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On the Long-Range Effects of Concentration Camp Internment 
on Nazi Victims: 25 Years Later 


Netta Kohn Dor-Shav ir 
Bar-Ilan University, Ramat-Gan, Israel i 


This study investigated long-range effects of concentration camp internment on 
survivors. The underlying hypothesis was that the extreme and prolonged stress 
suffered by victims could be: expected to have resulted in impoverishment of 
personality and dedifferentiation in both personality and perceptual-cognitive 
functioning. Subjects were 42 survivors, 26 males and 16 females, ranging in 
age from 42 to 67, and 20 controls, 9 men and 11 women, of similar age, back- 
ground, and education but who had escaped incarceration. The study was 
carried out in Israel, using a double-blind paradigm. All subjects were tested 
with three of Witkin’s measures of psychological differentiation, the Embedded- 
Figures Test, Human-Figure Drawings, and the Block Design subtest of the 
Wechsler Adult Intelligence Scale. They were also given the Rorschach Inkblot 
Test, the Bender Gestalt Test, and the 16 Personality Factor Questionnaire. 
Findings tend to support. the hypothesis. Survivors manifested evidence of im- 
poverishment and constriction of personality and appeared to be less accessible, 
less connected, and more labile. In perceptual-cognitive functioning, they tended 
to be more global, less complex, and less differentiated, and there were indica- 
tions of breakdown of ego boundaries. Some evidence suggests that earlier in- 
carceration led to more severe impairment. The price is still being paid! 


The psychological literature, of course, re- 
flects considerable interest in the effects of 
laboratory-induced or situational stress, as 
well as in the aftereffects of stress experiences 
such as that of the prisoner-of-war, criminal 
imprisonment, sensory-deprivation states, and 
natural and man-made disasters. 

However, none of these compare with— 
and only the prisoner-of-war experience in 


N 

| Although it is more than a quarter of a 
century since the liberation of Aushwitz in 
January of 1945, it is still difficult to address 
oneself objectively to the effects of that holo- 
caust on those few who survived. Neverthe- 
less, it was felt that if the survivors of that 
ineffable madness were not reached soon, sci- 
entific assessment of the sequelae of concen- 
tration camp internment would be made im- 


possible by the passage of time. 
| The present study, therefore, was initiated 
25 years after the liberation in an attempt to 
ascertain the price paid—and still being paid 
—by the victims of the most inhuman stress 
experience ever perpetrated by man against 
man before it became too late to reach the 
victims, 


This study was carried out with the help of fund- 
ing granted by the Committee for Research of Bar- 
Ilan University. The author gratefully acknowledges 
the support given. 

Requests for reprints should be sent to Netta Kohn 
Dor-Shav, Department of Psychology, Bar-Ilan Uni- 
versity, Ramat-Gan, Israel. 
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some cases approached—the degree of stress 
experienced in the concentration camp, a 
stress situation that taxed to the extreme hu- 
man ability to survive in the face of mani- 
fold, prolonged, and unprecedented stresses. 

It was, of course, recognized immediately 
after the liberation that the camp experiences 
had left indelible effects on the few pathetic 
survivors (Boder, 1949; Grygier, 1954; 
Nieremberski, 1946). Unfortunately, though 
understandably, however, no scientific, sys- 
tematic study of survivors has been under- 
taken over the years. There have been, in 
fact, only a very few studies, and these tended 
to focus on attitudes, adjustment, clinical 
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symptomatology, and life-style—suggesting a 
unique, long-lasting syndrome among survi- 
vors characterized by chronicity, vagueness, 
depression, fatigue, sleep disorders, emotional 
lability, blunted affect, and withdrawal (Cho- 
doff, 1963; de Wind, 1968; Niederland, 
1968; Simenauer, 1968). Also noted were the 
absence of psychotic symptomatology and dif- 
ficulties in interpersonal relationships. 

Thus a review of the literature found no 
studies that attempted to assess the intellec- 
tual and perceptual-cognitive functioning of 
the survivors, nor were there studies aimed 
at a comprehensive and objective evaluation 
of personality functioning; no attempts to 
identify, assess, and define the nature of the 
psychological deficit or to examine the per- 
sonality structure and function of survivors 
in a controlled study seemed to have been 
made. It was to fill this gap that the present 
study was undertaken. 

Specifically, the study made use of clinical 
psychological tests and research measures in 
an attempt to assess the personality func- 
tioning, ego boundaries, and perceptual-cog- 
nitive processes of concentration camp victims, 

Underlying hypotheses were that the se- 
vere and prolonged stress endured by survi- 
vors may be expected to have resulted in im- 

` poverishment of personality, a process of 
dedifferentiation in the perceptual-cognitive 
sphere, and primitivization in both personality 
and perceptual-cognitive functioning. 

Thus the research aimed to answer the fol- 
lowing basic questions: (a) Is there evidence 
of primitivization and/or dedifferentiation in 
perceptual cognitive or personality function- 
ing of survivors? and (b) Are there demon- 
strable decrements or differences in personal- 
ity structure or functioning in a group of 
survivors as compared to controls? In addi- 
tion, the study was interested in ascertaining 
if there was evidence of breakdown of ego 
boundaries, differential effects related to age 


at the time of incarceration, or actual brain 
damage. 


Method 
Subjects 


There were a total of 62 subjects of both sexes in 
the study, ranging in age from 42 to 67, and living 
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within the greater Tel Aviv area in Israel, All s 
jects were tested individually in their homes 
places of work. Testing sessions lasted from 2 
3 hours. 

Group 1, the main experimental group, consist 
of 42 survivors (26 men and 16 women), who h 
been incarcerated in the most severe of the Ni 
concentration camps during World War II. Su 
jects in this group were taken from a list suppli 
by Yad Vashem, the Israeli organization institu 
devoted to the memory and study of the holocay 
with whom all survivors in Israel are registered, ( 
those included in the study, 13 had been interned ; 
Auschwitz, 5 at Bergen Belsen, 6 in Maidenek, 
in Treblinka, and 10 in other camps. Ages in thi 
group ranged from 42 to 67. 

Group 2, the control group, consisted of 20 sub 
jects (9 men and 11 women) matched to subject 
in Group 1 for age, academic training, occupation 
and place of origin. These subjects were chosen vi 
the population registry and were contacted individ 
ually. (As there were, unfortunately, an unexpectet 
number of refusals, the N for this group is con 
siderably smaller than that for Group 1, The subject 
in this group rariged in age from 42 to 65.) 


Measures 


Each subject was tested with a battery of tes 
chosen for their appropriateness for measuring pen 
ceptual-cognitive and Personality functioning. 

Three measures of Psychological differentiation 
defined by Witkin, Dyk, Faterson, Goodenough, an 
Karp (1962) were included. 

1. The Embedded Figures Test (EFT). This tes 
introduced by Witkin et al, (1962), is considered ti 
be a basic and reliable measure of psychological dif: 
ferentiation. The subject is required to disembe 
from within a complex geometrical constellation, 
Previously presented simpler figure that has bee 
integrated into the new context. 

2. The Human Figure Drawing (HFD), as a mea- 
sure of psychological differentiation, is scored fof 
degree of sophistication and articulation of the bod 
concept. 

3. The Block Design subtest of the Wechsler Adulti 
Intelligence Scale (WAIS, 


the assessment of a hypothesis of brain damage. 

In addition, the battery included (a) the Rot 
a measure of both perceptu: 
as well as of 
the Bender Gestalt 
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ality questionaires that has been carefully trans- 
d into Hebrew and for which norms are in the 
cess of being established in Israel. 


J 

Design and Procedure 
i 
i To mitigate the effects of possible bias, the study 
was conducted using a double-blind paradigm. Only 
bne assistant (the one in charge of identifying, lo- 
tating, and contacting subjects, as well as of pro- 
iding testers with lists of individuals to be tested 
re of keeping records) knew to which group a 
articular subject belonged. 

í Data were gathered by a total of five testers, 
“oad two or three at a time. Each tester was 
arefully trained and supervised by the chief re- 
archer before being allowed to test subjects in 
he field. Testers did not know the purpose of the 

udy or the nature of the sample groups, nor did 

ey themselves score the tests in the battery. Testers 

ere given lists of subjects to be tested, and they 
ontacted the subjects and tested them individually 

t the subject’s home or place of business. Testing 
essions lasted 2-3 hours. Testers were MA students 

psychology, with one exception who was in the 

st year of BA studies. 

Scoring was done by a total of three scorers, all 
f whom were MA candidates in psychology. Again, 
n accord with the double-blind nature of the study, 
he scorers remained ignorant of the purpose of the 
esearch, as well as of the nature of the sample 
sroups included. 

Interrater reliability for scores used was ascer- 
ained and was uniformly high, ranging from .42 
o .99, (See next section.) However, in the interest 
f facilitating data processing, one of the scorers 
vhose own intrarater reliability was excellent (r= 
89) rescored all protocols. It was these scores that 
vere used in the final statistical analyses. 


coring 


Tests in the battery were scored according to the 
ollowing criteria; (a) for the EFT, the score used 
vas the total time in seconds that the subject took 
o disembed the 12 figures, with a maximum of 180 
ssigned for any one figure or for failure (Witkin 
t al, 1962). (b) The Block Design subtest of the 
VAIS was assigned its scaled score value. (c) HFDs 
vere scored according to the 9-point version of the 
Marlens scale (Dershowitz, 1971; Witkin et al., 
962). (d) Rorschach protocols were scored both 
ccording to standard clinical methodology as well 
$ for cognitive-differentiation complexity using 
Pecially developed criteria described below. (A de- 
ailed account of the scoring method and criteria 1s 
inder preparation for separate publication.) i 

Briefly, a number of scores, as well as a differen- 
iation index and complexity ratio, were developed 
o reflect the degree of differentiation, integration, 
ind complexity of the Rorschach responses given by 
he subject. 


Thus WA and WD are both scores reflecting un- 
differentiated whole responses, with the former re- 
ferring to simple, vague, or global gestalts and the 
latter to simple aggregates of unintegrated or un- 
defined forms (e.g., “bunches of cotton” on Card 7). 

WB and WC, conversely, are scores reflecting both 
differentiated and relatively complex percepts, as 
well as ones having a good deal of integration; the 
distinction between the two is that WB is assigned 
to responses that seem to progress from the whole 
to the parts and WC for the reverse process. (WB 
responses are relatively rare in any population.) 

In addition, two composite scores were developed. 
The first of these, the differentiation index, reflects 
not only the quality of the whole responses but also 
the number of parts or details of the blots that were 
specifically included in the percepts, as well as the 
total response pattern. Thus the higher the index, 
the better the level of articulation and integration. 

Specifically, the formula for the differentiation 
index used is: 


[3 (WA)] + [3 (WB) + ND»] + [3 (WC) + ND] 
+3 (WD)]+ND +4 Nd 


R 


where WA, WB, WC, and WD reflect the qualities 
described above, ND» and ND. refer to the large 
details of these differentiated responses, respectively, 
and ND and Nd refer to the total number of large 
detail (D) and small detail (d) responses given 
independent of the whole responses. The denomi- 
nator, R, refers to the total number of responses in 
the protocol. 

The complexity ratio reflects the. degree of com- — 
plexity and differentiation of the Rorschach whole - 
responses by contrasting the number of simple, glo- 
bal amorphous responses to the differentiated, com- 
plex ones. Thus the formula for the ratio is 


WB + WC 
WA+WD 


The higher the ratio, the more complex and articu- 
lated the responses. $ 

With regard to the reliability and validity of these 
specially developed measures, the following was 

nd: 
era and intrarater reliabilities for WA and 
WB were better than .90 (p<.02), and for WB 
and WC reliabilities ranged from .56 to 78 (p< 
.05). For the differentiation index reliabilities were 
better than .88 ($ < .02). (Reliabilities for the com- 
plexity ratio, as it is simply a mathematical function 
of WA, WB, WC, and WD, were not computed 
separately.) n 

In addition, in conjunction with an earlier study 
that may be considered a pretest with respect to 
the present one, test-retest data yielded reliability 
coefficients for the scores and indices significant at 
between .05 and .001. 

With regard to validity, of course, the problem is 
more complex. A cogent argument for face and con- 
tent validity can be made based on the nature of 
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the procedures and criteria used. The fact that the 
measures differentiated between groups, according to 
prediction, in a number of pretests using independent 
variables such as age and current stress—as well 
as in the present study (see Results)—is also con- 
sidered to be in a sense validating. 

The 16 PF was scored for its component factors, 
and these scores were compared across groups. 


Results 


Addressing ourselves first to the data rele- 
vant to the hypothesis that the severe and ex- 
treme stress endured by the concentration 
camp victims would manifest itself in de- 
creased differentiation, survivors were com- 
pared to the control group on those measures 
that were considered relevant for cognitive- 
complexity differentiation, that is, the EFT, 
the Block Design subtest of the WAIS, the 
HED, and the especially developed Rorschach 
measures described in the preceding section. 
Table 1 summarizes these findings and also 
includes data for the Bender Gestalt test. 

Inspection of the data in Table 1 shows 
that no differences were found between group 
means on the EFT, BD, or HFD, nor were 
there significant differences on the Bender. 

With regard to the Rorschach, findings 
were mixed, with support for the hypothesis 
emerging only from significant differences on 
the differentiated WC measure, as well as for 
the complexity ratio, though there was also 
evidence of trends in the expected direction 
for the WA and WB scores, as well as for the 
differentiation index. 

It had been noted, however, as inspection 
of Table 1 shows, that there was a great 
deal of variability with regard to the EFT, 
Witkin et al.’s (1962) basic measure of psy- 
chological differentiation, especially among 
the concentration camp group. Upon analysis, 
this difference yielded an F of 2.28, significant 
at the .05 level—a finding that in itself in- 
dicates, at the very least, that the two groups 
cannot be considered as having come from the 
same population. Similarly, a significant dif- 
ference in variability between the groups was 
found for the differentiation index (.05 < $ 
< .10). 

This was confirmed in a nonparametric 
analysis, in which the groups were each di- 
vided into high, medium, and low performers. 


Table 1 

Comparison of all Concentration Camp 
Survivors and Control Subjects on | 
Perceptual-Cognitive and Differentiation 


Measures | 


Measure p(o 
and group 2” M SD t tail 
= 
Block Design 
cc 40 9.05 2.40 4732 | 
Control 20 8.75 2:22 > y 
Emdedded Figures | 
cc 35 1,028.57 572.5 12 45 
Control 15 1,009.10 399.2 4 
Bender Gestalt | 
cc 42 72.26 = 19.94 79 .22 
Control 20 66.10 18.34 "7 -I 
Differentiation index 
cc 42 2.55 57. g3 ot 
Control 20 2.70 n ai | 
WA 
cc 42 4,23 2.58 
Control 20 © ©3.40--2.03 128 tl 
WB 
ce 42 81 -90 7 2 | 
Control 20 1.00 85 s 
wc 
EG 42 1.02 1.24 1.45 
Control 20 1.55 1,53 ~ 
WD 
cc 42 60 95 3 
Control 20 TOOB a 
Complexity ratio 
CG 42 .50 53 1.70 
Control 20 84 80 ra 
Human Figure Drawings 
Males 
cc 38 7.31 2.29 89 
Control 14 7.92 1.90 ” 
Females 
cc 37 7.35 2.09 66 
Control 14 7.78 2.08. * 


Note, CC = concentration camp. 
* Asterisks indicate significance. 


The difference between the groups was foi 
to be significant at the .10 level of confiden! 
(one-tailed) + 

With regard to the HFDs that had ^ 


1It should be noted that due to the considera 
difficulty, in general, in obtaining significant difë 
ences in clinical research, it was decided to # 
the 10% level of confidence. 
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in this analysis, shown any significant dif- 
ferences between groups when scored accord- 
ing to the Marlens scale, it became apparent 
during subsequent content analyses (not re- 
ported here) that the drawings of the con- 
centration camp group tended to be rendered 
with fragmented broken lines, with vague 
gestalt, and with inconsistency in the level 
of detail. Thus, such drawings might include 
secondary features or items of dress, and 
even accessories, while omitting essentials. 
Such differences, however, did not necessarily 
reflect themselves in the scoring, as a draw- 
ing with fragmented lines, which indicated 
sex and included most body parts, although 
having obvious evidence of a breakdown of 
the body boundary concept might well get a 
better score, than a relatively “healthy” stick 
figure or a “snowman.” These stick figure and 
snowman styles, however, though obviously 
undifferentiated and rather primitive, do not 
reflect a breakdown of ego boundaries. It was 
decided, therefore, to submit the drawings 
to a senior clinical psychologist with instruc- 
tions to separate them into two groups, those 
that he thought belonged to concentration 
camp victims and those not. The prediction 
was clearly significant, x?(1) = 9.47, $ < 01, 
indicating that on a global comparison, clini- 
cal differences are apparent and that the 
drawings can be differentiated. 

At this point—especially due to the varia- 
bility of the scores—it was decided to ascer- 
tain whether age was a factor that should be 
taken into further account. With an age range 
of from 42 to 67, it was reasonable to as- 
sume that there were a considerable number 
of subjects included for whom a diminution 
of powers might be expected on the basis of 
age alone. 

Accordingly, both concentration camp and 
control subjects were divided into above 50 
and 50 and below brackets, and comparisons 
were made for the scores. This analysis con- 
firmed that age was indeed a factor on the 
differentiation index, as well as for the Bender. 

This led to a replication of all analyses 
for ‘the perceptual-cognitive and differentia- 
tion measures, excluding subjects above 50 in 
both groups. Table 2, presents the findings. 

As will be noted from Table 2, excluding 


Table 2 

Comparison of Subjects Aged 50 and Below 
on the Perceptual-Cognitive and 
Differentiation Measures 


ee 


Measure p(one- 
and group 1” M SD 4  tailed)* 
Block Design 
cc 17-888 179 
Ag yes0 2.00%. “8.382 
Embedded Figures 
cc 15 1,014.59 621.30 
Control 5 87720 480.65 “# “33 
Bender Gestalt 
cc 17. 70.23 19.75 
a O 17.751) 1S ante 
Differentiation index 
cc 17 G2g0Ih WAT 
Gadtrol OB Naes aise 1350 
WA 
cc 17 4.64 3:12 
Control 8 3.37 19 112 14 
WB 
cc 17 66  .90 
Control 8 erie ee Suite 
WC 
cc 17 1.00 1,02 
Catal abs 22s) age 85:10 
WD 
cc 17 55 1.04 
Control 8 Et tga Ahan 8 
Human Figure Drawings 
Males 
cc 15 6.93 2.31 
Same) Tak y eae 
Females are 
cc 15 6.80 24 
Pe Te, 8.28 Mena ae 


Note. CC = concentration camp. 
a Asterisks indicate significance. 


the older subjects, as expected, sharpened the 
findings, resulting in considerable support of 
the hypothesis. For Witkin et al.’s measures 
of psychological differentiation it was found 
that although findings for the EFT were not 
significant, the Marlens ratings for the HFD 
yielded significant differences for both the 
male and female drawings in the predicted 
direction. 

With regard to the special Rorschach 
scores, the differentiation index yielded a sig- 
nificant difference between the groups, the 
level of significance for the WC measure was 
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Table 3 

Comparison of all Concentration Camp 
Survivors and Control Subjects on Selected 
Rorschach Determinants 


Score and (one- 

group M SD t tailed)* 
Productivity 

cc 18.92 11.85 

Control 20.70 11.60 55 29 
Form % 

cc 69.92 19.48 

Control 69.60 14.26 07 48 
Form + % 

cc 75.45 11.81 

Control 77.60 ~ 7.86 74 23 
Human Movement 

cc 1.44 1.43 

Control 2.05 150 153 | .07* 
Form Primary ¢ 

ce 92 Aana 

Control 30 57 1.58 .06* 
Color Primary 

cG 62 95 

Control 55 1.23 27 39 
Pure Color 

cc 1 68 

Control .20 AL 1.36 .09* 
Texture. 

Ce + 46 59 

Control 72 49 1.70 05% 
Sum C 

cc 149 1.82 

Control 1.30 1.62 Al 34 
Taa Penetration 

217 2.45 
Control 2.90 2.34 1.14 13 


SE T a 
Note. CC = concentration camp. n = 42 for CC 
subjects, and n = 20 for controls. 

* Asterisks indicate significance, 


increased from .08 to .02. (No computation 
was made for the complexity ratio for the 
younger subjects taken alone, since it was ap- 
parent that any differences obtained would 
reflect the difference on WC with some con- 
tribution from the trend (p < .20) for an in- 
crease in WA in the concentration camp 
group.) 

For the Bender, too, the elimination of the 
older subjects resulted in a significant differ- 
ence between the groups, ¢(24) = 1.60, .05 
< $ < .10, one-tailed. 

Of considerable interest, of course, was the 


analysis of the Rorschach in accordance with 
the usual scoring methods. Thus the Ror- 
schach protocols were rated not only for 
cognitive-differentiation complexity but also 
for the generally accepted formal and content 
criteria. Tables 3 and 4 present the results 
for Rorschach determinants and content, re- 
spectively, for the concentration camp and, 
control groups as a whole. Table 3, in addi- 
tion, presents data for the Combined Color 
(Sum C) and Fisher Penetration scores. 

As will be seen from the table, there were) 
significant differences between the groups in. 
the predicted direction for the Human Move- 
ment (.05 < p< .10), Texture (p <.05),, 
and Pure Color (.05 < p < .10) determinants, 
respectively. In addition, there was a signifi- | 
cant difference (.05 < p < .10) on the Form 


Table 4 

Comparison of all Concentration Camp 
Survivors and Control Subjects on Selected 
Content Aspects of the Rorschach 


Content | 
and group M SD t p | 
Human | 
cc 15.21 12.10 | 
Control 16.76 12.78 T 92 
Animal 
cc 51.07 17.84 
Control 45.29 16.48 1.25 ide 
Anatomy 
cc 6.28 9.06 
Control 5.48 8.35 34 37 
Fire 
cc 1.67 4.55 
Control 81 2.79 80 .22 
Abstract 
ce 40 131 y 
Control 1.86 3.68 2.33 .01* 
Sex 
cc 63 2.85 
Control 2.76 9.67 134 0% 
Blood 
cc 1.42 5.35 
Control 86 2.99 45 .33 
Nature 
cc 214 5.70 r 
Control 5.00 7.37 1.71 .05* 


Note. CC = concentration camp. n = 43 for ce 
subjects, and » = 21 for controls. 
* Asterisks indicate significance. 


= 


Primary measure, favoring the concentration 
camp subjects. (Preliminary indications with 
regard to Productivity and the Form + %, 
reflecting deficits in the concentration camp 
group, were not confirmed, although slight 
trends remained in the predicted directions.) 
The differences for the Sum C— and the 
Fisher Penetration scores were not significant. 

With respect to content, as will be noted, 
concentration camp victims, as expected, gave 
significantly fewer abstract responses (p< 
.01), mature responses (p < .01), and sex re- 
sponses (.05 < p < .10). In addition, again 
as would be expected, they tended to give 
more animal responses (.10 < p<.20) and 
showed a weak trend toward an increase in 
incidence of fire responses. < 

Next, because age was a serious factor in 
our analyses for the cognitive-differentiation 
complexity Rorschach measures, it was de- 
cided to also compare several of the tradi- 
tional Rorschach measures for the younger 
subjects only. Table 5 summarizes the findings. 

‘As can be noted from Table 5, this analysis 
again yielded a significant difference for the 
Pure Color determinant, as well as a signifi- 
cant difference on Color Primary. Findings 
for the Texture and the Form Primary deter- 
minants, however, were weaker than for the 
entire sample, though strong trends remained 
(ps < .10 and .13, respectively). 

Finally, scores on the 16 PF were com- 
pared for each scale, both for the whole sam- 
ples as well as for the younger subjects taken 
separately. Tables 6 and 7 present these data, 
respectively. 

As can be noted from the tables, only the 
Imaginative scale showed a significant differ- 
ence between the concentration camp and con- 
trol groups, with trends apparent also for the 
Conscientious and Experimenting factors. 
Taking the younger subjects alone, only the 
Shrewd and Controlled scales yielded signifi- 
cant differences, with a trend evident for the 
Self-sufficient factor also. Adopting the stan- 
dard meanings of the factors for the purpose 
of interpretation—as well as adopting a con- 
siderable degree of caution—we find that as 
expected, concentration camp victims were 
rated as more practical, careful, and conven- 
tional than the controls (Imaginative). They 
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Table 5 a ty 
Comparison of Subjects Aged 50 ard Below — 
on the Selected Rorschach Determinants 


„Score pone s 
apr M SD £ tailed)" 
Productivity F ; TTN 
cc 22.47 16.02 
Control 20.25 7.72 3736 
Form % ; 
cc 7441 13.97 
Control 73.62 9.87 1445 
* Form + % K 
cc 76.35 9.79 ? $ 
Control 79.37 5.63 8 AL 
Human Movement aa 
ce 83. 1,65 h 
Qutrol 2.00 5 ‘ 21 
Form Primary g 
cc 83 1.24 
Control 25 46 1.27 Ai 
Color Primary i 
cc A4 85 
Control 00 00 1.45 .08* 
Pure Color 
cc 66 84 
Control 25 46 1.31 10* 
Texture 
cc 50° 48 
Control 15 sg 8-18 


Note. CC = concentration camp. » = 17 for CC 
subjects; = 8 for controls. 
a Asterisk indicates significance. 


tend to be more conservative, staid, and per- 
severing (Conscientious), as well as being 
more likely to accept established ideas and 
more tolerant of traditional difficulties (Ex- 
perimenting). For the younger subjects only, 
the differences indicated the victims to be 
more penetrating, more shrewd, more calculat- 
ing, and more worldly (Shrewd); less dis- 
ciplined and more likely to follow their own 
urges and to disregard protocol (Controlled) ; 
as well as tending to be more group dependent 
and more prone to group adherence than the 
controls (Self-sufficient) . 

It should be noted that although no pre- 
dictions had been made with regard to the 
subscales of the 16 PF before the project 
was initiated, they were made before the 
data were analyzed. All findings, with the ex- 
ception of those for the Controlled scale were 
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Table 6 

Comparison of all Concentration Camp 
Survivors and Control Subjects on the Sixteen 
Personality Factor Questionnaire Scales 


Scale 
and group M SD t ral 
Warm-hearted 
GC 11.15 3.3 
Control 11.90 242 68; sec) 
Emotionally stable 
cc 12.40 3.98 
Control 1345 410 15 23 
Assertive 
cc 12.28 4.66 
Control 11.63 4.31 40 35 
Enthusiastic 
Ce 10.65 3.12 
Control 10.09 4.03 48` 32 
Conscientious 
cc 14.65 3.44 
Control 13.27 ata SEZA 12 
Venturesome 
14.59 6.05 
Control 15.00 4.85 20042 
Sensitive 
cc 10.75 3.68 
Control 11.36 3.98 47 32 
Suspicious 
cc 10.50 2.88 
Control 11.81 4.02 28 39 
Imaginative 
cc 11.68 2.94 
Control 13.18 «= 2.60149 «07* 
Shrewd 
cc 11.62 2.99 
Control 11.09 1.37 51 29 
Worried 
cc 11.71 4.47 
Control 12.09 3.59 25 40 
Experimenting 
cc 8.43 2.23 
Control 9.45 3.42 -99 16 
Self-sufficient 
cc 11.25 3.48 
Control 12.18 4.09 “73 24 
Controlled 
cc 12.32 2.93 
Control 12.54 2,62 A7 43 
Tense 
cc 14.15 5.46 
Control 12.90 4.27 09 25 


Note. CC = concentration camp. n = 32 for the 
concentration camp subjects, and » = 11 for the 
controls. 


* Asterisk indicates significance. 
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predicted, though a number of scales for 
which predictions had been made did not show 
significant differences. | 


| 
] 


Discussion 


Findings are discussed first in terms of the 
hypothesis predicting a deficit in cognitive 
differentiation complexity for the concentra: 
tion camp victims, after which the question 
of personality functioning and possible im: 
poverishment will be addressed. 

Tt was expected that the severe and pro- 
longed stress undergone by the survivors could, 
be expected to have resulted in a process of 
dedifferentiation and primitivization in the 
perceptual-cognitive sphere, as well as in per- 
sonality structure and function. Obviously, it 
is not possible to demonstrate dedifferentia- 
tion or change retroactively; however, data, 
in support of a hypothesis of differences be- 
tween the groups on the measures allow us, 
to draw inferences with regard to the hy- 
potheses, 

On measures of psychological differentiation 
taken from Witkin et al. (1962), findings 


were mixed. For the EFT a significant dif 


ference was found in variability between the 
groups, and chi-square was significant at the 


-10 level, lending some albeit inconclusive) 
The Block De-) 


support to the hypothesis. 
sign subtest of the WAIS did not differentiate, 
and the HFD, rated according to Marlens’ 
scale, did not differentiate between the groups 
when the entire age range was included. How- 


ever, when the older subjects were excluded 
from the analysis, ratings for psychological 
differentiation of the drawings yielded signifi- 
cant differences between the groups for both 
the male and female drawings as predicted: 
The above findings, in themselves, indicate 
some support of the underlying hypothesis 
The failure to find differences on Bl 
Design may be explained by what is, in effect) 
below-par performance within a rather nat 
Tow range for both the concentration camp 
and the control groups. This finding may bé 
a function of the tendency, found in othe! 
studies (Dershowitz, 1971), for Jewish sub- 
jects at various age levels to do relatively 
poorly on the performance subtests of thé 
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WAIS, as well as on other measures of psy- 
chological differentiation. Thus the fact that 
group differences were found for the EFT and 
HFD lends more support to the hypothesis 
of dedifferentiation here. 

It is from the special, original Rorschach 
measures of cognitive-differentiation complex- 
ity that considerable additional support was 
obtained for the hypothesis, however. Thus, 
when the groups as a whole were compared 
to each other, differences were found on the 
WC measure, indicating a lower incidence of 
well-articulated and integrated whole re- 
sponses among the survivors and an overall 
lower complexity ratio, which reflected the 
relative frequency of simple, amorphous, non- 
articulated responses as compared to differ- 
entiated, well-articulated, and integrated ones. 
Further, when the older subjects were ex- 
cluded from the analysis, we found that the 
difference on the differentiation index, which 
had only shown a trend for the groups as a 
whole, became significant. This measure, be- 
cause it takes both the qualitative and the 
quantitative aspects of Rorschach responding 
into account, is considered to be the most 
sensitive of the measures for cognitive-differ- 
entiation complexity. Thus the finding of a 
difference at better than the .10 level in the 
predicted direction lends considerable support 
to the hypothesis. 

Some difficulty is encountered in addressing 
the question of whether the eventual amount 
of impairment was greater for victims incar- 
cerated at an earlier age, as has been sug- 
gested in the literature regarding survivors 
(Eitinger, 1964). Obviously, it is the older 
group of victims who were interned at a later 
age; however, comparison of older to younger 
subjects cannot ignore the effects of the ag- 
ing process itself, which seems to mitigate 
the differences between the concentration 
camp and control groups when taken as a 
whole, The fact that significant differences 
with age were found within the control group 
for several of the perceptual-cognitive mea- 
sures, that is, the EFT, the Bender, the WC, 
and the HFD, indicates that the involutional 
process per se leads to poorer performance of 
these measures. It becomes all the more in- 
teresting, therefore, to note that for the con- 


Table 7 

Comparison of Subjects Aged 50 and Below 
on the Sixteen Personality Factor 
Questionnaire Scales 


Scale 

and group M SD 4p: a 
Warm maes 

E 11.21 3.51 

E EAE INA R ae 
Emotionally stable 

CG 13.36 4.12 

aaa A e Aa i N 
Assertive 

cc 11,28 5.06 

Control Semon A 5) 88) a 
Enthusiastic 

cc 11.42 2.95 

Contests! aL ORT al", 22 aR 
Conscientious 

cc 14.14 3.71 

A E E E P ENE 
Venturesome 

cc 14.57 4.41 

Reel cane este E 
Sensitive 

cc 11.07 4,34 

MCCS op A atest 
Suspicious r 

CC 10.21 2.2. 

Conny Comma ame ye amie seit 
Imaginative y 

ice 13,28 2.58 

AS EEE i eects ery nee eae! 
Shrewd rg 25 

cC 13. „51 " 

ole E ri ONS UIA AE 
Worried G 

cc 10.64 4 

Control 10.80 3.11 Ae eal 
Experimenting 

cc 9.07 3.43 1,00 17 

Control 11.00 4.47 : i 
Self-sufficient i ee 

cc 11, v 

SEEE E i408" tot NR, 
Controlled ie 272 

cc 11. i id 

ae 13.80 216 1.49 .08' 
Tense 14:85 5.46 

cc 3. £ 

CO NEE 


Note. CC = concentration camp. # = 14 for CC 
subjects, n = 5 for controls. 
a Asterisks indicate significance. 
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centration camp victims, but not for the con- 
trols, there was evidence of a reversal on the 
differentiation index, with younger subjects 
having significantly lower indices, ¢(41) = 
2.24, p < .02. As other research has shown a 
decrease with age for this index, the present 
finding, in effect, lends some support to the 
hypothesis that incarceration at an earlier 
age was more damaging. 

At this point, the data for the Bender be- 
come of considerable interest as well. It can 
be recalled that when all subjects were taken 
together, there was no significant difference 
between the groups. However, when the older 
subjects were excluded from the samples, a 
difference between the concentration camp 
and control groups did obtain. This finding— 
especially in view of the fact that here, as in 
the case of the differentiation index, a sig- 
nificant intragroup difference between older 
and younger subjects was found in the con- 
trol group but not in the concentration camp 
sample—can be interpreted as reflecting 
greater deficit in the younger subjects. Re- 
garding the possibility of actual brain dam- 
age among concentration camp victims, the 
findings remain inconclusive. Though the 
Bender might suggest damage in the younger 
subjects, there is no reason to suspect that 
head trauma as such would be directed more 
at younger subjects than at older ones, Fur- 
thermore the Block Design scores, which are 
also sensitive to brain damage when it is 
present, were, if anything, better for the con- 
centration camp group both when the samples 
were taken as a whole, as well as when only 
the younger subjects were included, though 
the difference was of course not significant. 

Looking at personality functioning, it was 
found, as predicted, that there were signifi- 
cant differences between the groups on a num- 
ber of standard Rorschach scores—in Particu- 
lar on those that measure full, rich, open, and 
creative functioning. Thus the Rorschach 
measure that reflects the quality of the inner 
life, creativity, and perception of the other 
(the Human Movement response) indicated 
that concentration camp survivors gave fewer 
such responses, reflecting a more constricted, 
impoverished inner life, decreased creativity, 
and limitation in the perception of the human 


“other.” The lower incidence of the Texture 
determinant shows, as predicted, that the sur 
vivors had greater difficulty in being accessi 
ble to others and in being connected, aware, 
and open, whereas the increase in Color Form 
and Color Primary indicate poorer emotional 
control. In addition, the data indicating sig- 
nificantly fewer abstract, nature, and sex re- 
sponses also indicate a more restricted, im- 
poverished inner life, whereas the trend to 
an increase in the number of Animal responses 
suggests the possibility that for these. sub- 
jects, the animal has, in a sense, come to 
replace the human. This interpretation, 
though of course highly speculative, is given 
some support by the fact that the animal 
responses in these protocols are very fre- 
quently given in generic, rather than in spe- 
cific, terms, something that is relatively rare 
otherwise. Thus we do not find names of spe- 
cific animals, but rather, “an animal,” “a 
wild animal,” “a beast of some sort,” “be- | 
hemoth,” and so on, responses that it is felt 
reflect the subject’s experiences with the wild 
beast unleashed in man and the absence of | 
man’s humanity. | 

From the 16 PF, we found that the survi- | 
vors, on the whole, tend to be more conserva- | 
tive, careful, conventional, and practical, as 
well as more staid and persevering—qualities 
that are consonant with the picture of greater 
constriction and lessened differentiation 
gleaned from the Rorschach data. In addi- 
tion, the suggestion that the survivors tend 
more to follow their own urges and to be dis- 
regarding of accepted protocol fits the finding” 
of an increase in Color Form and Color Pri- i 
mary on the Rorschach. The finding (Self- 
sufficient) that the survivors tend to be more 
group dependent and more prone to group ad- 
herence supports suggestions to that effect in 
the literature (Bondy, 1943) and would seem 
to indicate a basic seeking for safety and affili- 
ation within a group, rather than genuine in- 
teraction and belonging. 

Taken as a body, then, the data can be 
seen as providing evidence pointing to im- 
poverishment and dedifferentiation, among 
concentration camp survivors, in both percep- 
tual-cognitive and personality functioning: 
The clinical Picture is made even more poig- 
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nant when we consider that the survivors, 
though tending to seek affiliation and group 
belongingness overtly, are impaired in the 
ability to achieve actual connectedness and 
openness; having been hurt so badly, they 
would appear to fear close emotional contact. 
In addition, the breakdown of ego boundaries 
suggested by the HFD, as well as the dediffer- 
entiation process, can be expected, of course, 
to make relating objectively difficult. 

To summarize; The underlying hypothesis 
that prolonged and severe stress leads to im- 
poverishment, primitivization, and dedifferen- 
tiation seems to have found support in a 
good deal of the data; the price is still being 
paid! 
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Self-disclosure as a Function of State and Trait Anxiety 


Amy L. Post and Bruce C. Wittmaier 
Kirkland College 


Mitchell E. Radin 
Hamilton College 


The influence of state and trait anxiety on self-disclosure was investigated. 
Debilitative and facilitative test-anxious subjects participated in a verbal learn- 
ing experiment under high and low anxiety conditions. Self-disclosure was 
elicited by a personal information questionnaire. Subjects’ responses to the 
questionnaire items were assessed for breadth or amount of self-disclosure and 
depth or intimacy of self-disclosure as well as positive-negative self-evaluation 
by content analysis of their statements, Results confirmed the hypothesis that 
individuals experiencing state anxiety disclose less than “normals,” These results 
are discussed in light of various conceptual approaches to anxiety. 


Although self-disclosure has been studied 
in relation to a number of personality, social, 
and situational variables, the effects of emo- 
tional arousal, specifically anxiety, on self- 
disclosure have yet to be experimentally as- 
sessed, As self-disclosure is often elicited in 
situations such as psychotherapy, in which 
anxiety may be a predominating factor, un- 
derstanding the role of anxiety in self-dis- 
closure is of considerable importance. The 
Present study, therefore, attempts to deter- 
mine the influence of anxiety on self-disclosure. 

Self-disclosure has been variously defined 
as making yourself overt, showing yourself 
so that others can see you (Jourard, 1964), 
and the 


explicit communication to others of some personal 
information which the others would be unlikely to 
acquire unless the person himself disclosed it and 
which is of such a nature that the individual is not 
likely to disclose it to everyone who asks for it. 
(Sermat & Smyth, 1973, p. 332) 


Self-disclosure is, thus, the communication of 
information about one’s affects, behaviors, 
and cognitions with the implication that the 
material disclosed is either secret, intimate, 
or emotionally charged. 


Requests for reprints should be sent to Bruce C, 
Wittmaier, who is now at the Lancaster Guidance 
Clinic, 630 Janet Avenue, Lancaster, Pennsylvania 
17601. 
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The concept of self-disclosure is quite com- 
plex, since it encompasses both the qualifying 
of verbalizations and the assessment of the 
content and direction of verbalizations. Cozby 
(1973) has proposed that the dimensions of 
self-disclosure are the breadth or amount of 
information disclosed, depth or intimacy of 
this information, and the duration of time 
spent in disclosure. Two different parameters, 
suggested by Chelune (1975) are the affective, 
manner of presentation of the disclosed ma- 
terial and the flexibility of the disclosure pati 
tern. In addition, other authors have ve af 


on the positive-negative self-evaluative as 
pect of the content of disclosures (e.g., Sara 
son & Ganzer, 1962). In the present study, 
the breadth, depth, and positive—negative as- 
pects of individuals’ self-disclosure were as 
sessed. 
The failure of self-report measures of self 
disclosure to predict actual self-disclosure 
situations has lead several authors to empha 
size the need for behavioral assessment 0 
self-disclosure (e.g., Chaiken & Derl 
1974). However, there are several issues it 
herent in the scoring of the content of vel 
balizations in terms of self-disclosure 
present problems. These problems inclui 
value judgments as to whether statemen! 
about feelings are more disclosing than # 
sured verbalizations, whether statements 
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the present tense are more disclosing than 
those in the past or future tenses, and whether 
negative information about the self is more 


disclosing than positive information. In addi-- 


tion, it has been found that the level of self- 
disclosure demanded by questions, modeling, 
or reciprocity in various situations has a pro- 
found effect on the level of self-disclosure 
elicited (Chaiken & Derlerga, 1974; Cozby, 
1973; Jourard, 1971). In light of these con- 
siderations, we have opted to measure self- 
disclosure by content analysis of statements 
made in an experimental situation with a 
relatively high demand for self-disclosure and 
to assess self-disclosure on the basis of popu- 
lar consensus by using a sizable sample of 
raters for this analysis. 

The concept of anxiety, too, is a complex 
one. Anxiety has been defined as “an un- 
pleasant emotional state or condition which 
is characterized by subjective feelings of ten- 
sion, apprehension, and worry, and by activa- 
tion or arousal of the autonomic nervous 
system” (Spielberger, 1972, p. 482). Spiel- 
berger (1972), in attempting to clarify the 
concept of anxiety, makes a distinction be- 
tween state and trait anxiety. He stated that 
“an anxiety state (A-Trait) is not directly 
manifested in behavior, but may be inferred 
from the frequency and the intensity of an 
individual’s elevations in A-State over time” 
(Spielberger, 1972, p. 482). State anxiety (A- 
State) is thus defined by the stress level of 
the situation and the individual’s experience 
of it, whereas trait anxiety (A-Trait) is de- 
fined in terms of the individual’s propensity 
to experience state anxiety. 

Much of the confusion in the anxiety litera- 
ture is a result of the failure to examine 
Separately the relationship of A-State and A- 
Trait with behavior (Spielberger, 1972; Witt- 
maier, 1974). In the present study the self- 
disclosure of subjects manifesting high or low 
trait anxiety was assessed under experimen- 
tally created conditions of high or low state 
anxiety, 

Alpert and Haber (1960) have shown that 
trait measures of anxiety that are geared to 
Specific stress situations (e.g., academic test- 
ing) have significantly better predictive valid- 
ity for both manifest anxiety and performance 


in those situations than do general trait mea- 
sures. The Alpert-Haber Achievement Anx- 
iety Test (AAT) is an A-Trait measure com- 
posed of two subscales that discriminate 
Tesponses to anxiety, which improve perform- 
ance (Facilitating subscale) from those that 
interfere with performance (Debilitating sub- 
scale). Scores on the Debilitating subscale 
correlate positively with other measures of 
A-Trait, whereas scores on the Facilitating 
subscale correlate negatively and thus reflect 
low A-Trait (Alpert & Haber, 1960), A use- 
ful measure of A-State is the anxiety factor 
of the Mood Adjective Checklist (MACL; 
Nowlis, 1965). $ 

Although there have been no direct em- 
pirical investigations of the influence of anx- 
iety on self-disclosure, research in this area 
with subjects characterized by other traits 
related to anxiety suggests that these subjects 
tend to disclose less than “normal” subjects. 
One such trait (need approval, or social de- 
sirability) may be thought of as apprehen- 
sion about negative evaluation, analogous to 
this type of apprehension in test-anxious sub- 
jects (see Sarason, 1975). Anchor, Vojtisek, 
and Berger (1972) found that psychotic pa- 
tients who scored extremely high on the 
Marlowe-Crowne Social Desirability Scale 
(MCSDS; Crowne & Marlowe, 1964) made 
a significantly lower proportion of self-state- 
ments in a group therapy session than did 
those scoring low or moderately high in so- 
cial desirability. Anxiety over negative evalua- 
tion appeared to outweigh the tendency to 
comply with the demand characteristics of 
therapy for high-need-approval individuals. 
Similarly, Burhenne and Mirels (1970) found 
high need for approval to be correlated with 
low self-disclosure on five essay-type ques- 
tions. Kopfstein and Kopfstein (1973) were 
unable to find a significant correlation be- 
tween self-disclosure and social desirability 
as measured by the MCSDS, but they did 
find that high-need-approval subjects, as mea- 
sured by other scales, were more impersonal 
and more evasive than their low-need-ap- 
proval counterparts. Subjects with another 
trait related to general anxiety, uncertainty 
anxiety, were found to anticipate having a 
more superficial conversation and more per- 
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sonal discomfort for a subsequent conversa- 
tion with an interviewer (Doster & Slaymaker, 
1972). Sarason and Ganzer (1962) reported 
that high-test-anxious subjects presented more 
negative self-references than low-anxious sub- 
jects and responded to evaluative threat with 
increased negative self-references when asked 
to describe themselves prior to evaluative 
threat and no-threat testing. 

Other evidence suggests that trait anxiety 
may influence self-disclosure such that the 
anxious individual does not modulate level 
of disclosure to correspond to situational de- 
mands. The anxious individual may be char- 
acterized by overly consistent patterns of 
self-disclosure. Chaiken, Derlerga, Bayma, 
and Shaw (1975) measured the self-disclo- 
sure of normal subjects and subjects high in 
neuroticism in response to a high- or low- 
disclosing confederate. These authors found a 
significant interaction between neuroticism 
and experimental condition such that the 
disclosure of normal subjects was considerably 
higher in the high-confederate-disclosure con- 
dition than in the low condition, whereas 
neurotic subjects maintained a moderate level 
regardless of condition. Finally, since com- 
munication involving self-disclosure is central 
to affiliative behavior, research on anxiety 
states and affiliation is quite relevant to the 
study of the effect of state anxiety on self- 
disclosure. Sarnoff and Zimbardo’s (1961) 
finding that the desire for affiliation decreases 
with increased anxiety suggests that subjects 
in a high-anxiety condition will also disclose 
less about themselves. 

It was therefore hypothesized that subjects 
in this experiment with higher debilitating 
anxiety, as well as those in a high anxiety 
„state, would disclose less about themselves 
than other subjects. 


Method 
General Design, 


Subjects were identified as debilitators (high A- 
Trait) or facilitators (low A-Trait) using scores on 
the AAT. After performing a learning task. under 
conditions designed to create high or low state anx- 
iety, they completed a self-disclosure questionnaire. 
The effectiveness of the manipulations was checked 
using scores on the Anxiety factor of the Mood 


Adjective Check List. Thus, self-disclosure could be 
assessed as a function of both A-Trait and A-State 


Subjects | 

Subjects were 48 freshmen males drawn at ran 
dom from the class of 1979 at Hamilton College, 
Subjects were solicited by telephone to participate 
in a “memory experiment” and were paid for their 
participation. 


Pretesting 


The AAT (Alpert & Haber, 1960) was adminis- 
tered to all freshmen during orientation. The sub- 
ject pool was divided between facilitators and de- 
bilitators according to scores on the AAT. Facilita- 
tors were those scoring at least 27 on the Facilitative 
scale (M = 31.16) and no more than’24 on the De 
bilitating scale (M = 19.75). Those with intermedi- 
ate scores were not included in either group, Twenty: 
four individuals from each group participated in this 
study and were assigned randomly to experimental 
conditions. No mention of the pretesting was made, 
5-6 months later when the experiment was con- 
ducted, 


Procedure 


Each subject participated in the experiment indi 
vidually. Upon entering the room for the experi 
ment, the subject was told by the male experimenter 
that the study was designed as an investigation of 
verbal and nonverbal learning and memory. 

All subjects then completed the same learning 
task. A list of 60 words was presented on a memory 
drum with a 4-sec exposure per word, and the sub: 
jects were asked to memorize these words in any 
manner they wished. Subjects were tested after each) 
trial until at least 30 words were recalled. Thos 
in the low-anxiety condition were then told that 
they had performed very well on the task. ~ 

At this point the procedure for the high- and 
low-anxiety conditions began to differ considerably: 
Both groups completed a digits backward task 
Subjects were read increasing numbers of digit 
until they failed to respond accurately. The samt 
number of digits was then repeated. Two succes 
sive failures on any number of digits were considere 
criterion. Subjects were then administered nine mo 
trials, two at one digit less than criterion, three % 
the criterion level, and four at one digit more thal 
criterion, in random order, for a net failure effect 
Subjects in the low anxiety condition were told thf 
the task was very difficult and not to worry about 
the score. After completion of the task, they wert 
told that they had performed well. Subjects in the 
high anxiety condition were told that the task me 
sured nonverbal memory and thinking ability. 
the end of the digits backward task, the exl i 
menter sympathetically inquired as to the health o 
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the subject, ostensibly to ascertain whether the sub- 
ject had performed as well as he could have. He 
did not tell high A-State subjects specifically that 
they had performed poorly, 

Subjects in the high anxiety condition were then 
led into a small adjacent room where a recall test 
for the memorized words was administered. The 
room had two one-way mirrors and was equipped 
with an intercom and a clock with a sweep second 
hand, The experimenter left the subject in the room 
after explaining how the clock worked and pro- 
ceeded to give the rest of the directions over the 
intercom. The experimenter, in an authoritative 
voice, told the subject that he would have 


exactly 50 seconds, no more and no less, for the 
next task. Your instructions for the task will be 
given, and then the task will be described. I will 
say “ready, set, begin,” and then I will start the 
clock. When the 50 seconds are up, I will say 
“stop.” You will immediately stop writing and 
raise your pen. Do not complete what you are 
doing and do not start anything new! Do you 
understand? 


The subject was then instructed to put his name on 
the paper in front of him, but as he began to write 
the experimenter said “not yet” to make him aware 
that he was being watched by the experimenter. 
The subject was then instructed to turn over the 
paper and was told: “Now you will have 50 seconds 
to write down all of the words that appeared in the 
memory drum list. Ready! Set! Begin!” 

After the recall test the subject was asked to com- 
plete the MACL (Nowlis, 1965) to assess the effec- 
tiveness of this anxiety manipulation. The anxiety 
scale consisted of seven adjectives (uneasy, tense, 
nervous, jittery, on edge, fearful, and “clutched 
up”), each rated on a 4-point scale, yielding scores 
from a low of 7 to a high of 28. 

Subjects in the low anxiety condition remained 
in the main room after the digits backward task. 
These subjects were given the following instructions: 


The next part of this study requires your com- 
plete cooperation and understanding. You are 
playing an important role in this study, since we 
will be using your scores as control scores. In 
other words, your scores will be used as the basis 
of comparison for other students who will be per- 
forming similar tasks but under somewhat differ- 
ent conditions. Try to do as well as you can, 
since we would like to have a realistic measure of 
the number of items that a person can complete 
in a certain amount of time. Please put your 
name on the top of the paper. OK? You will now 
have about 50 seconds to write down as many 
of the words as you can remember from the word 
list you just learned from the memory drum. 
Please begin and stop when I ask you to. Are you 
ready? All right, begin. 


All of these instructions were delivered without any 
authoritative tone of voice. When the 50 sec had 


elapsed, the experimenter told the subject to “please 
stop.” These subjects then completed the MACL, 


Self-disclosure Questionnaire 


After completing the MACL following the recall 
test, all subjects were asked. to complete a self- 
disclosure questionnaire and another MACL, They 
were then briefed on the purpose of the experiment. 
The self-disclosure questionnaire consisted of four 
questions modeled after questions rated to demand 
moderate to high levels of intimacy from a 40-item 
Self-Disclosure Questionnaire (in Jourard, 1971), 
Each question had a positive and a negative pole 
(e.g., abilities and weaknesses), so that positive and 
negative self-evaluation could be examined. Two 
forms of the questionnaire, with the arrangement 
of questions and positive and negative aspects of 
questions reversed, were administered to control for 
order effects, In addition, the ordering of positive 
and negative parts of questions was randomized 
within each form of the questionnaire. Subjects were 
given about 10 minutes to write responses to the 
questions, The decision as to when to terminate 
writing was left to the subject; the subject could 
write as much or as little as he chose. 

The questions were presented to the subject as a 
possibility for the experimenter to learn significant 
information about the subject’s personality rather 
than as an investigation of self-disclosure per se. 
The following is one form of the self-disclosure 
questionnaire used in this study: 


Often psychological research ignores most of the 
experiences, feelings, and thoughts that make up 
an individual. We believe this leads to many over- 
sights and oversimplifications. We are, therefore, 
interested in knowing more about you as a per- 
son. Please take about 10 minutes to answer the 
following questions as honestly as possible, Your 
answers to these questions will be kept strictly 
confidential. 

1, How do you react to other people's criticism 
and praise of you? What things about you do 
people tend to criticize and praise? 

2. How satisfied are you with different parts of 
your body, for example, weight, height, build, hair? 
Do members of the opposite sex find you sexually 
desirable? 

3. What are your academic abilities and weak- 
nesses? How do they relate realistically to your 
plans for the future? 

4, What do you do when you feel depressed ? 
When you feel anxious? When you feel affection- 
ate? When you feel happy? 


Breadth or amount of self-disclosure was assessed 
in terms of the total number of words written by 


each subject. 


Content Analysis 


Depth or intimacy of self-disclosure and positive- 
negative self-evaluation were assessed by content 
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analysis of protocols from the self-disclosure ques- 
tionnaire. Subjects’ responses to Question 2, the 
question demanding the most intimate answer as 
determined by independent raters, were typed in 
random order on 6 pages of eight statements each. 
Five of the six sheets were collated into a leaflet. 
Twenty such leaflets were prepared with all pos- 
sible combinations of the six sheets to yield 16 
separate ratings of intimacy and self-evaluation for 
each statement. The leaflet comprised a content anal- 
ysis questionnaire, which was distributed to 20 up- 
per-class college students, chosen at random who were 
asked to rate each statement on an intimacy and 
a positive-negative scale. 

The intimacy, or depth of self-disclosure, of each 
statement was rated on a 9-point Likert-type scale 
from most intimate to nonintimate according to 
guidelines explicated in the instructions for the con- 
tent analysis questionnaire. Similarly, raters were 
asked to assess the subjects’ self-evaluations on a 
9-point Likert-type scale from most positive to most 
negative. The following are the guidelines that raters 
were instructed to follow in assessing the subjects’ 
statements: 


Intimacy Scale 


1. Most intimate: The person wrote things about 
himself that were of an extremely personal, emo- 
tional, secret, or embarrassing nature. 

3. Very intimate: The person wrote things about 

himself that were quite personal, emotional, secret, 
or embarrassing, although perhaps not consistently 
so. 
5. Somewhat intimate: The person wrote some 
things about himself that were Personal, emotional, 
secret, or embarrassing, but he may have been eva- 
sive or defensive in responding. 

7. Less intimate: The person generally said little 
about himself that was of a personal, emotional, 
secret, or embarrassing nature. 

9. Nonintimate: The person said nothing about 


himself that was of a personal, emotional, secret, 
or embarrassing nature. 


Positive-Negative Scale 


1, Most positive: The person is extremely positive 
about himself. He is very Pleased with his body and 
believes members of the Opposite sex find him quite 
attractive. 

3. Positive: Although he may. have some reserva- 
tions, the person tends to be positive about himself 
and appears satisfied. 

5. Neutral or ambivalent: The person’s positive 
and negative remarks about himself seem to balance 
or counteract each other. 

7. Although the person may indicate some strong 
points he appears dissatisfied or negative about him- 
self, 


9. Most negative: The person is very displeased 
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| 
by his appearance and believes that members oti 
opposite sex are not attracted to him at all. l 

| 


Results 


Significant main effects for both stated 


trait anxiety were found on the analysis | 
variance of scores on the Anxiety scale of th 
MACL administered immediately after t 
recall test. A significant main effect for stal 
was also found on the analysis of variance ( 
scores on the Anxiety scale of the MACL at 
ministered after completion of the self-di 
closure questionnaire (Table 1). Thus, afti 
the recall test, subjects in the high anxiet 
condition and those with high debilitatin 
anxiety reported experiencing moods chara 
terized by terms like tense, nervous, jitter 
and on edge (i.e., anxiety). Subjects in th 
high anxiety condition continued to expel 
ence more of this type of negative affect al 
ter completing the self-disclosure questiot 
naire. | 
Significant main effects for both state au 


| 
Table 1 | 
Means and Analysis of Variance for the | 
Anxiety Scale of the Mood Adjective Check Lit 


ad 


Anxiety condition 


Measure High Low | 
After recall test 

Debilitating 18.67 14.25 1 

Facilitating 14.08 991 F 
After self-disclosure 

questionnaire 
Debilitating 13.25 10.91 3 
Facilitating 11.08 9.33 


Analysis of variance 


Source df MS 
After recall test 
State (A) 1 111.02 ie 
Trait (B) 1 123.52 5.00" 
AXB 1 22.69 ‘ 
Error 44 24.70 
After self-disclosure 
questionnaire 
State (A) 1 363.00 4.0% 
Trait (B) 1 i by 
AXB 1 27.00 
Error 44 88.66 


* p < .01, one-tailed. 
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trait were also found on the analysis of vari- 
ance of the number of words written on the 
self-disclosure questionnaire, the measure of 
the breadth of self-disclosure. As predicted, 
subjects in the high anxiety condition and 
those with greater debilitating anxiety wrote 
significantly fewer words on the questionnaire 
(Table 2). Intercorrelations of the number of 
words written per question were significant 
for all but one comparison. Eight out of nine 
of these were significant at or beyond the 
.001 level, indicating a high degree of consis- 
tency in each subject’s response to the ques- 
tionnaire and a high reliability for the 
measure, 

The analysis of variance of the depth of 
self-disclosure as rated on the intimacy scale 
of the content analysis questionnaire yielded 
a significant main effect for state. Subjects in 
the high anxiety condition were seen to have 
disclosed less intimately about themselves 
than their control counterparts. There was no 
significant main effect for trait on this mea- 
sure nor any interaction effect (Table 3). A 
Pearson correlation of .75 (p < .001) was 


Table 2 
Means, Analysis of Variance, and 
Intercorrelations of the Number of Words 


Written on the Self-disclosure Questionnaire 
SS ES E 


Condition Debilitating Facilitating 
High anxiety 207.42 231.75 
Low anxiety 235.75 285.25 
Analysis of variance 
Source df MS F 
State (A) t 20,090.08 3.90* 
Trait (B) 1 16,354.08 3.17* 
AXB 1 1,900.08 37 
Error 44 5,159.65 
Intercorrelation of words written per question 
Item 2 3 4 Total 
1. Question 1 13 Eid E U ae .68%** 
2. Question 2 .50** ae 0" 
3. Question 3 Ch hag ees 


4. Question 4 


ee b < .05, one-tailed. 
„sÊ S01 
$ < .001. 


Table 3 
Means and Analysis of Variance for Rated 
Intimacy from the Content Analysis 


Questionnaire 
O AE 
Condition Debilitating Facilitating 
High anxiety 6.33 6.70 
Low anxiety 5.66 5.63 
Analysis of variance 
Source df MS F 
State (A) 1 4,91 3.21* 
Trait (B) 1 .05 .03 
AXB 1 02 01 
Error 44 67,38 


* p < .05, one-tailed. 


found for the split-half interrater reliability 
on the intimacy scale. 

There was no significant main effect on 
the positive-negative scale of the content 
analysis questionnaire (Table 4). The State 
x Trait interaction approached but did not 
attain significance, F(1, 44) = 2.23, p < .13, 
two-tailed. The split-half interrater reliability 
was .93 (p< .001). Subjects with high fa- 
cilitating anxiety tended to express more 
positive self-evaluations in the low anxiety 
condition than the control condition, whereas 
subjects with high debilitating anxiety tended 
to express more positive self-evaluations in 
the experimental condition than in the control. 


Discussion 


The effectiveness of the situational cues of 
the high anxiety condition to elicit state anx- 
iety was borne out by the difference of scores 
on the Anxiety scale of the MACL completed 
after the recall test. Those in the high anx- 
jety condition did, in fact, report more anxiety 
than did those in the control condition. In 
both conditions subjects with high debilitat- 
ing test anxiety reported experiencing more 
anxiety than facilitators. This result gives 
credence to Spielberger’s contention that trait 
anxiety, in this case debilitating test anxiety, 
can be operationalized as the propensity to 
experience state anxiety with greater fre- 
quency or intensity, even in relatively innoc- 
uous testing situations like the control con- 
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Table 4 

Means and Analysis of Variance for Rated 
Self-evaluation from the Content Analysis 
Questionnaire, 


Condition Debilitating Facilitating 
High anxiety 4.47 4,97 
Low anxiety 4.98 4.27 

Analysis of variance 
Source df MS F 
State (A) 1 ll .06 
Trait (B) 1 13 07 
AXB 1 4.42 2.23 
Error 44 1.99 


dition, Subjects in the high anxiety condition 
reported experiencing more anxiety than con- 
trols even after the self-disclosure question- 
naire was administered, though now there 
were no differences between debilitators and 
facilitators. This result may be viewed as sup- 
porting the notion of the highly situation- 
specific nature of the trait, debilitating test 
anxiety. 

As predicted, high state anxiety resulted in 
less breadth of self-disclosure as well as 
less intimate disclosure. Spence and Spence 
(1966) have theorized that anxiety functions 
to activate dominant responses that have a 
high degree of habit strength. The lower self- 
disclosure of subjects in the high anxiety con- 
dition thus follows, in that introspective, per- 
sonal, emotional statements are certainly not 
the most salient responses in a testing situa- 
tion. Indeed, the physiological arousal in- 
volved in the “fight or flight” response (Selye, 
1956) seems antithetical to introspection, 
Sarason (1975) noted, moreover, that the 
internal focus of attention of high-test-anx- 
ious subjects (A-Trait) leads to failure to 
attend to external task-relevant cues. In the 
present study, those in the high anxiety con- 
dition, who reported experiencing greater anx- 
iety than controls both before and after com- 
pleting the self-disclosure questionnaire (and 
to some extent debilitators in the control con- 
dition), may well have been attending to 
their present arousal state. As a result, per- 
haps, of this internal focus of attention, they 
failed to attend to information about them- 
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selves that was relevant to the self-disclosure 
questionnaire. 

Although debilitators in both conditions did 
produce less disclosure quantitatively, no dif- 
ference was seen in the intimacy with which 
debilitators and facilitators wrote about them- | 
selves. As noted above, the debilitators in 
the control condition experienced lower anx- 
iety than those in the high anxiety condition. 
Their intermediate level of arousal may have 
functioned to moderate the number of words 
they wrote but not their overall intimacy. 
Working on the self-disclosure questionnaire 
may have served to moderate the anxiety 
experienced by debilitators. As intimacy was 
rated on the basis of responses to the second 
question (on the alternate form, the third), 
responding to the preceding question(s) may 
have served to dissipate anxiety and, conse- 
quently, to increase intimacy. Although there 
has been no empirical demonstration that 
self-disclosure once elicited functions to ameli- 
orate state anxiety, the finding that self- 
disclosure is greater in a prolonged stress 


condition than in a control situation (Altman | 
& Haythorn, 1965) suggests that self-disclo- 


sure may serve to reduce anxiety. 

The lower intimacy of self-disclosure of 
subjects in the high anxiety condition may 
have been, above all, a function of the evalua- 
tive threat involved in disclosing intimate 
information about the self in this particular 
situation. The anxiety manipulation used in 
this experiment pivoted on inducing threat 
to self-esteem by making tasks like the digit 
span ego involving and creating a failure ex- 
perience. Although, contrary to expectations, 
this threat did not manifest itself signifi- 
cantly in the subject’s self-evaluations, it may 
have been crucial in the lower self-disclosure 
of those in the high anxiety condition. In- 
deed, low levels of disclosure would seem to 
function to protect the individual from threat 
to his most secret or emotionally charged be- 
havior, affect, and thought, especially when 
this threat is a salient feature of the situa- 
tion. That debilitators in the control condition 
did not differ from facilitators in the inti- 
macy of their self-disclosures again suggests 
that they did not discriminate the self-dis 
closure questionnaire as a threatening stimu- 


l 
l 
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lus, whereas the recall test was obviously 
anxiety evoking for debilitators, again sup- 
porting the notion of the situation-specific na- 
ture of debilitating test anxiety. 

Although this experiment has shown that 
state anxiety arising from both situational 
cues (A-State) and the individual’s propen- 
sity to experience anxiety (A-Trait) results 
in both lower breadth and depth of self-dis- 
closure, much more empirical work is needed 
on the relationship of anxiety to self-disclo- 
sure. For example, Would the pattern of 
findings be similar to those of the present 
study if verbal, rather than written, self-dis- 
closure were used (Anonymous reviewer, Note 
1)? Understanding this relationship is crucial 
to developing effective psychotherapeutic in- 
terventions for anxious patients as well as 
for understanding social maladjustments and 
their consequences for anxious individuals. 
Although this experiment did attempt to 
assess the effect of anxiety on self-disclosure, 
it did not assess the effect of self-disclosure 
on anxiety. If self-disclosure does aid in cop- 
ing with anxiety, as was suggested previously, 
then inducing anxious subjects to self-disclose 
in an appropriate situation should lower their 
level of anxiety. This point, too, needs to be 
investigated. 


Reference Note 


1. Anonymous reviewer. Personal communication, 


May 1977. 
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Effectiveness of Widows’ G 


roups in Facilitating Change 


Carol J. Barrett 
Wichita State University 


Seventy urban widows participated in one of three group treatments for a 
7-week period or a waiting list control group. Two therapists each led a self- 


help group, a “confidant” group, and 


Personality, attitude, and behavioral measures were obtained at pretest, post- 


test, and at a 14-week follow-up, At 


significantly higher self-esteem, experienced a significant increase in intensity 
of grief, and espoused significantly more negative attitudes toward remarriage. 
Experimental subjects showed significant improvement in their ratings of future 


health and became significantly less 


women relative to the controls. The therapist variable produced few differences 
in response to treatment. At follow-up, treatment gains were maintained. Life 
changes were significantly more positive in the women’s consciousness-raising 
groups, and posttest evaluations of the program by these subjects were signifi- 


cantly higher. All treatments resulted 
ipants in the group. 


Research has demonstrated amply that 
widowhood is a stage in the life cycle during 
which multiple stresses occur. The degree of 
change in one’s life necessary for adjustment 
to the spouse’s death is deemed greater than 
that required for 42 other major life events 
(Holmes & Masuda, Note 1). There is strong 
evidence that widowed persons experience im- 
pairment in both mental and physical health 
relative to married persons of the same age 
(Barrett, 1977). Psychiatric symptoms (Clay- 
ton, 1974; Parkes, 1964), mental illness (Bel- 


This article is based on the author’s doctoral dis- 
sertation research at the University of Southern Cali- 
fornia in 1974, A report of the preliminary findings 
was made at the Gerontological Society Meeting in 
Miami Beach, Florida, in 1973 (The Gerontologist. 
13, p. 66). 
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as one of the group therapists, and Elwin Barrett 
and Sally Daniel assisted in the data analysis, 
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a women’s consciousness-raising group. 


posttest, subjects in all conditions had 


other-oriented in their attitudes toward 


in high rates of contact among partic- 


lin & Hardt, 1958), inception of psychiatri 
outpatient and inpatient services (Robertsoll 
1974; Stein & Susser, 1969), and suicid 
(Bock & Webber, 1972; Cosneck, 1966) 
Segal, 1969) are all more frequent among thi 
widowed than the married. 

Disabling illness (Woolsey, 1952) and hos 
pitalization for medical problems (Californil 
Department of Public Health, 1958; Rosell 
feld, Katz, & Donabedian, 1957; Rosenfel 
Mott, & Taylor, 1951) are also more frequen 
among the widowed. Physical symptoms wê 
more prevalent among young widows relativi 
to married women (Maddison & Viola, 1968 
Parkes & Brown, 1972) but not among 4 
elderly widowed group (Heyman & Gianturél 
1973). In a landmark study, Kraus 4 
Lilienfeld (1959) documented higher deat 
tates among both black and white widowé 
persons of both sexes, for all age groups 4 
for all causes of death, when compared 
married persons of similar age and socioe 
nomic status. The discrepancy in death rat 
between widowed and married persons W 
largest at the younger levels, 

There are convincing arguments that sué 
differences in the level of functioning of sam 
age married and widowed individuals canti 
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be accounted for totally by predisposing char- 

acteristics operative prior to marriage, the 
effects of a shared unfavorable environment, 

artifacts in reporting procedures, and the pos- 

sibility of higher remarriage rates by the 
healthy widowed (e.g., Parkes, Benjamin, & 
| Fitzgerald, 1969; Stein & Susser, 1969). With 
an occasional exception (Lowenthal, 1964), 
widowhood has been shown to increase social 
isolation (Adams, 1968; Berardo, 1970; Mar- 
ris, 1958). In Lopata’s (1973) extensive study 
of 301 widows over age 50 in the Chicago 
area, as well as in other research (Nuckols, 
|1973), loneliness was indeed perceived as the 
worst problem. A substantial proportion of 
widowed persons must assume the additional 
responsibility of single parenthood. (Widows 
constitute the largest group of single parents; 
Schlesinger, 1971.) Financial problems are 
often severe (Nuckols, 1973; Palmore, Stan- 
ley, & Cormier, 1963), and legal problems 
complicate the resolution of grief that must 
proceed with little institutionalized support 
(Gorer, 1965). The available evidence, the 
stress theory of mental illness (e.g., Simon, 
1970) and of physical illness (Holmes & 
Masuda, Note 1), and the application of so- 
cial role theory to widowhood (Lopata, 1973) 
support the hypothesis that the experience of 
widowhood itself may produce the observed 
physical and emotional distress. 

The importance of this position is high- 
lighted by the size of the population currently 
and potentially at risk. The 12 million wid- 
owed persons in this country already number 
almost 5% of the total population (U.S. 
Bureau of the Census, 1970). The projected 
‘increase in the size of the older population 
enhances the probability that the proportion 
of widowed persons will continue to increase. 
The probability of widowhood is much higher 
for women, who usually marry men older than 
themselves, despite the fact that they outlive 
men by 7-8 years. Approximately three out 
of four married women will become widows 

(Lewis & Berns, 1975). Currently there are 
more than four times as many widows as wid- 
owers, Widowhood is seldom a brief episode 
at the end of the life span; the average dura- 
tion of widowhood for wiqows 
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remarry and who die from natural causes is 
18% years (Carter & Glick, 1970). 

In recognition of the special needs of the 
widowed, the high-risk nature of this group, 
and the dearth of responsive programming in 


traditional mental health agencies (Silverman, . 


1966), a variety of therapeutic programs have 
recently been developed by mental health 
professionals (Abrahams, 1972; Miles & 
Hays, 1975; Silverman, 1970; Woods, 1973; 
Antoniak, Note 2; Van Coevering, Note 3; 
Parkes, Note 4) and by lay organizations 
(e.g., NAIM, a Catholic-sponsored organiza- 
tion; THEOS (They Help Each Other Spir- 
itually), a nondenominational organization 
based in Pittsburgh; and American Associa- 
tion of Retired Persons’ Widowed-to-Wid- 
owed Program). Most utilize widowed para- 
professionals and offer individual help to the 
recently widowed; others arrange discussion 
groups. Unfortunately, controlled outcome 
studies of such innovative programs are non- 
existent. 

The present study sought to develop and 
evaluate three different therapeutic group in- 
terventions for widowed women and to com- 
pare the results of treatment to a waiting list 
control group. Although the propensity of 
widowhood is higher among older women, re- 
search suggests that stresses may be greater 
for younger widows (Kraus & Lilienfeld, 
1959; Parkes, 1964; Robertson, 1974). 
Hence, the treatment groups were designed 
to respond to the needs of widows of all ages. 
Since many of the social and economic stresses 
of widowhood persist long after the husband’s 
death, no limit was set on the duration of 
widowhood for participants in the study. 

A basic premise in all three treatment 
groups was that widowed women would be 
able to help each other cope with the stresses 
of their situation (Silverman, 1970). The de- 
sign of each treatment reflected a different 
research base or theoretical posture, The pos- 
sible superiority of self-help strategies in com- 
parison with traditional psychotherapy among 
groups with pervasive common problems (e.g, 
Hurvitz, 1970) provided a o for the A 

oups. The advent of peer counseling 

ett id the research of Lowenthal and 

968) provided the rate „the 
Z AND p 
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confidant groups. These investigators have 
shown that older persons who have a con- 
fidant have better subsequent mental health 
than those without one. Since many widows 
will have lost a primary confidant in their 
husbands, the development of a compatible 
widow confidant might be therapeutic, par- 
ticularly in light of Lopata’s (1973) analysis 
of the difficulties that widows have in mak- 
ing new friends. The advent of the conscious- 
ness raising group in the women’s movement 
provided a model for the widows’ conscious- 
ness raising groups. Although consciousness 
raising groups enjoy immense popularity and 
personal accounts of their helpfulness to 
women abound, little systematic research has 
accompanied them. Since a number of the 
stresses of widowhood derive from sex roles 
operative both in marriage and subsequently, 
a group experience focusing on sex roles was 
considered to have potential therapeutic value 
to widows. Since there were no previous out- 
come studies to guide the selection of de- 
pendent variables relevant to widowed per- 
sons’ services, a variety of potentially useful 
measures was included. 


Method 
Subjects 


Two hundred and thirty-nine widows who re- 
sponded to a news item in the Los Angeles Times 
or one of six Los Angeles area newspapers that 
briefly described the program were invited to an 
orientation meeting. (Eighty-one subsequent inquiries 
were referred to local community mental health 
centers.) One hundred and twenty-six attended. Of 
these, 831 (66%) participated in the program. Ten 
out of 28 subjects randomly assigned to the wait- 
ing list control group, and 25 out of 86 assigned 
to treatment did not return. Eight subjects who 
could not participate immediately due to prior com- 
mitments were permitted to join the waiting list 
group (4 did so), but their data were omitted to 
Prevent confounding of subject characteristics with 
control group status. Eight subjects in the treat- 
ment groups who attended fewer than four of the 
seven sessions (4 confidant subjects, 2 self-help sub- 
jects, and 2 consciousness raising subjects) were 
dropped from the analysis along with 1 control sub- 
ject later identified as a divorcée. This left a total 
of 70 subjects, including 17 controls and 53 as- 
signed to treatment groups. 

The subjects ranged in age from 32 to 74, with 
a mean age of 55.7 (SD=9.2 years). The duration 


of widowhood ranged from less than 1 month) 
22 years, with a mean of 4 years 9 months and 
median of 3 years 9 months. Subjects widowed į 
2 years or less comprised about one third of { 
sample. Ten subjects had been widowed over | 
years. The majority (87%) had been married 

once. About a third still had children at hom 
Twenty-eight were Jewish; 23, Protestant; | 
Catholic; and 9 indicated some other religious al 
gory, Fifty-three percent were employed, 30% wy 
retired or not employed by choice, and 17% we 
unemployed. More than 70% of the sample hadi’ 
least some college education, and 16% had a grad 
ate or professional degree. The modal cu 

monthly income from all sources was $600-$900; th 
modal household income prior to widowhood w 
over $1,500 a month, i 


Procedure 


Each subject attended one of four orientatit 
meetings, which began with a brief lecture on wi 
owhood. They were told that they would be # 
tending one of three types of discussion group 
focusing on “specific problems of widowhood” (tt 
self-help groups), “the development of friendship 
(the confidant groups), or “the roles of women 
society” (the consciousness raising groups). Subjed 
completed demographic information and the pret 
dependent measures and indicated their availabill 
for meeting times (without knowledge of whit 
group would meet at which time). A $15 chë 
payable at the first group session and to be retum 
at follow-up was required to reduce subject d 
trition.? 

All subjects who attended orientation (except {0 
who withdrew) were assigned as randomly as p 
sible to one of three treatment conditions orl 
waiting list control group. Several factors prevent 
complete random assignment. The first was a li 
tion on the availability of subjects. No subject W 
assigned to a group that it was impossible for M 
to attend. To prevent confounding of subject chil 
acteristics such as employment status with tre 
ment, one group of each of the treatments met ú 
a weekday, and one met on Saturday. q 

A second factor was the desirability of pai 
confidant group subjects with someone who liv 
nearby to increase the likelihood of continued rel 


1 Inquiries were received from all over Los Ang? 
County. Phone calls from a number of potential 8 
jects hoping for a neighborhood group suggested tM 
the attendance at orientation and the reduced num 
ber of actual participants were in part a refecti 
of the excessive travel time required in the city. 

? Informal feedback suggested that this procedil 
may have backfired. Some potential subjects who ™ 
tended orientation but did not subsequently joi? 
group may have been deterred by the fee. Seve 
had no checking account. 
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tionships posttreatment. To ensure geographic prox- 
imity, confidant group assignments were made from 
among the two most heavily represented of the five 
toll-free Los Angeles County regions designated by 
the telephone company. Subjects in one group all 
resided in the San Fernando Valley; subjects in the 
other group lived in the central city area. (Addi- 
tional subjects from these regions were assigned to 
other treatment conditions.) 

The third factor was the desirability of beginning 
the groups as soon as possible after orientation to 
reduce the likelihood of attrition. Subjects from the 
first orientation meetings were assigned to the first 
therapist or the control group; subjects from the 
final meetings were assigned to the second therapist 
or the control group. The final number of subjects 
in each group appears in Table 1. 

All experimental subjects began a 7-week treat- 
ment group within 1-2 weeks of orientation. Ap- 
proximately 14 weeks after the posttest (13-15 
weeks), and 6 months after the program announce- 
ment, all subjects attended a follow-up session. Af- 
ter completion of data forms, the preliminary results 
of the study were described, Control subjects par- 
ticipated in a treatment group following the col- 
lection of posttest data and hence were omitted from 
the follow-up analysis. 

All groups met for 2 hours once a week for 7 
weeks at the University of Southern California. Two 
(nonwidowed) female doctoral students in clinical 
psychology each led one of three different groups 
pretested in a pilot study. The purpose of the self- 
help groups was to encourage participants to help 
each other find solutions to the problems of widow- 
hood. These groups were told that widowed persons 
are the experts on widowhood. The therapist's role 
was to facilitate discussion; all direct requests for 
advice were referred back to group members, and 
reticent members were encouraged to share their 
experiences, The therapist praised group members 
when specific suggestions were offered to others. The 
problems discussed were those initiated by the mem- 
bers themselves and included loneliness, grief, single 
Parenting, reduced financial resources and employ- 
ment difficulties, decisions about living arrangements, 
strained relationships with relatives and married 
friends, new relationships with men, and legal 
Problems, y 

The purpose of the confidant groups was to facili- 
tate the development of a close friendship between 
pairs of widows. Subjects were paired for the dura- 
tion of the group in one of two ways. At the close 
of the first group discussion? subjects could con- 
fidentially indicate preferences for pair assignments. 
Those who did not wish to make this decision were 
assigned primarily on the basis of matched age and 
duration of widowhood. The group format con- 
Sisted of an intimacy training task in pairs, during 
which the therapist moved from pair to pair as 
needed, followed by a discussion of the experience 
by the whole group. The intimacy tasks were ar- 
Tanged in gradually increasing difficulty over the) 
T-week period; they proceeded from a request to 


Table 1 
Distribution of Subjects 
ee ee 


Group n 
Self-help 1 8 
Self-help 2 10 
Confidant 1 8 
Confidant 2 7 
Consciousness raising 1 11 
Consciousness raising 2 9 
All experimental subjects 53 
Waiting list control 17 
Total no, subjects 70 


find three things the pair had in common to a re- 
quest to share personal problems and to make an 
explicit offer of help to one’s confidant. 

The purpose of the consciousness raising groups 
was to facilitate the participants’ awareness of how 
their experiences as widows relate to them as women. 
The group structure was modeled after that de- 
veloped by the Consciousness Raising Committee of 
the Los Angeles chapter of the National Organiza- 
tion for Women (Freeman, Note 5). Each group 
was given a list of possible sex role topics of par- 
ticular relevance to widows‘; the group selected the 
topic to be discussed 1 week in advance, Each ses- 
sion consisted of (a) introductory comments about 
the topic from the leader’s own experience, (b) 5-10 
minutes for each member in sequence around the 
group to express her reactions to the topic without 
interruption, and (c) open discussion by all partici- 
pants. “No confrontation” was an enforced rule, 
Members could ask questions after the individual 
reactions but were not permitted to make comments 
until their “turns.” 


Measures 


Eighteen personality, attitude, and behavioral 
measures were obtained by written self-report; 12 
at pretest, posttest, and follow-up; 2 at posttest and 
follow-up; and 4 at follow-up only (see Table 2). 
Ten measures, comprised of 6 rating instruments 
and 4 behavioral indices, were developed by me to 
assess physical, emotional, and social functioning in 
widowhood.® These included the frequency of physi- 


8 The announced beginning topic in all groups was 
the question of whether widows or widowers experi- 
ence greater stress. However, both therapists observed 
that the actual primary topic was the circumstances 
of the husband’s death, often tearfully recounted. 
A similar phenomenon occurred in Lopata’s inter- 

i te 7). 
panto Hort the author on request, Popular 
topics included: “Does widowhood oppress women? 
and “Are you still a wife?” J é 

5 Copies of instruments not previously published 
are available from the author. 
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Table 2 
Dependent Variables 


Pretest 


Frequency of physical complaints 
5-year health prediction 
Intensity of grief 
Attitude toward widowhood 
Attitude toward remarriage 
Social role involvement 
Self-esteem 
Locus of control 
Life satisfaction 
Attitudes toward women 
Other-orientation 
Self-orientation 
Radical vs. conservatism 


Posttest 


Pretest plus 
Extent of help from group 
Extent of learning in group 


Follow-up* 


Pretest plus posttest plus 
Quality of life change 
Number of group meetings 
Number of members contacted 
Total number of individual contacts 


* Included experimental subjects only. 


cal complaints (summed across six ailments that are 
common in bereavement), a prediction of one’s 
health in 5 years (poor, fair, good, and excellent 
were assigned scores of 1-4, respectively), intensity 
of grief (a sum of 12 7-point rating scales assess- 
ing feelings of loneliness, inability to cope, sorrow, 
guilt, anger, and depression), attitudes toward wid- 
owhood and remarriage (2 7-point scales), and so- 
cial role involvement (sum of ratings representing 
involvement in eight. major social roles). A major 
behavioral measure was the quality of life change 
since enrollment in the Program; responses to an 
open-ended follow-up question were rated indepen- 
dently on a 5-point scale from extremely positive 
to extremely negative by two persons blind with 
respect to treatment condition. Responses were also 
content analyzed. Additional behavioral measures 
permitted an evaluation of the extent of social ac- 
tivity among members of the various groups sub- 
sequent to treatment. These were the number of 
spontaneous group meetings during the follow-up 
period, the number of group members contacted in- 
dividually, and the total number of individual 
contacts during follow-up. 

Three personality variables were included: Rosen- 
berg’s (1962) self-esteem Measure; Rotter’s (1966) 
Internal-External Locus of Control (I-E) Scale; 
and Neugarten, Havighurst, and Tobin’s (1961) Life 
Satisfaction Index A; as well as three measures of 
attitudes toward women; a 15-item radical versus 


conservatism scale adapted from Spence and Helm. 
reich (1972); and Gump’s (1972) other-orientation 
and self-orientation measures, Other-orientation is 
a composite of three factors representing the views, 
that identity is derived through traditional roles, 
that woman’s role is submissive, and that home 
orientation and duty to children should be stressed, 
Self-orientation is a composite of four factors re. 
flecting a need for individualistic achievement and 
satisfactions, a sense of autonomy and heightened 
independence, and the beliefs that the traditional 
role implies some relinquishing of needs for per 
sonal fulfillment and that the family is inadequate — 
to completely fulfill one’s needs, At posttest and 
follow-up, subjects evaluated the degree of help re- 

ceived from the group and the degree to which 

they learned from the group on 7-point scales, l 


Results 
Pretest 


Multivariate and univariate analyses of 
variance performed on the 12 pretested de- 
pendent variables resulted in no significant 
differences between the four conditions (self- 
help, confidant, consciousness raising, and 
control groups) at the start of treatment. Nor 
was there any difference in the average age 
or duration of widowhood among subjects in. 
the four conditions, 


Posttest 


To determine whether treatment groups 
could be combined across therapists, a multi- 
variate analysis of variance and univariate 
analyses of variance were performed on the 
Pre-post change scores for the six available 
widowhood functioning variables and the 
three personality variables, with therapist 
(two levels) and treatment (three levels) as 
independent factors. The only significant 
therapist effect was for attitude toward re- 
marriage, F(1, 40) = 6.44, p< .05 (see 
Table 3), and none of the therapist-treatment 
interactions was significant. Similarly, tests 
for therapist and therapist-treatment inter- 
action effects on the attitudes toward women 
change scores were not significant. 

Hence, one-way (four-level) multivariate 
analyses of variance were performed on the 
pre-post change scores for the nine personal- 
ity and widowhood functioning variables, and 
separately for the three attitudes toward 
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women variables, on all subjects (including the 
control group) for whom data were complete, 
The degrees of freedom in the treatment fac- 
tor were partitioned to enable planned com- 
parisons to assess (a) whether any significant 
change occurred in the variables from pre- 
test to posttest, (b) whether the change 
among experimental subjects differed from 
the change among controls, and (c) whether 
there were any between-treatment differences 
in the change scores. 

Both multivariate tests for pre-post change 
across all subjects were significant, F(9, 51) 
= 3.09, p < .005, F(3, 60) = 4.20, p < .01. 
Subsequent univariate tests revealed a highly 
significant increase in self-esteem, F(1, 59) = 
14.07, p < .001, a significant increase in the 
intensity of grief, F(1, 59) = 9.92, p = .003, 
significantly more negative attitudes toward 
remarriage, F(1, 59) = 8.07, p= .006, a sig- 
nificant reduction in other-oriented attitudes 
toward women, F(1, 62) = 7.50, p = .008, 
and an almost-significant increase in self- 
orientation (p = .07). The mean changes in 
these variables by treatment appear in Table 
3. 

Although neither multivariate test of ex- 
perimental—control group differences was sig- 
nificant, two univariate tests were significant. 
Experimental subjects showed a more positive 
change in their 5-year health predictions, 
F(1, 59) = 4.12, p < .05, and became less 
other-oriented than controls, F(1, 62) = 3.86, 
b = .05. Treatment means for these variables 
appear in Table 3. Scrutiny of the group 
means revealed that the significant reduction 
in other-orientation across all subjects was 


Table 3 


totally accounted for by the substantial 
change among experimental subjects, There 
were no significant between-treatment differ- 
ences on any of the 12 pre-post variables. 

A 3 X2 multivariate analysis of variance 
on the two posttest experimental group eval- 
uation measures yielded a significant main 
effect for treatment, F(4, 92) = 2.62, p< 
.05, and no therapist effect or Therapist x 
Treatment interaction. Both extent of help 
from the group, F(2, 47) = 4.05, p< .05, 
and extent of learning in the group, F(2, 47) 
= 5.03, p < .01, varied with the treatment. 
Table 4 presents the mean evaluation scores 
by treatment. The highest ratings on both 
variables occurred in the consciousness rais- 
ing groups and the lowest in the self-help 
groups. 


Follow-up 


A one-way (three-level) multivariate anal- 
ysis of variance and univariate analyses of 
variance on the change scores from pretest 
to follow-up of the experimental subjects for 
whom complete data were available on the 
nine pretested widowhood functioning and 
personality variables still did not yield any 
significant differences among the three treat- 
ments, but the multivariate test for non-zero 
change from pretest to follow-up was highly 
significant, F(10, 32) = 3.79, p = 002. Uni- 
variate tests indicated. that experimental sub- 
jects maintained the significant increase in 
self-esteem, F (1, 41) = 18.98, p < .001, and 
in intensity of grief, F(1, 41) = 23.7, p< 
.001, and also maintained their more negative 


Mean Pre-Post Change in Selected Variables 


voc Consciousness Confidant Self-help Control 

aria 

Ss ae 
Self-esteem 2.53 ee in 4.58 
Intensity of grief ae 14 —.06 —.24 
Health prediction i 47 —.86 -— AT — 65 
Attitude toward remarriage* 183 —1.53 —2,38 .29 
Other-orientation ot 7 AT 1.31 1,29 
Self-orientation 3 6 n 
Radical versus conservative 2.13 —.69 1. 


i 28 
attitudes toward women 
04 for Therapist 1 and —1.18 for Therapist 2. 


* Change in attitude toward remarriage, by therapist, was ~~” 
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Table 4 The treatment effect was significant in { 
Mean Evaluation Scores by Treatment at univariate test of extent of help from 4 
Posttest and Follow-up group, F(2, 45) =3.38, p< .05. Gro 


eee 


means are reported in Table 4. Examinatiy 
Ss ee Self- of the group evaluation means and standa 
deviations suggested that the reduced leveli 


Variable raising fidant help “eb 

significance of treatment at follow-up wasa 
Helpfulness tributable to the increased variance in ratin 
Fottoen ae aot Ae in almost all cells, rather than to a conve 
PORE: 3 j i gence of treatment means. As in the postte 
Educational value analysis of these data, the therapist and tha 

Posttest 5.75 5.54 4.34 ’ Z * 
Follow-up 5.85 5.57 4.70 apist-treatment interaction effects were m 
significant. | 


A 3X 2 multivariate analysis of varian 
attitudes toward remarriage, F(1, 41) = 6.94, was performed on the four behavioral ma 
b <.05. A trend toward increased social role sures obtained at follow-up on the experima 
involvement by follow-up was also observed tal subjects for whom complete data wa 
(p = .08). Mean changes by treatment are available. The main effect for treatment w 
reported in Table 5. significant, F(8, 84) = 3.14, p < .005, | 

Although differences between treatments did were the main effect for therapist, F(4, 4 
not occur in the change in attitudes toward = 3.47, p < .05, and the therapist~treatm' 
women from pretest to follow-up among interaction, F(8, 84) = 4.45, p < .001. G 
the experimental subjects for whom complete means for these variables appear in Tablet 
data were available, here again an almost Univariate tests yielded a significant tr 
significant change across groups emerged in ment effect for both quality of life cha 
the multivariate analysis of variance (p < .07; F(2, 45) = 3.36, p < .005, and number 0 
see Table 5 for the mean changes in these group members contacted, F(2, 45) = 1 
variables by treatment). The univariate test p < .005. Pearson’s correlation of life 
of change in other-orientation was significant, scores by the two raters was .89 (p < 001 
F(1, 45) = 5.57, p < .05; at follow-up, sub- The most positive life changes occurred 
jects held less other-oriented attitudes. the consciousness raising groups, and the l 

The 3X2 multivariate test of the two positive changes, in the self-help groups. 
group evaluation variables at follow-up did content analysis of the types of repo 
not yield a significant main effect for treat- changes across all subjects appears in T 
ment, although the lowest ratings were still 7. Changes occurred in all spheres of li 
given by the self-help participants and the (except religion) but predominated in © 
highest by consciousness raising participants. area of mental health, The most frequen 


Table 5 
Mean Change from Pretest to Follow-up in Selected Variables by Treatment 


’ Consciousness 
Variable raising Confidant 


Self-help 


Self-esteem 4.71 3.18 4.56 
Intensity of grief 6.65 9.46 6.13 
Health prediction ‘i 09 —.06 
Attitude toward remarriage — 88 =S -1.13 
Social role engagement 1.24 1.64 244 
Other-orientation —.61 —1.21 —1.81 


Self-orientation 1.00 1.29 —.50 
Radical versus conservative attitudes 
toward women $ t =o 
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Table 6 
Group Means for Behavioral Measures at Follow-up 
Variable Therapist 
No. persons contacted 1 
2 
No, contacts 1 
2 
Group meetings 1 
2 
Life change 1 
2 


Consciousness 
raising Confidant Self-help 
1.82 4.12 AZ 
1.33 50 80 
2,82 13.00 22.57 
5.33 1.00 1.50 
82 1.25 14 
.00 .00 90 
4,09 2.88 3.14 
4.44 4.17 3.40 


reported changes were reduced feelings of 
unique experience; increased self-confidence; 
more positive future outlook; the incorpora- 
tion of help from the group; increased social 
contacts; return to school; and on the nega- 
| tive side, contraction of personal illness. On 
the average, confidant subjects contacted the 
most group members; self-help subjects con- 
tacted the fewest. At follow-up, four of the 
six treatment groups planned to continue 
meeting on their own.® 
Univariate tests on the behavioral measures 
substantiated the multivariate therapist ef- 
fect. Both number of members contacted, 
F(1, 45) = 6.97, p < .01, and total number 
of contacts, F(1, 45) = 4.87, $ < .05, varied 
with the therapist. Subjects in more of the 
first therapist’s groups had higher rates of 
contact during follow-up. (Because of the dif- 
ferential starting dates, these groups had a 
2-week longer follow-up.) Univariate tests for 
the therapist-treatment interaction were sig- 
nificant for the number of members contacted, 
F(2, 45) = 8.04, p< .001, and the number 
of spontaneous group meetings, F(2, 45) = 
5.4, p < .01, and a trend occurred for total 
number of contacts (p = 06). Examination 
of the group means revealed that discrepant 
outcomes occurred in the two confidant 
groups. One therapist’s group members €x- 
perienced the highest levels of follow-up con- 
tact in this treatment; the other’s experienced 
the lowest levels here. 


Discussion 


_ The comparison of groups at the pretest 
indicated that an approximation of random 


assignment of subjects was achieved. The 
treatment groups were somewhat successful 
in facilitating greater change than that ex- 
perienced in the control group. Experimental 
subjects were more likely to give up the view 
that others’ needs are more important than 
their own, That this change was not limited 
to consciousness raising participants suggests 
that a group experience provided a new ref- 
erence group of widowed women that facili- 
tated new norms that were competing effec- 
tively with married group norms in the 
population. This may be one advantage of 
group as opposed to individual interventions 
with the widowed. 

The more positive health predictions among 
experimental subjects is particularly note- 
worthy in light of the known physical im- 
pairment of the widowed (and the actual 
illness of several subjects), the fact that no 
medical intervention was offered, and the fact 
that this significant difference was based on 
change in a 4-point scale, The index of ac- 
tual physical complaints did not change, per- 
haps because the ailments addressed in this 
variable all commonly occur early in bereave- 
ment, whereas many subjects had been wid- 
owed for relatively long periods of time. Or 
perhaps the change in prediction of future 
health is better understood as an indication 
of morale, Dunkle (Note 6) found that chang- 
ing health was related to a change in morale 
among elderly women but not among elderly 
men. Parkes (1970) claimed that widows’ 


pave 2. 


6 Subsequent correspondence documented that these 


plans were carried out. 
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Table 7 
Life Changes Reported at Follow-up by 
Experimental Subjects 


Area 


CAROL J. BARRETT 


Physical and mental health 


Group reduced feeling that widow’s 
situation was unique 

More self-confidence 

More positive future outlook 

Received help from the group 

Personal illness 

Greater decision-making ability 

Reduced grief 

Increased goal directedness 

More insight into self 

Experienced feelings of inadequacy 
in relationships with men 

Group increased feeling of belonging 

Greater appreciation of life 

Reduced loneliness 


Home and family 


Improved family relationships 

Illness in family 

Moved 

Death in family 

Reunited with family members 

Separated from family members 

Renewed interest in improving 
home environment 

Less concern with family problems 

Increased pride in family member 


BONN wWw mee NNwWWUAAWTOO0 


ee 


Social activities 


More social contacts (nonspecific) 

Enjoyed group meetings 

Travel 

Positive experiences with men 

Joined a club 

Doing more things (nonspecific) 

Fewer social contacts 

A friend died 

Assumed jury duty 

New involvement in charitable work 

Received community recognition 
award 


eee Oh ED 


= 


Employment and education 


Went back to school 

Got a new job 

Quit or fired from job 
Experienced job frustration 
Received a scholarship 
Quit school 


mewn 


Note. n = 5 for no change. N = 51. 


global self-reports of current poor 
were related more to feelings of a 
irritability than to physical symptoms 
although data were not presented to doci 
this finding. j 

The treatment groups also were suce 
as catalysts for social interaction from 
test to follow-up. High rates of c 
among group members were generated, 
plans for continued meetings even 


results are often discouraging. Yet in 
study, treatment gains were maintaine 
several months, 

The simultaneous gains in self-es' 
intensity of grief merit discussion. Wi 
persons frequently complain that oth 
unwilling to listen to their negative f 
Unfinished “grief work” (Caplan, 1964 
be common in this population, despite 
of widowhood. Supported by a group of p 
there may be nothing contradictory 4 
the simultaneous expression of grief am 
creased positive regard for oneself. It is 
sible that the increased intensity 
reflects a new acceptance of negative 
tional reactions to widowhood and a I 
ing of their denial. In any event, the 4j 
ent increase in grief amidst an array 0 
positive changes casts doubt on the rai 
of treatment modalities for widowed pê 
that focus primarily on the reduction ol 
ings of sorrow, anger, depression, guilt 
so on, 

Of the three types of widows’ group 
neered in this research, the most consi 
effective method was the widows’ con 
ness raising group, and treatment oute 
were remarkably similar across therapists. 
high degree of structure in the consciou 
raising groups, which guaranteed 


TIn this regard, it is interesting that a pat! 
task of the confidant groups presumed to red! 
fairly high level of intimacy—the sharing Of 
phone numbers in order to plan a mutual # 
outside the group—became redundant, as 
numbers were spontaneously exchanged by 
all groups by midtreatment. 
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nities for all members to participate actively, 
may have contributed to their success. Alter- 
natively, the content of these groups may 
have been particularly therapeutic. The focus 
on women and sex role oppression provided 
a natural external target for anger. This emo- 
tion has long been observed to be facilitative 
in the treatment of depression. 

The self-help format on the whole was least 
effective, although no between-treatment dif- 
ferences emerged on a number of variables 
reflecting positive change. Most self-help 
groups do not have a professional leader. It 
is possible that the presence of a therapist 
in these groups was counterproductive, de- 
spite, or perhaps because of, her limited role. 

Response to the confidant groups was more 
variable. Both therapists found these groups 
the most difficult to lead, as departures in 
the session plans were often necessitated by 
absences or by a discussion of the members’ 
objections. The very idea of becoming par- 
ticularly well acquainted with only one widow 
appeared to foster apprehension. (The aver- 
age confidant subject in fact contacted more 
members during follow-up than subjects in 
either of the other treatments.) The confi- 
dant group strategy might be more effective 
if limited to widows who have no confidant 
prior to treatment. 

A major finding of this research was that 
substantial change occurred in al} groups in- 
cluding the control group. The promise of an 
extended small group experience in 2 months 
time may itself have had a therapeutic ef- 
fect on the waiting list controls, especially in 
view of the reduced opportunities for social 
interaction and the scarcity of helping re- 
sources in widowhood. This interpretation 1s 
consistent with my clinical observation that 
an enthusiastic, eager quality in the initial 
meeting of the waiting list discussion group 
replaced the more depressive tone in the in- 
itial meetings of the earlier groups. It may 
be worthwhile to explore the potential thera- 
peutic benefits of waiting periods with other 
clinical populations. 

The gains of the control group may also 
reflect a positive change bias in those who 
tolerated the waiting period and returned. 
However, this is unlikely as a total explana- 


tion, since the percentage of control subjects 
who did not return (35.8%) is not that dif- 
ferent from the percentage of subjects as- 
signed to a treatment group who never at- 
tended (29%). It could also be argued that 
waiting list subjects who improve on their 
own should be Jess likely to return for treat- 
ment. 

Attention should be addressed to the sub- 
ject characteristics that increase the probabil- 
ity of enrollment in a therapeutic program 
for widows and those that are associated with 
the greatest therapeutic impact. The fact that 
widows of all ages and duration of widowhood 
enrolled in the program refutes the views 
that the stresses of widowhood are limited 
to a particular age group and that only the 
recently widowed need help. Clearly, the 
strategy of placing widows varying widely in 
age and duration of widowhood ° in the same 
group was not a hindrance to the program’s 
effectiveness, and it may have been facilita- 
tive. Experienced widows probably functioned 
as role models for the newcomers, and the 
diversity of the members probably served to 
remind the widow of available options in her 
own life-style decisions. 

Compared to widows nationally, subjects in 
this research overrepresented the Jewish faith, 
high educational achievement, and prior 
upper-income status. The location of the pro- 
gram on a university campus and the required 
fee may have contributed to the observed 
sample bias. The generalizability of the re- 
sults awaits replication with other widowed 

mples. 

a eae of dependent variables in this 
research did not evolve as effective descrip- 
tors of the personal changes that accompany 
participation in a social service for widows 
(locus of control,” life satisfaction, frequency 
of physical complaints, attitude toward wid- 
owhood, and the subset of items from the 
radical versus conservative Attitudes Toward 
Women scale). Some of the measures may 
have been too global to reflect the changes 


esun 
8 extremely recent widows participated. i 
aha TLE scale may have limited applicability 
to an elderly population. Subject complaints about 
the irrelevance of items were common. 
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that occurred. Subsequent research might 
focus on the discrete types of life changes 
reported by subjects on the open-ended ques- 
tion, that is, the feeling of unique experience, 
attitude toward the future, self-confidence, 
extent of social contacts, and so on. 

This research has addressed the question 
of how best to facilitate therapeutic change 
among widows using a group format. It pro- 
vides the best documentation to date that 
psychological and social change can be fos- 
tered by innovative programs for the widowed. 
Mental health professionals as well as wid- 
owed persons should be able to assume an 
attitude of hopefulness instead of resignation 
with respect to the circumstances of widow- 
hood. Some subjects experienced far-reaching 
changes in several areas of their lives during 
the 6 months from the time they read about 
the program to the follow-up after their small 
group experience. A question for the future 
is how to predict which persons will profit 
most from a widows’ group. 
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Reduction of Test Anxiety Through Cognitive Restructuring 


Marvin R. Goldfried 
State University of New York at Stony Brook 


Marsha M. Linehan and Jean L. Smith 
The Catholic University of America 


This collaborative clinical outcome study compared two procedures for reduc- 
ing test anxiety with a waiting list control. In the first, systematic rational 
restructuring, participants were trained to realistically reevaluate imaginally 
presented test-taking situations. In the second, a prolonged exposure condition, 
the same hierarchy items were presented but with no instructions for coping 
cognitively. On the basis of questionnaire measures of test anxiety, greater 
anxiety reduction was found in the systematic rational restructuring condition, 
followed by the prolonged exposure group, with no changes for the waiting list 
control. Only those in the rational restructuring condition reported a significant 
decrease in subjective anxiety when placed in an analogue test-taking situation 
Participants in the restructuring condition also reported greater generalized 
anxiety reduction in social-evaluative situations. Within the broader context of 
cognitive behavior therapy, the results of the present investigation add to the 


increasing number of outcome studies indicating that the cognitive reappraisal 
of anxiety-provoking situations can offer a markedly effective treatment proce- 


dure for the reduction of anxiety, 


The incorporation of cognitive variables 
within behavior therapy represents a clear 
and unmistakable trend. Much of the current 
work in this area has been based on the clini- 
cal observation of Ellis ( 1962), who has ar- 
gued that modification of inappropriate expec- 
tations and beliefs could lead to behavior 
change. Until recently, however, Ellis’ ra- 
tional-emotive therapy had relatively little 
impact on the behavioral movement. The dif- 
ficulty of fitting Ellis’ approach into a behav- 
ioral orientation has been due, in part, to 
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the lack of clear therapeutic guidelines 
well as to the absence of an empirical dat 
base for its clinical effectiveness. This situa 
tion is clearly changing, and steps are cul: 
rently being taken to incorporate many 
Ellis’ concepts and procedures into the field 
of cognitive behavior therapy (Beck, 1976; 
Goldfried & Davison, 1976; Goldfried, Di 
centeceo, & Weinberg, 1974; Mahoney, 1974; 
Meichenbaum, 1977). A number of outco 
studies have appeared in the literature, dem 
onstrating that speech anxiety (Meichet 
baum, Gilmore, & Fedoravicious, 1971; Trex: 
ler & Karst, 1972), interpersonal anxiety (Di- 
Loreto, 1971; Kanter & Goldfried, Note 1) 
unassertive behavior (Thorpe, 1975; Wolfe 
& Fodor, in press; Linehan, Goldfried, & 
Goldfried, Note 2), and test anxiety (Hol 
toyd, 1976; Meichenbaum, 1972; Osarchuk, 
1976) can be reduced by intervention proce 
dures that focus on training individuals t0 
modify their unrealistic belief systems. 

Like much of the early behavior therapy 
research in general, therapeutic attempts 4! 
the reduction of test anxiety have focus 


COGNITIVE RESTRUCTURING OF TEST ANXIETY 


primarily on the use of systematic desensitiza- 
tion (Wine, 1971). The relevance of a more 
cognitively oriented approach in the treat- 
ment of test anxiety is noted by Wine (1971), 
whose review suggests that anxious individ- 
uals not only experience emotional arousal 
but also engage in excessive worry about the 
adequacy of their performance, Based on the 
assumption that test anxiety may be com- 
prised of both “emotionality” and “worry” 
components (Liebert & Morris, 1967; Morris 
& Liebert, 1969), Meichenbaum (1972) de- 
veloped a treatment package involving cog- 
nitive restructuring and modified systematic 
desensitization. Test-anxious subjects were 
provided with relaxation training, engaged in 
discussions of potentially unrealistic beliefs 
associated with test taking, and were then 
given practice in coping with imagined test- 
related situations by means of relaxation and 
self-instructions to focus on only the test 
itself. Compared with traditional systematic 
desensitization, this cognitive modification 
package produced greater reductions in test 
anxiety. Although Meichenbaum’s study dem- 
onstrated that a treatment approach that in- 
cludes the altering of cognitions can have an 
affect on reducing test anxiety, a more pre- 
cise interpretation of these findings is lim- 
ited by the inclusion of relaxation in the cog- 
nitive treatment package. Further, the expo- 
sure times for hierarchy presentation were 
longer for the cognitive modification subjects, 
since they were instructed to maintain the 
image while attempting to cope with their 
anxiety (Meichenbaum, Note 3). 

The primary purpose of the present study 
was to determine whether a treatment pro- 
cedure using only cognitive restructuring, 1- 
corporating the basic tenets of Ellis’ thera- 
peutic approach, could be successful in the 
reduction of test anxiety. The therapeutic 
procedure used was originally outlined by 
Goldfried et al. (1974), whose guidelines for 
the implementation of systematic rational re- 
structuring are presented within a social learn- 
ing framework. The treatment procedure, 
which is described in greater detail elsewhere 
(Goldfried & Davison, 1976), essentially ans 
volves the use of imaginally presented hier- 
archy items to provide individuals with prac- 
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tice in ferreting out unrealistic concerns and 
worries, affording them the opportunity to 
place each situation into a more realistic 
perspective, and then using their newly ac- 
quired skills to reduce anxiety in real-life 
situations. In the present study, this treat- 
ment procedure was compared with an ex- 
posure-alone condition, in which the hier- 
archy items were presented without any in- 
structions or directions for coping cognitively. 
The prolonged exposure group was included 
to control for possible extinction effects. 

A second purpose of this study was to dem- 
onstrate the feasibility of conducting col- 
laborative clinical outcome research with 
investigators at more than one setting. In an 
overview of the current status and future 
direction of psychotherapy research, Bergin 
and Strupp (1970) discussed the need for 
such coordinated or collaborative research ef- 
forts. As noted by them, a major difficulty 
associated with collaborative research in psy- 
chotherapy has been a general lack of stan- 
dardization. By focusing on a specific target 
problem—test anxiety—and using compara- 
ble therapeutic techniques and assessment 
procedures, we wished to demonstrate that 
such standardization problems could be over- 


come. 


Method 


Participants 


icipants were 15 men and 21 women who 
A ikai advertisements for treatment of test 
anxiety at the State University of New York at 
Stony Brook and The Catholic University of America. 
They ranged in age from 18 to 49 years and were 
not being seen in therapy elsewhere. Of the 42 sub- 
jects who originally began the study, 6 were elimi- 
nated for failure to complete the treatment. The 
attrition was distributed among the three conditions 
as follows: 2 in rational restructuring, 1 in pro- 
longed exposure, and 3 in waiting list control. 


Procedure 
i i ts, inter- 
onding to local advertisemen! yi 
esa individuals were sent a questionnaire battery, 


iption, and a consent form. Persons 
gies a consent form and questionnaire 
battery and who were available at scheduled ther- 
apy times were assigned by a within-sample match- 
ee technique (see Goldstein, Heller, & Sechrest, 
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1966) to systematic rational restructuring ("= 12), 
prolonged exposure (n= 13), or waiting list (n= 
11) conditions. Participants were then seen in per- 
son and were given a pretreatment analogue exami- 
nation and an associated assessment battery. Ther- 
apy, administered in groups over six 1-hour sessions, 
was standardized across the two treatment conditions. 
Participants were reassessed within the week fol- 
lowing the termination of therapy and at a 6-week 
follow-up. 

The study was carried out in two waves. Approxi- 
mately half of the individuals were seen at Stony 
Brook, and half participated 1 year later at Catho- 
lic University, Detailed therapy manuals, transcripts 
of therapy sessions, audiotapes of pilot sessions, and 
phone calls were exchanged to ensure maximum com- 
parability between the two locations. All other as- 
pects of the study were identical. 


Measures 


Two classes of measures were used in evaluating 
the results of the treatment procedures: a question- 
naire battery and subjective reports of anxiety pre- 
ceding an analogue examination situation. The ques- 
tionnaire battery consisted of several scales designed 
to measure test anxiety: the S-R Inventory of Anx- 
iousness (Endler, Hunt, & Rosenstein, 1962) for 
situations “taking an important exam” and “taking 
a weekly quiz”; the Suinn Test Anxiety Behavior 
Scale (Suinn, 1969); the Achievement Anxiety Test 
derived by Alpert and Haber (1960), which assesses 
facilitating as well as debilitating anxiety in ex- 
amination situations; and the Test Anxiety Ques- 
tionnaire (Mandler & Sarason, 1952), using Liebert 
and Morris’ (1967) Worry and Emotionality sub- 
scores. Measures included to assess generalization 
of therapy effects to areas other than test anxiety 
were the S-R Inventory of Anxiousness for the 
situations “giving a speech,” “going to a party,” 
and “interviewing for a job”; the Fear of Negative 
Evaluation and Social Avoidance and Distress scales 
(Watson & Friend, 1969); and the Trait scale of 
the State-Trait Anxiety Inventory (Spielberger, 
Gorsuch, & Lushene, 1970). Participants also re- 
sponded to a question measuring their expectations 
for benefit from therapy, using a 5-point scale rang- 
ing from “0% chance of success” to “100% chance 
of success,” 

After completing the questionnaire, participants 
were asked to take part in an actual “examination” 
situation. The examination was administered in 
small groups and consisted of the Wonderlic Per- 
sonnel Test (Wonderlic, 1959) and the Digits For- 
ward, Digits Backward, and Digit Symbol tests taken 
from the Weschler Adult Intelligence Scale. Partici- 
pants were told that the test they were taking was 
selected from standardized IQ tests and that the re- 
sults would be posted beside their names. Immedi- 
ately following these instructions and before taking 
the exam, participants were administered the State 
scale of the State-Trait Anxiety Inventory (Spiel- 
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berger et al., 1970); the Anxiety Differential (Hust. 
& Alexander, 1963); and the Experiences Question. 
naire, which consists of Taylor Manifest i 
subscores (Morris & Liebert, 1969). Posttesting in. 
cluded the identical questionnaire battery and p 


composed of alternate forms of the tests adminis 
tered during the pretest. At a 6-week follow-up 
participants in the two treatment conditions wer 
sent the same questionnaire battery plus forms for 
evaluating the program and their respective ther 
pists. Because of ethical considerations, individu; 
in the waiting list control were offered therapy i 
mediately following the posttest and were therefor 
not included in the follow-up assessment 


Therapy 


Therapy sessions in the systematic rational restruc 
turing and prolonged exposure conditions were con 
ducted in groups that met weekly for six 1-hour 
sessions. Participants who missed sessions listened to 
audiotapes of the missed session, carried out 
procedure described on the tape, completed in-session 
practice sheets, and returned these with homework 
assignments to an assistant. As a means of asses 
ing the credibility of each treatment rationale, 
ticipants were asked to estimate the amount of 
success that they expected from therapy at the coni 
clusion of the first session, 


Davison (1976). During the first session, the trea 
ment rationale and therapy procedures were out 
lined, and participants received practice in imagining) 
situations. During Sessions 2-6, the participants were 
presented with a standard 15-item hierarchy con 
structed on the basis of Stony Brook participant 
Pretest responses to the Suinn Test Anxiety Behav- 
ior Scale (Suinn, 1969), Three situations were Pre 
sented in each session, and participants were ind 
structed to imagine themselves in the situation 
throughout the presentation of each item and tof 
attempt to reduce their anxiety by means of 1a- 
tional restructuring. Each item was presented for # 
total of four 1-minute trials. Immediately following | 
each trial participants were instructed to record 
their self-defeating thoughts (eg, “I’m going 1 
fail this test, and then everyone’s going to 4 
Im stupid”), their rational reevaluation (eg | 
“Chances are I probably won’t fail, And even if 
do, people probably won’t think I’m stupid. And 
even if they do, that doesn’t mean that I a| 
stupid”), and their anxiety levels before and afte | 
reevaluating. A brief group discussion followed the 
fourth presentation of each item. Participants were 
instructed to practice their reevaluation skills in viv? 
and were provided with homework sheets that sé 

as the basis for discussion at the outset of the fo 
lowing session. 
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Prolonged exposure condition. A social-learning- 
based rationale was presented to subjects in this 
condition, emphasizing the importance of habitua- 
tion and extinction in the reduction of anxiety. The 
first session of the treatment was identical in gen- 
eral format to that in the systematic rational re- 
structuring condition. The same hierarchy presen- 
tation used for the restructuring condition was used; 
during scene presentations in this condition, how- 
ever, participants were instructed to focus on their 
emotional reactions. In-session record forms were 
provided for the recording of feelings noted at the 
beginning and end of each imaginal presentation, 
and a group discussion was held after all four pres- 
entations of each item. Homework assignments, 
discussed at the beginning of each session, consisted 
of attending to their anxiety reactions in everyday 
situations, 

Waiting list control. Participants in this condi- 
tion were informed that because of the limited thera- 
peutic time available, there would be a brief delay 
before they could be seen in treatment. Participants 
received the same pretest and posttest assessment 
battery as was administered to individuals in the 
two therapeutic contact conditions, and they were 
then offered therapy following the posttesting. 


Therapists 


The first and second authors served as therapists 
at their respective universities. A detailed therapy 
manual was used, hierarchy scenes were written out 
fully and were delivered verbatim, and transcripts 
and tapes of pilot sessions were exchanged to mini- 
mize any differences between the therapists. Both 
in-person meetings and frequent phone consultations 
during the progress of the experimental sessions in- 
sured that therapy and subject problems were han- 
dled in the same way by both therapists, and that 
the assessment procedures were administered in a 
comparable fashion. Each therapist had contact with 
approximately the same number of participants in 
each treatment condition. 


Results 


Separate univariate analyses of variance 
conducted on each dependent variable indi- 
cated no significant pretest differences among 
the three groups on any of the measures used 
in the study. Inasmuch as comparisons be- 
tween therapists failed to reveal any main 
effect differences, the data from the two unt- 
versities were combined in all subsequent 
analyses.1 


Pre-Post Treatment Effects 


Questionnaire battery. The results of a 
way analyses of covariance on each of the 
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questionnaire variables are presented in Table 
1. On variables designed to measure test anx- 
iety, there were significant treatment effects 
on the S-R Inventory of Anxiousness for both 
the “exam” and “quiz” situations, the Suinn 
Test Anxiety Behavior Scale, and the Achieve- 
ment Anxiety Test Debilitating Anxiety scale, 
Newman-Keuls tests for differences between 
adjusted means revealed that participants in 
the systematic rational restructuring condi- 
tion reported significantly less anxiety in 
exam situations, less anxiety on the Suinn 
scale, and less debilitating anxiety than did 
participants in the prolonged exposure con- 
dition. Both groups reported less anxiety on 
these three measures than did individuals in 
the waiting list control. Participants in the 
two therapy conditions did not differ from 
each other on S-R inventory reports of anx- 
iety in quiz situations, but both reported sig- 
nificantly less quiz anxiety than did those 
on the waiting list. There were no significant 
differences among groups on the Facilitating 
Anxiety scale or on the Worry or Emotional- 
ity subscales of the Test Anxiety Question- 
naire. 

Significant generalization effects were found 
on the S-R Inventory of Anxiousness for 
“party” and “job interview” situations and 
on the Fear of Negative Evaluation and the 
Social Avoidance and Distress scales. New- 
man-Keuls comparisons of adjusted means 
indicated that participants in systematic ra- 
tional restructuring reported less Fear of 
Negative Evaluation and Social Avoidance 
and Distress than those in either the pro- 
longed exposure or the waiting list control 
conditions, which did not differ significantly 
from each other. On the S-R Inventory of 
Anxiousness for both party and job interview 
situations, participants in the systematic ra- 
tional restructuring condition reported less 
anxiety than those on the waiting list; no dif- 
ference was found between the restructuring 
and prolonged exposure groups. There were 
cant differences among groups on 


ignifi £ 3 
the “Trait scale of the State-Trait Anxiety 


deviations for pretest, 


1The means and standard bei 


posttest, and follow-up assessments are @ 
from the first author. 
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Table 1 
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One-Way Analyses of Covariance and Adjusted Means at Posttesting 
a FT a ED a T R a 


Adjusted treatment M 


Systematic 
rational Prolonged Waiting 
Measure F restructuring exposure list 
Questionnaire Battery; Test Anxiety 
S-R inventory: exam 10.10*** 35.674 41,38, 46.55. 
S-R inventory: quiz 6.71** 30.575 34.67, 42.40, 
Suinn Test Anxiety Behavior Scale 15.81*** 116.55, 149.06, 171.86, 
Achievement Anxiety Test: debilitating 14.11*** 27.23, 31.27, 37.52, 
Achievement Anxiety Test: facilitating 1.70 21.11, 21.79, 18.85, 
Test Anxiety Questionnaire: worry 2.75 37.52, 41,78, 44.96, 
Test Anxiety Questionnaire: emotionality 2.22 34.01, 38.18, 38.14, 
Questionnaire Battery: Generalization 
S-R inventory; speech 1.03 33.93, 35.78, 38.34, 
S-R INNERE: party 3.49* 25.91, 31.40,» 33.62, 
S-R inventory: job 3.64* 31.05, 36.794. 39,46, 
Fear of Negative Evaluation 6.37** 10.98, 15.79, 16.55p 
Social Avoidance & Distress 8.78*** 5.04, 9.66, 13.0% 
Trait Anxiety scale of the STAI 1.87 41.31, 46.52, 47.15, 
Preexamination anxiety 

State Anxiety scale of the STAI 2.72 35.464 39.98, 42,43, 
Anxiety Differential <i 62.23, 62.41, 63.365 
Experiences Questionnaire: worry 4.62* 7.525 9.08, 11.33 
Experiences Questionnaire: emotionality 3.08 6.91, 8.72, 11.79% 


Note. Means in the same row with different subscripts differ from each other at least at the .05 level, For 


analyses of covariance, df 
*p <05. 
**p <01. 
“> < .001. 


= 2, 32. STAI = State-Tra 


Inventory or the “speech” situation on the 
S-R inventory. 

Focusing specifically on the direction of 
change, analyses of pre-post within-group dif- 
ferences by ¢ tests (see Table 2) indicated a 
significant improvement in systematic rational 
restructuring across all but one measure de- 
signed to measure test anxiety (the Achieve- 
ment Anxiety Test Facilitating scale) and on 
all but one measure of treatment generaliza- 
tion (the Trait scale of the State-Trait Anx- 
iety Inventory). In contrast, the prolonged 
exposure participants improved significantly 
on only three variables; the Achievement 
Anxiety Debilitating scale, the Worry sub- 
scale of the Text Anxiety Questionnaire, and 
the Social Avoidance and Distress Scale, No 
pre-post differences were found on any mea- 
sure for the waiting list control, Although 


it Anxiety Inventory. 


these within-group changes cannot be inter- 
preted in terms of difference between the two. 
treatment groups, they are nonetheless con- 
sistent with the overall superiority of the sys- 
tematic rational restructuring group. 
Examination situation. One-way analyses 
of covariance were carried out separately on 
posttest scores for the State scale of the State- 
Trait Anxiety Inventory, the Anxiety Differ- 
ential, and the Worry and Emotionality sub- 
scales of the Experiences Questionnaire. As 
can be seen in Table 1, a significant mail 
effect for treatment was found only on the 
Worry subscale. Newman-Keuls individual 
comparisons indicated that participants wb? 
received rational restructuring reported les 
preexam worry than those in the waiting list 
condition. No significant difference was foun 
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between the systematic rational restructuring 
and the prolonged exposure conditions, 
Analyses of pre-post differences, summar- 
ized in Table 2, indicated a significant re- 
duction in subjective anxiety for the sys- 
tematic rational restructuring condition on all 
but one variable, the Anxiety Differential. 
There was no significant pre-post improve- 
ment on any of the four measures for either 
the prolonged exposure or waiting list control. 


Follow-up Evaluation 


Analyses of covariance were carried out to 
test whether treatment differences were main- 
tained on the questionnaire battery measures 
at follow-up 6 weeks after the termination of 
therapy. Results indicated that systematic 
rational restructuring participants reported 
significantly less anxiety than prolonged ex- 


posure participants on the Suinn Test Anx- 
iety Behavior Scale, F(1, 22) = 8.25, p< 
01; the S-R Inventory of Anxiousness party 
situation, F(1, 22) = 5.09, p< .05; and the 
Social Avoidance and Distress Scale, F(1, 22) 
=5.71, < .05. In addition, there was a 
trend at the .10 level indicating less debilitat- 
ing anxiety on the Achievement Anxiety Test 
in the systematic rational restructuring group 
as compared to the prolonged exposure group. 
There were no other significant differences 
between the two groups. 

At the follow-up assessment, participants 
were asked to rate the therapy that they had 
received on several dimensions. Analyses by 
t tests indicated that participants in the sys- 
tematic rational restructuring condition, com- 
pared to those in the prolonged exposure con- 
dition, were generally more satisfied with 
changes in themselves, ¢(22) = 2.68, p < .05. 


Table 2 
Within-Group Mean Differences from Pretesting to Posttesting 
Condition 
Systematic ; 
rational Prolonged Waiting 
Measure restructuring exposure list 
Questionnaire battery: Test anxiety 

S-R inventory: exam tat a my 
S-R inventory: quiz 8. ae ae aae 
Suinn Test Anxiety Behavior Scale : ae aS PETA ut 
Achievement Anxiety Test: debilitating ži ; is ae 
Achievement Anxiety Test: facilitating - ER TENT E 
Test Anxiety Questionnaire: worry TE a vith 

Test Anxiety Questionnaire: emotionality i K 
Questionnaire battery : Generalization ji 

t . 
S-R inventory: speech ros sen -209 
S-R inventory: party 817 308 93 
S-R inventory: job ; 616° 138 18 
Fear of Negative Evaluation 659% 400° -127 
Social Avoidance & Distress 367 293 =~ 1100 
Trait Anxiety scale of the STAI 
Preexamination anxiety 

11.58** 4.23 3.63 
State Anxiety scale of the STAI A -116 my 
Anxiety Differential 3ig3e 1.00 27 
Experiences Questionnaire: worry 3125% 1.00 ‘00 


Experiences Questionnaire: emotionality 


ae . STAI = 
Note. Significance of change is based on within-group # tests. 


*p < 05. 


State-Trait Anxiety Inventory. 
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The groups did not differ in their ratings of 
the helpfulness of therapy for test anxiety or 
on their estimate of the generalization effects 
of the therapies. 


Expectancies and Liking of Therapist 


During the pretest, participants were asked 
to rate their expectations for becoming less 
anxious in test-taking situations. This initial 
expectancy rating was used as one of the 
variables for the within-sample matching 
procedure. A check on this matching proce- 
dure was conducted by a one-way analysis 
of variance, which failed to reveal any sig- 
nificant differences among the three condi- 
tions, To check whether the two therapy ra- 
tionales were equally credible, participants 
again rated their expectancy for success on 
the same 5-point scale after the first session. 
The absence of a significant ż-test difference 
Suggests that the two groups were compara- 
ble with regard to the nonspecific effects asso- 
ciated with expectancy of improvement and 
treatment demand characteristics. 

At follow-up, participants in the two 
contact conditions were asked to rate their 
therapist on the following 5-point bipolar 
scales: incompetent-competent, unlikable-lika- 
ble, not understanding-understanding, aloof- 
warm, and uncomfortable-comfortable. The 
ratings from the two conditions were all to- 
ward the positive pole and were not signifi- 
cantly different from each other on any of 
the scales. These findings suggest that differ- 
ences between the two treatments cannot be 
attributed to nonspecific therapist differences 
while conducting the two treatments, 


Discussion 


The results of this study revealed a con- 
sistent pattern: Based on questionnaire mea- 
sures of test anxiety, participants in the sys- 
tematic rational restructuring condition ex- 
perienced greater anxiety reduction, followed 
by those having undergone prolonged expo- 
sure to the same hierarchy items, whereas the 
waiting list control did not change. Only par- 
ticipants in systematic rational restructuring 
reported a significant pre-post decrease in 
subjective anxiety before an analogue test- 
taking situation. In addition to the reduction 
of test anxiety, participants in the restructur- 
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ing condition also reported greater generaliza- 
tion of anxiety reduction in social-evaluative 
situations. 

Although the systematic rational restruc 
turing procedure was found to be the most 
effective of three conditions, exposure alone 
also produced significant anxiety reduction, 
These findings corroborate the results of sev- 
eral other studies that have demonstrated 
that hierarchy exposure alone, particularly if 
it is prolonged or repeated, can be an effec- 
tive therapeutive procedure for anxiety re- 
duction (e.g., DZurilla, Wilson, & Nelson, 
1973; Goldfried & Goldfried, 1977; Malleson, 
1959). These therapy-analogue findings are 
consistent with the animal literature on ex- 
tinction of avoidance via exposure and re- 
sponse prevention (Wilson & Davison, 1971), 
As indicated in the present study, however, 
active attempts to cope with one’s anxiety 
by means of cognitive reappraisal adds to any 
anxiety reduction associated with extinction 
or habituation and greatly facilitates gen- 
eralization to nontreated targets. 

This study has also demonstrated that col- 
laborative research with investigators at dif- 
ferent settings is possible and that it can be 
a useful approach to increasing the number 
of participants available in clinical outcome 
research and facilitating the generalizability 
of the findings. Such collaborative research 
efforts are feasible provided that close co- 
ordination and appropriate methodological 
Precautions are taken, such as insuring the 
standardization of treatment procedures, com- 
Parability of participants, and equivalence of 
Outcome measures, 

Although Meichenbaum’s (1972) cognitive 
modification procedure for reducing of test 
anxiety consisted of a treatment package 
containing rational-emotive therapy, self-in- 
structions, and relaxation components, Our 
results reveal that cognitive restructuring 
alone is effective. It should be noted, how- l 
ever, that the cognitive restructuring proce- 
dure used by Meichenbaum is somewhat dif- | 
ferent from the intervention procedure used | 
in the present study. Although Meichenbaum 
included a rational-emotive therapy compo- 
nent, the self-instructions used during hier- 
archy presentation involved self-statements 
to encourage participants to pay attention t0 


Bc 
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the task at hand. By contrast, participants in 
rational restructuring were taught to tune in 
to any of their unrealistic concerns and wor- 
ries and to put them into a more realistic 
perspective. Thus, the results of the present 
study add to the increasing number of out- 
come studies indicating that the cognitive re- 
appraisal of anxiety-provoking situations is a 
markedly effective treatment procedure for 
the reduction of anxiety. 
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Psychological Androgyny and Interpersonal Behavior 


Jerry S. Wiggins and Ana Holzmuller 
University of British Columbia, Vancouver, Canada 


Bem’s measure of psychological androgyny was derived from only two relatively 
desirable dimensions of interpersonal behavior that may, or may not, implicate 
other less desirable traits that are sex role stereotyped. From an item pool of 
1,710 trait-descriptive adjectives, sets of masculinity and femininity scales were 
assembled that were comparable to “traditional” scales and to those developed 
by Bem and by Heilbrun. The pool also contained items from eight scales that 
form an interpersonal circumplex. One hundred eighty-seven college men and 
women who rated themselves on the 1,710 adjectives were classified as stereo- 
typed, near-stereotyped, or androgynous by Bem’s criteria. Bem’s measure of 
psychological androgyny appears to reflect a highly generalizable personological 
construct that implicates both desirable and undesirable dimensions of inter- 
personal behavior. Heilbrun’s scales are both empirically and conceptually sim- 
ilar to Bem’s, and both scale sets differ from traditional masculinity—femininity 
measures. There is a possibility that androgynous men are more flexible in their 
interpersonal behavior than androgynous women, 


Until very recently, the psychological con- 
struct of “masculinity-femininity” has been 
characterized by a conceptual fuzziness that 
has permitted unrelated and even contradic- 
tory attributes of persons to be viewed as in- 
dicants of a monolithic process. Constanti- 
nople’s (1973) critical review of the Major 
psychological tests of masculinity—femininity, 
developed during the last 40 years, cleared 
the air for new approaches to an old problem. 
Bem’s (1974) work on the construct of psy- 
chological androgyny is one such new ap- 
proach, Her formulations place a unique em- 
phasis on certain classes of interpersonal be- 
havior that are generally considered more 
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desirable or appropriate for one sex than the 
other. 

Bem (1974) asked male and female judges 
to rate the desirability of traits for men and 
women separately. Traits that were rated by 
both sexes as more desirable for men than for 
women were called masculine, and traits that 
were rated by both sexes as more desirable 
for women than for men were called feminine. 
Traits that were rated by both sexes 4 
equally desirable for men and women welé 
called neutral. Scales were developed from 
these item pools and combined to yield a mea- 
sure of “psychological androgyny.” Specifi- 
cally, the algebraic difference between femi- 
ninity and masculinity scales was taken to 
be an index of androgyny. Individuals scoring 
high on this index are feminine stereotypes 
individuals with negative scores are masculine 
stereotypes, and individuals with scores neat 
zero are androgynous. In a series of studies 
Bem (1975; Bem & Lenney, 1976; Beth 
Martyna, & Watson, 1976) has marshaled 
evidence that androgynous persons are flexi- 
ble in their social behavior and that they ca" 
vary their behavior according to situatio 
demands, rather than according to sex r0lé 
Stereotypes. 


Copyright 1978 by the American Psychological Association, Inc. All rights of reproduction in any form reserved. 
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PSYCHOLOGICAL ANDROGYNY AND INTERPERSONAL BEHAVIOR 


Other than restricting her inquiry to desira- 
ble interpersonal traits, Bem did not specify 
the manner in which the universe of content 
of interpersonal behavior was defined. There 
is now considerable precedent for defining the 
universe of content of interpersonal behavior 
with reference to a two-dimensional circum- 
plex of variables of the kind illustrated in 
Figure 1 (eg., Becker & Krug, 1964; Ben- 
jamin, 1974; Carson, 1969; Foa & Foa, 1974; 
Leary, 1957; Lorr & McNair, 1963; Rinn, 
1965; Schaefer, 1959; Stern, 1970; Swensen, 
1973). The system represented in Figure 1 
was developed as part of a larger project 
whose eventual aim is a comprehensive taxon- 
omy of trait-descriptive terms (Wiggins, Note 
1). Like the Leary (1957) system of inter- 
personal behavior on which it is based, the 
eight variables illustrated in Figure 1 can 
be decomposed into 16 more narrowly defined 
variables. In principle, the number of differ- 
ent variables is limited only by the reliability 
with which respondents can distinguish be- 
tween closely synonymous words and phrases. 

Inspection of the items comprising Bem’s 
Masculinity scale revealed that virtually all 
of the classifiable items fell within the domi- 
nant-ambitious vector of our interpersonal 
model (e.g., assertive, ambitious, dominant, 
forceful). Similarly, most of the classifiable 
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items in Bem’s Femininity scale fell within 
the warm-agreeable vector of the same cir- 
cumplex (e.g., affectionate, compassionate, 
sympathetic, tender). Lippa (1977) reported 
the results of an item analysis of the Bem 
Sex-Role Inventory (BSRI) in which the 
items dominant and assertive were the best 
markers of the Masculinity scale and the 
items compassionate and tender were the best 
markers of the Femininity scale, Bem (in 
press) is well aware of the interpersonal con- 
tent of her scales and uses the terms instru- 
mental and agentic to describe the Masculin- 
ity scale and the terms expressive and com- 
munal to describe the Femininity scale. 

Since the two vectors at issue are known 
to be orthogonal, it is clear why Bem dis- 
covered a nonbipolar dimension of masculin- 
ity-femininity. But it is also clear that Bem’s 
definition of psychological androgyny is based 
on only two of the eight major dimensions 
of interpersonal behavior and that it ignores 
possible sex role stereotypes related to unde- 
sirable interpersonal behavior. It could also 
be the case that the flexibility associated with 
Bem’s definition of psychological androgyny 
is a more general personality characteristic 
that subsumes sex role stereotypes. According 
to the logic of the original Leary (1957) 
system, the self-actualized individual is one 


DOMINANT - 
AMBITIOUS 
o 
GREGARIOUS - 
ARROGANT - o EXTRAVERTED 
CALCULATING o 
WARM - 
COLD - © AGREEABLE 
QUARRELSOME © 
ALOOF - UNASSUMING - 
INTROVERTED o RETA 
o 
LAZY - 
SUBMISSIVE 


Figure 1. Eight-variable represi 


entation of interpersonal behavior. 
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who is capable of responding in all inter- 
personal dimensions as the situation demands. 
Hence, the actualized person is one who has 
a relatively flat profile of interpersonal traits. 
Such an individual would, by definition, be 
androgynous. However, an androgynous per- 
son would not necessarily be actualized by 
Leary’s definition, 

In the research to be reported here, the 
self-presentations of masculine-stereotyped, 
feminine-stereotyped, and androgynous per- 
sons are examined with reference to a much 
broader universe of personological content 
than has previously been used. Although par- 
ticular attention was paid to the interpersonal 
dimensions illustrated in Figure 1, the uni- 
verse of content sampled was rigorously rep- 
resentative of the entire domain of what All- 
port and Odbert (1936) called stable bio- 
physical traits. In addition, the relationship 
between bipolar and orthogonal definitions of 
sex differences was examined in the medium 
of self-report. 


Method 
Subjects 


The data used in the present analyses were pro- 
vided by Lewis R. Goldberg. A group of 204 stu- 
dents in an introductory personality class at the 
University of Oregon rated the self-applicability of 
1,710 trait-descriptive adjectives on a 9-place Likert 
scale ranging from “extremely inaccurate” to “ex- 
tremely accurate.” The students received course 
credit for this exercise, and they were promised 
feedback at a later date. The adjectives were par- 
titioned into 10 blocks, and the order in which these 
blocks were administered was counterbalanced across 
the sample. The students were allowed to complete 
the test forms at their own Pace over a period of 
weeks under specific instructions to stop when fa- 
tigued or bored. The resultant protocols were checked 
for carelessness, as indicated by semantic incon- 
sistency, and 17 subjects were eliminated on this 
basis. The total number of complete usable records 
was thus 187 (117 females and 70 males). 


Item Pool 


The pool of items used has historical roots in the 
work of Allport and Odbert (1936), who identified 
4,504 terms in Webster’s (1925) New International 
Dictionary that were considered genuine Personality 
traits. Norman (Note 2) expanded this list by con- 
sulting Webster’s Third New International Diction- 
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ary, Unabridged, and, by a variety of procedures, 
reduced the new list to 2,800 terms. This revi 
list was, in turn, reduced by Goldberg and Norman 
to the 1,710 adjectives used in the present study, 
This list of 1,710 adjectives (Goldberg, Note 3) is 
thus broadly representative of terms in the English 
language that describe stable personality traits, 


Sex Role Inventory 


Forty-three of the 60 items in the BSRI appear 
in the item pool just described. The majority of 
items that do not appear are phrases rather than 
single items. Consequently, single adjectives were 
identified, in the 1,710 pool, that were closely syn- 
onymous with the BSRI phrases (e.g., “decisive” for 
“makes decisions easily”). Synonymous adjectives 
were easily found for all but two of the adjectives 
and phrases in the BSRI: “athletic” on the Mas- 
culinity scale and “loves children” on the Feminin- 
ity scale. Thus, our revised single-adjective sex role 
inventory consisted of a 19-item Femininity scale, 
a 19-item Masculinity scale, and a 20-item Neutral 
scale, 

Lacey (Note 4) administered both the original 
BSRI and our single-adjective sex role inventory to 
110 young men and women from the Vancouver, 
Canada, area. The Femininity, Masculinity, and 
Neutral scales from the two inventories correlated 
92, .97, and .96, respectively, in that sample. Com- 
parable alpha coefficients were also obtained for the 
two sets of scales. Thus, the measure of psycho- 
logical androgyny used in the present study is em- 
Pirically equivalent to Bem’s, although it is based 
on slightly different items. 


Interpersonal Circumplex 


Sixteen eight-item scales were used to measure the 
interpersonal dimensions named in Figure 1 (egs 
dominant, ambitious, arrogant, calculating). These 
16 scales were scored as octants by combining ad- 
jacent variables (e.g., dominant-ambitious, arrogant- 
calculating). The eight-variable display in Figure 1 
is an actual empirical plot of the loadings of these 
eight scales on two hand-rotated principal com- 
ponents extracted from the intercorrelations among 
the scales in the present sample of 187 subjects. Al- 
though these scales were mainly developed on the” 
Present sample (Wiggins, Note 1), the same clear 
circumplex structure has been found in samples of 
Australian and Canadian students as well (Wiggins 
& Marston, Note 5). 


Masculinity—Femininity Scales 


We constructed two relatively pure measures of 
self-reported “gender,” as opposed to sex-related 
traits, by clustering items that almost necessarily 
would be answered differently by men and women: 
A five-item Woman scale (feminine, girlish, lady- 
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like, unfeminine, womanly) had an alpha coefficient 
of .96 in the total sample. A five-item Man scale 
(manly, masculine, unmanly, unmasculine, virile) 
had an alpha coefficient of .93. The correlation be- 
tween the two scales (r= —.91) approached their 
estimated reliabilities. Needless to say, these two 
scales, and the items that comprise them, have non- 
overlapping distributional properties in male and 
female samples. The scales are, in effect, caricatures 
of “traditional” masculinity-femininity scales based 
on items that empirically discriminate men from 
women. 

Although in some respects, a traditional masculin- 
ity-femininity scale (Constantinople, 1973, p. 397), 
the Masculinity-Femininity scale of Gough and 
Heilbrun’s (1965) Adjective Check List (ACL) at- 
tempts “to capitalize on both biological and psy- 
chological sex differences” (Heilbrun, 1976, p. 185). 
The items were selected on the basis of their ability 
to discriminate between college males identified with 
masculine fathers and college females identified with 
feminine mothers (Cosentino & Heilbrun, 1964). In- 
spection of the 28-item masculine subscale and the 
26-item feminine subscale reveals that the scales have 
highly similar contents to Bem’s Masculinity and 
Femininity scales. The items masculine and feminine 
are included in the two scales, as they are in Bem’s, 
but in general the two scales are saturated with 
dominance and nurturance, respectively. Although 
originally scored as a single scale (masculine minus 
feminine), normative and psychometric information 
is now available for both subscales (Heilbrun, 
1976). In the present pool of 1,710 adjectives, all 
but 2 adjectives (“handsome” and “strong”) were 
available for the masculine subscale and all but one 
(“praising”) for the feminine subscale. The self- 
applicability of the adjectives was evaluated using 
a 9-place scale rather than with the usual check- 
list format. 


Experimental Design 


In the sample of 187 subjects, we identified groups 
of stereotyped, near-stereotyped, and androgynous 


Table 1 

Percentage of Subjects in the Oregon Sample 

Classified as Masculine, Feminine, or 

Androgynous 

—— Ea a eee 
Classification Males* Females? 


Feminine (t > 2.026) 3 
Near feminine (1.01 < £ 
p220) 9 28 
ndrogynous (—1.01 < £ 
< +1.01) EE 
Near masculine (—2.026 <# 1 
<= 1.01) 30 5 
Masculine (¢ < —2.026) 11 
“n = 70. 
*n = 117, 
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men and women using Bem’s (1974) t-tatio cri- 
teria Table 1 shows the percentage of subjects in 
each of the groups. The 11 “sex-reversed” subjects 
were not included in the present analysis. Within a 
3x 2 analysis of variance design, subjects were clas- 
sified by group (stereotyped, near-stereotyped, an- 
drogynous) and by sex (men, women). Scores on 
the interpersonal adjective scales were entered sep- 
arately as dependent variables. For each interpersonal 
variable, this permitted an evaluation of main effects 
for groups (stereotyped and androgynous groups 
irrespective of gender), for gender (sex differences 
in responding irrespective of groups), and for the 
interaction between these two factors, 


Results and Discussion 
Sex Roles and the Interpersonal Circumplex 


As expected, a main effect for groups was 
not found for any of the eight interpersonal 
variables. Stereotyped, near-stereotyped, and 
androgynous groups did not differ in self- 
report on the interpersonal variables when the 
sexes were combined, However, as can be seen 
from Table 2, there were highly significant 
sex differences for all but one (aloof-intro- 
verted) of the interpersonal variables. Men 
scored higher on dominant-ambitious, arro- 
gant-calculating, and  cold-quarrelsome, 
whereas women scored higher on lazy-sub- 
missive, unassuming-ingenuous, warm—agree- 
able, and gregarious-extraverted. Sex differ- 
ences in self-report are not restricted to the 
relatively desirable vectors of dominance and 
nurturance; they occur in almost all sectors 
of the interpersonal circumplex. 

To what extent are the sex differences 
on these interpersonal variables related to the 


1 t ratio is based on the algebraic difference 
e femininity and masculinity scores for each 
subject. There are a number of shortcomings to such 
a measure (Strahan, 1975), not the least of which 
is its inability to distinguish androgynous subjects 
who score high on both masculinity and femininity 
from ‘androgynous” subjects who score low on both 
masculinity and femininity. As a consequence, Bem 
(1977) now recommends a fourfold classification 
based on median splits on masculinity and femininity. 
However, designs such as the present one that ana- 
lyze interactions between sex role classification and 
other variables require the additional rex wy 
typed” groups yielded by the difference score pete 
Within this design, it js still possible to note dif- 
ferences between high-high and low-low androgy- 
nous subjects. 
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Table 2 \ 
Analyses of Variance of Interpersonal Variables 
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—— eee 


Sex (A) Group (B) AXB 

Variable F $ F è F t 
Dominant-ambitious 21.3 .00001 3 ns 30.6 00000 
Arrogant-calculating 24.6 -00000 3 ns 3.4 04 
Cold-quarrelsome 45.4 00000 3 ns 2.3 ns 
Aloof-introverted 2.3 ns 1.9 ns 7.5 .0008 
Lazy-submissive 15.2 .0001 2.3 ns 14.6 .00000 
Unassuming-ingenuous 39.1 .00000 1.3 ns 9.2 0002 
Warm-agreeable 33.2 -00000 1.3 ns 5.9 .003 
Gregarious-extraverted 13.8 0003 1.7, ns 2.6 ns 


stereotyped, near-stereotyped, and androgy- 
nous classifications of the sex role inventory? 
From the final column of Table 2, it can be 
seen that significant interactions between gen- 
der and sex role classification occurred on six 
of the eight interpersonal variables. The pat- 
tern of these interactions is consistent with 
Bem’s concept of androgyny for all but one 
(again, aloof-introverted) of the six inter- 
personal variables. The greatest difference in 
self-report occurred between male and female 
stereotypes, the next greatest between male 
and female near stereotypes, and the least dif- 
ference between male and female androgy- 
nous subjects. Despite substantial overall sex 
differences on arrogant-calculating, lazy—sub- 
missive, unassuming-ingenuous, and warm- 
agreeable, there were no statistically reliable 
differences between androgynous men and 
women on these variables.? Although the dif- 
ference between androgynous men and women 
on dominant-ambitious was statistically sig- 
nificant, it involved a “crossover” effect that 
is illustrated in Figure 2. Androgynous women 
presented themselves as significantly more 
dominant-ambitious than androgynous men. 
A similar crossover occurred on lazy-submis- 
sive, although the difference between androgy- 
nous men and women was not significant. 
These results suggest that Bem’s measure 
of androgyny reflects a highly generalizable 
personological construct. Persons classified as 
androgynous by the sex role inventory are 
nonstereotypic, not only in the realms of 
dominant-ambitious and warm-agreeable be- 
havior but in the realms of arrogant-calculat- 
ing, lazy-submissive, and unassuming—ingenu- 


ous behaviors as well. Persons classified as 
sex role stereotyped by the Bem inventory 
are stereotyped on the same five dimensions 
of interpersonal behavior. 

The data suggest that the undesirable di- 
mension of cold—quarrelsome behavior is not 
the functional opposite of the desirable warm- 
agreeable dimension, with respect to sex role 
stereotypy. The trend of cold—quarrelsome 
is clearly for the largest sex differences to 
occur in the stereotyped groups and the 
smallest in the androgynous groups. However, 
the interaction of sex and group classification 
was not significant, and there was no trend 
toward a crossover effect as there was with 
warm-agreeable. In general, none of the sub- 
jects presented themselves as particularly cold 
and quarrelsome, although men were more 
willing to do so than women. 

Striking exceptions to the foregoing trends 
occurred with reference to aloof-introvert 
and gregarious-extraverted, which represent 
poles of the ubiquitous “introversion—extta | 
version” dimension of personality research. 
On both variables the differences betwee? — 
androgynous men and women were significant, | 
whereas those between stereotyped men and 
women were not. The results of the analysis 
of aloof-introverted are particularly anoma 
lous. As can be seen from Figure 3, the tw? 
stereotyped groups are indistinguishable, 25 
are the two near-stereotyped groups. How 
ever, androgynous women presented them- 


2 All post hoc comparisons between sex role groups 
were evaluated by the Newman-Keuls procedure W! 
alpha set at p < 05. 
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selves as significantly less introverted than 
androgynous men, and, in fact, significantly 
less introverted than all groups. Thus, there 
is a significant interaction between sex and 
group classification in the absence of signifi- 
cant main effects for the latter two factors. 
A similar trend emerged from the analysis of 
gregarious-extraverted, although no interac- 
tion or crossover trend was evident. All fe- 
male groups reported more extraversion than 
their male counterparts, although the only 
significant difference occurred between an- 
drogynous women and men. 

It is tempting to speculate on the reasons 
for the lack of generalizability of Bem’s index 
of psychological androgyny to the domains 
of introverted and extraverted behaviors. It 
appears that introversion is not a sex stereo- 
typed role, as evidenced by the lack of over- 
all gender differences, although androgynous 
women clearly see themselves as less intro- 
verted than their male counterparts. Although 
there are significant gender differences in 
self-reported extraversion, the lack of an in- 
teraction with sex role classification raises 
the possibility that this relatively desirable 
variable is not sex stereotyped either. If one 
thinks of the gregarious-extraverted social 
role as a blend of dominance and nurturance, 
then it is understandable that both men and 
women could assume it without violating 
Stereotypes. 


Sex Roles and Trait-Descriptive Adjectives 


To what extent do the sex role groups dif- 
fer among themselves on variables other than 
those of the interpersonal domain? The 1,710 
adjectives to which all subjects responded 
embrace a variety of personal characteristics 
other than the strictly interpersonal. Although 
a final taxonomy of this item pool has not 
yet been achieved, a preliminary set of dis- 
tinctions has been made among the domains 
of interpersonal traits (dominant, warm), 
temperamental traits (animated, high-strung), 
character traits (moral, unprincipled), ma- 
terial traits (miserly, materialistic), attitudes 
(prejudiced, progressive), mental predicates 
(analytic, intelligent), and social roles (aris- 
tocratic, cliquish) (Wiggins, Note 1). Differ- 
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Figure Z. Mean scores of male and female sex role 
groups on dominant-ambitious. 


ences in self-report between stereotyped, near- 
stereotyped, and androgynous groups were 
examined in all of these domains by compar- 
ing the group means on all 1,710 adjectives. 
Even with highly stringent significance levels, 
this procedure produced an embarras de 
richesses that is not easily summarized nor 
confidently interpreted. Nevertheless, several 
trends were apparent from these data that 
are worth noting, if only as guides for future 
investigations. 

With remarkably few exceptions, the ad- 
jectives that differentiated sex role groups 
from each other were interpersonal in nature 
and could be easily classified within the 8- 
variable system used in the main analysis or 


5.0 o---0 FEMALES 
o—o MALES 
38 
B45 
w 
3 3 Be eee 
Sima pesssassessO 
Ab’ x 
Z x 
2 ies x 
és a 
F a $ ` 
3.0 
og STEREOTYPED NEAR ANDROGYNOUS 
STEREOTYPED 


SEX ROLE GROUPS 


Figure 3. Mean scores of male and female sex role 
groups on aloof-introverted. 
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within the more refined 16-variable system. 
In virtually all cases, interpersonal adjectives 
that differentiated two sex role groups did so 
in a manner that was consistent with the find- 
ings of the analyses of variance reported 
earlier. In comparing androgynous with mas- 
culine males, the former presented themselves 
as warm and submissive and the latter as 
dominant and cold. In comparison with femi- 
nine females who described themselves as 
meek and submissive, androgynous females 
described themselves as dominant and extra- 
verted. The comparisons of androgynous and 
stereotyped groups on the 1,710 adjectives 
mainly confirmed the earlier analyses with 
many more interpersonal adjectives. 

The adjectives that differentiated androgy- 
nous men from androgynous women served 
to confirm the earlier analysis and to extend 
it somewhat into the domain of temperament. 
The 51 adjectives listed in Table 3 reliably 
distinguished (p < .001) the two groups. On 
the item aggressive, for example, the mean 
item response for androgynous females was 
6.54 and the mean item response for androgy- 
nous males was 4.82. The item antisocial was 
answered as significantly more self-descriptive 
by androgynous males than by androgynous 
females. The starred items are significant un- 
der an ultraconservative criterion (p = .05/ 
1,710 = .000029). In comparison with the 
androgynous male, the androgynous female 
presents herself as extraverted, warm, excita- 
ble, emotional, aggressive, and vivacious. In 


Table 3 


contrast, the androgynous male is introverted, 
cold, calm, unemotional, passive, and u- 
dramatic. In this sample of subjects, at 1 
males and females achieved psychological an. 
drogyny by different routes. Although both 
androgynous groups clearly differed from 
stereotyped groups of the same gender, the 
also differed from each other in ways tha 
both support and contradict sex role stereo: 
types. 

The present data also permitted a test of 
the hypothesis that subjects who are class 
fied as androgynous on the basis of high ani | 
equal scores on femininity and masculinity 
(high-high) differ from subjects who achiev | 
that classification by low and equal score 
(low-low), primarily with respect to self 
esteem (Spence, Helmreich, & Stapp, 1975), 
Like Bem (1977), we were unable to find 
many examples of truly “low-low” androgy: 
nous subjects in our sample. Nevertheless, wt 
were able to gain some information on this 
issue by contrasting the upper and lower 25% 
of each androgynous group. Table 4 displays 
the adjectives that differentiated high-high 
and low—low androgynous subjects under this 
definition. 

From among the considerable number o 
adjectives that differentiated high-high from 
low-low androgynous females, we have re 
ported only those that did so at the conserva 
tive significance level, Overall it is clear that 
the high-highs attributed positive characte 
istics to themselves, whereas the low-lows 


Items that Discriminated Androgynous Females from Androgynous Males 


Females (39) 


Males (33) 
‘JM a rr re Se 


Aggressive* Naive 
Animated Open-hearted 
Blunt Overexcitable 
Bossy* Overtalkative* 
Dainty* Peppy 
Direct Perky 
Emotional Rambunctious 
Excitable Solicitous 
Extraverted* Talkative 
Fickle Uncalculating 
Frightenable Unshy 

nial Vivacious 
Gullible* Vocal 


* p < .000029, All other items significant at p < .001. 


Antisocial Reserved 
Bashful Shy* 

Calm Silent 
Emotionless Soft-spoken 
Feelingless Taciturn 
Hard-hearted Tough 
Indirect Unaggressive 
Iron-hearted Undramatic 
Meek Unemotional 
Overquiet Unexcitable 
Passive Unfeeling 
Quiet* Unromantic 
Quiet-spoken 
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Males 
High-high> Low-low» 


Assertive 
Compassionate 


eaa 
om i 
Cona passionless 
Forceful 

Forthright 

Heroic 

Magnetic 

Penetrative 

Self-respecting 

Self-seeking 

Steadfast 


Table 4 
Items that Discriminated High-High from Low-Low Androgynous Groups 
Females 
High-high* Low-low* 
Affectionate Disobliging 
Certain Feelingless 
Courteous Heartless 
Friendly Hostile 
Genial Ill-tempered 
Outgoing Joyless 
Polished Peevish 
Thoughtful Pessimistic 
Self-doubting 
Spineless 
Unassured 
Unpleasable 


Note. Differences between female groups significant at p < .000029, Differences between male groups 


significant at p < .001. 
an = 10. 
bn = 8, 


attributed negative characteristics to them- 
selves, More germaine to the hypothesis, the 
high-highs described themselves on the rating 
scale as more certain (M = 6,90) than did 
the low-lows (M = 3.90). Similarly, the low- 
lows described themselves as more self-doubt- 
ing (6.30 vs. 3.00) and as more unassured 
(5.10 vs. 2.00). 

? There were fewer adjectives that differen- 
tiated the male groups, and consequently a 
less stringent significance level (p < .001) was 
used to detect trends. As was true of the 
females, the high-high males attributed highly 
Positive characteristics to themselves in con- 
trast to the low-low males. The high-highs 
presented themselves as more self-respecting 
(8.00) than did the low-lows (5.75). The 
low-lows described themselves as more aim- 
less (4.62) than did the high-highs (1.87). 
In general, the data for both men and women 
suggest that high-high androgynous subjects 
have considerably more self-esteem than do 
low-low subjects. This trend is especially in- 
teresting in the present sample in which high- 
highs and low-lows differed only slightly 
from each other in their masculinity and femi- 
ninity scores. 

The fact that high-high and low-low an- 
drogynous subjects differed in self-esteem 
could restrict the generalizability of the con- 
clusions drawn concerning differences between 


androgynous males and androgynous females 
(Table 3). Consequently, differences in self- 
report between androgynous males and an- 
drogynous females were reanalyzed excluding 
the lower 25% of both groups, a liberal defi- 
nition of low-lows. None of the findings in 
Table 3 change when low-low androgynous 
subjects are excluded, Where differences be- 
tween high-high and low-low groups oc- 
curred, they served only to accentuate the 
trends reported in Table 3. 


Masculinity-Femininity Scales and 
Interpersonal Behavior 


Three different sets of masculinity and 
femininity scales were scored in the present 
sample: (a) Bem’s Masculinity and Feminin- 
ity scales from the BSRI; (b) Heilbrun’s 
Masculinity and Femininity scales from the 
ACL; and (c) the five-item Man and Woman 
scales that were constructed as markers of 
traditional masculinity-femininity scales. The 
intercorrelations among these scales, and their 
reliabilities, are presented in Table 5. The 
BSRI and ACL scales are of comparable 
homogeneity, and the Man and Woman scales 
are, as expected, highly homogeneous despite 
the small number of items involved. 

The correlations between comparable BSRI 
and ACL scales suggest that the two sets of 
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Table 5 ma We 5 A 
Intercorrelations Among Femininity, Masculinity, and Femininity Minus Masculinity Scores | 
i 


Variable 1 2 3 4 5 6 7 a 8 
1. BSRI-F (.718) 
2. BSRI-M rece Meee 
3. BSRI-D -678 oak 
4, ACL-F 731 —.121 516 (.771) 
5. ACL-M —.257 873 —.800 —.217 (.831) 
6. ACL-D 599 —.686 857 724 —.830 | 
7. Woman 641 —.354 636 627 —.409 647 (.963) | 
8. Man —.562 382 —.611 —.578 423 —.629 —.913 (.932) 
9. WM-D -616 —.376 638 617 —.425 652 -980 —.916 


Ss ee  SSeeSSSSSFSFSSSSSSSSSMMse 


level. The reliabilities (Cronbach's alpha) of the masculinity and femininity scales appear in parentheses 
The correlation coefficients in italics indicate the degree of bipolarity that exists between masculinity and 


femininity scales in the three sets. 


measures are closely related. The masculinity 
measures are correlated .873, the femininity 
measures are correlated .731, and the differ- 
ence measures (femininity minus masculinity) 
are correlated .857. Both the BSRI and the 
ACL scale sets are substantially correlated 
with the traditional masculinity-femininity 
scales, The ACL Masculinity scale was more 
highly correlated with the Man scale than 
was the BSRI Masculinity scale, but the 
BSRI Femininity scale was more highly cor- 
related with the Woman scale than was the 
ACL Femininity scale. The BSRI difference 
score (femininity-masculinity) was corre- 
lated .628 with woman minus man, and the 
ACL difference score was correlated .564 with 
the same index. Although the BSRI and the 
ACL scale sets are conceptually distinct from 
the traditional masculinity-femininity mea- 
sures, they share a considerable amount of 
variance in common. The extent of these rela- 
tionships may be exaggerated by the use of a 
common method of measurement (9-place 
Likert scales) in the Present study, but it is 
expected that the scales would also be highly 
correlated in their original formats, 

The traditional Man and Woman scales are 
strongly bipolar (—.913), the ACL scales are 
only slightly related ( —.217), and the BSRI 
scales are the most independent (—.127). The 
BSRI and:ACL scales also differ from the tra- 
ditional scales in the extent to which mascu- 


187, an r of .187 is significant at the 0l 


linity and femininity determine the difference 
score (femininity minus masculinity). For the 
traditional scales, the correlation between man 
and woman minus man (—.976) was almost 
identical to the correlation between woman 
and woman minus man (.980). For the BSRI- 
scales, however, the correlation between mas 
culinity and the difference score (—.816) was 
higher than the correlation between femininity 
and the difference score (.678). The same is 
true for the ACL scales. This suggests that 
in the present sample at least the greatest con 
tribution to variance on the androgyny indet 
comes from the Masculinity scale. 

The conceptual similarity of the ACL scales 
to the BSRI scales is most evident from thé 
pattern of correlations of these two scale sets 
with the interpersonal variables used in the 
present study. From Table 6 it can be seen 
that the masculinity and femininity scales 
from the BSRI and ACL behave as variable 
that are closely related to the circumplex pos 
tions of dominant-ambitious and warm-agre® 
able, respectively. The masculinity scales havé 
their highest positive correlations with domi- 
nant-ambitious and their highest negative 
correlations with the opposite vector of lazy- 
submissive. The femininity scales have theif 
highest positive correlations with warm agree” 
able and their highest negative correlations 
with the opposite vector of cold-quarrelsom® 
This orthogonal pattern is also clearly e 
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dent in the correlation of the androgyny in- 
dices (feminine minus masculine) with the 
interpersonal variables. From Table 6 it can 
be seen that the ACL androgyny index has 
its two highest positive correlations with lazy- 
submissive and warm—agreeable and its two 
highest negative correlations with dominant- 
ambitious and cold—quarrelsome. The same 
pattern is evident for the BSRI androgyny 
index, with the slight exception that the nega- 
tive correlation with arrogant-calculating ex- 
ceeds the negative correlation with cold- 
quarrelsome. Aside from this slight exception, 
the patterns of correlations of the BSRI and 
ACL scale sets with the interpersonal circum- 
plex variables are virtually identical. 

The pattern of correlations of the tradi- 
tional Man and Woman scales with the inter- 
personal variables is quite different. The Man, 
Woman, and Woman minus Man scales all 
have their highest correlations with the single 
bipolar dimension of cold—quarrelsome versus 
warm-agreeable, Masculinity is associated 
with hostility and femininity with warmth. 
The apparent association between masculinity 
and hostility is especially interesting, since 
none of the items in the Man or Woman scales 
implicate interpersonal variables. 


Psychological Androgyny or Interpersonal 
Flexibility? 


According to Bem, the androgynous per- 
son is both dominant and nurturant and thus 
can vary his or her behavior to meet situa- 
tional demands. Stereotyped males would, 
presumably, emphasize dominance and sup- 
press nurturance, and stereotyped females 
would emphasize nurturance and suppress 
dominance. Earlier in this article we raised 
the possibility that the flexibility of androgy- 
nous persons may be part of a broader pat- 
tern of flexibility that is expressed in all or 
most dimensions of interpersonal behavior. 
By this reasoning, the androgynous person s 
profile of interpersonal variables would be 
relatively flat, and the stereotyped person’s 
profile would be both positively and nega- 
tively spiked on variables that are highly sex 


stereotyped. 
The profile variability of sex role groups 


Gregarious- 
extraverted 


Warm- 


agreeable 


ingenuous 


Unassuming- 


Lazy- 
submissive 


Aloof- 
introverted 


Cold- 


quarrelsome 


Arrogant- 


calculating 


Dominant- 
ambitious 


Scale 


Correlations Between Masculinity- Femininity Scale Sets and Interpersonal Variables 


Table 6 


difference between the two 
difference between the two scales; 


Bem Sex-Role Inventory Masculinity scale; BSRI-D 


Adjective Check List Masculinity scale; ACL-D 


WM.-D = difference between the Woman and Man scales. For N = 187, an r of .187 is significant at the .01 level. 


Adjective Check List Femininity scale; ACL-M 


Note. BSRI-F = Bem Sex-Role Inventory Femininity scale; BSRI-M 


scales; ACL-F 
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was examined with reference to scores on the 
full set of 16 interpersonal variables. Sixteen 
variables were used to ensure an adequate 
number of observations for profile analysis. 
Means and standard deviations were com- 
puted separately in the total group of men 
and in the total group of women for each of 
the 16 variables. All subjects’ raw scores were 
then converted to standard scores based on 
either male or female norms. Mean standard 
Scores were computed for male and female 
stereotyped, near-stereotyped, and androgy- 
nous groups. This resulted in a mean profile 
of 16 interpersonal variables, for each sex role 
group, that had been standardized with ref- 
erence to the total same-sex sample, The vari- 
ance of a group’s profile of mean standard 
Scores was used as an index of profile varia- 
bility for that group. Differences in profile 
variability between sex role groups were as- 
sessed by a variance ratio (F) with 15, 15 
degrees of freedom. 


port for male subjects, Androgynous males 
had a flat profile 
to the means for the total group. In contrast, 
the profile of scale Scores for stereotyped 
males was much 


spike on dominant and low spikes on warm 
and submissive, 


nous males, 
The hypothesis that andro 

would have relatively flat pote Gti 
Personal variables was strikingly disconfirmed 
for female subjects, The index of profile vari- 
ability was highest for androgynous females 
and next highest for stereotyped females: the 
slight difference between the two groups was 
not statistically significant, Near-stereotyped 
females were least variable, and their rela- 
tively flat profile was significantly ($ < .001) 
less variable than those of androgynous and 
stereotyped females. 


Stereotyped females had a profile that was 
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high on introverted, submissive, and unassum 

ing, and was low on dominant, extraverte 3 
and arrogant. The profile of androgynous fe 
males was a mirror image of this pattern. The | 
correlation between the profile of stereot: 
females and that of androgynous females was 
—.96 (p< .001). Rather than differing from 
stereotyped females in profile variability, an- 
drogynous females were equally variable, but 
in ways that run counter to sex role stereo 
types. Androgynous males differ from stereo- | 
typed males in both the mean level and vati 
ability of their profile of interpersonal vari- 
ables. In contrast, androgynous females are, 
interpersonally, the opposite of their stereo- 
typed counterparts. The profile of androgy- 
nous females was significantly (p< .001) 


terized as “inflexibly androgynous,” whereas 
androgynous men would be characterized as 
“flexibly androgynous.” 

To investigate the possibility that the ob- | 
served patterns of profile variability may have 
been affected by the presence of low—low an: 
drogynous subjects, the data were analyzed 
with the lower 25% of both androgynous 
groups excluded. This had the effect of mak- 
ing the profile of androgynous males slightly 
less variable and the profile of androgynous 
females slightly more variable. All of the ma 
jor trends previously noted remained the 
same. 


Conclusions 


Despite the fact that it was derived from | 
only two relatively desirable dimensions 0 
interpersonal behavior, Bem’s measure of psy- \ 
chological androgyny appears to reflect 4 
highly generalizable personological construct: 
Persons classified as androgynous or as stere0 
typed by the BSRI will tend to be androgy- 
nous or stereotyped on five of the eight majo 
dimensions of interpersonal behavior. Al 
though there was a trend for this to be true 
of the highly undesirable cold-quarrelsom® ” 
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dimension, it was not a significant one, A 
genuine exception to the generalizability of 
BSRI sex role classifications occurred with 
respect to the dimensions of introversion and 
extraversion. Women were more extraverted 
than men in all sex role groups, with the 
greatest differences occurring between an- 
drogynous groups. Although stereotyped 
groups did not differ on introversion, androgy- 
nous women were Clearly less introverted than 
androgynous men. The possibility that the 
dimensions of introversion and extrayersion 
are not sex role stereotyped is worth investi- 
gating in other samples of subjects. 

Sex role groups defined by the BSRI dif- 
fered from each other in self-report primarily 
on interpersonal adjectives as opposed to ad- 
jectives denoting temperamental, character- 
ological, material, attitudinal, or intellectual 
characteristics, Androgynous groups differed 
from stereotyped groups of the same gender, 
but they also differed from each other in ways 
that both support and contradict sex role 
stereotypes. Androgynous men and women 
who scored high on both masculinity and 
femininity scales differed from their same- 
gender androgynous counterparts who scored 
low on both scales. The high-high androgy- 
nous subjects characterized themselves as 
having greater self-esteem than did low-low 
androgynous subjects (Spence et al., 1975). 

The Masculinity and Femininity scales 
from Gough and Heilbrun’s (1965) ACL are 
much more closely related to the correspond- 
ing Bem scales than they are to traditional 
masculinity—femininity measures. In addition 
to being highly correlated with the Bem 
scales, the ACL Masculinity and Femininity 
scales exhibit the same pattern of correlations 
with interpersonal variables. Thus, it seems 
appropriate to characterize subjects who score 
high on both scales as psychologically an- 
drogynous (Heilbrun, 1976). 

When the 16-variable interpersonal profiles 
of sex role groups were compared on profile 
variability, sex differences were again found 
between androgynous groups. Androgynous 
males had a relatively flat profile of interper- 
sonal variables that stood in contrast to the 
Profile of stereotyped males that spiked on 
sex role stereotyped variables. Androgynous 
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females had a variable profile that was a 
mirror image of the sex role stereotyped pro- 
file for stereotyped females, Such a finding 
does not necessarily warrant the conclusion 
that androgynous males are more “flexible” 
in their interpersonal behavior than androgy- 
nous females. The greater profile variability 
of androgynous females may reflect a more 
differentiated self-perception on their part. 
Whether or not such differences in self-report 
are reflected in behavior is, of course, still an 
open, empirical question. 
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The standard form Minnesota Multiphasic Personality Inventory (MMPI) and 
two abbreviated forms, the MMPI-168 and the Faschingbauer Abbreviated 
MMPI (FAM), were compared with direct measures of psychopathology ob- 
tained from the Brief Psychiatric Rating Scale (BPRS) with psychiatric inpa- 
tients. Each patient was interviewed using the Mental Status Schedule by one 


rater while another rater observed thi 


s initial diagnostic interview behind a 


one-way mirror. Thus, each patient was rated on the BPRS by two raters to 
assess interrater reliability. Since MMPI scales contain more than one interpre- 


tative factor, these scales were correlated with the means of more than one 


BPRS symptom using multiple correlation coefficients. The multiple correlation 
coefficients between the BPRS ratings and the corresponding MMPI and abbre- 
viated-form scales were significantly high and comparable. Only on Pd for 
females did a significant difference occur, with the FAM correlation being sig- 
nificantly higher. These findings suggest that these abbreviated forms. are an 
accurate substitute for the standard-form MMPI in predicting objective mea- 


sures of psychopathology. 


The recent development. of several abbre- 
viated forms of the Minnesota Multiphasic 
Personality Inventory (MMPI) has precipi- 
tated a plethora of investigations assessing 
the practical utility of these instruments. The 
majority of these studies focused solely on 
comparisons with the standard MMPI of 
group mean data and individual profile pairs 
concerning validity, high points, general ele- 
vations, code-type correspondence, configural 
analyses, and application of various MMPI- 
derived diagnostic rules. Of the six abbre- 
viated MMPIs now available for clinical use, 
namely .. Kincannon’s (1968) Mini-Mult, 
Dean’s (1972) Midi-Mult, Hugo's (1971) 
Short Form, Faschingbauer’s (1973) Abbre- 
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Carolina 27514. í, 
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viated MMPI (FAM), Overall and Gomez- 
Mont’s (1974) MMPI-168, and Spera and 
Robertson’s (Note 1) Maxi-Mult, only the 
FAM and the MMPI-168 have been demon- 
strated to predict accurately both individual 
and group mean data with both normal and 
psychiatric samples (Newmark, Boas, & Mes- 
servy, 1974; Newmark, Cook, Clarke, & 
Faschingbauer, 1973; Newmark, Galen, & 
Gold, 1975; Newmark & Glenn, 1974; New- 
mark, Newmark, & Cook, 1975; Newmark, 
Newmark, & Faschingbauer, 1974; Newmark, 
Owen, Newmark, Cook, & Faschingbauer, 
1975; Newmark & Raft, 1976). 

None of these investigations, however, has 
dealt with two crucial issues. First, Is the in- 
terpretation obtained from an abbreviated 
form comparable to that obtained from the 
standard form? If not, regardless of whether 
individual and group mean data are compara- 
ble, the abbreviated form should not be used. 
Therefore, the comparative interpretive effi- 
cacy of the FAM and the standard MMPI 


uction in any form reserved. 
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was assessed with a sample of psychiatric in- 
patients (Newmark, Conger, & Faschingbauer, 
1976). Psychiatric residents evaluated the ac- 
curacy of the interpretation of both test forms 
for each of their patients. Although signifi- 
cantly higher mean ratings resulted for the 
standard MMPI, when data were collected 
across sexes, the quantity of loss was sig- 
nificantly lower than expected. That is, the 
FAM functioned as an instrument 80% as 
long as the standard MMPI, even though it 
contains only 30% of the items. 

Newmark, Falk, and Finch (1976) also 
compared the interpretive accuracy of the 
standard MMPI and three abbreviated forms 
with a sample of psychiatric inpatients. Al- 
though significantly higher mean ratings re- 
sulted from the standard MMPI when com- 
pared with either the FAM or Hugo’s Short 
Form, comparable ratings by psychiatric 
teams were obtained when comparing the 
MMPI-168 and the standard-form MMPI 
interpretations. Two major limitations of 
these latter two studies include the use of 
psychiatric judges as the validity criterion 
and the failure to have the interpretations 
done by computer to reduce the error factor 
introduced by clinical judgment. 

A second and more crucial issue concerns 
the empirical validity of the standard and 
abbreviated forms when compared with di- 
rect measures of psychopathology. In pre- 
vious investigations the utility of the abbre- 
viated form has been based on concurrent 
validity with the standard form of the MMPI, 
Thus, though the FAM may predict the stan- 
dard-form MMPI more precisely than the 
MMPI-168 does, the MMPI-168 might pre- 

_ dict psychopathology more accurately than 
either the FAM or standard-form MMPI, 

The present investigation attempted to 
compare the empirical validity of two abbre- 
viated forms of the MMPI and the standard 
form with direct measures of Psychopathology 
using psychiatric inpatients, This investiga- 
tion seems necessary, since there is no reason 
to use an alternative version of a psycho- 
metric instrument, whether longer or shorter, 

unless it possesses greater empirical validity 
than the form it replaces or unless it is sig- 
nificantly less expensive to administer with 


no loss in empirical validity. Rand ( 1976) 
and Overall, Higgins, and De Schweinitz 
(1976) independently concluded that using 
external criteria is a vital prerequisite for 
evaluating abbreviated forms of the MMPI. 


Method 
Subjects 


The subjects were 158 male and 217 female con- 
secutive admissions to either a private or a uni- 
versity psychiatric inpatient facility, For numerous 
reasons, primarily invalid profiles, lack of coopera- 
tion, confusion, limited intellectual ability, and 
cerebral dysfunction, 38 males and 47 females were 
eliminated from this investigation. The resultant 120 
males and 170 females were between the ages of 17 
and 65 years (M = 34.7), and their level of educa- 
tion ranged from 3 to 22 years (M = 10.2). First 
admissions to a psychiatric hospital comprised 71% 
of the sample. Although there were no significant 
educational differences as a function of sex, the 
female patients were significantly older (p < 05). 


Apparatus 


The Brief Psychiatric Rating Scale (extended ver- 
sion). The rating scale that was used to record 
clinical observations in this investigation was a 42- 
item instrument devised by Pokorny, in collabora- 
tion with Overall and others, for use in a survey 
of the Texas state hospital population (Pokorny & 
Overall, 1970). It consists of the widely used 18- 
item Brief Psychiatric Rating Scale (BPRS; Overall 
& Gorham, 1962) plus 24 additional rating con- 
structs from various syndrome scales chosen to span 
the range of lesser psychopathologies. Each symp- 
tom construct in the BPRS, and in the extended 
version used here, is rated on a 7-point scale of 
severity. Approximately half of the BPRS ratings 
are based on observations of the patients, their modes 
of communication, and the organization of their 
mental processes without regard to specific content. 
Information regarding the development, scoring pro- 
cedure, reliability, and validity of this instrument 
can be found elsewhere (Overall & Gorham, 1962; 
Overall & Klett, 1972), 

, The B was chosen as the criterion variable 
in this investigation, since it has been used fruit- 
fully for the isolation and identification of psycho- 
pathological syndromes and Categories in numerous 
studies (Auerbach & Ewing, 1964; Overall, 1974; 


chopathology has been extensively documented (Lort 
1954; Norton, 1967; Schreier, Reznikoff, & Glueck 
12625 Wittenborn, 1972). Furthermore,’ the BPRS 
has symptom labels similar to those found on the 
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MMPI scales. The development of rating scales in 
an attempt to standardize, fractionate, and objectify 
the process of clinical decision making has been 
among the most significant achievement in psycho- 
diagnosis during the past decade (Goldberg, 1974). 
That behavioral rating scales have largely supplanted 
psychological tests as criterion measures has been dis- 
cussed recently by Cleveland (1976). 

The Mental Status Schedule. To complete the 
BPRS, Overall and Gorham (1962) advocated the 
use of an 18-minute generally nondirective inter- 
view. However, this interview proved impractical in 
the present investigation due to its brevity and lack 
of structure and because the usefulness of rating 
scales has been shown to be limited when there is 
significant variability in the interview procedures on 
which the ratings are based (Spitzer, Fleiss, Burdock, 
& Hardesty, 1964). Therefore, the Mental Status 
Schedule (MSS), which was developed by Spitzer, 
Burdock, and Hardesty (1964) to provide a stan- 
dardized interview to assess the major dimensions 
of the mental status in which the content and order 
of questions are fixed, was used in this investigation. 

The MSS contains an interview schedule and a 
matching inventory of 248 dichotomous items de- 
scriptive of small units of psychopathological be- 
havior that the interviewer evaluates as true or false. 
The inventory was not used in this study, The in- 
terview schedule is a series of 82 questions arranged 
in a definite sequence to provide a natural pro- 
gression of topics that cover a wide range of psy- 
chopathology that the interviewer uses to elicit 
information from the patient. Fifty-one supplemen- 
tary questions were provided to clarify or probe the 
areas in which the patient's responses seemed in- 
complete. The schedule provided alternative phrase- 
ology for many questions so that the examiner could 
ask the question in the form and tense most appro- 
priate to the circumstances. Thus, even though stan- 
dardized, the procedure has enough flexibility so 
that when properly administered it seems like a 
typical clinical interview, allowing good rapport be- 
tween interviewer and patient (Spitzer, Fleiss, Kerno- 
han, Lee, & Baldwin, 1965). The items on the in- 
ventory were developed by surveying standard psy- 
chiatric texts and by interviewing several hundred 
psychiatric patients with preliminary forms of the 
MSS to obtain a comprehensive coverage of overt 
signs of psychopathology. Efforts were made to 
word the questions so that even inpatients with 
minimal education or psychological insight could 
comprehend them. The length of the interview varied 
from 20 to 60 minutes depending on the patient's 
verbal productivity, amount of psychopathology 
Present, and cooperation, A detailed description of 
the MSS as well as information bearing on the re- 
liability, validity, and administration of this instru- 
ment can be found elsewhere (Spitzer, Endicott, Gi 
Cohen, 1964; Spitzer, Fleiss, Burdock, & Hardesty, 
1964; Spitzer, Fleiss, Endicott, & Cohen, 1967). 

The advantages of the MSS over other commonly 
used assessment procedures include the incorporation 
of a standardized interview schedule to reduce in- 


consistency and oversight due to variability in inter- 
viewing technique and coverage of psychopathology, 
awareness of what questions are asked to provide 
a framework within which the patient’s responses 
can be understood by others not present at the in- 
terview, and the use of a score sheet that serves 
simultaneously as a permanent clinical record and 
as a form for automated data processing. The use 
of the same interview schedule for all patients also 
has the research advantage that differences observed 
among patients tend to reflect actual differences 
rather than artifacts caused by differences in areas 
of psychopathology explored or interviewing tech- 
niques used. 


Procedure 


Subjects were tested approximately 48 to 72 hours 
after admission as part of the routine screening pro- 
cedure. A counterbalanced design was used to offset 
evidence of decreased pathology under repeated 
MMPI administrations (Kincannon, 1968; Newton, 
1971; Windle, 1954). The subjects were assigned al- 
ternatively to one of two groups: one group initially 
received the MMPI, and the other received either 
the FAM.or the MMPI-168. As soon as possible 
after completion of this initial testing, the procedure 
was reversed so that each subject received both the 
standard and one abbreviated MMPI form. Thus, 
independently administered abbreviated forms were 
obtained, since the extracted method possessed in- 
herent flaws. Most notably, a perfect reliability of 
scores was assumed, since the same items scored 
were used on both the abbreviated and standard 
forms. Furthermore, Newmark et al. (1973) pre- 
sented evidence that the extracted form has limited 
benefits. p 

The FAM was scored and converted into stan- 
dard-scale raw scores using Faschingbauer’s (1973) 
tables of conversion for each sex. Conversion of pro- 
rated raw scores to T scores was then accomplished 
following the standard procedures. The MMPI-168 
was scored and converted into standard-scale K-cor- 
rected raw scores using regression equations for each 
scale as derived by Overall and Gomez-Mont (1974). 
K-corrected T scores were used in this study in or- 
der to be consistent with the literature. The standard- 
form MMPI answer sheets were hand scored in the 
traditional manner. a was defined as either 

L or K 2 10 or ? 2 60. 
ee raters, consisting of two clinical psychologists 
and two psychiatric residents, were presented de- 
tailed operationally defined explanations of the symp- 
toms on the BPRS as defined by Overall and Gor- 
ham (1962) and Porkorny and Overall (1970) in 
an attempt to reduce idiosyncratic biases. Whereas 
definitions can never be rigorous or complete ex- 
cept in mathematics, they nevertheless serve to de- 
marcate a concept even though its boundaries remain 
somewhat blurred (Zubin, 1967). All raters had at 
least 4 years of diagnostic experience. Such a pro- 
cedure was necessary because it has been demon- 
strated (Kreitman, 1961) that variables relating to 
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nomenclature and degree of experience were the 
greatest impediments to reliability in psychiatric diag- 
noses, Clearly definable terms and equivalent diag- 
nostic experience are essential. 

Initially, in an attempt to maximize interrater re- 
liability, filmed MSS interviews with three psychi- 
atric inpatients, with varied degrees and types of 
psychopathology, were presented to the four raters. 
Following each filmed presentation a detailed dis- 
cussion of the rating differences occurred. That such 
a procedure has merit in increasing rater reliability 
has been confirmed by Raskin, Schulterbrandt, and 
Reatig (1966). 

Each patient was then interviewed using the MSS 
by one of the raters while another rater observed 
this initial diagnostic interview behind a one-way 
mirror. Neither rater was familiar with the patient’s 
history or observed ward behavior. Reliability is 
definitely enhanced if two raters observe the same 
interview rather than each rater observing a sepa- 
rate interview (Wittenborn, 1972). Thus, each pa- 
tient was rated on the BPRS by two raters so that 
interrater reliability was assessed. Immediately fol- 
lowing the interview, each rater recorded his ob- 
servations independent of his colleague’s rating. 

When two raters observe the same patient, two 
distinct strategies exist for combining data (Over- 
all, 1968), One approach has been to have both ob- 

servers discuss each rating at the conclusion of the 
interview and to arrive at a concensus rating. The 
other more frequently used model has been to have 
raters complete ratings independently and then to 
average the ratings on the theory that random er- 
rors tend to cancel. This latter model has been 


Table 1 


Mean T Scores, Standard Deviations, Correlations, 
and the Faschingbauer Abbreviated MMPI (FAM) 


shown to be an advantageous procedure by Overall, 
Hollister, and Dalal (1967). Consequently, the mean 
ratings for each BPRS symptom were used. 

Since several MMPI scales contain more than one 
interpretative factor, these scales were correlated 
with the means of more than one BPRS symptom 
rating using multiple correlation coefficients (Brun- 
ing & Kintz, 1968). Only for Scale L, which was 
correlated with only one BPRS symptom, was a 
product-moment correlation used. The MMPI scales 
and the corresponding BPRS scales with which they 
were correlated were as follows: L-Denial; F-Un- 
usual Thought Content, Conceptual Disorganization, 
Intellectual Subnormality; K-Denial, Guardedness; 
Hs-Conversion Somatization, Somatic Concern; D- 
Depressive Mood, Introjection of Blame; Hy-Con- 
version Somatization, Dramatization, Manipulative- 
ness, Denial, Emotionally Labile; Pd-Antisocial 
Trends, Impulsiveness, Projection of Blame, Hostil- 
ity, Manipulativeness; Mf-Sexual Crossed Identifi- 
cation, Passive Dependency; Pa-Suspiciousness, Pro- 
jection of Blame, Hostility; Pt-Anxiety, Tension, 
Guilt Feelings, Phobias, Compulsive Acts, Obsessive 
Thoughts; Sc-Conceptual Disorganization, Unusual 
Thought Content, Sexual Inadequacy, Emotional 
Withdrawal, Blunted Affect, Feelings of Unreality; 
and Ma-Excitement, Elevated Mood, Grandiosity, 
Tension. The BPRS ratings selected for inclusion 
were based on item analyses content and interpretive 
data of the MMPI scales as presented by Dahlstrom, 
Welsh, and Dahlstrom (1972), Fowler (1966), and 
Lachar (1974), 
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Results 


The T-score means, standard deviations, 
Pearson product-moment correlations, and ¢ 
values of the comparable validity, clinical, 
and Mf scales of both the MMPI and the 
FAM are presented as a function of sex in 
Table 1. Note that in most cases for both 
males and females the standard deviation was 
larger for the FAM than for the comparable 
standard MMPI scales. Statistical analysis 
showed this to be a significantly reliable trend 
(p < .05). 

Paired ¢ tests yielded significant mean dif- 
ferences on Z, ¢(59) = 2.87, $ < 0l, for 
males, and on F, ¢(84) = 3.74, p < 001; Hs, 
t(84) = 3.17, p < 01; Pa, t(84) = 3.29, p < 
Ol, and Pt, ¢(84) = 3.42, p < .001, for fe- 
males. In all cases the FAM significantly 
overestimated the MMPI scale scores. This 
finding is consistent with previously published 
results (e.g, Newmark et al., 1973), demon- 
strating that the FAM seemed to oyeresti- 
mate slightly MMPI scale means. 

For males, scale correlations ranged from 
82 for L to 93 for K and Hs (M = 88, Mdn 
= .89). For females, correlations ranged from 
83 on Pd to .95 on F, Hs, D, and Ma (M= 


Table 2 ; 
Mean T Scores, Standard Deviations, 
MMPI and the MMPI-168 


Pearson Correlations, 
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91, Mdn = 93). All correlations were sig- 
nificantly different from zero (p < .001). 

The T-score means, standard deviations, 
and Pearson product-moment correlations of 
the comparable validity, clinical, and Mj 
scales of both the MMPI and the MMPI-168 
are presented as a function of sex in Table 2. 
Note that in approximately half of the cases 
for each sex, the standard deviation was larger 
for the MMPI-168 than for the comparable 
standard MMPI scale. 

Paired ¢ tests yielded no significant mean 
differences for male subjects. For females, 
however, significant mean differences occurred 
on F, t(84) = 4.29, p < 001; Pd, t(84) = 
3.09, p< 01; Pa, t (84) = 2.96, p< 0l; 
and Sc, t(84) = 4.59, P< ,001. In all cases 
the MMPI-168 significantly overestimated 
the MMPI scale scores. The results for fe- 
males are rather surprising in view of several 
previously published reports (Newmark, New- 
mark, & Cook, 1975; Newmark & Raft, 1976) 
that found no significant mean differences for 
a sample of both male and female psychiatric 
inpatients and medical patients. 

For males, scale correlations ranged from 
02 on L to .98 on Hs and Sc (M = 96, Mdn 
= .96). For females, scale correlations ranged 
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Table 3 


Correlations Between the BPRS Ratings and the Corresponding MMPI and Short- Form 


Scale Scores 


See 


Males Females 

Scale MMPI FAM MMPI FAM 
63 63 57 54 
i 74 12 74 13 
K 80 81. 78 179 
Hs 82 82 83 -86 
D -83 85 84 .81 
Hy 86 85 83 83 

Pd 83 -80 65 noe 
Mf 85 82 80 .78 
Pa 77 19 82 80 
Pt 80 19 76 76 
Se 61 64 72 69 
Ma 85 85 81 82 


Males Females 
MMPI MMPI-168 MMPI MMPI-168 
64 63 61 64 
70 71 69 68 
65 67 81 79 
79 80 84 86 
81 81 83 83 
82 84 83 86 
77 .79 -82 79 
83 84 75 77 
83 84 83 81 
81 -80 81 81 
65 .67 64 66 
84 83 86 84 


Note. BPRS = Brief Psychiatric Rating Scale; MMPI = Minnesota Multiphasic Personality Inventory; 
FAM = Faschingbauer's Abbreviated MMPI. n = 60 for males and 85 for females. 


*p <.01. 


from .89 on K to .96 on D (M = 93, Mdn = 
-93). All correlations were significantly dif- 
ferent from zero (p < .001), 

Table 3 presents the multiple correlation co- 
efficients between the BPRS ratings and the 
corresponding MMPI and abbreviated-form 
scales, All of the correlations were signifi- 
cantly different from zero (Ż < .001). For 
males, correlations ranged from .61 for Sc to 
86 for Hy (M=.78, Mdn = 81) on the 
standard form and from .63 for L to .85 for 
D, Hy, and Ma (M = 80, Mdn = .82) on the 
FAM, For females, correlations ranged from 
.57 for L to .84 for D (M = .78, Mdn = .19) 
on the standard form and from .54 for L to 
86 for Hs (M=.79, Mdn=.79) on the 
FAM. With regard to the MMPI and the 
MMPI-168, for males, correlations Tanged 
from .64 for L to .84 for Ma (M = -78, Mdn 
= .81) on the standard form and from 63 
for L to .84 for Hy, Mf, and Pa (M = .79, 
Mdn = .79) on the MMPI-168, For females, 
correlations ranged from .61 for L to .86 for 
Ma (M=.79, Mdn = 82) on the standard 
form and from .64 for L to .86 for both Hs 

and Hy (M=.79, Mdn= 81) for the 
MMPI-168. Only on Pd for females did a 
significant difference occur between the 
MMPI and the abbreviated-form correlations, 
t(84) = 2.72, p < .01, with the FAM being 


significantly higher. However, this significant 
difference could occur by chance, since 48 
comparisons were made. 

Paired ¢ tests between the multiple corre- 
lation coefficients from the BPRS ratings and 
each abbreviated form yielded significantly 
higher correlations for females on L, (84) = 
3.05, $ < .01; F, t(84) = 2.32, p < .05; and 
Pt, t(84) = 2.07, p < .05. For L and Pt, the 
MMPI-168 correlation coefficients were higher, 
whereas for F the FAM correlation coefficients 
were higher. For males, significantly higher 
correlations occurred for K, ¢(59) = 2.66, p 
< 01, on the FAM and Pa, t(59) = 2.19, 
$ < .05, on the MMPTI-168. 

Product-moment reliability coefficients 
across the four raters ranged from .83 to .88 
(M = .86). Mean interrater reliabilities be- 
tween the BPRS scores and the corresponding 
MMPI scale scores were as follows: L = .87, 
F= 86, K = 84, Hs = 88, D = 88, Hy= 
87, Pd = 84, Mj = 87, Pa = .83, Sc = 85, 
and Ma = 88, Each rater tated the same 
number of patients. 


Discussion 


Although both abbreviated forms signifi- 
cantly overestimated several of the compara- 
ble standard MMPI scale means, the actual 
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mean differences that occurred were extremely 
small, ranging from .3 to .9, and probably 
were not of any practical significance. Hill 
(1976) presented compelling evidence, sug- 
gesting that the concepts of statistical valid- 
ity and clinical utility were by no means 
identical and that it does not necessarily fol- 
low that one is a prerequisite for the other. 
Furthermore, related ¢ tests are concerned 
with the magnitude of the correlation so that 
when large correlations, as were obtained in 
this study, are found, the differences between 
the means can be extremely small and still 
be statistically significant. 

The majority of abbreviated-form investiga- 
tions have demonstrated a tendency for these 
forms to underestimate MMPI T-score stan- 
dard deviations, leading Dean (1972) to sug- 
gest that this is somehow inherent in shorter 
tests. However, Faschingbauer (1976) con- 
cluded that his use of a substitution model 
rather than a regression model in developing 
conversion values for the FAM would over- 
come this persistent tendency. Equivocal sup- 
port for this hypothesis was found in the 
Present investigation, as the FAM had gen- 
erally larger standard deviations than did the 
standard-form MMPI, whereas the MMPI- 
168, which was developed using a regression 
model, had higher standard deviations for ap- 
proximately half of the scales for each sex. 

The correlations obtained between the stan- 
dard MMPI and the BPRS ratings were 
markedly greater than results obtained from 
other investigations (Endicott & Jortner, 
1966, 1967; Endicott, Jortner, & Abramoff, 
1969; Harris, Wittner, Koppell, & Hilf, 
1970), which correlated other objective mea- 
Sures of psychopathology with various stan- 
dard MMPI scales. The higher correlations 
may have occurred in the present investiga- 
tion due to the use of multiple correlation 
Coefficients, which combine several objective 
Measures of each MMPI scale rather than 
Using just one measure. For example, instead 
of correlating Pt with objective measures of 
anxiety only, a multiple correlation coefficient 
Was obtained from objective measures of anx- 
lety, tension, guilt, phobias, compulsive acts, 
and obsessive thoughts. 


The correlations obtained between the ab- 
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breviated forms and the BPRS ra 
significantly high and ‘canaries 
MMPI-BPRS rating correlations. Scale L 
correlations were consistently lower in all 
cases, possibly because only one objective 
measure of psychopathology was correlated 
with it. Only on Pd for females did a signifi- 
cant difference occur between the MMPI and 
the abbreviated-form correlations, with the 
FAM being significantly higher. However, this 
difference quite likely could have occurred 
by chance. Thus, the empirical validity of 
both abbreviated forms seems comparable to 
the standard-form MMPI when compared 
with direct measures of psychopathology. 

The FAM-BPRS rating correlations and 
the MMPI-168- BPRS rating correlations 
were significantly different in 5 of the 24 com- 
parisons made. However, neither abbreviated 
form demonstrated any superiority. 

The interrater reliabilities obtained in this 
investigation were comparable to those ob- 
tained in the original developmental investiga- 
tion of the BPRS by Overall and Gorham 
(1962). However, the present results were 
somewhat higher than those obtained in the 
majority of recent BPRS investigations (e.g., 
Anderson, Kuehnle, & Catanzano, 1976) pos- 
sibly due to the attempt to maximize inter- 
rater reliability through the use of practice 
procedures as advocated by Raskin et al. 
(1966). 

‘As an additional analysis, the mean BPRS 
symptoms rating scores were obtained for , 
each comparable MMPI scale. Both abbre- 
viated- and standard-form MMPIs in each 
sample as a function of sex were then sub- 
grouped using the mean BPRS ratings as a 
moderator variable. Means between 0 and 2.5 
were included in the low-ratings group, and 
means between 3.5 and 6.0 were included in 
the high-ratings group. Means from 2.6 to 3.4 
were excluded to provide more well-defined 
groups. The mean MMPI scale scores for the 
low-ratings group and high-ratings group 
were not significantly different from the mean 
abbreviated-form scales for the low-ratings 
and high-ratings groups, respectively, within 
both samples regardless of sex. 

Endinger, Kendall, Hooke, and Bogan 
(1976) emphasized that in the absence of in- 
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endent validation, none of the abbreviated 
ad developed can be recommended for 
clinical use. However, there is evidence from 
the present investigation that both the FAM 
and MMPI-168 were comparable to the stan- 
dard-form MMPI when compared with direct 
measures of psychopathology. Since both ab- 
breviated forms are slightly less expensive to 
administer and result in more completed pro- 
files with no loss in empirical validity, they 
appear to be an adequate substitute when ad- 
ministration of the standard-form MMPI is 
not feasible. 


Reference Note 


1, Spera, J., & Robertson, M. A 104-item MMPI: 
The Maxi-Mult. Paper presented at the meeting 
of the American Psychological Association, New 
Orleans, September 1974, 
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Treatment of Female Sexual Dysfunction Through 
Symbolic Modeling 


Georgia H. Nemetz, Kenneth D. Craig, and Gunther Reith 
University of British Columbia, Vancouver, Canada 


An evaluation of treatment programs for women suffering from debilitating 
sexual anxiety is described. Attitudinal and behavioral indices of sexual adjust- 
ment and sexual anxiety were obtained from 22 women to assess effects of indi- 
vidual and group graduated symbolic modeling through videotapes, with con- 
procedures. All women serving as clients 
had reported severe anxiety that precluded sexual enjoyment or activity. Sixteen 


current behavioral tasks as treatment 


Sexual anxiety continues to be a major 
concern for therapists concerned with sexual 
dysfunction and anomalies, Although a diver- 
sity of therapeutic techniques have been de- 
vised to accommodate the numerous theoreti- 
cal positions (Kaplan, 1974; Masters & John- 
son, 1970), there is general agreement that 
anxlety toward sexual functioning is a basic 
prohibitive 
y, 3 
or measured affectively, be- 
haviorally, is a highly 
learnable pattern of response, aversive sexual 
incidents would be sufficient to generate fe- 
male sexual dysfunction, These “incidents” 
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tatoring as the two major inhibiting factors 
to the natural flow of stimuli leading to the 
female orgasmic response. Because anxiety is 
readily attached to previously noneliciting 
stimuli on the basis of simple contiguity, the 
full range of sexual behavior may become af- 
fected. For example, anxiety toward inter- 
course may generalize sufficiently to disrupt 
or preclude other effective sexual habits. 

A number of therapists have effectively used 
the desensitization paradigm (Wolpe, 1958) 
to reduce sexual anxiety and achieve a con- 
current increase in pleasurable sexual behavior 
(Brady, 1966; Kraft & Al-Issa, 1967; Laza- 
rus, 1963). Lazarus (1963) treated 16 in- 
orgasmic women described as recalcitrant and 
Persistent Cases. Subjects were instructed in 
Progressive relaxation, hierarchies with re- 
spect to sexual functioning were constructed, 
and anxlety-eliciting items were presented 
verbally in a hierarchical fashion. Lazarus 
reported that 9 patients became “sexually 
adjusted” after a mean of 28.7 sessions over 
am Average period of 6 months. Control 
group data were not provided. Kraft and 
Al-Issa (1967) used desensitization for 4 
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female patient whose dysfunction was at- 
tributed partially to social anxiety in hetero- 
sexual situations. The authors reported com- 
plete recovery after 84 13-hour systematic 
desensitization sessions. Variations in tradi- 
tional desensitization procedures have also 
proven effective. Clopton and Risbrough 
(1973) reported success in eliminating sex- 
ual anxiety and in inducing approach behav- 
ior by having patients present sexually aver- 
sive stimuli to themselves at or before mas- 
turbation-induced orgasm. Brady (1966) 
reported success for 4 out of 5 clients suffer- 
ing from “chronic severe frigidity” using 
desensitization aided by the use of subanes- 
thetic doses of methohexital sodium as a 
means of producing profound muscular re- 
laxation. 


strategies because of their effectiveness 1n re- 
ducing maladaptive fears and avoidance be- 
havior and in concomitantly inducing desira- 
ble behavior. Elements of vicarious extinction 
(Bandura, Grusec, & Menlove, 1967; Craig, 
Best, & Ward, 1975), graduated symbolic 
modeling (Bandura, Ritter, 
1969), and videotaped desensitization 
(Wincze & Caird, 1976; Woody & Schauble, 
1969) were utilized. 
modeling therapies, previously 
(Bandura, 1969), include the acquisition of 
previously nonexistent patterns © 
weakening response inhibitions, and facilitat- 
ing the occurrence of preexisting responses 10 
the behavior repertoire. In the instance of 
sexual anxiety, modeling therapy would ap- 
pear to be particularly promising pecause of 
its effectiveness in extinguishing 
avoidance tendencies and in inducing behav- 
ioral and attitudinal change: Bandura et al. 
(1969) found graduated i 
to be more effective than standard desensiti- 
zation in neutralizing the anxiety-arousing 


properties of snakes and in inducing the great- 


est behavioral and attitudinal change. 

The present study used s. 
through the medium of a videotaped, gradu- 
ated hierarchy of sexual scenes. 
behavioral tasks also were utilized in the 


form of homework assignments to match the 


scenes portrayed in the videotapes. It was ex- 
pected that reducing sexual anxiety by vicar- 
ious extinction would enable the client to 
successfully complete the behavioral tasks, 
thus exposing them to a potentially positive 
sexual experience. The treatment procedures 
were based on those used by Wincze and 
Caird (1976), who found that videotaped de- 
sensitization was more effective than stan- 
dard desensitization in reducing heterosexual 
anxiety in those suffering from primary sèx- 
ual dysfunction. The present study extended 
these treatment procedures to a group therapy 
program, used additional measurement pro- 
cedures to examine other components of the 
behavior change process, and provided a 1- 
year follow-up of treatment effectiveness. 

In addition to the potential that symbolic 
modeling techniques have for effective inter- 
vention, they introduce economies by reduc- 
ing requirements for therapist inyolvement— 
thus making therapy more readily available. 
In the extreme, the technique can be exclu- 
sively self-administered. The procedure also 
lends itself to application to several clients 
at the same time. We could find no accounts 
of videotape treatment used in a group for- 
mat in the literature. The present study con- 
trasted individual treatment, group treatment, 
and a waiting list control group. 

A multidimensional model for evaluating 
treatment effectiveness was adopted. Changes 
in subjective affect were evaluated through 
self-report scales. Behavioral indices of 
changes in sexual activities were provided by 
clients who self-monitored during baseline, 
treatment, and follow-up phases and by the 
clients’ male partners, who also provided data 
during these phases of the treatment program. 
Repetition of the measures, especially the be- 
havioral indices, was designed to assess how 
rapidly treatment effects would manifest 
themselves. Because of the potential for pro- 
ducing & reorganization of attitudes toward 

ersonal and general sexual behavior, scales 
evaluating this domain also were included. 


Method 
Subjects 


The subjects were 22 inorgasmic female clients, 15 
of whom would satisfy Masters and Johnson’s 
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(1970) criteria for secondary orgasmic dysfunction 
and 7 of whom would be classified as primary in- 
orgasmic., All had been referred by physicians to 
the Sexual Dysfunction Clinic at the Health Sci- 
ences Center Hospital at the University of British 
Columbia. Other selection criteria included presence 
of anxiety or discomfort toward sexual behavior as 
measured both by interview data and scores on the 
Sexual Anxiety Card Sort (Barlow, Leitenberg, & 
Agras, Note 1); absence of organic causal factors, 
as established through gynecological examination; 
absence of acute marital problems justifying infer- 
ence of an unstable relationship with the sexual 
partner; and availability and cooperation of the 
husband or a steady male partner. Of 29 women 
initially interviewed, 7 failed to score above a cutoff 
point of 13 out of 100 on the Sexual Anxiety Card 
Sort. The subjects ranged in age from 21 to 39 
years with a median of 26.7 years. Fifteen of the 
subjects were married, 4 were single, and 2 were 
separated. The first 16 subjects obtained by succes- 
sive admissions to the clinic were randomly assigned 
to one of the two treatment conditions, The last 6 
successive admissions were assigned to the control 
condition. These control subjects were advised truth- 
fully that treatment was not available for approxi- 
mately 6 weeks but that it would be provided on 
termination of other clients, All 6 volunteered to 
Participate in pretreatment investigations and pro- 
vided data on the measures matching those used by 
experimental subjects, At the end of this period of 
time, all were treated in this clinic. 


Measures 


sists of 25 separate cards, describing various sexual 


according to the levels 


were to engage in the de icted bi i 
are sorted into five one vitae 


edian o ' me was administered 
the initia] interview, prior to each of the five sh 
ment sessions, at the 3-week follow-y : 
l-year follow-up, (b) i 


nine explicit categories of sexual behavior (see Table 
1). Both partners independently record the fre. 
quency of their sexual behaviors corresponding to 
these categories for complete weeks, including 1 
week prior to treatment, the intervals between each 
treatment, during the 3-week follow-up period, and 
during the week prior to the 1-year follow-up. 

Attitudinal measures examining both global and 
specific sexual attitudes were (a) Rotter’s Sex Atti- 
tude Scale (Rotter, Note 2), which contains 100 
items requiring ratings of attitudes on sexual issues 
concerning sex roles, sex education, and marital re- 
lationships. The measure was administered at the 
initial interview, prior to the fifth treatment ses- 
sion, and at the 3-week follow-up. (b) The Sexual 
Semantic Differential assesses attitudes toward spe- 
cific sexual concepts. Nine of the bipolar scales were 
based on the Marks and Sartorius (1968) measure, 
Ratings were scored for two basic dimensions, evalu- 
ation and anxiety. Clients rated seven concepts in- 
cluding both specific sexual activities and basic 
sexual descriptors (see Table 1). The semantic dif- 
ferential was administered at the initial interview, 
each of the five treatment sessions, the 3-week fol- 
low-up, and the 1-year follow-up, 


Procedure 


At an initial interview, after acceptance into the 
treatment program, both partners were advised of 
Procedures and regulations, The male’s supportive 
potential was established, and he was instructed to 
refrain from initiating sexual activity for the dura- 
tion of treatment. The different measures were de- 
scribed, and questionnaires were completed. The ra- 
tionale for relaxation training was explained to the 
client, and she received a tape-recorded or card- 
form Summary of progressive relaxation procedures. 
She Was instructed to practice the exercises twice a 
day during ‘Sessions of approximately 15 minutes. 
in ee twice a week with 3-day interses- 

intervals we; in- 
dal ne re planned for members of the in 
trollable events 
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Table 1 
Treatment Effects for Initial and 3-Week Fe i itudt: 
ie a E ollow-up Anxiety, Attitudinal, and 
Groups (A) Time (B) AXB 
Measure 
F df F af F df 
Anxiety 
Sexual Anxiety Card Sort 6.61** 
Heterosexual Behavior Index 4.30* 7 i ior E ase 4 ie 
Semantic Differential Anxiety ns 7.03** 6, 18 2.45%" 12, 108 
Sexual behavior index 
Female-initiated behavior ns + 
viena sexual behavior (male seeing female Bat 8 Aer ee 
nude or female seeing male nude) ns 
Nonsexual sensate focus (male or female vit saat bite’ 
giving body massage) ns be 
Sexual sensate focus (the foregoing includ- j BE AO 
ing breast and genitalia manipulation) 6.80** 2, 18 ns 2.24** 10, 90 
Foreplay (cumulative duration 10 minutes) 6.10** 2, 18 ns ns ” 
Foreplay (cumulative duration 20 minutes) 6.07**. 2,18 ns ns 
Intercourse ` 4.46**_ 2,18 ns 5.15** 6, 54 
Orgasm ns ns, ns 
Sexual concept evaluation 
Sex : 11.66%" 2,18  10.02** 6,108 3.25%* 12, 108 
Nudity _ ns ns 2.03** 12, 108 
Male bodies ns 3,52** 6, 108 ns 
Self-exploration ns ns 2.46** 12, 108 
Verbalizing sexual desires 3.75** 2,18 4,28** 6, 108 ns 
My role in sex 6.41** 2, 18 8.65** 6, 108 3.22** 12, 108 
Orgasm 4,05** 2, 18 ns ns 
Rotter Sex Attitudes ns ns " 
*p < 05. 
**p < 01. 
were on, clients were instructed to visualize both At the 3-week follow-up, both partners were in- 


themselves and their partner engaging in the depicted terviewed, behavioral indices were collected, ques- 
behaviors. During jntervals between scenes, they tionnaires were readministered, and a standard in- 
were to utilize whatever relaxation technique they quiry into the treatment and personal progress was 
found beneficial. Clients were provided with the completed. At the 1-year follow-up, clients were 
option of terminating the scenes by signaling anx- contacted by telephone. All questionnaires, exclud- 
iety to the therapist if necessary. ing the Rotter Sex Attitude Scale, were readministered, 
Instruction was provided for the Wolpe (1958) Subjects assigned to the control condition were 
subjective units of distress scale, and distress was told that treatment would not be available for ap- 
to be signaled if the client rated her feelings at proximately 6 weeks, but they were asked to partici- 
greater than 15 units. No client exercised this op- pate in pretreatment investigations, None of the six 
tion, Each of 45 vignettes was presented in brief, women refused, hence all came to the clinic on the 
with progressively longer portions and relexation same schedule as treatment clients and provided 
episodes interspersed between them. The 4-min vig- complete accounts of their sexual behavior on the 
nettes were presented progressively for 15, 30, 30, various measures. Male partners in this condition 
O = and 60 sec with 30-sec pauses PS epi- ajso completed all the sexual behavior indices. Be- 
er relaxation Eight vignettes were show! Per cause control group clients were provided with treat- 
session. Viewing sessions lasted approximately 1 i different ram after time com arable 
hour, After completion, clients filled out the Sexual ment in a differen prog E 
Semantic Differential. At the end of the session, to that required by the treatment groups at the 3- 
they were instructed to practice the viewed activ- week follow-up interval had elapsed, they were not 
ities at home with their partners. included in the 1-year follow-up assessment. 
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Results 


The following provides the outcome of anal- 
yses on anxiety, behavioral, and attitudinal 
measures for the duration of treatment and 
the 3-week and 1-year assessments. Basic 
analyses of initial treatment effects used a 
two-way analysis of variance with the two 
treatment groups and the control condition 
comprising the three levels of the between- 
groups factor and the baseline, five process, 
and 3-week follow-up administrations provid- 
ing the within-groups factor. Analyses of the 
1-year follow-up data used a two-way anal- 
ysis of variance contrasting the two treatment 
groups as a between-subjects factor and the 
four occasions during which data were col- 
lected as a repeated measures factor, These 
were baseline, the fifth treatment session, the 
3-week follow-up, and the 1-year follow-up. 
Significant main effects and interactions were 
analyzed further using Cicchetti’s (1972) ver- 
sion of the Tukey multiple range comparison 
technique (a = .05), Table 1 summarizes the 
findings for the anxiety, attitudinal, and be- 
havioral measures, 


Anxiety Measures 


As noted in Table 1, significant main ef- 
fects and interaction effects for groups over 


Table 2 
Mean Frequencies for Anxiety Measures 


Measure 


time were observed for the three anxiety mea- 
sures. (See Table 2 for mean group frequen- 
cies at selected sessions.) Post hoc compari- 
sons on the card sort data indicated that for 
both treatment conditions, the 3-week follow- 
up measure differed significantly from the 
pretest measure as well as from treatment 
measures at Sessions 1, 2, and 3. There were 
no differences between measures for the con- 
trol condition. Post hoc comparisons for in- 
dividual assessment sessions indicated that 
all three groups differed on the pretreatment 
measure, but the difference disappeared with 
respect to the individual and group treatment 
comparison while doubling in magnitude with 
Tespect to the differences between both treat- 
ment conditions and the control condition, 
As the premeasures were significantly differ- 
ent, the data were subjected to an analysis 
of covariance with the premeasure as the 
covariate. Results indicated significant treat- 
ment effects, F(2, 17) = 4.40, p < .05, a time 
effect, F(5, 85) = 12.81, p< .01, and a 
Treatment X Time interaction, F(10, 85) = 
6.33, p < .01, indicating that the results were 
due to treatment over time and not to the 
initial premeasure difference, 

Data for the semantic differential anxiety 
dimension, obtained by collapsing over the 
sexual concepts, revealed unfavorable anx- 
iety during the baseline in all groups, with 


Baseline Session 5 3-week follow-up 1-year follow-up 
Anxiety Card Sort 


Individual 32.25 
i 12.87 8. 
Col 22.14 12.42 6. o "029 
ro 41.66 40.00 45.66 aoe 
Bentler Heterosexual b 
Behavior Hierarchy 
Individual 37. 
.12 — 
Group 29.28 = ‘pe nee 
Contes dias 12.00 5.57 
Anxiety dimension oe J5 
(Semantic Differential) 
Individual 
—13.37 7 
Gare ak -50 13.75 
Control Too 261 


Eo 


me 
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the treatment groups improving over time 
until their ratings of the concepts became 
favorable. The control group initially im- 
proved but subsequently declined back to 
baseline levels. Post hoc analyses indicated 
that within the two treatment conditions, the 
follow-up measure differed significantly from 
the pretreatment and first and second treat- 
ment measures. Within each administration 
period there were no differences between con- 
ditions until Sessions 4 through follow-up, at 
which time the treatment conditions differed 
from the control condition but not from each 
other. As with the Sexual Anxiety Card Sort, 
the decline in anxiety in the treatment con- 
ditions manifested itself at the time of the 
third treatment session. 

Findings using the Bentler Heterosexual 
Behavior Hierarchy were similar, even though 
this measure was not administered during 
treatment sessions. For both treatment con- 
ditions the follow-up measure was signifi- 
cantly below the baseline measure. The con- 
trol condition showed no change. 

Analyses of the 1-year follow-up data for 
all three anxiety measures indicated that the 
two treatment conditions did not differ from 
each another (p > .05), but the effect for the 
time of administration of the measure was 
significant. This was the case for the card 
sort, F(3, 12) = 25.20, p < .01, the semantic 
differential, F(3, 12) = 9.38, p < 01, and the 
Bentler hierarchy, F (3, 12) =9.39, p < Ol. 
Post hoc comparisons indicated that the 1- 
year follow-up data differed significantly from 
the baseline measure but not from treatment 
termination or the 3-week follow-up. Thus, 
the reduction in anxiety displayed at treat- 
ment termination and the 3-week follow-up 
was maintained. 


Behavioral Measures 


Clients’ and partners’ reports for the dif- 
ferent categories of the sexual behavior index 
were intercorrelated to evaluate reliability. 
Following z-score transformations, an over” 
all mean intercorrelation of .912 was obtained, 
indicating good reliability. The following p10- 
vides details on the clients’ data only. 


Treatment influenced the extent to which 


the women initiated sexual behavior with their 
partners, as indicated by the significant Group 
X Time interaction. Consistent with the 
means reported in Table 3, post hoc pairwise 
comparisons indicated that clients in the two 
treatment conditions reported increases in the 
number of female initiatives, whereas the no- 
treatment control condition actually declined 
over the 6 weeks of monitoring. These effects 
were maintained throughout the year preced- 
ing the 1-year follow-up, as indicated by a 
main effect for the four measurement periods, 
F(3, 12) = 6.55, $ < .01, without differences 
between the 3-week and 1-year follow-up data. 
The two treatment groups did not differ at 
the 1-year follow-up. 

Treatment effects over time were also ob- 
served for three of the five measures of pre- 
coital sexual interaction. With respect to vis- 
ual sexual behavior, post hoc comparisons of 
the significant Treatment X Time interaction 
indicated that clients in the two treatment 
conditions maintained stable rates of visual 
exposure to their partners, and the no-treat- 
ment control groups actually declined in the 
frequency of visual exposure over the 6 weeks 
of behavioral monitoring. 

Analyses of long-term effects for visual 
sexual exposure revealed that neither of the 
two treatment groups differed from the rate 
of visual exposure observed at the 3-week 
follow-up. However, the two groups did differ 
significantly from each other at that time, 
with the group treatment exceeding the indi- 
vidual treatment, F(1, 12) = 8.25, p < 01. 

Nonsexual sensate focus effects were ap- 
parently the result of those clients receiving 
group treatment engaging in significantly 
more nonsexual sensate focus, relative to the 
baseline measure, whereas clients in the no- 
treatment control actually experienced a sig- 
nificant decrease in the monitored behavior. 
Clients in the individual treatment condition 
maintained a fairly steady rate of nonsexual 
sensate focus. 

The results of the 1-year follow-up for non- 
sexual sensate focus indicated that the two 
treatment groups did not display any signifi- 
cant change over the follow-up period ($ > 
.10). 

Post hoc comparisons indicated that both 


68 


Table 3 


`G. NEMETZ, K. CRAIG, AND G. REITH 


Mean Frequencies for Behavioral Index 


_ 


Measure Baseline Week 2 3-week follow-up 1-year follow-up 
No. female initiatives 
Individual .87 2.62 2.39 1.85 
Group 57 1.42 2.30 2.80 
Control 1,33 -66 -50 — 
Visual sexual exposure 
Individual 10.12 8.25 7.62 10.42 
Group 9.14 13.14 12.42 16.00 
Control 12.83 6.83 3.66 — 
Nonsexual sensate focus 
Individual 1.62 1.37 1.50 3.71 
Group .42 3.85 3.14 3.57 
Control 3.50 1.83 53 — 
Sexual sensate focus 
Individual 6.00 11,62 8.45 9.71 
Group 4.71 9.71 10.57 12.71 
Control 4.50 2.16 75 4 
Foreplay duration 10 min, 
Individual -62 1.75 1.13 1.57 
Group 2.00 1.42 1.34 1.57 
Control .16 .00 16 hi 
Foreplay duration 20 min. 
Individual 37 
ES a: a 
ontrol 33 00 10 > 
Intercourse 
Individual 
Group ie 2.37 2.31 2.14 
Control è 1.57 2.14 3.8 
1,16 66 50 Ba 
Orgasm 
Individual 
Troup 50 62 37 42 


r 3-week follow-up, 
they reported significantly lower levee tee 


either of the two treatment groups. The in- 


creased incidence of sexual sensate focus was 
maintained from the 3-week to the 1-year 
follow-up, with the two treatment groups not 
changing Over this interval (p > .10) and not 
differing from one another at that time. 
Analyses of clients’ reports of the incidence 
of foreplay lasting longer than 1 minute but 
less than 10 minutes indicated a significant 
main effect for treatment, with subsequent 
Post hoc comparisons indicating that the two 
treatment groups differed significantly from 
the control group but not from each other. 


= 
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Inspection of the data indicates a very high 
incidence of 10 minutes of foreplay in the 
group treatment clients’ reports during the 
baseline period. At the time of the 1-year 
follow-up, the two treatment groups main- 
tained the incidence reported at the 3-week 
follow-up. 

Analyses of the incidence of foreplay last- 
ing between 10 and 20 minutes also indicated 
significant treatment effects but no interaction 
with time, Paired comparisons of the main 
effect indicated that the two treatment groups 
reported a greater incidence than the control 
group, but they did not differ from one an- 
other. No initial differences were observable 
in the baseline measure. There were no dif- 
ferences between the 3-week follow-up and 
the 1-year follow-up, indicating that the bene- 
ficial effects of treatment had been maintained. 

Analyses of the incidence of intercourse 
were restricted to differences between the 
baseline and 3-week follow-up measures be- 
cause clients and their partners had been en- 
joined to avoid intercourse throughout treat- 
ment. Results indicated significant treatment 
and Treatment X Time interaction effects. 
Post hoc comparisons indicated that the sig- 
nificant Time x Treatment interaction ap- 
parently was the result of an increased fre- 
quency of intercourse relative to baseline for 
those clients receiving group and individual 
treatment as compared to a decline in inci- 
dence as reported by the no-treatment con- 
trols, At the time of the 1-year follow-up, 
post hoc analyses of a significant interaction 
between groups and time, F(3, 12) = 10.77, 
p< 01, indicated that the group recelving 
individual treatment maintained the level of 
intercourse observed at the time of the 3- 
week follow-up. However, group treatment 
led to significant increases relative to both 
the individual treatment group and the in- 
cidence of intercourse reported at the time 
of the 3-week follow-up. 

Analyses of the incidence of orgasm Te 
ported by the women yielded no significant 
effects for treatment or the interaction of 
time with the treatment condition. Reports of 
the incidence of orgasm were at the zero level 
for the control group and were only mar- 
ginally greater for either of the two treat- 


69 


ment groups. At the end of the 1-year fol- 
low-up, there was a nonsignificant tendency 
for both treatment groups to increase in the 
incidence of orgasm relative to the baseline, 
treatment termination, and 3-week measures, 
F(3, 12) = 1.23, p > .10. 


Attitudinal Measures 


The Rotter Sex Attitude Scale was admin- 
istered to evaluate global sexual attitudes, and 
a semantic differential had been constructed 
to examine the evaluative dimension with re- 
spect to specific sexual concepts, Data from 
the Adjusted-Maladjusted subscale of the 
Rotter measure were subjected to a two-way 
analysis of variance, with the treatment 
groups and control condition comprising the 
three levels of a between-groups factor and 
the baseline, final session, and follow-up ad- 
ministrations providing the within-groups 
factor. The analysis revealed no significant 
main effects or interactions. Mean values for 
all groups indicated adequate sexual adjust- 
ment. 

The Sexual Semantic Differential scores in- 
dicated the degree of positive or negative af- 
fect induced in the client by each of the seven 
concepts relating to either specific sexual ac- 
tivities or concepts with sexual arousal po- 
tential. In contrast to the Rotter scale, atti- 
tudinal changes and treatment effects were 
observed, as summarized in Table 1. Inter- 
action effects were observed between treat- 
ment and time for the following sexual con- 
cepts: “sex,” “nudity,” «self-exploration,” 
and “my role in sex.’ Means for groups at 
the different times at which the measures were 
administered appear in Table 4. With respect 
to the concept Sex, post hoc comparisons in- 
dicated that whereas all groups initially had 
unfavorable attitudes, the two treatment 
groups progressively developed favorable at- 
titudes through the 3-week follow-up, and the 
control condition remained at the baseline 
level throughout. The concept nudity was met 
with an essentially neutral response in all 
groups at the time of the baseline measure. 
The control group did not change over time, 
put both treatment groups exhibited favora- 
ble attitudes at the time of the 3-week fol- 
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Table 4 siiig , 
Mean Frequencies for Semantic Differential 
— SSS 
Sexual concept Baseline Session 5 3-week follow-up 1-year follow-up 
Sex 
Individual —6.28 7.14 10.71 13.00 
Grows F —5.28 10.14 10.85 13.85 
Control — 13.16 —12.00 —12.50 
Nudity 
Individual 5.14 8.10 11.57 11.14 
Group 8.42 8.10 13.57 13.71 
Control 3.50 -00 —.50 =e 
Male bodies 
Individual 4.85 8.42 11.14 9.85 
Group 6.14 10.71 14.85 14.14 
Control 1.16 1.00 50 — 
Self-exploration 
Individual 2.42 5.14 8.75 6.42 
Group 4.42 6.28 9.71 6.71 
Control 2.83 3.50 —4.00 — 
Verbalizing sexual desires 
Individual —.28 2.28 5.14 3.85 
Group —4.10 6.00 8.14 7.42 
Control —13.50 —8.50 —10.00 _ 
My role in sex 
Individual ~11.57 5.80 
if 5 9.10 10.85 
SEN —3.28 6.00 9.80 12.00 
ontrol —=15.83 —=11.50 —13.66 E 
Orgasm 
Individual 
Group 1071 na ar se 
Control ; yeu 14.00 13.00 


groups than in 
the contro] 
the two treatment groups at the time of the 


the control group. However, 
group differed substantially from 
baseline measure, and it would be inappro- 
priate to attribute differences to treatment 
effects, Similarly, all groups apparently im- 
proved in their evaluation of the concept over 
time, but this again could not be attributed 
to treatment because the control group im- 
Proved as well. It is noteworthy that the 
treatment Stoups ultimately ended up with 
favorable attitudes while the control group 
Maintained an unfavorable evaluation of the 
Concept. 

Analyses of evaluative attitudes toward the 
concept orgasm indicated that the groups 
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differed significantly, but there were no 
changes over time nor did group effects in- 
teract with time. At the time of the baseline 
measure, those individuals about to receive 
group treatment appeared to have substan- 
tially more favorable attitudes than those 
about to receive individual treatment or those 
in the control group. The two latter groups 
were essentially neutral with respect to their 
attitude toward the concept. At the time of 
treatment termination and the 3-week fol- 
low-up, the control group maintained the es- 
sentially neutral attitude, with the two treat- 
ment groups quite positive and not different 
from one another. The three groups could not 
be differentiated in terms of their attitudes 
toward the concept male bodies at any time 
during baseline, treatment, or follow-up. How- 
ever, attitudes tended to become more favora- 
ble over time in all three groups. 

Analyses of the 1-year follow-up data in- 
dicated that for the majority of the sexual 
concepts (sex, nudity, my role in sexual ac- 
tivity, verbalizing sexual desires, and orgasm), 
treatment gains observable at the time of the 
3-week follow-up were maintained. No sig- 
nificant differences were observable between 
measures taken on the two occasions ($ > 
.05). In addition, there were no significant 
differences between groups receiving individ- 
ual or group treatment after 1 year. With 
respect to the concept self-exploration, there 
was a nonsignificant decline in attitude 
through the 1-year follow-up, suggesting that 
after this time early benefits from self-explora- 
tion had led to other more satisfying forms 
of experience. No change appeared in char- 
acterizations of the concept male bodies. 


Discussion 


Clients receiving group and individual 
graduated symbolic modeling with concurrent 
behavioral tasks improved significantly with 
respect to sexual anxiety, attitudes, and be- 
havioral enactment, in contrast to the no- 
treatment control condition. Although no sta- 
tistically significant differences were observed 
between group and individual treatment pro- 
grams, clients receiving gtouP treatment re- 
corded less anxiety on the card sort and the 
anxiety dimension of the Sexual Semantic Dif- 


ferential. There was also a tendency for clients 
receiving the group treatment to report fa- 
vorably higher frequencies on the sexual be- 
havior index. As clients were randomly as- 
signed to the two treatment conditions, there 
was no reason to believe that those receiving 
group treatment were different from those 
receiving individual treatment. It is possible 
that components of the group treatment, such 
as exposure to people with similar problems, 
induced lower levels of anxiety and facilitated 
the incidence of sexual activity. 

There was also a tendency toward deteri- 
oration in the control group over all indices. 
Behavioral frequencies generally tended to 
decline, and in two cases, visual sexual 
exposure and nonsexual sensate focus, they 
decreased significantly. On several other mea- 
sures, there was a pattern of initial im- 
provement with subsequent return-to-baseline 
incidence. There appeared to be an initial 
enthusiasm for involvement in what was de- 
scribed at the time as a “pretreatment inves- 
tigation,” which essentially comprised self- 
monitoring over the span of about 6 weeks; 
however, the improvement waned with time. 
It seems that the demand-induced increased 
awareness of personal sexual habits provoked 
anxiety and dissatisfaction to the extent that 
the initial favorable effects of self-monitoring 
(Kanfer, 1970; Kazdin, 1974) declined rap- 
idly after the third session. It would be use- 
ful to know if clients on waiting lists for 
sexual dysfunction treatment all manifest this 
trend toward deterioration. 

Because improvement was observed around 
the third treatment session on all measures 
of behavior, attitude, and anxiety, it appears 
that the treatment effects were consolidated 
simultaneously in both covert: and overt be- 
havioral systems. Including behavioral mea- 
sures throughout the course of treatment not 
only permitted assessment of when therapeu- 
tic effects manifested themselves but indicated 
whether clients were complying with treat- 
ment instructions. 

Results from the several measures of anx- 
iety, the Sexual Anxiety Card Sort, the Bent- 
ler Heterosexual Behavior Hierarchy, and the 
Sexual Semantic Differential all indicated 


that both treatment programs reduced anx- 
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iety throughout treatment, with reduced anx- 
iety persisting through both the 3-week and 
1-year follow-up. The control subjects dis- 
played a nonsignificant trend toward increased 
anxiety through the 3-week follow-up. Re- 
ductions in reported anxiety as measured by 
the Bentler were slightly less than those ob- 
served on the Sexual Anxiety Card Sort. As 
the Bentler describes more novel sexual ac- 
tivities than the Sexual Anxiety Card Sort, 
there was evidence of generalized treatment 
effects. 
Because the Marks and Sartorius ( 1968) 
Sexual Semantic Differential was originally 
devised to assess sexual attitudes toward un- 
conventional sexual activities, it is noteworthy 
that it was of value in the present context, 
thereby attesting to its flexibility and useful- 
ness. In the initial theoretical formulation, it 
was expected that positive increases on the 
Sexual Semantic Differential would be more 
marked for those concepts conforming to the 
behavioral treatment requirements, namely, 
sex, self-exploration, verbalizing sexual de- 
sires, and my role in sex. The findings were 
partially consistent with this expectation, 
since treatment effects were particularly po- 
tent for the concepts sex and my role in sex- 
ual activity. Significant differences between 
the two treatment groups and the control 
group generally manifested themselves after 
the third treatment session, These results have 
implications for time-limited therapy in that 
treatment effects appear to consolidate them. 
rad within the attitudinal realm fairly early 
luring treatment Programs, 
Tn accordance with e 


titudes. 


The sexual behavior index disclosed treat- 
ment-induced improvement in the Categories 
measuring the incidence of visual sexual ex- 
posure, nonsexual sensate focus, sexual sensate 
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focus, the number of female-initiated sexua] 
behaviors, and the frequency of intercourse, 
These effects were maintained without excep- 
tion through the 1-year follow-up. Although 
an increased incidence of orgasm tended to be 
a personal goal for most of the clients, it was 
not affected by treatment, a finding consis- 
tent with Wincze and Caird (1976). They 
agree with Kaplan (1974), who observed that 
orgasms can be conceptualized as partially in- 
dependent of sexual pleasure. As such, achiev- 
ing orgasm is the appropriate treatment ob- 
jective of complementary, but different, 
therapeutic programs. 

Although no statistically significant differ- 
ences were observed between group and 
individual treatment conditions, there were 
tendencies for those clients receiving group 
treatment to engage in more visual sexual ex- 
posure, nonsexual sensate focus, and female- 
initiated sexual behaviors. When debriefed, 
the women receiving group treatment tended 
to express positive feelings concerning it. The 
major issues included a feeling of group co- 
hesion, an increased ability to describe per- 
sonal sexual experiences in public, and a sense 
that they wished to do well for other mem- 
bers of the group. Given that group treatment 
is far more economical, this type of program 
appears to have some decided advantages. 
The above findings indicate the viability of 
the group treatment as a highly desirable al- 
ternative to conventional individual treatment. 


Reference Notes 


1. Barlow, D., Leitenberg, H., & Agras, S. The ex- 
perimental control of sexual deviation through 
manipulation of the noxious scenes in covert sen- 
sitization. Paper presented at the meeting of the 
Eastern Psychological Association, Washington, 
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Three therapies designed to reduce 


New Mexico 


were evaluated and compared, Self-referred and court-referred clients were ran- 


domly assigned to one of three t 
(AC) using self-administered 


reatment groups: aversive counterconditioning 


behavioral self-control 
i analysis; 
ood alcohol awareness 


consisted of 10 weekly sessions. All three behavior therapies produced signifi- 


The importance of alcohol abuse as a na- 
tional health problem is apparent. Less ob- 
vious is its appropriate remedy. Adherents of 
the traditional disease model of alcoholism 
tained that it involves an irrever- 
sible loss of control over drinking 
the resumption of normal drinking 
sible for the alcoholic (Alcoholics Anony- 
mous, 1955; Jellinek, 1960). The treatment 
of problem drinkers, dominated for the past 

i viewpoint, has focused pri- 


is impos- 


30 years by this 
marily on the goal of total and lifelong ab- 


Naughton, and Lonnie Snowden. 
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d peak blood alcohol 
1 year of follow-up. 
, although AC was 


ye, re- 
r BT. A self-control 
drinking. 


stinence. The general applicability of this 
model has been increasingly questioned, 
however (Armor, Polich, & Stambul, 1976; 
Miller & Caddy, 1977; Sobell & Sobell, 1974), 
and treatment methods designed to produce 
controlled drinking have begun to emerge 


(e.g., Lovibond & Caddy, 1970; Miller & 


Muñoz, 1976; Sobell & Sobell, 1973b). 
The status of these controlled drinking 
therapies is uncertain at present (Nathan, 
1977). A wide variety of treatment methods 
have been Proposed for the reduction and Pa 
trol of drinking (Briddell & Nathan, TE 
Hamburg, 1975; Lloyd & Salzberg, Ne 
P. M. Miller, 1976; Nathan, 1976). W. R: 
Miller (1976) has reviewed outcome reseat j 
regarding seven major strategies: (a) “a 
sive Counterconditioning, (b) blood alcol A 
discrimination training, (c) rate A 
training, (d) operant methods, (e) stim! 
control procedures, (f) self-monitoring, 4" 
(g) broad-spectrum approaches. Altho aa 
these techniques have been within the ar me 
mentarium of behavior therapists for K # 
time, their relative efficacy in the ce 
drinking has not been established. To fur 
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complicate matters, outcome studies have used 
varying client populations (albeit primarily 
inpatient alcoholics) and have, with few ex- 
ceptions (e.g., Sobell & Sobell, 1973a, 1976), 
failed to provide long-range follow-up data. 
The present study was designed to assess 
the relative effectiveness within an outpatient 
setting of three controlled drinking therapies: 
(a) aversive counterconditioning, (b) behav- 
ioral self-control training, and (c) direct 
training in rate control and blood alcohol 
discrimination. On the basis of the existing 
literature (W. R. Miller, 1976), it was pre- 
dicted that all three therapies would produce 
significant reduction in drinking behavior over 
the course of treatment. Of interest was the 
relative effectiveness of the three methods and 
specifically whether the treatments would dif- 
fer with regard to (a) their impact on drink- 
ing behavior, (b) their impact on measures 
of psychological functioning, and (c) the 
maintenance of whatever behavior changes 
they produced. Previous research has pro- 
vided no basis on which to predict the relative 
effectiveness of the three methods. 


Method 


Clients 


A treatment program for problem drinkers who 
desired “to reduce and control their drinking with- 
out stopping altogether” was announced through 
the local news media and mental health service 
agencies of Eugene, Oregon. Clients came to the 
program through either of two routes: self-referral 
or court referral. p s 

Self-referrals. Of 45 persons voluntarily seeking 
treatment, 1 was excluded due to advanced liver 
disease, Nine others dropped out of the program 
prior to or during the initial assessment phase and 
before beginning treatment. An additional 5 clients 
dropped out of treatment prior to the third ses- 
sion, and 1 client terminated following the eighth 
session. The remaining 29 clients completed the 10- 
session treatment program. 

Court referrals. Oregon 
ternative sentencing to a treatment program of per- 
sons found guilty of driving whi 
Alcohol Trafic Safety Clinic (ATSC), 
for treating such offenders, referred selected clients 
to this program. During the period of 
the present study, a total of 89 clients were pro- 


cessed by the ATSC. 4 

To enter the program å court-referred client had 
to pass through several screening stages- Clients A 
willing to sign a release permitting evaluation © 


their progress (n= 10) were excluded altogether. 
‘An additional 8 clients were incarcerated or left the 
community before they could be evaluated at in- 
take. These 18 clients were thus not studied. 

The ATSC staff further excluded 35 clients as in- 
appropriate for controlled drinking. The remaining 
36 clients were considered eligible for behavioral 
treatment. Of these, 6 refused or dropped out prior 
to treatment. Four began’ treatment but dropped out 
before the third session, and 1 terminated after the 
fourth session. All court-referred clients who dropped 
out of the program were referred back to ATSC 
for alternative therapies (primarily group and/or 
Antabuse). Eight clients who were otherwise eligi- 
ble for behavioral treatment were randomly selected 
to be assigned back to ATSC as a “random” con- 
trol group. A final total of 17 clients signed the 
release, were cleared by ATSC staff, were randomly 
assigned to and accepted treatment, and completed 
the 10 sessions. 


Client Assignment Procedures 


Clients were randomly assigned to one of the three 
treatment modes and to 1 of 10 paraprofessional 
therapists. All clients were informed prior to assess- 
ment and treatment of the modality to which they 
had been assigned. The nature of the treatment, in- 
cluding its potential risks and discomforts, was fully 
explained, and all clients read and signed a statement 
of informed consent for their particular treatment. 


Project Staff 


All treatment sessions were conducted by para- 
professional therapists trained in the behavioral pro- 
cedures described below. All therapists treated clients 
within each of the three treatment modalities. 

Therapists were recruited from among graduate 
and undergraduate students in psychology and al- 
lied fields and received academic credit for their 
participation. Eight therapists were women and two 
were men. By group means, the “average” therapist 
was 26.5 years of age, had completed 4.5 years of 
postsecondary education, and had had 3 years of 
prior experience in some form of human services. 
Therapists followed a detailed procedures manual 
for treatment sessions (Miller, Note 1) and received 
ongoing supervision, 

All assessment procedures were administered by 
trained interviewers who had no contact with or 
responsibility for therapy. Three different teams of 
interviewers conducted assessment at intake and 
termination, at 3-month and 1-year follow-ups. 


Facilities and Apparatus 


A conventional interview room served as the set- 
ting for assessment and for the treatment of clients 
in the behavioral self-control training group. A 
simulated bar setting provided the environment for 
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the other two treatment modalities, with the thera- 
pist in the usual position of a bartender, An Omi- 
cron Intoxilyzer, Model 4011, was installed in the 
bar to provide feedback regarding blood alcohol 
concentration. Equipment for the administration of 
electric shock included a Lafayette Master Shocker 
(Model 82400) with an intensity range of O to 5 
ma, two Lafayette Interval Timers (Model 58010), 
and a Lafayette Probability Generator (Model 
58019). A triple-function data recorder (Lafayette 
Model 58004) was installed on the bar top as an 
in-session counter for shocks, sips, and other rele- 
vant events. A quad indicator panel (Lafayette Model 
58001) was connected with the interval timers to 
provide signals for the Proper sequencing and spac- 
ing of aversive conditioning trials. 


Initial Assessment 


All clients completed 4 hours of 
to behavioral treatment. The first 2-hour session 
included a demographic questionnaire, the Minnesota 
Multiphasic Personality Inventory (MMPI; Hath- 
away & McKinley, 1943), the Michigan Alcoholism 
Screening Test (MAST; Selzer, 1971), Rotter’s In- 
ternal-External Locus of Control (I-E) Scale (Rot- 
ter, 1966), and a Profile of Mood States (McNair, 
Lorr, & Droppelman, 1971), The second 2-hour ses- 
sion consisted of the 
1976), 


assessment prior 


ralle assessment (with 
Drinking Profile) was conducted with clients who 
dropped out of 


uled by the probability generator, Upon pressing the 
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shock button for 15 sec, after which a “go” light 
signaled the beginning of the next trial. Between 
trials the client was allowed to adjust shock inten. 
sity if desired, being encouraged to increase inten. 
sity over trials. An AC session Was terminated after 
a minimum of 40 or a maximum of 50 trials, 

Clients in all treatment modalities were trained in 
self-monitoring and kept daily record cards of all 
drinks during each week of treatment. Clients in 
AC submitted these records to the therapist during 
weekly sessions, but no Specific use was made of 
them in treatment. 

Behavioral self-control training. 
self-control training (BT) sessions 
in the interview room and lasted approximately 30 
minutes each. As in the other two treatment modal- 
ities, clients began self-monitoring after the first 
session, The record cards kept by clients included 
columns for recording the type and amount of each 
drink consumed, time of the first sip, and situational 
information, These cards Served as the focus of BT 
sessions, 

During each session the client reviewed the records 
of her or his previous week’s drinking. The therapist 
completed basic calculations regarding alcohol con- 
sumption and blood alcohol concentration and helped 
the client to examine the records for stimulus ante- 
cedents of heavy drinking, Strategies for increasing 


All 10 behavioral 
were conducted 


future control were discussed. Suggested self-control _ 


methods encompassed three major strategies: (a) 
awareness and alteration of antecedents of drinking, 
(b) reduction of drinking rate, and (c) identifica- 
tion and practice of alternatives to the use of alco- 
hol. These strategies have been outlined by Miller 
and Muñoz (1976), 

Controlled drinking composite. The simulated bar 
setting served as the setting for controlled drinking 
(CD) sessions, which lasted between 120 and 150 
minutes each. Because clients consumed alcohol dur- 
ing these sessions, arrangements were made for trans- 
portation to and from the dinic. A 

During Sesion 1 the client was provided with 
her or his most frequently consumed beverages and 
Was instructed to drink as usual. No client was pe- 
mitted during any session to consume drinks that 
would elevate the blood alcohol concentration above 
80 mg%. Training in self-monitoring was inclui 
in this first session, 

During Sessions 2-4 the client was instructed to 
drink in his or her normal manner and was taught 
to estimate the blood alcohol concentration reach 

after each drink by attending to both internal an 
external cues. When the client’s blood alcohol con- 


—— 


7 Sessions 1-5 had originally been intended to in- 
clude blood alcohol concentration discriminati 
training after the manner of Lovibond and Ca a 
(1970), Malfunctioning of the Intoxilyzer, howevely 
Precluded its i i 


jents 
rected, a procedure was substituted whereby on 
Were taught to estimate their own blood 
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centration reached 65 mg% or 30 minutes before 
the end of a session (whichever came first), elec- 
trodes were attached as in AC, and discriminated 
aversive counterconditioning (Lovibond & Caddy, 
1970) was begun. During Session 2, 10 shocks were 
self-administered and were paired with sips of a 
beverage. During Sessions 3-4, 40 trials were re- 
quired, pairing shocks with presipping behaviors as 
in AC. Conversation between therapist and client dur- 
ing these and subsequent sessions focused on drink- 
ing behavior, antecedents and consequences of al- 
cohol use, alcohol education, and self-control strate- 
gies (Miller & Muñoz, 1976). Daily record cards 
were examined during each session to facilitate dis- 
cussion of the previous week's drinking pattern. 

Session 5 served as a probe session. No shocks 
were administered, but the client continued to re- 
cord drinks, sips, and blood alcohol concentration 
estimates. 

Sessions 6-8 focused on the reduction of alcohol 
consumption rate (Sobell & Sobell, 1973b). Elec- 
trodes were attached as before, but shocks were 
controlled by the therapist during these sessions. A 
list of rules similar to those used by Sobell and 
Sobell (1973b) was read to the client, indicating 
the behaviors for which a shock would be delivered. 
Shock intensity was set at the highest level pre- 
viously tolerated by the client. 

Session 9 served as a final probe, again with no 
shock contingency in effect. The therapist observed 
rule infractions but did not punish them. 

Session 10 consisted of a review of the treatment 
process, and was conducted in the interview room. 
A thorough functional analysis was constructed 
from information gained during treatment, and in 
vivo applications of self-control procedures were te- 
viewed. 


Termination Assessment 


A portion of the final session in each treatment 
mode was devoted to the completion of several as- 
sessment forms, including the Profile of Mood States 
and ratings of treatment 
Significant others were contacted to assess current 
drinking behavior. Clients were i 
record cards for an additional 12 weeks, mailing 
them weekly. Whenever possible, parallel information 
was obtained from clients not receiving behavior 
therapies. 

One half of the clients in each of the three treat- 
ment groups were randomly chosen to receive a 


concentration from number of drinks consumed, body 
weight, and spacing of drinks. Other research has in- 
dicated this to be an equally effective procedure for 
blood alcohol concentration training (Huber, Kar- 
lin, & Nathan, 1976). This change necessitated the 
repetition of one or two sessions for five CD dients 
who had been using the Intoxilyzer. 
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copy of a manual prepared as an aid in the main- 
tenance of treatment gains (Miller & Muñoz, Note 2), 


Follow-up Assessment 


At 3 months following termination, clients were 
contacted and invited to participate in a follow-up 
session, A fee of $5 was paid to each client for 
participation. Instruments administered at this time 
included a questionnaire paralleling previous mea- 
sures of drinking behavior, the MMPI, Rotter’s 
LE scale, the Profile of Mood States, and (where 
appropriate) a questionnaire regarding the client's 
use of the manual. Significant others were again 
contacted for corroborative information. Driving 
records were obtained for all court-referred clients. 
Those who had not previously received a copy of 
the manual were given one at this time, 

Whenever possible, clients not treated in the pro- 
gram were also interviewed and paid for their par- 
ticipation. To determine the appropriate time for 
these follow-up interviews, all nontreated clients 
were yoked to treated clients according to the date 
of their intake with the program or with the ATSC. 

At 1 year following treatment termination, clients 
and significant others were once again interviewed 
by telephone and mail. Drinking behavior was as- 
sessed, and clients again completed the Profile of 
Mood States. 


Results 
Pretreatment Measures 


Table 1 presents demographic data for each 


-of the six groups studied. The groups differed 


on several important pretreatment measures. 
Contrasting self-referred with court-referred 
clients receiving behavioral treatment, the 
former reported significantly higher income 
and education and scored significantly higher 
on the MAST and on three subscales of the 
MMPI: Depression, Psychopathic deviate, 
and Rich and Davis’ (1969) Revised Alco- 
holism Scale. Self-referred clients also reported 
a significantly higher rate of alcohol con- 
sumption at intake than did court-referred 
clients, F(1, 40) = 19.67, p < .001. No sig- 
nificant differences on pretreatment measures 
were found among the three behavioral treat- 
ment groups, consistent with random assign- 
ment. 

Within the population of traffic violators, 
clients excluded by the ATSC were found to 
be significantly older and less educated and 
to have scored significantly higher on the 
MAST and on the three MMPI subscales 
mentioned above. 
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Table 1 
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Demographic Information Regarding Client Groups 
a E ESE O S 


n 
Years of Annual 
Client group Males Females Age education income 
Self referred 
Completed treatment 18 11 37.3 15.6 $14,703 
Refused or dropped out il 4 32.9 14.1 $10,034 
Court referred 
Completed treatment 14 3 35.8 11.9 $ 8,227 
Excluded from treatment 30 5 41.5 11.3 $8,384 
Refused or dropped out 9 2 27.3 12.4 $ 8,305 
Random controls 6 2 41.3 13.5 


Effects of Treatment on Drinking Behavior 


Self-report. Clients were interviewed at 
intake, termination, and at 3- and 12-month 
follow-ups. Reported alcohol consumption at 
each point was converted into standard eth- 
anol content units, with 1 standard eth- 
anol content unit equal to .5 ounce (15 ml) 
of pure ethanol. Figure 1 shows the mean 
self-reported alcohol consumption of each 
treated group at each of the four assessment 
periods. A repeated measures analysis of vari- 
ance revealed a significant decrease in drink- 


$ 9,955 


ing for all groups, F(3, 81) = 12.53, p < .001. 
No significant differences among groups were 
observed. 

Reports of significant others. Mean esti- 
mates of significant others regarding clients’ 
drinking are presented in Figure 2. In cases 
in which multiple estimates were obtained, 
the highest estimate was used. Product- 
moment correlations between the reports of 
clients and of their significant others were 
calculated and yielded a puzzling pattern. At 
intake there was no significant relationship 
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Figure 1. Mean self-report of weekly 
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Figure 2. Mean highest estimate from significant others of clients’ weekly alcohol consumption, 


(From Chapter 6 “Behavioral Self-control Training in 


the Treatment of Problem Drinkers” by 


William R. Miller. In Behavioral Self-management, Richard B. Stuart (Ed.). Copyright by Brun- 


ner/Mazel (1977). Reprinted by permission.) 


between these two data sources, (28) = -225, 
p> 05. At termination there was a mild 


whereas at the 3-month, r(29) = -733, p< 
.001, and 12-month follow-ups, r(20) = 821, 
p < .001, the reports were highly correlated. 
At each assessment period an approximately 
equal number of significant others overesti- 
mated and underestimated clients’ own reports 
of their drinking. 

Daily record cards. Alcohol consumption 
and weekly peak blood alcohol concentration 
as reported on record cards over the weeks of 
treatment and follow-up até presented in Fig- 
Repeated measures 


creases over the course of treatment both in 
weekly consumption, F(8, 344) = 8.51, P< 
001, and in weekly peak plood alcohol con- 
centration, F (8, 344) = 452, 2 < ‘001. Blood 
alcohol concentration peaks were estimated 
from body weight and consumption data 
(Miller & Mufioz, 1976)- There were again 
no significant differences among treatment 


modalities. In time series analyses of these 
data (Glass, Willson, & Gottman, 1972), all 
three groups showed a significant downward 
drift in consumption during treatment: for 
aversive counterconditioning, t(17) = —3.93, 
p < 001; for behavior training, ¢(17) = 
—8.71, p < 001; and for controlled drinking, 
t(17). = —13.30, p < 001, and maintained 
these gains, with level slopes during follow-up. 

Improvement ratings. The use of improve- 
ment ratings provides an index of the status 
of individuals following treatment. Each client 
was assigned to one of six improvement cate- 
gories at termination and at 3- and 12-month 
follow-ups. 

Because 4 number of clients (primarily 
court referred) were found to be drinking 
very little at intake, an initial distinction was 
drawn between those who had been “problem 
drinkers” and those who had been “controlled 
drinkers” immediately prior to treatment. A 
problem drinker was defined as anyone who 
exceeded 20 standard ethanol content units 
per week or a peak blood alcohol concentra- 
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low-up. (From Chapter 6 “Behavioral Self-control Training in the Treatment of Problem Drinkers” 
by Wiliam R. Miller. In Behavioral Selj- 


management, Richard B. Stuart (Ed.). Copyright by 
Brunner/Mazel (1977). Reprinted by permission.) 
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tion of 70 mg% or both, as reported by any 
data source. Those who, by all reports, were 
not exceeding these limits were classified as 
“controlled” at intake and were then assigned 
to one of two ratings. Controlled clients were 
rated as improved if they showed a noncon- 
tradicted reduction of at least 30% in alco- 
hol consumption. (“Noncontradicted” indi- 
cates that self-report, reports of significant 
others, and daily record cards were all in 
agreement.) Clients who showed less than a 
30% reduction or whose reports were con- 
tradicted by any data source were rated as 
showing no change. 

Problem drinkers were assigned to one of 
four improvement ratings. Considerably im- 
proved were those clients who showed at least 
a 50% noncontradicted reduction in alcohol 
consumption or who met the controlled drink- 
ing criteria following treatment. (Abstainers 
fell into this category, but they are indicated 
separately as abstinent.) Moderately im- 
proved clients were those who showed at least 
a 30% noncontradicted reduction or a de- 
crease greater than 50% that was contradicted 
by another data source. Clients were rated 
as slightly improved if they showed at least 
a 10% noncontradicted decrease or a 30%- 
50% reduction that was contradicted. Clients 


Table 2 : 
Number and Percentage of Clients Assigned to E 
and Follow-ups 


At termination 
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with less than a 10% reduction or a contra- 
dicted reduction of less than 30% were rated 
as not improved. 

Table 2 summarizes the number and per- 
centage of clients assigned to each improve- 
ment rating at termination and follow-ups. 

Drinker classification. Still another short- 
hand method for evaluating individual out- 
come is to determine how many reached an 
absolute criterion of controlled drinking. The 
above criteria of not exceeding 20 standard 
ethanol control units or 70 mg% per week 
defined the controlled clients, with the addi- 
tional constraint that these reports could not 
be contradicted. Number of clients classified 
as problem, controlled, and nondrinkers at 
intake, termination, and follow-ups is reported 
in Table 3. 


Drinking Behavior of “Control” Groups 


For comparative purposes clients who were 
excluded from or dropped out of treatment 
were also evaluated. Although much less in- 
formation was available for these clients and 
since they proved more difficult to reach for 
follow-up, sufficient data were obtained to 
assign drinker classifications at 3- and 12- 
month follow-ups. These are reported in Table 


‘ach Improvement Rating at Termination 


At 3-month follow-up At 1-year follow-up 


enka E 
B cD 

Improvement rating AC BT = =CD: AC BT” cD AC T 

Probl i intake 

ae BE. na 0 0 2 (15) 0 1(¢7) 205) 200) 1 (2 1B 
Abstinent nA _ #40) E (52). 8062) ES OD LGD $0) 7150) $98 
Moderately A re 3 (30) 3 ae aay í Go eg i ) 2 Aan Noa 
sen pees g au} A : > PT2 ih 1 uo} 0 3 i (8 
Inguffiedent das 0 jo 2U8) 2 (20) 214) 46 

cit 

Controlled drinkers at intake ; j i x i ; 

Improved 2 2 i 4 3 i 4 : : 


No change 2 1 
BT = behavioral self- 


Note. Numbers in parentheses are percenta; 
control training; CD y 
From Chapter 6 “Behavioral Self-control Trai 
Miller. In Behavioral Self-management, 
Reprinted by permission. 


= controlled drinking composite, 
ining in 
Richard B. Stuart (Ed.). 


ges, AC = aversive counterconditioning ; 


the Treatment of Problem Drinkers” by William R. 


Copyright by Brunner/Mazel (1977). 


CD 


12 months 
BT 
2 (14) 1 6) 1( 7) 
7 (50) 9 (53) 7 (47) 
2 (14) 3 (18) 3 (20) 
3 (21) 4 (24) 4 (27) 
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2 (13) 
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0 


BT 
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3 months 
0 


AC 
0 


CD 
2 (13) 
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BT 


6 (35) 


0 


Termination 
0 
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BT 
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AC 

0 

4 (29) 
10 (71) 
0 
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Problem drinker 
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Number of Clients in Each Treatment Group Assigned to Each Drinker Classification at Intake, Termination, and Follow-ups 
Abstinent 
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4. Most notable is the high proportion of 
abstainers among those excluded (and thus 
treated by the ATSC). Correlations between 
self-report and significant others’ reports were 
moderately high at both 3-month, r(29) = 
.775, p < .001, and 12-month, 7(37) = 480, 
p < .005, follow-ups. 

Examination of the driving records of court- 
referred clients at the 12-month follow-up 
revealed that fewer than 10% had repeated 
their offense of driving while intoxicated, 
Type of treatment received (ATSC vs, be- 
havior therapy) had no significant effect on 
level of recidivism. 


controlled drinking 


Effects of Treatment on Other (Nondrinking) 
Measures 


| 

Clients treated by the ATSC and those re- 
ceiving behavior therapies showed substantial 
improvement on measures of personal adjust- 
ment. On the Profile of Mood States, all three 
behaviorally treated groups improved signifi- 
cantly (p< .01) on all six subscales over 
the course of treatment and follow-up. ATSC 
clients showed similar improvement through 
the 3-month follow-up, but by the 12-month 
follow-up they had relapsed to pretreatment 
levels on all subscales. Self-referred dropouts, 
the only untreated group, were found to be 
unimproved at all follow-up interviews. Be- 
haviorally treated clients showed significant 
decreases on 8 of 10 clinical subscales of the 
MMPI, and similar improvement was found 
among all control groups. 


= aversive counterconditioning; BT = behavioral self-control training; CD 


Effects of a Manual on Maintenance of 
Controlled Drinking 


Clients were randomly chosen at termina- 
tion to receive or not receive a manual de- 
signed to improve maintenance of treatment 
gains (Miller & Muñoz, Note 2). Of the 2 
who received it, 17 indicated at the 3-mon! 
follow-up that they had read all or parts 
it. Alcohol consumption (from record caf ; { 
of clients who read, did not read, or did no 
receive the manual is shown in Figure 5- iB 

Two-way analyses of variance were used 
compare clients who read the manual W! 
those who did not receive it. At termination 


of 


Note. Numbers in parentheses are percentages, AC 


composite. 
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Table 4 
Drinker Classification Within “Control” Groups at 3-Month and 12-Month Follow-ups 
Court-referred 
Self-referred 
Excluded Dropouts Random controls ropouts 
Classification Z months 12months 3months 12months 3months 12 months 3months 12 months 
Abstinent 23 (66) 17 (49) 0 0 143) 0 2(13) 203 
Controlled drinker 7 (20) 10 (29) 4(36) 4 (36) 4 (50) 4 (50) 0 3 (20) 
Problem drinker 1(3) 1.03) 307) 3018) 0 1 (13) 7(47) 6 (40) 
Insufficient data 4an  7(20) 4 (36) 509) 3 (38) 3 (38) 6 (40) 407) 
Note. Numbers in parentheses are percentages. 
when manuals were distributed, these groups Discussion 


did not differ on alcohol consumption. At the 
3-month follow-up, however, significant dif- Treatment Outcome 
ferences were found in both weekly consump- 

tion, F(1, 16) = 4.40, p< .05, and weekly The patter 
peak blood alcohol concentration, F(1, 16) = modalities is quite consiste: 
6.21, p < .05. Repeated measures analyses of 
variance also showed large main effects of the alcohol use. In genera 
manual (p < .001) on both measures. A some- 
what surprising aspect of these data is the 
low level of alcohol use amon, 
ceived but did not read the manual. It ap- 
pears that these were 
ready achieved a considerable degree of con- 
trol by termination, and so they “didn’t 
bother” to read it. 
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to those of the BT and CD groups. A possi- 
clients who had al- ble explanation of this lies in the self-control 
manual, given to half of all clients only after 
the 3-month follow-up. For AC clients this 
manual contained new methods and informa- 
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tion, whereas BT and CD clients had covered 
this material during treatment. Subsequent 
research (Miller, Gribskov, & Mortell, Note 
3) has indicated that clients using this man- 
ual even without therapist contact can sig- 
nificantly reduce their drinking. The seeming 
reversal among AC clients between 3 and 12 
months may thus be partly attributable to 
use of the manual. 

Although no absolute differences between 
BT and CD were found, the pragmatics of 
these two treatments would argue for the use 
of BT, given equal effectiveness. The CD 
program required the use of expensive equip- 
ment: a bar, alcohol, and electric shock. BT 
requires none of these, is conducted in a 
standard therapy room, and requires less than 
25% of the therapist time needed for CD. 
BT is also amenable to a wider variety of 
delivery modes including group therapy, class- 
room and media presentation, and self-instruc- 
tion (Miller & Muñoz, 1976). In short, BT 
is a considerably more cost-effective and 
flexible approach, 

The lack of difference between BT and CD 
modes is particularly noteworthy because the 
latter included all of the components of the 
former. CD clients received instruction in 
self-monitoring, rate reduction, and functional 
analysis of drinking behavior—the formal 
components of BT. In addition CD clients 
received discriminated aversive countercondi- 
tioning and specialized rate control training 
with in-session drinking and avoidance learn- 
ing. These additional components apparently 
made little difference, 

These findings underline an important point 
—that at least in the area of problem drink- 
ing, More extensive therapeutic Programs are 
not necessarily more effective. It is often as- 
sumed that the addition of new components 
will increase the overall effectiveness of a pro- 
gram. This issue is a timely one with the 
Present growing emphasis on “multimodal” 
and “broad spectrum” approaches (Hamburg, 
1975; W. R. Miller, 1976). The utility of 
adjunctive procedures should be evaluated 
from a cost-effectiveness viewpoint (which 

may or may not correspond to statistical sig- 
nificance). Expensive additional procedures 
that add little to a basic treatment program 


can then be discarded, and more economical 
and/or effective ones can be adopted. 

Court-referred clients recei ving behavior 
therapies were, because of screening proce- 
dures, less severe problem drinkers at intake 
than were self-referred clients. Nevertheless, 
court clients responded to treatment in a 
manner similar to that of their heavier-drink- 
ing self-referred counterparts. Both groups 
reduced their drinking during treatment and 
maintained gains during follow-up. In no case 
did treatment mode interact with referral 
source. For these reasons all clients were com- 
bined for analyses of treatment outcome, 

A treatment adjunct evaluated within the 
present study is the self-control manual de- 
signed to improve maintenance (Miller & 
Muñoz, Note 2). Clients who received and 
tead the manual showed better maintenance 
than did those not given the manual. This 
effect was a small though significant one, but 
the use of a self-control manual is a very 
economical procedure and may thus be justi- 


fiable for even small increments in improve- } 
ment and maintenance (Christensen, Miller, 
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& Muñoz, 1978). 


Comparison with Control Grou ps 


Although the present study did not include 
untreated controls, comparisons are possible 
with clients who dropped out or received al- 
ternative therapies. Of the 15 self-referred 
clients who refused or dropped out of treat- 
ment, only 5 were found to be abstinent or 
controlled at the 1-year follow-up. This group 
also showed no improvement on measures of 
personal adjustment. Clients treated by ATSC, 
in contrast, showed marked improvement 02 
both drinking and psychological measures: 
The high frequency of abstinence among 
ATSC clients suggests that treatment outcome 
may be related to the ideology and expecta 
tions of the treating agency. 

Regarding the absolute effectiveness of me 
trolled drinking therapies, judgment must Df 
reserved pending further research. Studies t° 
date have not provided comparable untreat 
Controls. It is possible, however, to, one 
the results of the present study with we 
of previous outcome research. Emrick (19 


CONTROLLED DRINKING THERAPIES 85 


surveyed the outcome literature on alcoholism 
and found that the average percentage of 
clients becoming controlled drinkers was 
5.8%, whereas an average of 33.8% became 
abstainers following treatment. This is the ap- 
proximate pattern shown by ATSC clients, 
although the clinic’s outcome was well above 
average in both categories. The pattern of out- 
come from behaviorally treated groups, how- 
ever, differs substantially from Emrick’s 
norms, particularly in the controlled category. 
The overall improvement rate of 84% at 
termination exceeded Emrick’s norms by one 
standard deviation. 

Finally, it is noteworthy that the present 
study supports the longstanding finding that 
outcome at a 3-month follow-up is reasonably 
predictive of the longer range picture with 
regard to drinking and personal adjustment. 
Of the 23 “successful” (considerably or mod- 
erately improved) cases at the 3-month fol- 
low-up for whom 12-month data were also 
available, 21 (91%) retained their success- 
ful status. Four of the six unsuccessful cases 
at 3 months were rated as successful at 1 
year. With regard to drinker classification, 17 
of 20 controlled drinkers at 3 months re- 
mained so at 1 year, whereas 12 of 17 prob- 
lem drinkers became controlled. Of the 3 ab- 
stainers at 3 months, 1 remained so, 1 was 
drinking moderately, and 1 had relapsed to 
heavy drinking. 


Conclusions 


Certainly behavioral self-control training 
deserves further study as a treatment method 
for problem drinkers. A relatively economical 
procedure oriented toward control rather 
than abstinence, it offers considerable prom- 
ise as an early intervention method and may 
also prove beneficial in the primary and sec- 
ondary prevention of problem drinking 
(Muñoz, 1976). see 

The role of aversive counterconditioning 
in the control of drinking is more question- 
able. Aversion therapy, eve? accompanied by 
self-monitoring, produced the slowest and 
smallest initial gains of the methods studied. 
A counterconditioning component apparently 
contributed little to our multimodal (CD) 


program. These findings suggest that aversive 
procedures can be discarded without substan- 
tially reducing effectiveness (cf. Caddy & 
Lovibond, 1976; Hamburg, 1975). 

The present study also demonstrates the 
feasibility of paraprofessionals as therapeutic 
agents for problem drinkers. Previously inex- 
perienced and nonalcoholic therapists effec- 
tively trained clients in self-control, produc- 
ing changes comparable or superior to those 
of most alcohol treatment outcome studies. 
Apparently neither an advanced degree nor a 
history of alcoholism is required for effective- 
ness, at least with these particular treatment 
methods. 

Further research is needed to determine the 
optimal combination of treatment components 
for the control of overdrinking. The present 
study supports the effectiveness of training 
clients in self-monitoring, rate control, func- 
tional analysis, and blood alcohol concentra- 
tion discrimination, though the relative con- 
tribution of these elements is unknown, Other 
promising procedures include training in prin- 
ciples of self-reinforcement, basic education 
regarding alcohol and its effects, involvement 
of spouse or other family members in treat- 
ment, and training in alternatives to the use 
of alcohol (e.g., systematic desensitization, as- 
sertion training). The utility of these addi- 
tional components remains to be demonstrated. 
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Comparison of Usual and Experimental Patients in a 
Psychiatric Day Center 


Marsha Vannicelli and Stephen Washburn 
McLean Hospital, Belmont, Massachusetts 


Betty-Jane Scheff 
Concord Mental Health Center 
Concord, Massachusetts 


In the course of a previously reported study 


Richard Longabaugh 
Butler Hospital 
Providence, Rhode Island 


of inpatient and day hospitaliza- 


tion, 59 seriously ill female psychiatric patients were randomly assigned to an 


inpatient or a day hospital setting. 


ill patients randomly assigned to the day 


“usual” day patients. The experimental 
provement from baseline to subsequent 
global mental status, subjective distress, 


on the other hand, spent fewer nights in 
ities significantly less during the first 3 


lower cost for the same period. Two measures—number of 
the treatment milieu—indicate that experimentals 


and amount of time spent in 


initially required more staff effort than controls, 


reverse was true. 


A number of controlled studies, including 
our own (Washburn, Vannicelli, Longabaugh, 
& Scheff, 1976), of full versus partial hospital- 
ization have demonstrated that patients ran- 
domized to a day hospital setting do as well 
as or better than those who are placed in an 
inpatient setting. Our study at McLean Hos- 
pital of 93 patients, 59 of whom were ran- 
domly assigned to either an inpatient or day 
hospital setting, indicated that on five mea- 
sures the day patients fared better than those 
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The present study compares the 29 seriously 


hospital with a control group of 34 
group showed significantly more im- 
time periods in three distinct areas: 
and family adjustment. The controls, 
the hospital, used the hospital facil- 
months, and incurred a significantly 
social work contacts 


but at later time periods the 


who were placed in the inpatient program, 
and on the remaining nine, they did just as 
well as the inpatients. These findings support 
the feasibility of treating a large number of 
patients, formerly treated as inpatients, ina 
partial hospital setting at considerably de- 
creased cost. What is not known is how the 
sicker patients, randomized to a day setting, 
fare compared to the population of patients 
normally placed there. Equally important is 
the emotional and material toll on the fam- 
ilies and the partial hospital staff in treating 
these “unusual” day patients. Is there a trade- 
off somewhere or hidden cost involved in ex- 
panding day services to a population of pa- 
tients who would normally be treated as 
inpatients? Are hospital and family resources 
stretched thin by placing this expanded popu- 
lation in the day hospital? Does this sicker 
population require greater amounts of com- 
munity support, more psychotherapy, more 
extensive use of drugs, greater frequency of 
ancillary service visits, or more intense milieu 


interactions? 


The purpose of this study was to try to 
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answer some of these questions. Specifically, 
we were interested in (a) the relative out- 
come of sicker patients randomized to a par- 
tial hospital setting compared to those who 
would normally be placed there and (b) pro- 
cess differences between the two partial hos- 
pital groups in terms of (a) impact on the 
family, (b) involvement of the social worker, 
(c) quality of patient interaction with treat- 
ment milieu, and (d) ancillary therapeutic 
intervention. 


Method 
Subjects 


Description of subjects, setting, recruitment pro- 
cedures, and outcome measures have been detailed 
elsewhere (Washburn et al., 1976) and will only be 
briefly summarized here, Subjects were 93 middle- 
to upper-middle-class female patients between the 
ages of 16 and 72 (M = 32.9), admitted to McLean 
Hospital with a primary diagnosis of functional 
disorder. Of these, 59 were in the randomized group 
(30 inpatient controls, 29 day center experimentals) 
and 34 were in the day center control group. 

Initial comparison of the two day center groups 1 
revealed differences on two demographic and three 
baseline pathology variables: Proportionately more 
families of experimental patients (12/29) than con- 
trols (5/34) lived farther than 30 miles from the 
hospital, x*(1) =4.35, p<.05; the experimental 
sample was significantly younger (M = 28.7) than 
the control group (M = 42.15; t=4.46, p< 001) 
and was sicker at baseline on three Measures of 
pathology (see Table 1). The two groups did not 
differ with respect to religion, socioeconomic status, 
or number of previous hospital admissions. 


Subject Recruitment and Procedure 


Experimental day subjects were patients who, fol- 
lowing an inpatient workup, would have been as- 
signed to an inpatient setting, but they were not so 
suicidal, homicidal, or incapable of forming a treat- 
ment alliance as to make inpatient care an absolute 
necessity (ie. they were patients who, for the pur- 
poses of this study, were considered to be possibly 
treatable in the day setting). 

Control day subjects were (a) patients from the 
community who applied for admission directly to the 
day center and (b) patients recommended for the 
day center from the inpatient units, All such pa- 
tients were judged to be (a) able to forego acting 
upon suicidal and destructive impulses, (b) able 
to travel to the day center, and (c) more likely to 

benefit from treatment in the day center than in 
any other setting. 
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Setting 


McLean Hospital is a private, nonprofit psychiatric 
hospital. Its modern day hospital facility offers pro- 
gram options individualized to meet specific patient 
needs including individual, group, drug and actiy- 
ity therapies, family counseling, and milieu meetings, 
Program staff includes psychiatrists, social workers, 
psychiatric nurses, and mental health workers, ' 
Twenty-five to 40 patients attend the center daily, 
with a staff/patient ratio of about 1 to 4. Day center 
patients who, for brief periods, have difficulty man- 
aging in the community at night are housed on an 
unlocked night care unit. 


Instruments and Outcome 


Fourteen outcome measures were derived by clus- 
tering procedures: (a) Psychiatric Status Schedule— t 
Subject Form (PSS—SF) global mental status, (b) 
subjective distress, (c) impulse control (Measures 
a-c from the PSS—SF; Spitzer, Endicott, & Cohen, 
1966a), (d) Psychiatric Status Schedule—Informant 
Form (PSS—IF) global mental status (from PSS— 
IF; Spitzer, Endicott, & Cohen, 1966b), (e) Psy- 
chiatric Evaluation Form (PEF) global mental 
status, (f) role functioning (Measures e-f from the 
PEF;; Spitzer, Endicott, Mesnikoff, & Cohen, 1966), 
(g) intrapsychic functioning (adapted from the 
Camarillo Dynamic Assessment Scales; May & Dixon, 
1969), (h) family adjustment—subject, (i) family 
adjustment—informant, (j) community adjustment 
(Measures h-j based on Meltzoff and Blumental’s, 
1966, Outpatient Adjustment Rating scales), (k) 
burden evaluation (a project-developed graphic mt 
ing scale indicating the extent to which the patient's 
illness imposes a burden on the family), (1) at 
tempted roles (informant’s assessment of the num- 
ber of roles the patient has tried to fill during 
each 6-month interval), (m) direct charges, and 
(n) days of attachment (calculated at 3-month’ in- 
tervals, the latter by summing the total days billed 
for at least a quarter day use of the hospital pro- 
gram). 


Instruments and Processes 


Informant Forms I and II. Based on the woe 
of Freeman and Simmons (1963), Sainsbury a" 
Grad (1968), and Meltzoff and Blumenthal (1966), 
these questionnaires were completed by the fe 
mant (a family member or friend involved in is 
day-to-day life of the patient) at 6-month itea 

Informant Form I assessed the patient’s effect 07 
the family as reflected in three summary variables, 
(a) project assessment of burden, (b) inform 
subjective burden, and (c) burden imposed bY 
patient’s core pathology.? 


1 Data from the 30 inpatient controls have bee? 
Previously presented (Washburn et al, 1976). ales 

* Details regarding composite items in these $ 
are available on request. 
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Injormant Form II focused on the informant’s 
ability to utilize hospital and community support 
systems, aS reflected by (a) willingness to contact 
the hospital, (b) number of staff contacted, (c) in- 
tensity of contact (ie., amount of contact with the 
most frequently contacted staff member), (d) num- 
ber of people contacted outside the hospital, (e) 
breadth of support system (sum of variables b and 
d), and (f) number of services used. 

The Social Work Form, completed monthly, pro- 
vided the social worker’s view of the family as re- 
flected in (a) number of contacts about the patient, 
(b) necessity for social worker to initiate contacts, 
(c) family ability to support treatment, and (d) 
family tolerance for deviant behavior. 

Activity logs. Each patient maintained an hour- 
by-hour activity log that was collected weekly by 
the research team. Using @ random numbers table, 


other therapies. 

Social interaction. Quantity and quality of social 
exchange was coded at one ‘meeting per week and 
was averaged over the 6-month periods to determine 
number of (a) healthy acts, (b) acts directed toward 
the patient from other patients, (c) acts toward the 
patient from staff, (d) acts by the patient toward 
other patients, (e) acts by the patient to staff, and 
(£) total acts by the patient. 

Staff assessment of burden, During the last 9 
months of the study, the day center staff rated all 
patients on a’ scale of 1 (no burden) to 7 (extreme 
burden). An average rank was computed monthly 
for each patient and was then averaged over the 
9-month period. 


Administration of Measures 


Baseline measures of mental status and of family 
and community functioning were obtained prior to 
official acceptance into the day center. Initial mea- 
sures on all other instruments were obtained within 
a few days after acceptance. Subsequent assessment 
occurred for up to 2 years. 


Results 
Outcome Analyses 


Fourteen two-way analyses of variance with 
repeated measures were computed on the out- 
come variables examining the data over time 
from our two randomized gto a 
center control group. Planned-comparison t 
tests for both main effects 
were computed using the mean square error 
from the corresponding analysis of variance as 
the best estimate of error variance. 


Main Effects 


The planned comparisons for main effects 
were used to determine whether patients 
(across treatment groups) tend to get better 
or worse if examined at baseline and then 
again at later time periods. The ¢ tests showed 
that in general, patients improved from base- 
line to later periods. These data have been 
reported in detail (along with the interaction 
effects comparing the two randomized groups) 
in Washburn et al. (1976). 


Interaction Effects: Experimental Group 
Compared to the Controls 


Table 1 presents the means and ¢ values 
for measures that showed differences between 
patients randomized to the day center (ex- 
perimentals) compared with patients who nor- 
mally would be treated in that setting (con- 
trols), Two-tailed tests of significance were 
used to compare the amount of change 0C- 
curring in the experimental group between 
two given points in time with the amount of 
change occurring in the control group during 
the same time period.’ In addition, absolute 
time comparisons were made on the two 
groups at baseline, at 1 year, and for all anal- 
yses regarding days of attachment and direct 
charges. 

On the PEF global mental status, the ex- 
perimental group showed significantly more 
improvement from baseline to the overall post- 
test period and from baseline to posttest at 
3, 6, and 8 months than did the controls. The 
control group started with significantly (p< 
01) less psychopathology (baseline M= 
10.82) than the experimental group (baseline 
M = 14.29) and improved very little over 
time, whereas the experimental group im- 

roved dramatically, becoming healthier than 
the controls (though not significantly so) at 


2 


8 Specific comparisons were as follows: Amount of 
change between baseline and Posttest 1, Posttest 2, 
mean of Posttests 1 and 2, the overall postrandomiza~ 
tion period (mean of all posttests), and amount of 
change between Posttests 1 and 2 and Posttests 2 
and 3. Where no baseline data were available, paral- 
Jel comparisons were made with Posttest 1 instead 


of baseline. 
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Table 1 


Significant Comparisons for Day Center Experimentals Versus Day Center Controls 


Outcome measure 


Significant difference on mean change 


Global mental status 
PEF; change from baseline to 
Overall post (through 24 mo.) 
3rd post (Mo, 5-6) 
6th post (Mo. 12-14) 
8th post (Mo. 18-24) 
PSS-SF; change from baseline to 


Overall post (through 18 mo.) 
Overall first year 
Second 6 months 


Subjective distress; change from baseline to 


Overall post (through 18 mo.) 
Overall first year 
First 6 months 


Significant differences on absolute means 


Days of attachment (ist 3 mo.) 
Direct charges (1st 3 mo.) 
Global mental status 

PSS-SF (baseline) 

PEF (baseline) 


Subjective distress (1 year) 


about the 5th month, This pattern of greater 
improvement in the experimental group is also 


even more pervasive Pattern that ap- 
Pears in the PEF and is reflected in nearly 
all other measures is that of greater initial 
pathology (baseline) 
group but less Pathology at 1 year. In fact 
on 10 out of 11 + : 


measures (significantly 


SO on subjective dis- 
tress). 
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Experimental Control 
4.30" AS 2.95%% 
3.478 58 3,17** 
6.31" 46 3.23%" 
5.248 70 2.93%* 
8,938 96 2.52 
9.238 2.06 2.40° 
10.09 1,04 2.88°* 
8.975 34 2.29 
9,288 1.66 2.14 
10.968 .23 2.87 
60.59 50.748 2.12* 
$5,480 $3,370" 3.279% 
48.10 43,27> 2.31* 
14.29 10,82» 3.70** 
38,30" 46.38 2.89** 


Only in terms of number of days of attach- 
ment and direct costs did the experimental 
group fare worse than the controls. During the 
first 3 months, the experimentals were at- 
tached to the hospital for significantly more 
days than the controls, and they incurred 


— 


*The one exception appears to be a function of 
the peculiarities of the instrument used. The PSS-IF 
format is such that Pathology must actually be ob- 
Served to be recorded by the interviewer, Thus, for 
controls, who 
mant, any 


measure under these circumstances 
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significantly higher costs. At no other time 
period, however, did either the cost or the 
number of days of attachment differ. 


Process Analyses 


Using the variables described under Pro- 
cess Measures, 27 two-way analyses of vari- 
ance with repeated measures were computed 
to examine the data over time (baseline, 6 
months, 1 year, and 18 months) for our three 
groups.® 

Whereas a planned-comparison t-test strat- 
egy was used for the outcome analyses, the 
process variables were analyzed using a more 
traditional analysis of variance strategy, with 
t-test comparisons made on absolute scores 
only when F ratios were significant. 


Main Effects: Group Differences Between 
Experimentals and Controls 


Nights in the hospital was the only process 
measure that revealed significant main effects 
differences, F(1, 60) = 10.03, $ <.01, be- 
tween the two day center groups—The usual 
patients spent fewer nights (M = 7.05) in 
the hospital across time than the experimen- 
tals (M = 21.98). (The interaction effects 
discussed below are helpful in understanding 
this main effect.) 


Interaction Effects: Experimental and Control 
Groups Compared over Time 


Significant interaction effects differentiating 
control from experimental patients at specific 
time periods emerged on 8 of the 27 variables 
analyzed. 3 

The meaningful® significant comparisons 
are summarized below for each variable. 

Nights in the hospital. Both experimental 
and control patients dropped significantly in 
number of nights spent in the hospital after 
the first 3 months (F = 47.16, $ < 01). The 
control patients, however (who initially, and 
at all subsequent time periods, spent signifi- 
cantly fewer nights than experimentals), 
dropped off much less during later periods. 

Intensity of contacts within the hospital. 
At 6 months’ informants involved with ex- 
perimental patients had significantly more 


contacts with one primary care agent than 
informants associated with controls, F(6, 
129) = 3.44, p < 01. 

Number of outside people contacted. At 
18 months informants involved with experi- 
mental patients contacted significantly fewer 
people than informants involved with controls, 
F(6, 125) = 2.78, p < .05. 

Social work contacts about the patient. At 
baseline the social worker made significantly 
more contacts about experimental patients 
than about controls, whereas at 1 year the 
reverse was true, F(6, 97) =3.35, p< 01. 
Over time the control group did not change 
in terms of amount of contacts the social 
worker made, whereas the experimental group 
showed a downward trend with significant 
drops in number of social work contacts from 
baseline to each subsequent time period, and 
also between 6 months and each subsequent 
period. 

Necessity for the social worker to initiate 
contact. The social worker felt a greater 
need to initiate contacts about the experi- 
mental patients than about controls at both 
1 year and 18 monhs, F(6, 89) = 4.33, p < 
001. 

Both social work contact about the patient 
and the necessity to initiate contact show that 
more effort was initially expended by the so- 
cial worker for experimental patients than for 
controls. With regard to the number of con- 
tacts a reversal later occurred—the social 
worker at 1 year had more contacts about the 
controls than about the experimentals. 

Family ability to support treatment. At 
18 months (but not before), families of con- 
trol patients were seen by the social worker 


pate 

5 On two measures, nights in the hospital and time 
spent in treatment milieu, data were examined over 
only three time periods (no baseline data) and for 
only the two day center groups. 

6 Significant differences between the two day groups 
at two different times, for example, are not dis- 
cussed unless in the context of other data they add 
meaning. Presentation of means, statistical proce- 
dures, and rationale of procedures used are detailed 
in a more complete manuscript, available on request. 

7 For all analyses that follow, 6-month | data re- 
fiect the amount of activity between baseline and 6 
months; 12-month data reflect activity between 7 
and 12 months, 18-month data reflect activity be- 
tween 13 and 18 months. 
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as having more ability to support treatment 
than were families of experimentals, F(6, 83) 
= 6.79, p < 01. 

Time spent in treatment milieu. Both ex- 
perimental and control groups spent signifi- 
cantly more time in the treatment milieu dur- 
ing the first 6 months than at subsequent 
periods, F (2, 122) = 7.19, $ < .01. Of greater 
interest, whereas experimentals initially 
(Months 1-6) spent more time in the milieu 
than controls, during both subsequent 6- 
month periods controls spent significantly 
more time than did experimentals. 

Drug and staf burden analyses. Chi- 
squares examining the proportion of drug 
users and nonusers at 6 months and 12 months 


Discussion 


Our outcome comparison of usual day pa- 
tients with sicker 


cantly greater decrease in pathology for the 
experimental patients, At first glance the 
greater change found in experimental patients 


their out ti 
treatment providers to the extent that hne 


tient evaluation was required. There is likely 


to be a lively, eyen turbulent, į i 
) in 
with these patients thai nena 


than about controls. At 18 months the social 
worker still felt more pressure to initiate con- 
tact about the experimentals although actually 
having fewer contacts about them than about 
controls. At that time the experimenta] fam- 
ilies were also contacting significantly fewer 
people outside the hospital than were control 
families. By then the families of experimentals 
were apparently feeling less need for support 
(perhaps due to the dramatic improvement 
of their sicker member), but the social worker 
remained vigilant, 

The initial greater exertion by the social 
worker for experimental patients can perhaps 
be best understood in terms of the flamboyant 
illness of these patients in the early phases 
of hospitalization and their greater propen- 
sity to change, This Propensity to change may 
also contribute to the social worker’s feeling 
of greater need to initiate contact about ex- 
perimental patients and her assessment that 
their families were less able to support treat- 
ment. She was apparently feeling the pressure 
of change to a greater extent than were the 
families themselyes, 


at no greater overal] cost. It is true that ex- 
perimentals, initially, are harder to handle 
(as indicated by hospital charges, nights spent 
in the hospital, days of attachment, amount 
of time in the treatment milieu, and pressure 
for social work), But after this initial period, 


latter measures, with controls later requiring 
more attention than experimentals, In addi- 
tion, in terms of 20 other important variables 
reflecting, (a) impact on the family, (b) in- 
teraction with the treatment milieu, and (c) 
need for ancillary therapeutic intervention 
(drugs, Psychotherapy, etc.), there were no 
significant differences between the two groups. 

The greater initial pressure imposed on the 
social worker by the experimental patients, 
greater number of days attached to the hos- 
pital, and corresponding higher charges dur- 
ing the first 3 months can be partially ex- 
plained in terms of our study design. At the 
time of assignment to the day center, the 
experimental group had spent up to 6 weeks 
as inpatients and were caught in the inertia 


— 
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of the hospital system. In addition, commu- 
nity support systems had been disrupted and 
needed to be remobilized—a process compli- 
cated by the fact that families of experimental 
patients tended to live further from the hos- 
pital than those of controls. Extensive social 
work effort was often needed to establish al- 
ternative living arrangements before these 
patients could leave the hospital. 

In contrast, the community support net- 
work of the controls had been only minimally 
disrupted. Thus, there was less initial chaos, 
lower pressure on the social worker, fewer 
nights of hospitalization and lower hospital 
charges for the control group, and ultimately 
less pressure toward change or improvement. 
As systems theory would predict, a system is 
More open to change when change is already 
occurring; and the more acute state of the 
experimental patient requires immediate 
change. Although positive change counters 
chaos, it also causes further change (creating 
in its wake new chaos and need for additional 
change). The control patient, on the other 
hand, is part of a more static system in which 
fewer changes are demanded, creating less 
chaos and, in turn, less pressure for change 
(and improvement). 

Other less comprehensive explanations of 
the experimental group’s superior outcome are 
Possible but less persuasive. The gains of ex- 
perimental patients may be related to some 
extent to the specific effects of their initial 
inpatient status and their greater subsequent 
use of night care. Alternatively, it may be that 
because of their younger age and/or a pos- 
_ sible Hawthorne effect, the experimentals re- 
ceived more attention than the controls and 
Tesponded to this greater attention with im- 
Proved performance. The social work data are 
consistent with the latter explanation. How- 
ever, nursing staff in the day hospital, when 
Specifically questioned, were not able to iden- 
tify which patients were experimentals and 
which were controls, nor were there any dif- 
ferences in terms of staff interventions with 
€xperimental and control patients. Thus, if a 
Hawthorne effect were operative, it would 
have been an indirect function of differences 
in the behavior of the social worker (who 
Worked with the significant others, rather than 
with the patient)—the social worker’s atten- 


tion affecting family members, who, as a con- 
sequence, acted differently to facilitate im- 
provement in the patient. This explanation 
begins to overlap with the systems theory that 
we have proposed. 

Perhaps the best alternative explanation for - 
the observed differences between the two day 
hospital groups is that control patients were 
more chronically ill than experimentals. Al- 
though there was no difference between the 
two groups on one index of chronicity (num- 
ber of prior hospitalizations), the control pa- 
tients were significantly older. Their age in- 
creases the possibility that they may have 
been more chronically ill—perhaps due, at 
least in part, to years of insufficient systems 
demand for change. 

Only further research can sort out the ex- 
tent to which our results can be accounted 
for by these overlapping explanations. How- 
ever, whatever the explanation, our data sug- 
gest that expansion of day services to include 
patients who have normally been treated as 
inpatients may require more initial effort, but 
this effort pays off. 
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The question, “Are treatment effects 
effects of full-time hospitalization?” 


Efforts to evaluate different systems for de- 
livering mental health services are increasing 
(e.g., Struening & Guttentag, 1975), Program 
evaluation methodology, however, is still de- 
veloping: Investigators continue to question 
which measures reliably and validly assess 
treatment outcome, which Perspectives in 
treatment outcome are Preferred, and whether 
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assessment should concern general treatment 
effects or an individual’s attainment of spe- 
cific goals (e.g., Schulberg, Sheldon, & Baker, 
1969). 
Despite uncertainties, program evaluation 
on multiple measures obtained from an- 
other person significant to the client in treat- 
ment has emerged as a reliable and valid 
Source of data (e.g., Ellsworth, 1975). Sys- 
tematic observations of home and community 
adjustment, rated by a person close to the 
client or so-called “community informant, 
Seem to provide an acceptable and important 
Perspective for inferences about treatment 
effects. 

The purpose of the present study was i 
compare, from the community informant’s 
Perspective, differential effectiveness of two 
approaches in psychiatric treatment: day hos- 
Pital (or Part-time) treatment versus wie 
tient (or full-time) psychiatric ea 
Although day hospital and inpatient treatmen 
approaches share in common many program- 
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matic features (such as individual and group 
therapy), the two approaches differ in amount 
of time a client spends at the respective treat- 
ment locale—8 hours a day for the day 
hospital setting versus 24 hours a day for the 
inpatient psychiatric setting. Supposedly, less 
time spent in the day hospital disrupts the 
client’s home and community involvement 
less, whereas more time spent in an inpatient 
unit, so the notion goes, fosters overdepen- 
dency on the institution and interferes with 
home and community social ties (e.g., Jones, 
1953). 

Clinical impressions have supported the 
view that day hospitalization is an effective 
treatment alternative to inpatient hospitaliza- 
ion (e.g., Chasin, 1967). Exploratory experi- 
mental efforts have indicated that favorable 
treatment outcome is associated with the part- 
time system of mental health services (€.8., 
Guy, Gross, Hogarty, & Dennis, 1969); spe- 
cifically, day hospitalization seems to disrupt 
the client’s employment status less (cf. Herz, 
Endicott, Spitzer, & Mesnikoff, 1971). Such 
findings were observed, however, in studies 
flawed by methodological shortcomings, such 
as absence of control groups or comparisons 
of samples differing in subject characteristics 
notably influential of treatment outcome. As 
program evaluation methods have improved, 
investigators have become less certain that 
differences in treatment systems contribute to 
differences in treatment outcome (€-8;, Ells- 


_ Worth, Note 1; Ellsworth, Finnell, & Leu- 


> ai 


thold, Note 2). 

The present study compared day hospital 
and inpatient differential treatment effects 
under conditions in which subject characteris- 
tics were controlled. The experimental strat- 
egy comprised both a “fixed scale” and a 
‘fixed time” approach—fixed scale in the 
Sense that all subjects were compared for out- 
come differences on the same set of multiple 
Measures of psychosocial functioning derived 
from ratings by community informants assess- 
ing clients’ home and community adjustment 
before and after treatment, and fixed time in 
the sense that pretreatment and posttreatment 
time duration did not vary from subject fo 
Subject, 


Method 
Experimental Strategy 


A modified form of the pretreatment-posttreatment 
control group design served as the experimental 
strategy (cf. Campbell, 1969, Design 4). Day hos- 
pital clients comprised the experimental group; in- 
patient clients, matched for relevant demographic 
characteristics with day hospital clients, served as 
the primary control group. The matched inpatient 
sample was selected from 22% of total inpatient 
admissions during the 2-month time period of the 
study (in 1972), A second control group sample 
was drawn from the remaining number (78%) of 
consecutive admissions. This second “unmatched” in- 
patient control group consisted of those inpatients 
not matched with the day hospital subjects but 
admitted to the inpatient unit during the time of 
the study. The unmatched inpatient sample fur- 
nished an additional perspective for analyzing sam- 
ple comparability of the matched day hospital - in- 
patient cohorts, particularly whether constraints had 
been introduced by matching in sampling on the 
“external generalizability” (cf. Campbell, 1969) of 
the findings. 


Subject Selection 


day hospital and inpatient samples 
followed a cohortlike pairing procedure. A board- 


admissions who seemed similar to day hospital ad- 
missions in age, typê, and severity of personal prob- 
lems and comparable 
flicts. -The process of subject pairing seemed free 
of subjective and 


is, the physically infirm with psychologi- 
pate, that is) (age with severe chronic brain dis- 
orders, and those who were acutely suicidal or 


iti were followed to insure compara- 
aaiiee Be day hospital and inpatient sam- 
istics and current levels 
ity functioning. Each of 
was rated by his attending 
the Brief Psychiatric pag ae 
; Gorham, 1962); samples id not 
Oriy on the 18 scales. Matched sub- 
t differ on measures obtained from a 
(which in- 
the Shipley Institute 

ivi hiple; 
Pees see ne Smalls variables). Matched 


rdized measures 
pitas did not differ on informants’ reports about 
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home and community resources and background 
(e.g, income, living arrangements, treatment outcome 
expectancy, etc.) 2 

The matched samples averaged 32 years of age, 
completed approximately 114 years of education, 
and scored within the average range of intelligence 
(Wechsler Adult Intelligence Scale equivalency scores 
= 108). Nearly two thirds of both samples were 
married; half were employed. Forty-eight percent of 
both samples were diagnosed as schizophrenic, and 
average length of program participation was about 
33 days. 


Program Evaluation Instrument 


The Personal Adjustment and Role Skills scale 
(PARS, Ellsworth, 1975) served as the measure of 
treatment outcome, The PARS consists of 57 items 
and is scored for eight factor-analytically derived 
measures of home and community adjustment, Re- 
liability and validity of PARS scales are well estab- 
lished; manuals are available, and scale definition 
and development have been detailed elsewhere (Ells- 
worth, 1975; Ellsworth, Foster, Childers, Arthur, & 
Kroeker, 1968). Seven scales were analyzed in the 
present study: (a) Interpersonal Interaction, which 
registers reactions to social interaction; (b) Agita- 
tion-Depression, which measures feelings of pessi- 
mism; (c) Anxiety, which assesses tenseness; (d) 
Confusion, which evaluates attentiveness and effi- 
ciency in mental concentration; (e) Alcohol and/or 
which estimates moderation 
(f) Social Activity, which 
reflects the number of social contacts (e.g, attend- 
ing movies or visiting friends); and (g) Employ- 
ment, which gauges amount and participation in 
An eighth scale, 
was omitted from anal- 
to all subjects. The first 
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both pre- and post-PARS, BPRS, and group psy. 
chological screening battery scores were available, 
(No significant differences were found for pre- 
PARS scores when comparing the group returning 
pre-PARS only with the group returning both pre. 
and post-PARS.) 

Informants were sought also for the unmatched 
inpatient sample. One hundred thirty inpatient ad- 
missions (or 95% of the unmatched inpatient sam- 
ple) agreed to participate: 108 informants returned 
scorable pre-PARS and 79 mailed back both pre- 
and post-PARS scales. These 84% pre-PARS and 
62% post-PARS return rates were higher than the 
expected rate of return (cf, Ellsworth, 1975; Fon- 
tana & Dowds, 1975). 


Treatment Settings 


Programmatic activity was similar in both treat. 
ment settings. The main difference was time spent 
in the unit—8 hours a day for the day hospital ver- 
sus 24 hours a day for the inpatient unit.2 

The general medical and surgical hospital that con- 
tains both units is located in a large Southwestern 
metropolitan complex. The day hospital was situ- 
ated in a single-purpose unit located on the hospital 
grounds but separate from the main complex. A 
daily structured program of individual and group 
therapy was offered along with regularly scheduled | 
Periods for occupational and recreational therapy. 
Staff ratio was 1:4; five staff members (psychia- 
trist, nurse, social worker, psychologist, and secre- 
tary) had an average daily patient load of 20. 

The inpatient unit, housed in the main general 
medical and surgical complex, consisted of four wards 
and four treatment teams, Staff ratio was 1:4, with 
five staff positions (psychiatrist, nurse, nursing aide, 
with halftime social worker, psychologist, occupa- 
tional and recreational therapists) responsible for 
approximately 20 patients. The inpatient treatment 
Program consisted of scheduled activities similar to 
those found in the day hospital, including ward 
self-government meetings, 


Research Questions 


Three questions were addressed from the stance 
of the Pretreatment-posttreatment control group dê- 
sign. First, Do post-PARS ratings differ from bs 
PARS ratings for the matched day hospital and in- 
Patient samples combined? In other words, Was there 
evidence of improvement following 2 months of 


eee 


* Supplementary tables are available containing 
descriptive and inferential statistics for psychiatrists, 
ratings, psychological test scores, and informants 
ratings of home information. d 

2A Supplementary table describing type a" 
amount of time spent in treatment activity, Pa 
nosis, and length of stay is available from the firs 
author, 
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treatment? Second, does the pattern of improvement 
(or pre- and post-PARS differences) vary between 
the matched day hospital and inpatient samples? In 
other words, Does treatment outcome differ as a 
function of treatment setting? Third, Do findings 
for the matched day hospital and inpatient samples 
hold when compared with pre- and post-PARS dif- 
ferences for the unmatched inpatient sample? In 
other words, Do the findings generalize? 


Results 


Patterns of Improvement for Matched 
Groups Combined 


Matched day hospital and inpatient samples 
combined evidenced significant gains in cor- 
related-means t-test comparisons of four of 
the seven PARS scores. For the measures of 
symptoms but not instrumental role skills 
(Table 1), increased Calm-contentedness, 
#(47) = 5.44, p < 001; for increased Atten- 
tiveness, t(47) = 2.18, p < -03; for lessened 
Anxiety, #(47) =6.76, P< 001; and for 
moderation in Alcohol Abuse, (47) = 4.37, 
p < 001. 


Treatment Effects for Separate Programs 


Day hospital. The day hospital sample, 

considered separately by correlated-means ¢ 
tests (Table 2), evidenced increased Calm- 
contentedness, ¢(23) = 4.51, $ <.001; in- 
creased Attentiveness, #(23) = 3.03, p< 
001; lessened Anxiety, #(23) = 6.18, P< 
001; and increased moderation in Alcohol 
Abuse, #(23) = 2.02, p < .05- 
_ Matched inpatient sample. The matched 
inpatient sample, comparing pretreatment and 
posttreatment PARS scale differences by CO- 
related-means ¢ tests, evidenced similar 
changes: increased Calm-contentedness, t(23) 
= 3.30, p < .003; lessened Anxiety, #(23) = 
3.65, p < 001; and increased moderation in 
Alcohol Abuse, ¢(23) = 4.22, p < 001. The 
inpatient sample gain for Attentiveness, W- 
like the day hospital scores, did not reach an 
acceptable level of significance (« = 05). 

Unmatched inpatient sample. The un- 
Matched inpatient sample evidenced less mag- 
nitude in change than either of the two 
Matched groups; moreover, the unmatched 
inpatient group sustained 4 significant loss 10 


Table 1 
Pre- and Post-PA RS Normative Scores for 
Day Hospital and Inpatient Samples 


Sas te Oe me 


Combined 
Pre-PARS Post-PARS 
Variable M SD M SD 
Interpersonal 
involvement 47.77 8.41 50.17 8.20 
Calm- 
contentedness 39.50 10.42 47.29 11.27 
Attentiveness 38.54 12.25 43,06 11.56 
Anxiety 38.31 7.94 47.52 9,11 
Alcohol abuse 42.42 13,21 49.15 9.42 
Social activity 43.06 7.95 42.42 7.76 
Employment 29.35 15.68 29,92 16.28 


Ce eee 
Note. PARS = Personal Adjustment and Role 
Skills scale. » = 48 for both pre-PARS and post- 
PARS. 


Employment (Table 2). Correlated-means t- 
test differences were significant for increased 
Interpersonal Interaction, #(78) = 2.26, P < 
03; increased Calm-contentedness, (78) = 
5.22, p < .001; increased Attentiveness, t(78) 
= 3.51, p< .001; lessened Anxiety, (78) 
=5.17, p< 001; moderation in Alcohol 
Abuse, #(78) = 3.99, P< 001; and decline 
in Employment, t(78) = 2.78, p < 007. 


Differential Comparisons of Matched Samples 


The matched day hospital and inpatient 
s were comparable on six of seven pre- 


sample: s 
PARS scales, differing only in greater mod- 


eration in Alcohol Abuse for the day hospital 
pendent-means t(46) = 2.38, p < 
independent-means t-test 
comparisons, however, 
ion between the two matched ERI 

‘he day hos ital sample was higher in Calm- 
D s t(46) = 1.78, p < 08; Social 
Activity, t(46) = 2.93, P< 005; and Em- 
ployment, (46) = 2.57, ? < 03. The two 
s did not differ significantly in 
risons (ie, when compar- 
e difference between each 


pre- and post-PARS scale score) * 


nce (or gain) scores were analyzed 


i differe: 
somi endations of Fontana and Dowds 


following recomm 
(1975). 
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Table 2 
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PARS Normative Scores for Matched Inpatient and Day Hospital and Unmatched 


Inpatient Samples 


EE eee 


Unmatched 
Day hospital* Matched inpatient* = jinpatienti 
SD M SD 
Variable and time M SD M 
interaction ao 
wee feet, 48.54 6,43 48,21 8.60 os é a 
Boge 49.96 8.66 50.38 7.88 Te ots 
ai : 8.60 i : 
i 1.42 7.72 2.17 i 
Gain® 
Meese oo 42.04 8,94 36.96 11.34 41. 1 9 1148 
Post 50.13 9.92 44.46 12.01 46.56 ai 
7.50 11.34 .37 j 
Gain 8.09 8,78 
a 37.17 8.58 37.42 12.21 sibs i if 
Post 44.04 11.86 42.08 11.42 ES fuik: 
i 12.20 ; 
Gain 6.87 11.11 4.66 i 
Pe 4 77 
"Pre 4 38.46 9.04 38.17 6.86 38. 3 w bi 
Post 49.42 9.62 45.63 8.33 43. H 
Gain 10.96 8.69 6.46 6.86 5.28 . 
Alcohol abuse 3 
Pre 46.42 11.50 38.48 13.83 45.27 Z 
Post 50.46 9.31 47.83 9.55 50.62 o% 
Gain 4.04 9.89 9.41 13.83 Soo ig) 
Social activity 
Pre 45,25 8.31 40.88 7.09 42.68 ra 
Post 46.46 7.45 40.38 6.95 43.32 an 
Gain 1.41 8.98 —0.50 7.09 0.64 . 
Employment 
Pre 30.63 16.59 28.08 14.96 27.82 7 re 
Post 35.63 17.15 24.21 13.42 23.44 na 
Gain 5.00 18.43 —3.87 14.96 —3.38 . 
Note. PARS = Personal Adjustment and Role Skills scale. 
an = 24, 
by = 79, 


° Gain scores include both negative 
larger than mean values in some instances, 


Both groups achieved significant gains, Im- 
Provement, however, does not vary widely as 
a function of differences in health delivery 
system, Three findings in the univariate sta- 
tistical analysis Suggest greater, albeit mild, 
advantages for the day hospital experience: 
(a) the significant increase in Attentiveness 
for the day hospital sample when differences 
were analyzed Separately for each setting, 
(b) greater gains in all PARS scores con- 
sidered collectively but not separately, and 
(c) higher post-PARS Social Activity and 
Employment for the day hospital sample after 


SD 3 Ten be 
and positive numbers; accordingly, standard deviation values may 


nonsignificant pre-PARS (except for Ala 
Abuse). Such trends, discernible by e 
analysis, reinforced a need for multivaria ; 
analysis. Further analysis was justified a p 
teriori by finding significant intercorrelations 
among the PARS scales. Multiple ce 
nant analyses (Klecka, 1975) were per 
Separately for pre-PARS, post-PARS, T 
gain scores (a) in two-group comparisons 


SOS 


-PARS 
*A supplementary table of pre- and ay P 
scale intercorrelations is available on request. 
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tween matched day hospital and inpatient 
samples and (b) in a three-group comparison 
among matched inpatient and day hospital 
samples and the unmatched inpatient group. 


Multivariate Analysis of the Two 
Matched Samples 


The matched samples did not differ for 
either pre-PARS or gain scores, whereas sig- 
nificant differences were found for post-PARS 
scores. The day hospital sample scored higher 
on Social Activity, univariate F(1, 46) = 
8.56, p < .001, and for Employment, F (1, 46) 
= 6,60, p < .001. One discriminant function 
significantly separates the two matched 
groups, eigenvalue = .612; canonical correla- 
tion 616; Wilk’s A = .620; x (6) = 
21.01, p < .002. Less depression (reverse 
scoring of Calm-contentedness) and more So- 
cial Activity and Employment differentiates 
the day hospital sample from the inpatient 
group. (See Table 3 for standardized discrimi- 
nant coefficients and group centroids.) 


Multivariate Analysis of the Three Matched 
and Unmatched Samples 


Results indicate a more favorable outcome 
for the day hospital sample when multivariate 
analysis was performed in a three-group com- 
parison, that is, matched inpatient and day 
hospital with unmatched inpatient sample. 
First, the three groups did not differ in uni- 
variate and multivariate comparisons on pre- 
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PARS scores. Second, significant differences 
occurred in both univariate and multivariate 
post-PARS scale comparisons; the day hos- 
pital sample obtained significantly higher 
post-PARS Employment scores, F (2, 127) = 
6.58, p < .001, and Social Activity scores, F 
(2, 127) = 3.21, p<.05. Multivariate anal- 
ysis yielded two significant discriminant func- 
tions (see Table 3 for coefficients and group 
centroids): The first obtained an eigenvalue 
of .218; canonical correlation = 423; Wilk’s 
A = .749; x2(6) = 11.47, p < .05. The first 
discriminant function varied along a con- 
tinuum from disruption in thinking (interfer- 
ences in Attentiveness) to Employment and 
lessened Anxiety; the second function varied 
from depression and social inactivity to less- 
ened Anxiety. Post-PARS group differences 
were characterized by rated increases in Em- 
ployment and Attentiveness (with which the 
day hospital sample aligned) and decreases 
in Anxiety (with which the inpatient sample 
aligned). 

The three groups differed significantly on 
PARS gain scores. The day hospital sample 
evidenced a greater decrease in Anxiety, uni- 
variate F(2, 127) =3.29, P< 05, and a 
greater increase in Employment, F(2, 127) = 
3.48, P< 05. The discriminant function for 
gain scores, although nonsignificant ( p< 06), 
was comparable to the first discriminant func- 
tion for post-PARS scores; that is, the day 


hospital sample was higher in posttreatment 


Employment and lower in posttreatment 


Anxiety. 


Discriminant Function Coeficients 
Three-group comparison 


Two-group comparison 


pee) coefficient First coefficient Second coefficient 
‘ariabie 
—.051 674 
Interpersonal involvement Ast 1.43 — 882 
Calm-contentedness — 862 —.972 — 181 
Attentiveness 834 "702 1,001 
Anxiety (lessened) a —.091 —.553 
Alcohol abuse (moderation) —2 (334 —.614 
Social activity pie 882 — 054 
Employment =187 
ose centroid 766 958 are 
atched hospital T —.068 ma 
ched day hospt 166 Ber — "158 


Matched inpatient 
Unmatched inpatient 
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Discussion 


Informants regarded clients in both settings 
as improving in home and community adjust- 
ment. Such improvement was registered more 
in measures of symptom reduction than in 
measures of instrumental role skills (cf. Ells- 
worth, 1975; Fontana & Dowds, 1975). 

Generality of the findings was limited by 
the comparatively narrow pretreatment and 
posttreatment time interval. Previous research 
suggests, however, that most variance is ac- 
counted for by patients’ level of adjustment 
1 month after discharge (Ellsworth & Schoon- 
over, Note 3). The present findings thus may 
reflect a persisting effect of treatment out- 
come, 

The present strategy of using two control 
groups in comparing differential treatment 
effectiveness indicated that the matching pro- 
cedure did not isolate a sample that was un- 
representative of those seeking treatment; the 
results may be extended to the treatment set- 
ting in general. Other questions were raised, 
however, by this research tactic, Why were 
group differences more pronounced for the 
unmatched group comparison than for the 
“matched” group comparison? Obviously, out- 
come gains were maximized in the three-group 
comparison but minimized in the two-group 
comparison, Although it was not possible to 
identify factors associated with this phe- 
nomenon, tbe: observation that degree of im- 
provement varies as a function of differences 
in background characteristics of samples dem- 
Onstrated the importance of describing sam- 
ia cote (cf. Campbell, 1969) 5 

ugh treatment gains were achieved b 
both the day hospital and the inpatient ad 

ples, the evidence suggested, nevertheless, a 


gain score differences 
l ly) a similar trend. Find- 
ing significantly less disruption in employ- 
ment for the day hospital sample is consonant 
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with findings in earlier studies (e.g., Herz et j 
al., 1971; Ellsworth et al., Note 2). i 
If one accepts a more prudent interpreta- 
tion—that day hospital treatment is at least 
an acceptable alternative, but not necessarily 
a better alternative, to inpatient treatment for 
most clients seeking inpatient or outpatient 
treatment—then the question about which 
modality is the treatment of choice could be 
debated from the standpoint of other con- 
siderations. One such consideration is treat- 

ment “costs.” 

The day hospital approach recommends it- 
self further based on its comparatively lower 
cost. Direct costs for each day hospital client 


direct costs) compared with $43.16 a day for 
each inpatient client (or $73.49 including 
indirect costs).° Cost parameters such as sal- 
aries, housing, equipment, and so on, do not 
mean necessarily that the “costs” of treat- 
ment have been estimated in their entirety, 
however. “Cost” of treatment, like the cost 
of living, is a multidimensional construct 
whose amount is determined not only by the 
assessment of material outlay but also by 
weighing the extent to which family, friends, 
and client are taxed in the process of coping 
with emotional distress, The use of commu- 
nity informants when evaluating outcome taps 
a significant dimension in determining cost. 
The evidence supports the notion that at 
least from the perspective of the community 
informant, day hospitalization is an impor- 
tant treatment alternative for many people. 


ë Diagnosis (e.g, schizophrenic vs. nonschizophre- 
nic) did not emerge as a significant outcome varla- 
ble (cf. Erickson, 1976), ; 

° The authors wish to thank John Molnar-Suhajda, 
Management Analyst, Veterans Administration Hos- 
Pital, Dallas, Texas, for computing average daily 
costs for day hospital and inpatient psychiatry units. 
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Test Anxiety and the Passage of Time 


Irwin G. Sarason and Rick Stoops 
University of Washington 


Three experiments were performed dealing with the relationship of test anxiety 
and achievement-oriented instructions to time perception. After being given 
either achievement-orienting or neutral instructions, subjects waited for an un- 
designated period of time and then performed an intellective task. The de- 
pendent measures were subjects’ estimates of the duration of the waiting and 
performance periods and their scores on the assigned task. High-test-anxious 
subjects’ time estimates were significantly greater than the estimates of the 
other subjects, and their performance was at a relatively low level. Evidence is 
Presented supporting the hypothesis that highly anxious persons under stress 


experience cognitive interference and 
and results in poor performance. The im; 
particularly in terms of the need for t 
improved cognitive skills requiring 


Although the generalization that time is 
precious holds for many situations, it does 
not always seem to have validity, There are 
situations in which time drags and one wishes 
it were possible to speed up the clock. For the 
football team behind 30-0, time moves slowly, 
whereas for the team on the way to victory, 
time flies and its players desire more time to 
win “big.” Similar differences in subjective 
time estimates seem to hold for many types of 
situations and events, a critical factor being 
the character of the particular situation. For 
example, waiting for what may be bad news 
about a loved one who is in the hospital can 
be excruciating, and there is evidence that 
time passes very slowly for depressives (Bech 
1975), $ 
‘ Although the literature on time estimation 
is sizable, much of the work done has focused 
on time estimation as a function of either per- 
sonality characteristics (such as anxiety) or 
experimental Conditions (see Meade, 1966 H 
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Preoccupation that makes time pass slowly 
plications of the findings are discussed 


self-control. 


raining programs capable of fostering 


Siegman, 1962). In analyzing subjective judg- 
ments of time duration, it seems logical to 
consider simultaneously two variables: the 
situation and the characteristics (hopes, fears, 
etc.) that a person brings to the situation — 
(Buchwald & Blatt, 1974; Sarason, Smith, & 
Diener, 1975). The present article reports the 
results of three experiments devised from this 
perspective. An important feature of the ex- 
periments is the inclusion of two types of 
data—time estimations and performance. 

In these experiments subjects were told that 
they would be administered a test of intelli- 
gence but that there would first be a waiting 
period. (There were also control groups not 
given the achievement-orienting communica- 
tion.) How does time pass while awaiting the 
evaluation? Obviously the way in which the 
intellectual evaluation is construed has a bear- 
ing on the answer to this question. Many psy- 
chological instruments reflect aspects of the 
construing process and the meanings that in- 
dividuals attach to particular types of situa- 
tions. Scales dealing with well-defined situa- 
tions, such as those designed to tap test and 
speech anxiety, can be viewed as measuring 
cognitive activity and worrying, which 1s 
stimulated by a specific event or demand 
(such as having to take a test or make a 
speech). 


» Inc. All rights of reproduction in any form reserved. 


TEST ANXIETY 


The personality measure used in the three 
experiments was the Test Anxiety Scale 
(TAS; Sarason, 1972). High scorers on this 
measure have been shown to perform more 
poorly than others on difficult, complex tasks 
administered under achievement-orienting 
conditions that emphasize the evaluation of 
one’s performance (Sarason, 1975). Test anx- 
iety can be interpreted as a form of self-pre- 
occupation—characterized by self-awareness, 
self-doubt, and self-depreciation—that influ- 
ences overt behavior and psychological reac- 
tivity, Other types of anxiety may be similarly 
interpreted. The self-preoccupying thoughts 
of the highly anxious individual interfere with 
adaptation at several points in the course of 
information processing. They narrow or other- 
wise influence the attentional focus on en- 
vironmental cues; distort encoding, trans- 
formation, and planning strategies; and 
influence responses that may be selected to 
cope with challenges confronting the indi- 
vidual. 

Available evidence suggests that the rela- 
tively poor performance of highly test-anxious 
persons under achievement-orienting condi- 
tions is not due to low intelligence but rather 
to the cognitive interference ofa personalized, 
self-centered approach to evaluational situa- 
tions. The expectations of a highly anxious 
person seem to be different from those of 
others (Doris & Sarason, 1955). When this 
person performs poorly, it may not be due 
merely to cognitive interference and self-pre- 
occupation during the test. It may also be re- 
lated to the time spent anticipating the test 
with dread. These personalized anticipations 
contribute to inefficient, ineffective prepara 
tion for the test. 

The experiments reported here were aimed 
at providing information about the way m 
which persons differing in anxiety fill time. 
It was predicted that in the presence of 
achievement-orienting cues, time would pass 
more slowly for high-anxiety scorers than for 
middle- and low-anxiety scorers. When these 
cues are not present, there should not be a 
significant gap in estimates of time duration 
among groups differing with regard to test 
anxiety. Furthermore, the effects of an 
achievement orientation should be as notice- 
able while the individual is waiting to perform 
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as during performance itself. The first two 
experiments deal with these hypotheses and 
differed in the length of the waiting period 
preceding performance. In the third experi- 
ment, the performance period was greatly 
lengthened, and a specially prepared postex- 
perimental questionnaire was administered in 
an attempt to clarify the relationships among 
achievement orientation, test anxiety, and 
cognitive interference. It was expected that 
highly anxious persons who are underachievers 
would describe themselves as having more 
task-irrelevant thoughts than would persons 
with middle and low anxiety scores. 


Experiment 1 
Method 


Subjects. The subjects were 48 male and 48 fe- 
male students from introductory psychology classes 
at the University of Washington. All subjects were 
approximately 18-19 years of age. Prior to and inde- 
pendent of the experiment, 550 students had been 
administered the TAS (Sarason, 1972). The subjects 
were drawn from the top ‘and bottom 15% of the 
distribution of TAS scores and from a group in the 
middle of the distribution. The subjects in the high 
TAS group had scores of 26 and above; subjects in 
the low TAS group had scores of 9 and below. The 
middle TAS group had scores between these cutoff 
points. Subjects’ assignments to experimental condi- 
tions were random within the requirements of the 
experimental design, (This method of assignment to 
conditions was followed in all three experiments.) 
After being escorted into the experi- 
mental room, subjects were asked to put their watches 
out of sight until the end of the session because, 


they were toli 

iti short self-description and a 
the task of writing a poet 
tion. The subjects, who 


eutral instructions. The achieve- 
ctions were given as follows: 


ut to take is part of a widely 
This is the most crucial part 
will be used to give me a 
intelligence. I have to get the 


The experimenter then lef 


erimenter return it 
dpon EE at rials and asked the subject to write 


in minutes and seconds) the length vi 
that he or she had been sitting alone. The experi- 
menter then continued with the following: 

about to take is part of 


said, the test you are 
i iiieehe test. This test has been found to 
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predict such things as course grades, success in later 
life, and, to some extent, the kind of personality 
you possess, Of course, your own intelligence will 
primarily determine whether you do well or poorly 
on the test. At a later date you will have an op- 
portunity to compare your IQ score with those of 
the other people in this study. You will then be 
able to determine how your abilities and capacities 
compare with other people like you. 


The subject was then given a difficult version of 
the digit symbol task (variations of the letter L) 
with the following instructions: “The purpose of this 
task is to put the symbols in the numbered boxes as 
prescribed by the code at the top of the page. Try 
the three examples.” The materials were an adapta- 
tion of those used by Sarason and Palola (1960). 
The subject then worked for 3.5 minutes on the digit 
symbol task and was asked to write down (in 
minutes and seconds) the length of time that he or 
she had been working on the test, The subject was 
then debriefed and excused, 

The neutral instructions were as follows: “I have 
to get the materials we need. I'll be back shortly.” 
The experimenter then left, returned in 2 minutes, 
and asked the subject to write down (in minutes and 
seconds) the length of time that the subject thought 
gone. The experimenter 


were in the expected direction, F(2, 84) = 


: . The mean waitin, time 
estimate of the high TAS subjects in the ex- 
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a test to a greater extent than those in the 
high TAS control group. More low and middle 
TAS subjects overestimated the interval dur- 
ing which they believed they were waiting to 
perform on a neutral task than did low TAS 
subjects, who believed that they were waiting 
to take a test. The results for time estimates 
of the period in which subjects performed 
were in the same direction as for the waiting 
period but were not statistically significant. 
An analysis of variance of the digit symbol 
performance scores yielded one significant re- 
sult, that for the test anxiety main effect, 
F(2, 84) = 4.07, p < .05. The low TAS mean 
(92.8) was higher than the middle (82.2) and 
high (81.5) means. There were no significant 
sex differences in either Experiment 1 or 2. 


Experiment 2 


Although in several respects the results of 
Experiment 1 were consistent with expecta- 
tions, they tended to be weak and in some in- 
stances inconsistent, for example, the fact that 
the middle TAS group’s performance scores 
more closely resembled those of the high TAS 
group than those of the low TAS group. In 
the hope of uncovering more decisive relation- 
ships, a second, related experiment was per- 
formed. Two changes pertained to the tem- 
poral variable. Because the 2-minute waiting 
period in Experiment 1 might not have been 
long enough to allow for significant effects of 
the test anxiety and experimental variables 
to show up, the waiting period in Experiment 
2 was lengthened to 4 minutes. In addition, 
subjects performed the digit symbol task for 
4 instead of 3.5 minutes. 

Another change in Experiment 2 was the 
task on which subjects worked prior to per- 
forming the digit symbol task. Instead of writ- 
ing a short self-description and a description 
of one other person, subjects performed for 7 
minutes on an anagrams task. This type of 
Concept-formation task was deemed somewhat 
more consistent than the writing task, since 
the experimental emphasis was on the evalua- 
tion of intellective performance. For the 
achievement-oriented group, the anagrams 
Were so difficult that it was certain that no 
subject could complete the task in the allotted 
time. For the control group, the anagrams 
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were relatively easy, and all subjects success- 
fully completed the task. The changes made in 
the preliminary task (particularly its diffi- 
culty level and time pressure) were designed 
to heighten stress on the evaluation of per- 
formance among subjects in the achievement- 
orientation group. 


Method 


Subjects. The subjects were 120 undergraduates 
at the University of Washington. The 60 males and 
60 females were divided into groups on the basis of 
their scores on the TAS using the same cutoff points 
as were used in Experiment 1. 

Procedure. The experiment followed a 3 X 2 X 2 
analysis of variance design. The variables were (a) 
TAS—high, middle, and low scorers; (b) conditions— 
achievement-orienting and neutral control; and (c) 
sex—male and female subjects. 

This experiment used the procedures of Experiment 
1 except for the following changes: (a) Before per- 
forming the digit symbol task, subjects worked on 
anagrams (easy ones for those in the control group, 
difficult ones for the experimental or achievement- 
orientation group), and (b) the waiting period and 
time for performance on the digit symbol test were 
4 minutes each. 


Results 


The analysis of variance for the subjects’ 
estimates of the waiting period prior to per- 
forming on the digit symbol task yielded two 
significant results—the effects for test anxiety, 
F(2, 108) = 3.57, p < 05, and conditions, 
F(1, 108) = 5.03, p < 01. The test anxiety 
result reflected larger waiting period estimates 
for the high (303.8) than for the low (274.1) 
and middle TAS (269.5) groups. The larger 
high TAS estimates were mainly attributable 
to the high TAS group that received the 
achievement-orienting condition. This is 
shown by the fact that the mean for this group 
was 337.6, whereas the high TAS neutral 
group’s mean was 270.0, F(1, 38) = 4.31, b < 
05. Table 1 presents the mean waiting time 
estimates together with the mean estimates of 
time spent on the digit symbol task and per- 
formance scores on that task, Because there 
were no significant sex differences, male and 
female results were combined in Table 1. 

The significant effects in the waiting period 


analysis were also significant in the analysis 
f time spent on the 


of subjects’ estimates © 


Table 1 

Mean Waiting Time and Task Time Estimates 
and Digit Symbol Performance Scores in 
Experiment 2 


c 


Waiting Task Per- 
Condition time* time* formance 

H-E 337.6 346.3 ` 68.5 
H-C 269.9 261.9 87.8 
M-E 279.0 258.1 100.4 
M-C 260.0 259.8 98.6 
L-E 285.0 266.8 100.6 
L-C 253.3 258.5 102.6 


Note. H, M, and L refer to high, middle, and low 
levels of test anxiety, respectively; E and C refer 
to experimental and control conditions. 

s In seconds. 


digit symbol task. The TAS main effect, F (2, 
108) = 5.13, p < .01, was due to higher esti- 
mates for the high (304.1) than for the low 
(262.7) and middle (258.9) TAS groups. 
Again, the greater high TAS mean was due 
mainly to the high TAS achievement-oriented 
group. The mean for this group was 346.3, 
whereas the comparable low TAS control 
group mean was 261.9. The TAS x Condi- 
tions interaction, F (2, 108) = 7.81, p < 001, 
was attributable to differences between the 
high TAS (346.3) achievement-oriented group 
and all other groups in the experiment (com- 
bined M = 261.0). 

The analysis of digit symbol performance 
scores yielded two significant findings: for test 
anxiety, F (2, 108) = 7.82, p < .001, and Test 
Anxiety X Conditions, F (2, 108) = 3.21, p < 
05. The main effect for test anxiety was due 
to poorer performance for the high than for 
the middle and low TAS groups. This in turn 
was explicable largely in terms of the rela- 
tively poor performancè of the high-scoring 
group. The high TAS achievement-orientation 
mean was 68.5; the mean for the high TAS 
control group was 87.8; and the mean for all 
middle and low TAS groups combined was 
100.5. These results contributed to the signifi- 
cant TAS X Conditions interaction. 


Experiment 3 


The procedural changes made in Experi- 
ment 2 led to more clear-cut results than were 
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Table 2 

Mean Waiting Time and Task Time Estimates, 
Anagram Performance Scores, and Cognitive 
Interference Scores in Experiment 3 


ee 
Cognitive 
Ana- inter- 
Waiting Task grams ference 
Condition time* time! score score? 
H-E 357.0 1354.1 3.3 33.2 
H-C 286.5 1114.0 4.8 24.6 
M-E 266.3 1031.5 5.5 18.2 
M-C 274.4 1103.5 5.7 21.6 
L-E 266.5 1172.0 5.0 19.8 
L-C 265.0 1140.5 5.0 21.4 


Note. H, M, and L refer to high, middle, and low 
levels of test anxiety, respectively; E and C refer 
to experimental and control conditions. 

a In seconds, 

» Reflects the degree to which the subject reported 
experiencing interfering thoughts, 


obtained in Experiment 1. The findings of the 
two investigations support the conclusions 
that not only is the performance of high TAS 
subjects deleteriously affected by achieve- 
ment-orienting instructions, but, in addition, 
these subjects tend to overestimate both the 
duration of the test period and the period 
during which they wait to have their ability 
evaluated. This seems analogous to the ten- 
dency to exagerate the time spent in the den- 
tist’s waiting room and in his or her office. 
Anticipating and going through unpleasant, 
frightening, or threatening experiences seem 
to take up a lot of time. If this interpretation 
is correct, the question arises: Do persons 
differing in anxiety fill time periods in similar 
or dissimilar ways? Experiment 3 was de- 
signed to provide evidence relevant to this 
question and to extend the generality of re- 
sults obtained in Experiments 1 and 2, 

In Experiment 3 the tasks used in Experi- 
ment 2 were reversed. All subjects worked on 
a digit symbol task prior to a waiting period 
and then were asked to solve a series of diffi- 
cult anagrams. The period during which they 
were occupied with the anagrams was much 
longer than was the case for the postwaiting 
task in the earlier experiments. F ollowing per- 
formance on the anagrams task, the subjects 
responded to a questionnaire dealing with 
their cognitive activity during that task, 


IRWIN G. SARASON AND RICK STOOPS 


Method 
Subjects. The subjects were 60 female undergradu- 
ates at the University of Washington. They were 


divided into groups on the basis of their TAS scores, 
using the same cutoff points that were used in Ex- 
periments 1 and 2. 

Procedure. The experimental design encompassed 
two factors: (a) high, middle, and low TAS scores 
and (b) achievement-orienting and neutral instruc- 
tions. Each subject worked on the digit symbol task 
for 4 minutes. This was followed by a 4-minute wait- 
ing period. At the end of the waiting period, subjects 
performed for 18 minutes on a series of difficult 
anagrams. The experiment concluded with subjects 
responding to a questionnaire about cognitive activity 
while occupied with that concept-formation task. 
The questionnaire was a modified version of one 
developed by Diener and Endresen (Note 1). It dealt 
with the tendencies during performance to have task- 
irrelevant thoughts (eg, what the experimenter 
thought about the subject, wondering about how 
others had done on the task).1 


Results 


There were two significant Fs in the analy- 
sis of waiting period time estimates, for test 
anxiety, F(2, 54) = 8.31, p < .001, and for 
Test Anxiety x Conditions, F(2, 54) = 3.31, 
$ < .05. The high, middle, and low TAS 
means were 321.8 sec, 270.4 sec, and 266.3 
sec, respectively. The interaction result 
showed that the greater high TAS mean was 
attributable mostly to the high TAS group 
receiving achievement-orienting instructions, 
The mean for that group was 357.0 sec, 
whereas the high TAS control group mean was 
286.5 sec. Table 2 presents the means of the 
four dependent measures for all groups in Ex- 
periment 3. 

The analysis of estimates of duration of the 
anagrams task also yielded two significant Fs: 
for test anxiety, F(2, 54) = 3.29, p < .05, 
and for Test Anxiety x Conditions, F(2, 54) 
= 3.41, p < .05. Again, the significant results 
were explicable largely in terms of the rela- 
tively large estimates given by the high TAS 
achievement-orientation group (see Table 2). 
The mean for that group was 1,354.1 sec, 
whereas the mean for all other groups com- 
bined was 1,112.3 sec. 


ee 


*The questionnaire is available from Irwin G. 
Sarason. 
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When an analysis was performed on the 
number of correct responses to the anagrams 
task, only the test anxiety effect was statisti- 
cally significant, F(2, 54) = 3.35, p < .05, As 
the means in the third column of Table 2 
show, this effect was due mainly to the rela- 
tively poor performance of the high TAS 
group receiving the achievement-orienting in- 
structions. 

There were two significant results in the 
analysis of interfering activity scores that 
were obtained by summing the subjects’ re- 
sponses to the questionnaire’s 11 items. These 
scores reflect the degree to which the subject 
reported experiencing interfering thoughts. 
These significant results were the F for test 
anxiety, F(2, 54) = 5.33, p < .01, and for 
Test Anxiety X Conditions, F(2, 54) = 3.27, 
p< .05. As column 4 of Table 2 shows, most 
of these significant effects were due to the high 
scores obtained by the high TAS achievement- 
orientation group, whose mean was 33.2. The 
Mean for the high TAS control group was 
24.6, and the combined mean for the middle 
and low TAS group was 20.3. Results for 
Separate analyses of individual items were in 
every case in the same direction as the results 
presented for the questionnaire as a whole. 

One item appended to the questionnaire 
asked the subject to indicate on a 7-point scale 
the degree to which her mind wandered while 
working on the anagrams task. An analysis of 
Variance of these scores yielded significant Fs 
for test anxiety, F(2, 54) = 3.45, p < .05, 
and for Test Anxiety X Conditions, F(2, 54) 
= 3.61, p < .05, the directions of these results 
resembling those in the other analyses. 


General Discussion 


Looking at the total picture provided by the 
findings of the three studies, it appears that 
Persons for whom tests are noxious experiences 
(high TAS subjects) tend to overestimate, to 
a greater degree than do others, both the time 
during which their performance is being evalu- 
ated and the period during which they are 
Waiting for the evaluation to take place, Add- 
ing to the picture is the fact that high-test- 
anxious subjects performed at significantly 
lower levels than did low and middle scorers 
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when emphasis was placed on the evaluational 
implications of performance, 

; The evidence from Experiment 3 concern- 
ing cognitive interference is enlightening from 
the standpoint of what persons think about 
while working on a task. High test anxious 
subjects, more so than low and middle scorers, 
attribute to themselves preoccupations about 
how poorly they are doing, how other people 
are faring, and what the examiner will think 
about the subject. These findings are in line 
with those obtained by Diener and Endresen 
(Note 1). It is difficult not to interpret these 
preoccupations as having the effect of ap- 
preciably complicating the task at hand. Al- 
though a measure of cognitive interference 
during the waiting period was not obtained, 
it seems likely that similar preoccupations 
would have especially characterized high test 
anxious subjects at that time. 

Janis (1958) has described the “work of 
worrying” as a step toward dealing effectively 
with a threatening or challenging reality situ- 
ation. Arnold (1960) has also referred to 
worrying as a preparation for action. Although 
this emphasis on the positive aspects of worry 
is commendable, sight must not be lost of the 
important fact of individual differences in 
worrying. The person who describes himself or 
herself as characteristically being a worrier 
may not be taking a positive first step in cop- 
ing with stress when he or she begins to worry. 
Rather, the individual may be creating sub- 
jectively vivid personal fictions and exagera- 
tions that instead of being of help in the cop- 
ing process, serve to exacerbate or create stress 
where it otherwise might not exist at all, A 
high score on a measure of trait anxiety may 
then be viewed as reflecting obsessive self- 
preoccupation and thereby the tendency to 
complicate situations that may already be 
sufficiently challenging. In the case of the 
TAS, inferences are drawn only to a defined 
domain of activity being evaluated. 

Doob (1971) has presented a cogent, wide- 
ranging survey of temporal dimensions of be- 
havior. Further research is needed on the role 
of a number of temporal variables in stress 
and anxiety. For example, although high TAS 
scorers in Experiment 3 described themselves 
as very much selfypreoccupied during the 18- 
minute-long anagrams task, it may well be 
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that these covert responses were not evenly 
distributed throughout that time period. It 
would be interesting to obtain measures of 
cognitive interference at several points during 
performance. Similarly it would be valuable 
to have a clearer picture of cognitive activity 
during waiting periods. Breznitz (1971) has 
called attention to a process of incubation by 
which the stress value of a stimulus or situa- 
tion is enhanced during a waiting period. The 
time interval between warning of an impend- 
ing threat (e.g., a test) and its actual occur- 
rence merits study as an independent variable. 
Of equal importance is the variable of the 
time filler: What happens, if anything, during 
the waiting interval? 

Another problem of both theoretical and 
practical significance is the matter of how to 
help people gain more control over their be- 
havior in situations requiring anticipation of, 
and later coping with, stress. The problem of 
self-preoccupation and its intrusive effects is 
not limited to the domain of anxiety. Some 
self-preoccupied persons worry, others respond 
covertly and overtly with anger, and still 
others are suspicious of potential unseen traps 
in the situations with which they must deal. 
The rapidly developing fields of cognitive 
training and cognitive therapy have much to 
contribute to the analysis, and, where de- 
sirable, to a reduction of the tendency to be 
self-preoccupied (Mahoney, 1974; Meichen- 
baum, 1972; Rimm & Masters, 1974). Train- 
ing aimed at strengthening adaptive cognitive 
skills (e.g., planning a course of action, wait- 
ing patiently, and reducing intrusive self- 
preoccupation) is especially relevant in reac- 
tions to personal threat. In challenging 
situations—either self-imposed, as in climbing 
a mountain, or unexpected, as in a sudden 
illness—the utilization of time can be of the 
utmost importance. Control over one’s 
thoughts may be the decisive factor in suc- 
cessfully meeting a situational challenge. 

The results of the series of experiments re- 
ported here lend support to the growing in- 
terest in a Persons X Situations approach to 
personality (Sarason, 1977; Sarason, Smith, 
& Diener, 1975). Two indices—estimates of 
the durations of time periods and perform- 
ance—were found to be a joint function of a 

situational characteristic: (a) whether or not 
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emphasis was placed on an achievement orien. 
tation and (b) an individual difference vari- 
able, test anxiety. Other evidence from diverse 
fields that supports the need for an interac- 
tional psychology is now available (Magnus. 
son & Endler, 1977). To understand and pre- 
dict behavior, data are needed about both the 
information provided by environmental situa- 
tions and the characteristics of persons who 
must process the information. 


Reference Note 


1. Diener, E., & Endresen, K. Task-irrelevant re 
sponses of highly test anxious students. Unpub- 
lished manuscript, University of Washington, 1974. 
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A Cognitive-Behavioral Treatment for Impulsivity: 
A Group Comparison Study 


Philip C. Kendall 


University of Minnesota 
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Richmond 


From a clinic population of emotionally disturbed children, 20 children initially 
identified as impulsive were randomly assigned to either a cognitive-behavioral 
treatment group or to a control group. The treatment group received six ses- 
sions of verbal self-instructions via modeling with response-cost contingent upon 
errors during training, and controls received similar training without specific 
treatment. Although two self-report measures and teacher and staff ratings of 
locus of conflict did not show treatment effects, both an increase in the latency 
and a decrease in the error measures from the Matching Familiar Figures Test 
and improved teacher ratings of impulsive classroom behavior revealed positive 
effects due to treatment. These treatment effects remained evident at follow-up. 
The present study provides group-comparison evidence for the efficacy of the 
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cognitive-behavioral treatment for modifying impulsivity. 


Children are not known for their willing- 
ness to consider alternatives when it comes to 
making a decision. Often, they want the first 
candy bar they see, the toy nearest them, or 
anything one of their friends has. Fortunately, 
most children become more cautious with age. 
However, some children never seem to stop 
and think, and reflective reasoning seems alien 
to them—these are the impulsive children. 

The cognitive dimension of reflection-im- 
pulsivity (Kagan, 1966) has been useful to 
describe differences in children’s approaches 
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to problem solving. Whenever a number of 
response alternatives are simultaneously avail- 
able and uncertainty as to the correct re- 
sponse is high, some children (reflectives) de- 
lay responding until the alternatives have 
been considered, and they have a high proba- 
bility of being correct, In contrast, other chil- 
dren (impulsives) respond quickly with Jess 
thorough evaluation of the various possibil- 
ities and consequently make many mistakes. 
Using the latency and error measures of 
Kagan’s (1966) Matching Familiar Figures 
(MFF) Test, children can be identified as 
cognitively reflective or impulsive. 

Among the most common behavior prob- 
lems resulting in children being referred to 
mental health services is impulsiveness—that 
behavior pattern which involves a lack of in- 
hibitory control and a tendency to respond 
quickly without thorough deliberation. In 
light of the severity of this behavior, at A 
creasing number of studies have been direct 
at modifying impulsiveness in children a 
Finch & Kendall, in press; Messer, 197 J 
Among the various methods that have se 
used in attempting to modify impulsivity 3 
forced delay (Heider, 1971; Kagan, Pe 
& Welch, 1966), reinforcement contingenc! 
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(Briggs, 1968; Finney, 1970), modeling 
(Debus, 1970; Denney, 1972), and instruc- 
tions in strategies for scanning (Egeland, 
1974; Nelson, 1969). In most of these stud- 
ies, except where scanning strategies were 
emphasized, one or the other aspects of im- 
pulsivity (latency/errors) was modified but 
not both. 

Other investigators (Finch, Wilkinson, Nel- 
son, & Montgomery, 1975; Meichenbaum & 
Goodman, 1971) have examined verbal self- 
instructions training as a program for modify- 
ing impulsivity. These researchers, using cog- 
nitive training, reported changes in the desired 
direction for both latencies and errors. 

The effects of reinforcement and response 
cost on cognitive style were investigated in 
a study by Nelson, Finch, and Hooke (1975), 
with results indicating that for impulsive sub- 
jects, the response-cost procedure was more 
effective. The utility of a response-cost pro- 
cedure with impulsive children was supported 
by Errickson, Wyne, and Routh (1973). 

Kendall and Finch (1976) combined the 
cognitive training procedures and the behav- 
ioral strategy of response cost into a cogni- 
tive-behavioral treatment and reported a suc- 
cessful case study in which both cognitive 
impulsivity and impulsive behavioral “switch- 
ing” were reduced. In addition, the results of 
the cognitive-behavioral treatment generalized 
to the classroom and across a variety of situ- 
ational tests. Even though this case study was 
suggestive of the utility of the cognitive-be- 
havioral treatment, the clinical utility of such 
a treatment procedure needed to be demon- 
strated further using a clinic population in a 
group comparison study. The purpose of the 
present study was to conduct just such an 
examination of the effectiveness of the cog- 
nitive-behavioral approach in modifying im- 
pulsivity. 


Method 


Study Setting 


The present study was conducted at the Virginia 
Treatment Center for Children, Richmond, Virginia, 
a university-affiliated children’s psychiatric hospital. 
The hospital houses four living units, each with 
sleeping and play areas for 10 children and a re- 


spective classroom with two teachers. 
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Eight master’s level special education teachers and 
16 trained child care technicians provided ratings 
of behavior. 


Subjects 


The subjects for the present study came from all 
new admissions to the Virginia Treatment Center for 
Children between July 1, 1975, and March 1, 1976, 
The identification of impulsive children was based 
on their initial assessment scores on the MFF (Kagan, 
1966). These scores, latency to first response and 
number of first-response errors, were compared to 
norms developed by us based on the performance 
of 195 children. The cutoff scores were as follows: 
Impulsives required an error rate 27 and a mean 
latency <8.5 sec. Of the 51 children who were in- 
itially assessed, 20 children were identified as im- 
pulsive and were assigned to either a treatment (n= 
10) or a control (m= 10) group according to a re- 
stricted randomization procedure.* The mean age 
of the children in the treatment group was 10.2 
years and 11.1 years for the controls, The 20 im- 
pulsive subjects included 16 males and 4 females (8 
and 2 in each group, respectively) of whom 4 were 
black and 16 were white (2 and 8 per group, re- 
spectively).” 


Dependent Measures 


Three types of dependent measures were examined; 
patient performance, self-report, and rating scales, 
Two self-report measures and three rating scales, 
two completed by the teachers and one by the unit 
personnel were used, 

Patient performance. Each subject’s performance 
on the MFF was examined at pretest, posttest, and 
follow-up periods. The MFF is a 12-item match-to- 
sample task that requires the child to choose from 
an array of six variants the one picture that is iden- 
tical to a standard picture. This test assesses the 
conceptual-tempo dimension of reflection-impulsivity. 


1 The restricted randomization procedure allowed 
for random assignment of subjects to either the 
treatment or control group, with the restriction that 
not more than three subjects in a row be assigned 
to any one group, This was done to prevent having 
all treatment or all control subjects involved at the 
same time. y 

20f the 20 subjects, there were 6 adjustment re- 
actions of childhood or adolescence, 5 overanxious 
reactions, 5 neurotic conditions (3 depressive, 2 
phobic), 2 aggressive reactions, and 2 psychotic or- 
ganic brain syndrome. Relative dosages of Thorazine 
and/or Mellaril were prescribed for 30% of the 
subjects. There were 10 meaningful differences in the 
diagnoses or in the drugs for the two groups. In ad- 
dition, 1 subject in each group had received Ritalin. 
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Latency to first response and first-response errors 
were recorded. 

Self-report scales. The two self-report scales were 
the Impulsivity Scale (IS; Sutton-Smith & Rosen- 
berg, 1959) and the Impulse Control Categorization 
Instrument (ICCI; Matsushima, 1964). 

The IS is a 25-item (Hirschfield, 1965, revision) 
true-false self-report scale that assesses impulsivity 
defined as a tendency to be restless, to indulge in 
horseplay, to lose control, and to enter activities 
with excessive vigor. Test-retest reliability was .85 
(Sutton-Smith & Rosenberg, 1959), and validity data 
(teacher ranking, IS scores, and classroom observa- 
tion were significantly related) were acceptable 
(Hirschfield, 1965). 

The ICCI contains 24 sentence situations to which 
subjects state degrees of choice between spontane- 
ously impulsive-aggressive behavior and behavior 
requiring impulse control on a 4-point continuum. 
The scale assesses self-control over immediate action 
when aroused. Odd-even reliability was .93, and ac- 
ceptable validity data according to an interview 
schedule and behavioral task persistence are pro- 
vided (Matsushima, 1964). ; 

Each child completed the inventories at the in- 
itial assessment (pretreatment), posttreatment, and 
follow-up periods. 

Rating scales. The two rating scales used were 
the Impulsive Classroom Behavior Scale (ICBS; 
Weinreich, 1975), which was completed by the teach- 
ers, and the Locus of Conflict (LOC) Scale (Armen- 
trout, 1971), which was completed by the unit per- 
sonnel as well as the teachers, 

The ICBS is a 9-item 5-point teacher rating scale 
that was developed by choosing the most frequently 
used descriptions for impulsive childhood behaviors 
taken from related text materials (e.g, attention 
span, work consistency), This scale was specifically 
developed for research involving the modification of 
impulsivity (Weinreich, 1975), 

“Locus of conflict” refers to the predominant mode 
of impulse modulation that is exhibited by an in- 
dividual. In internalization of conflict, the conflict 
is between impulses and their inhibition. Behaviors 
are rigidly controlled, and the individual experiences 
subjective discomfort. In contrast, externalization of 
conflict represents the conflict between the child’s 
actions and the reactions that they bring about in 
others. Impulsive emotionally disturbed children are 
more likely to be rated as exhibiting externalization 
of conflict (Montgomery & Finch, 1975), 

Each of the rating scales was completed by the 


respective rater at Pretreatment, osttreatme: 
follow-up periods. i u 


Training Materials 


There were six sets of training materials used in 
the present study, one for each of six therapy ses- 
sions. The six sets were (a) conceptual thinking, 
(b) attention to detail, (c) recognition of identities, 
(d) sequential recognition, (e) visual closure, Ka 
(f) visual-motor reproduction, ; 
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Set 1. This was a series of 48 plates containing 
four pictures, three of which were conceptually simi- 
lar. The task was to “find the one that doesn't 
belong with the others.” 

Set 2. Four pictorial stimuli (in only a few cases 
there were nine) were presented. Two of the stimuli 
were identical, and the subjects were instructed to 
“find the pictures that match” (42 plates). 

Set 3. This series of 191 plates was directed 
toward the recognition of identities under conditions 
of conceptual similarity. Each plate of two pictures 
required the child to verbalize, “Are these the same 
or are they different?” 

Set 4. A series of figures (beads or other geo- 
metric figures) were presented in a sequential pat- 
tern. In each of the 68 sequences, the child was to 
choose from an array of alternatives “which one 
would come next.” 

Set 5. In this series of exercises, the child was 
presented an incomplete line drawing superimposed 
on a square configuration of evenly spaced dots. The 
child was given a pencil and was asked to “com- 
plete the drawing so that it looks symmetrical—the 
same on both sides. Just draw one line between any 
two dots.” There were 50 of such visual closure 
tasks. 

Set 6. This was a set of 56 visual-motor repro- 
duction tasks in which the subject was asked to re- 
Produce one design on a configuration of dots on a 
similar blank configuration of dots. 


Period 1; Initial Assessment 


All children entering the Virginia Training Center 
for Children were screened within a 3-day period, 
10 days after their arrival at the Center. Almost 
without exception, children were taken from their 
respective classrooms between the hours of 9:00- 
11:30 a.m. or 1:00-2:30 p.m. Each child was initially 
given the opportunity to choose not to attend, but 
no one made such a choice. At times, however, 
scheduled sessions were delayed due to special events, 
misconduct, or illness. 

During the initial assessment each child was ad- 
ministered the MFF, the IS, and the ICCI in a 
random order, Subjects were merely informed that 
the examiner was collecting some information and 
needed all the children to do these tasks. Subjects 
were also informed that their scores did not go tO 
their therapists or teachers but that they were in 
cluded in a larger group of scores. Children receive! 
a small candy reward for their cooperation. F 

Concomitant with the child’s assessment session, 
the respective classroom teacher completed the ate 
and the LOC scale, Similarly, a randomly selecte 
full-time unit person rated the child using the LO k 
scale. In order for unit ratings to be repreni 
of unit behavior, each unit staff had to consult the 
records of the child’s behavior on the unit for i 
period of time in question, Neither teachers nor E 
persons were informed as to the assignment ob et 
jects to groups, thus guaranteeing blind ratings. 
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Intervention 


One male therapist was the individual trainer for 
both the treatment and control groups. All children 
received six 20-minute sessions during which they 
were exposed to the training materials and after 
which they, in some fashion, received a reward for 
their participation. Except for the instructions re- 
lating to the cognitive-behavioral treatment, subjects 
in both groups were given identical task instructions 
and performance feedback. No criterion number of 
tasks were set, rather, all subjects worked for 20 
minutes. Thus, disregarding the cognitive-behavioral 
treatment, similar conditions were achieved for both 
subject groups. 

Treatment group. Although this group received 
conditions similar to controls, these subjects also re- 
ceived specific additional training in verbal self- 
instructions and a response-cost procedure contingent 
on their errors during training. 

1. Verbal self-instructions. In relation to each set 
of training materials, the verbal self-instructions 
were practiced in the following sequence: first, the 
therapist performed the task out loud, verbalizing 
out loud about possible answers and relevant aspects 
of the stimuli; second, the subject performed the 
task items with the instruction to talk out loud as 
the trainer had (the trainer provided guidance for 
overt self-instructions) ; next, the trainer performed 
the task while talking to himself in a whisper; then, 
the subject was told to “try the next item, and 
this time just whisper to yourself”; last, the thera- 
pist performed an item while modeling covert self- 
instructions, which, in turn, was followed by the 
child performing similarly. 

The following sample of a self-instructional pro- 
cedure was that used with the visual association tasks 
(Task Set 1). 


Let’s see now, What am I supposed to do? Tm 
supposed to find the one that doesn’t belong with 
these others. I see four pictures here, so I better 
look at each one carefully. Okay, the first one 1S 
a clock, so is the second one. This one js a grand- 
father clock, but this one is a cup and saucer. 
So, I've got three clocks and one cup an 
It’s the cup and saucer that doesn’t belong. 


trainer, a planned 


In a later item done by the 
follows: “Here we 


error was made and corrected as yi 
have all four animals. They’re all animals . . . wait 
. . . this one isn’t a dog, it’s a lion. There, now I 
can correct myself before I make an error. The lion 
is the one that doesn’t belong.” Verbal self-instruc- 
tions were performed during each of the six therapy 
sessions. A 
2. Response cost. At the beginning of each train- 
ing session, the subject was given 10 chips and was 
told that the chips were his/her to keep but that 
he/she could lose 1 for making a mistake. An ex- 
ample was given, and it was checked to see in 
each subject understood the contingency- A rewari 
menu hung over the work area, and on it were sev- 
eral categories of candy and gum purchasable with 


113 


9, 7, or 5 of the chips, Thus, it was explained to 
each child that he/she could use the chips to pur- 
chase a reward at the end of the session, and the 
more chips, the bigger and better the choice of re- 
ward, After each error on the training material, 1 
chip was taken from the subject, and the inaccuracy 
was stated as the reason for the loss of a chip. 

Control group. Although this group received simi- 
Jar conditions as did the treatment group, they did 
not receive verbal self-instructions nor was response 
cost contingent on errors. They were given their 
choice of rewards noncontingently (for their coopera- 
tion) at the end of each session. 


Period 2: Posttreatment Evaluation 


The posttreatment evaluation was conducted 4 
weeks after the initial assessment. Subjects were 
told that all the children were taking the tests more 
than once. This was done to reduce the potential of 
a child perceiving the second test as due to failure 
on the test the first time. As in the initial assessment, 
subjects were taken from their respective classrooms 
to perform these tasks, and ratings were obtained 
from teachers and unit persons. 


Period 3: Follow-up 


Follow-up data were collected 3 months after the 
initial assessment (2 months after the posttreatment 
evaluation). Again, each child was individually ad- 
ministered the tasks and was told that all children 
were taking the tests several times. Ratings were 
again obtained from teachers and unit persons. 


Results 
Reliabilities 


Patient performance. The reliability of the 
MFF performance was calculated using scores 
for all subjects who were initially assessed 
except those in the treatment group and those 
who were discharged during the course of the 
study (n= 30).8 The test-retest correlations 
for latency and errors at the different admin- 
istrations were .72 and 78, .79 and .69, and 
82 and .62 (first and second administration, 
second and third, and first and third, respec- 
tively). All correlations were significant (Pp < 
001). Also, MFF latency and error measures 
correlated —. —.75, and —.73, respec: 


tively, for the three periods. 


pet 
8 This group of subjects included children who were 
assessed at all periods and for whom data were com- 
plete, but it did not include those subjects in the 
treatment group. The control group was included. 


Se 
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Figure 1. Mean latency in seconds for the treatment 
and control groups at the initial assessment, post- 
treatment, and follow-up periods. 


Self-report scales. For the IS, test-retest 
reliabilities were .56 ( < .005), .70 (p< 
001), and .33 (.05 < $ < .10; first and sec- 
ond administrations, second and third, and 
first and third, respectively). In similar re- 
spective order, reliabilities for the ICCI were 
-56, .63, and .52 (all ps < 005). 

Rating scales. The interrater reliability of 
each of the rating scales was determined using 
Pearson product-moment correlation coeffi- 
cients. Reliabilities were assessed for teachers 
and unit personnel by assigning randomly 
selected children to each of a pair of raters. 
Each pair of raters was assigned five children, 
and there were six rater pairs. 

The reliability of the ICBS was .85 (p < 
.01) for the sample of students rated by the 
teacher pairs. The reliabilities of the three 
measures from the LOC scale (internalization, 
externalization, and total maladjustment) 
were assessed for teacher and unit personnel 
pairs both combined and separately. The com- 
bined LOC scale reliabilities were 62, .93, 
‘91 (ps <.005) for the three LOC scale 
measures, respectively, The teacher LOC 
scale reliabilities were 89, .96, and 95, and 
the unit personnel LOC scale reliabilities were 
42, 82, and .84, in the same order (n=5), 
Thus, the reliabilities of the rating scales were 
favorable.* 


Group Comparisons 


To evaluate changes in the dependent mea- 
sure for subjects in the treatment and con- 
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trol groups across the three periods, separate 
2 (between-subjects) x 3 (within-subjects) 
analyses of variance were conducted.’ 

Patient performance. Separate 2 X 3 anal- 
yses of variance were conducted for latency 
and error measures. The latency analysis re- 
sulted in a significant groups effect, a sig- 
nificant periods effect, and a significant 
Groups X Periods interaction, F(1, 18) = 
7.02, p < 01, F(2, 36) = 9.17, p < 001, F 
(2, 36) = 11.87, p < .001, respectively. These 
results are presented in Figure 1. An analysis 
of simple effects (via independent ¢ tests) ê 
indicated that although the treatment and 
control groups’ latencies did not differ at the 
initial assessment period, ¢(18) = .96, the 
groups differed significantly at both posttreat- 
ment and follow-up, (18) = 3.62, p< 001, 
(18) = 2.58, p< .02, respectively. In addi- 
tion, a related ¢ test indicated that for the 
treatment group, the change in latency from 
initial assessment to posttreatment was signifi- 
cant, #(9) = 3.69, p < .01. All other ¢ tests 
were nonsignificant. 

The analysis of variance of the error data 
resulted in a significant groups effect, a sig- 
nificant periods effect, and a significant 
Groups X Periods interaction, F(1, 18) = 
13.67, p < .002, F(2, 36) = 17.30, p < 001, 
F(2, 36) =6.45, p<.005, respectively. 
These results are presented in Figure 2. 

An analysis of simple effects (via indepen- 
dent ¢ tests) indicated that although the 
treatment and control groups’ error scores did 
not differ at the initial assessment periods, 
#(18) = .37, the groups differed ae oe 
at posttreatment, #(18) = 2.92, p < .01, an 
the groups differed significantly at mh 
#(18) = 1.76, p< .05. The decrease in ce 
error rate for the treatment group from t 


*The intercorrelations of all dependent are 
in tabular form, are available from the first aut FR 

5A table of all means and standard devie o 
all the dependent measures for the treatmen 
control groups is available from the first author. dic 

ĉIn the cases in which specific a priori Tae n 
tions were made, one-tailed ¢ tests were er one) 
Point of information, however, all ts Aexcep t for 
that were either significant or not significan mar 
one-tailed tests would have been similarly a beon 
cant or not significant if two-tailed tests ha! 
used. 
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initial assessment to posttreatment was sig- 
nificant, #(9) = 2.52, p<.05, but the de- 
crease from posttreatment to follow-up for the 
control group was not significant, #(9) = 1.17. 
All other ¢ tests were nonsignificant. 
Self-report scales. The results of the 2 x 
3 analysis of variance for the IS resulted in 
nonsignificant main effects for groups and for 
periods (Fs < 1) and a nonsignificant Groups 
x Periods interaction, F(2, 36) = 2.37, p> 
10. Similarly, the ICCI analysis resulted in 
nonsignificant main effects and a nonsignifi- 
cant interaction, F(1, 18) < 1, F(2, 36) = 
1.92, p > .10, F(2, 36) < 1, respectively. 
Rating scales. The results of the 2 X 3 
analysis of the ICBS indicated nonsignificant 
main effects for groups and for periods (Fs < 
1), but the Groups X Periods interaction was 
significant, F(2, 36) = 12.19, P< .001. The 
nature of this interaction is presented in Fig- 
ure 3. An analysis of simple effects via in- 
dependent ż tests indicated that the treatment 
and control groups’ ICBS ratings differed at 
the initial assessment, (18) = 4.5, Ż < 001, 
with the treatment group rated as more im- 
pulsive in the classroom. Although the groups 
only approached being significantly different 
at the posttreatment period, t(18) = 1.74, p 
< .10 (treatment group was rated as less im- 
pulsive), the groups differed significantly at 
follow-up, ¢(18) = 2.85, p< .02, with the 
treated subjects rated as less impulsive in the 
classroom. The change in ICBS ratings for the 
treatment group from initial assessment to 
posttreatment was a significant reduction, re- 
lated (9) = 3.70, p< 01. In addition, re- 
lated ¢ tests for the change in ICBS ratings 
of the control subjects indicated that the in- 
crease in impulsivity from initial assessment 
to posttreatment was significant, #(9) = 2.61, 
~ <.05. The increase from posttreatment to 
follow-up was not significant, (9) = 1.43. 
Three 2 X 3 analyses of the LOC scale 
data were conducted for each of the raters 
(unit personnel and teachers) and for each 
of the three LOC scale measures (internaliza- 
tion, externalization, and total maladjust- 
ment). The results of the unit ratings of 
internalization were nonsignificant (all Fs < 
1). Unit ratings of externalization of conflict 
resulted in a nonsignificant groups effect and 
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Figure 2. Mean errors for the treatment and control 
groups at the initial assessment, posttreatment, and 
follow-up periods. 


a nonsignificant periods effect, F (1, 18) <1, 
F(2, 36) =3.13, .05 <p < -10, respectively. 
The Groups X Periods interaction was sig- 
nificant, F(2, 36) = 4.22, p< 02, Analyses 
of the simple effects via related ¢ tests for the 
treatment and control groups separately re- 
sulted in nonsignificant changes across periods 
for the treatment group, ts(9) < 1, and a sig- 
nificant increase for the control group from 
the initial assessment to posttreatment, t(9) 
= 2.88, p < .02, but the posttreatment de- 
crease to follow-up was not significant, ¢(9) 
= 1.41, Independent ¢ tests indicated that 
although the groups differed significantly on 
the units’ initial ratings of externalization, 
t(18) = 2.62, p < 02, they did not differ at 
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Figure 3. Mean Impulsive Classroom Behavior Scale 
treatment and control group 


(ICBS) scores for the 
at the initial assessment, posttreatment, and follow- 


up periods. 
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the posttreatment period, #(18) <1, or at 
the follow-up period, ¢(18) = 1.25. Thus, the 
significance of the above analysis is attributa- 
ble to a significantly lower externalization 
rating for controls at the initial assessment 
period; no other means differed significantly. 

The total maladjustment ratings made by 
the unit personnel were analyzed using a 2 x 
3 analysis of variance, which resulted in non- 
significant main effects for groups and for 
periods, F(1, 18) = 1.10, F(2, 36) = 1.09 
(in that order), and a nonsignificant Groups 
X Periods interaction (F < 1). 

The teacher ratings of locus of conflict were 
similarly analyzed. The results of the anal- 
ysis of the internalization data were nonsig- 
nificant main effects for groups and periods 
and a nonsignificant interaction, F(1, 18) < 
1, F(2, 36) = 2.43, F(2, 36) = 1.14, all ps 
> .10 (in that order), The teacher externali- 
zation of conflict ratings produced a nonsig- 
nificant groups effect, F(1, 18) = 1.16, p> 
+10, a significant periods effect, F(2, 36) = 
3.63, p <.05, and a nonsignificant Groups 
X Periods interaction, F(2, 36) = 1.18, p> 
-10. Similarly, the teacher total maladjust- 
ment ratings produced a nonsignificant group 
effect and a nonsignificant interaction (Fs < 
1), whereas the periods main effect was sig- 
nificant, F(2, 36) = 4.49, p < .02. 

One additional question remained unan- 
swered by the above analyses. The question 
was, Does the actual number of response-cost 
occurrences for subjects in the treatment 
group correlate with improvement? To ex- 
amine this, the change from the initial assess- 
ment to the posttreatment period (improve- 
ment) for both MFF latencies and errors 
and for ICBS scores was correlated with the 
respective number of response-cost conse- 
quences occurring during treatment. The cor- 
relation between response-cost occurrences 
and improvement in MFF latency was not 
significant (r = —.36) nor was the correlation 
with MFF errors (r = 36). The correlation 
of response-cost occurrences and improvement 
on the ICBS was .58. This relationship (p < 
.06) suggests that the more Tesponse-cost oc- 

currences, the greater the improvement in the 
classroom. 
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Discussion 


The reliabilities of the latency and error 
measures of the MFF provided in the present 
study were in the moderate to high category, 
and they indicate, in contrast to Finch, Dear- 
dorff, and Montgomery (1974), that the meai 
sures are reliable over a 3-month period,’ 
Correspondingly, the reliability of the MFF 
performances suggests that the reflection-im- 
pulsivity dimension is relatively stable. The 
teacher ratings of impulsive classroom behav- 
ior (ICBS) were highly reliable. In addition, 
some validational information was provided 
in that the ICBS was found to be a sensitive 
measure of the effects of treatment. Along 
with the brevity and simplicity of format of 
the ICBS, the present study provided sup- 
portive reliability and validity data that 
should also be considered in selecting a mea- 
sure of impulsive classroom behavior.’ 

The major results of the present study sub- 
stantiate the effectiveness of a cognitive-be- 
havioral program for the modification of 
impulsivity in emotionally disturbed children 
(Kendall, 1976). In addition to the alteration 
of both latency and error measures of cog- 
nitive tempo, the present study found a favor- 
able generalization of treatment to the class- 
room. There, as evidenced by teacher ratings, 
the treatment subjects displayed significantly 
less impulsive classroom behavior. Although 
teacher ratings of locus of conflict did not re 
flect these differences, the present results 
support those of Kendall and Finch (1976), 
and, taken together, they are considered t0 
support the utility of the cognitive-behavioral 
treatment for modifying impulsivity and at- 
taining generalization to the classroom. 

It should be noted that the desired gen- 
eralization of the treatment effects to the 
classroom is in contrast with the increase i 
impulsivity of the control group. Although 
all subjects received the “constant” tren A 
milieu of the Virginia Treatment Center i 
Children, the increase in the control subjec ic 
impulsiveness may be due to the eT 
theoretical model of the treatment center i 
a whole in which the expression of feeling: 


x the 
7A copy of the ICBS can be obtained from 


first author, 


COGNITIVE-BEHAVIORAL TREATMENT FOR IMPULSIVITY 


are emphasized. Even though this may or may 
not be the treatment model of choice for 
overly inhibited children, it would not appear 
to be the desired model for impulsive chil- 
dren, Moreover, these findings illustrate the 
efficacy of examining differential treatment 
effectiveness for distinct groups of children 
(Kendall & Finch, in press). 

Some of the findings of the present study 
have general implications. The most outstand- 
ing of these is the fact that although neither 
of the self-report indicants of impulsivity 
changed due to treatment, two measures of 
MFF performance and ratings of classroom 
behavior reflected a desirable treatment effect. 
The implication here is that behavior change 
may occur without first altering self-percep- 
tions. An equally compelling implication is 
that self-report scales may be of limited util- 
ity in treatment research with children (rela~ 
tively insensitive to change). 

In theorizing about the effectiveness of the 
cognitive-behavioral treatment, one must not 
ignore the training materials. Indeed, in the 
present study in which generalization to the 
classroom was attained, the tasks were of the 
psychoeducational variety. On the other hand, 
if the six treatment sessions consisted of cog- 
nitive training and response-cost dealing with 
interpersonal situations, attaining generaliza- 
tion to life situations may have been more 
likely. Thus, although it is not impossible to 
conclude that the cognitive-behavioral treat- 
ment did not generalize to the units, it is 
likely that the training tasks are relevant in 
regard to the type of generalization that will 
be obtained (see also Kendall, 1977). 

Although the present study has demon- 
strated that the cognitive-behavioral treat- 
ment effectively modifies impulsivity, the 
treatment package contains many components 
—modeling, self-instructions, and response 
cost. Future research should examine the spe 
cific components of the treatment package 
an attempt to isolate the one active compo- 
nent or, more likely, to uncover the relative 
efficacy of the components. 
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This investigation was designed to determine the extent to which contextual cues 
mediated the effectiveness of systematic desensitization and a plausible placebo 
in alleviating public speaking anxiety. After participating in a public speaking 
situation that allowed the collection of self-report, physiological, and behavioral 


manifestations of anxiety, 67 subjects 


were randomly assigned to receive five 


sessions of either desensitization, “T scope” therapy, or no treatment. Each of 


these conditions was conducted in a 


context that either stressed the clinical 


relevance of the procedure or presented the procedure as a laboratory investi- 


gation of fear without therapeutic implications. Analysis of changes both be- 
tween groups and within individuals indicated that desensitization reduced 


public speaking anxiety in both contexts, 


whereas the placebo was effective only 


in the therapeutic setting. The superiority of desensitization was most pro- 
nounced on the physiological variables. The results are interpreted as indicating 
support for a counterconditioning, rather than an expectancy, interpretation of 


desensitization. 


Early documentations of the effectiveness 
of systematic desensitization (eg., Lang, 
Lazovik, & Reynolds, 1965; Paul, 1966) 
were followed by numerous investigations 
that have sought to isolate the active thera- 
peutic ingredients of this procedure. One 
critical area that has received much recent 
attention is the influence of cognitive factors 
on the outcomes engendered by desensitiza- 
tion, Three general research strategies have 
generally been used in investigating this com- 
plex theoretical issue. 

The first involves comparing desensitiza- 
tion against various placebo manipulations. 
The outcomes of this strategy have been in- 
consistent. Paul (1966) and Davison (1968) 
demonstrated the superiority of desensitiza- 
tion with evidence that supported a counter- 
conditioning interpretation. Other investiga- 
tors, however, have failed to differentiate 
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the effects of desensitization and placebos in 
alleviating test anxiety (Allen, 1971) or 
small-animal phobias (McReynolds, Barnes, 
Brooks, & Rehagen, 1973; Tori & Worell, | 
1973). This inconsistency may be partly due 
to differences in the perceived credibility of 
the treatment and placebo procedurés that 
have been used (Borkovec & Nau, 1972). 

A variant of this first strategy involves 
providing instructional sets to augment the 
efficacy of desensitization within a thera- 
peutic context. In general, the results pro- 
duced within this framework suggest that 
such manipulations yield very modest effects 
(McGlynn & Mapp, 1970; Woy & Efran, 
1972). $ 

The third and potentially most useful line 
of investigation involves presenting desensi- 
tization as either a therapeutically relevant 
treatment or & nonclinical laboratory proce- 
dure by means of instructional and contextual 
cues. Utilizing this paradigm, several studies 
(Miller, 1972; Rosen, 1974) have indicated 
that desensitization is more effective when 
presented as a clinically useful technique. In 
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addition, Tori and Worell (1973) found no 
differences between desensitization conducted 
in a laboratory context and a plausible pla- 
cebo procedure. 

Several systematic influences run through 
these three studies that limit their adequacy 
in providing a crucial test of the competing 
theoretical models of desensitization. First, 
both Miller and Rosen reported that a num- 
ber of subjects in their laboratory conditions 
evidenced awareness of the therapeutic qual- 
ities of desensitization. More importantly, all 
three investigations examined small-animal 
phobias, which lessens their relevance for two 
reasons. Bernstein (1973) has clearly demon- 
strated the influence of contextual cues in 
both the selection of animal phobias and 
their response patterns on self-report and be- 
havioral avoidance measures. Borkovec (in 
press) provides impressive evidence indicat- 
ing that exposure to fearful animal stimuli 
does not evoke extreme physiological activa- 
tion. These researchers advocate the use of 
various interpersonal anxieties in analogue 
investigations of desensitization. 

The present study examines the efficacy of 
desensitization and a highly credible placebo 
procedure administered within separate clin- 
ical and laboratory contexts in alleviating 
public speaking anxiety. The first and third 
reseafch strategies were combined to clarify 
the extent to which contextual cues influence 
the specific and nonspecific effects of these 
Procedures and to provide a test of the two 
dominant models of desensitization, Speci- 
fically, demonstrating that desensitization is 
Superior to a placebo in reducing self-report, 
behavioral, and especially physiological in- 
dices of anxiety across contexts would sup- 
port a counterconditioning explanation 
whereas interactions between treatments and 
settings would argue for a cognitive-expect- 
ancy interpretation. 


Method 
Subjects 


Of slightly over 1,500 students in an introductory 
psychology course who were offered the opportunity 
to participate in a program designed to improve 
public speaking, 116 expressed interest and 75 met 
several selection criteria including acknowledging a 
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high level of fear of public speaking on Geers 
(1965) Fear Survey Schedule, attending a pretreat. 
ment screening session, and providing a monetary 
deposit as a guarantee that treatment sessions would 
be attended. Complete data were collected from a 
total of 67 subjects, 35 of whom participated in 
the clinic phase of the experiment. The remaining 
32 subjects were told that they could not be 
accommodated in the therapy program because it 
was filled. Approximately 5 weeks later, after the 
therapy phase had been completed, these subjects 
were asked to take part in an ostensibly independent 
laboratory study of physiological concomitants of 
anxiety, 

The deposits of these latter participants were 
returned when they were refused treatment, and 
they were later paid for their participation as re- 
search subjects. The participants averaged 18,6 years 
of age and were predominantly freshmen women. 


General Experimental Design 


perimental design. The experiment was conducted in 
two separate contexts in two independent phases 
with different subjects studied in each. During 
the first phase, subjects assigned to the clinical set- 
ting were screened and received five 50-minute ses- 
sions of either desensitization or a placebo therapy 
administered by one of two advanced graduate stu- 
dents. Immediately after completion of this phase, 
the graduate students carried out identical treat- 
ment procedures in a research context with those 
subjects assigned to the laboratory setting. Assess- 
ment sessions were conducted before and after each 
phase, and two no-treatment groups were included. 
These control subjects were informed that they 
could not be accommodated in the program for 
which they were eligible, but they were asked to 
Participate in the posttreatment assessment sessions. 
A debriefing session for all subjects was held fol- 
lowing completion of the laboratory phase. Those 
subjects who received no treatment or some other 
ineffective treatment were then offered therapy: 
Thus, the basic design involved six independent 
groups in a complete factorial crossing of three 
treatment conditions in two contexts with repeated 
Measurement before and after intervention. The 
therapeutic procedures were crossed with two ther- 
apists. 


Process and Outcome Measures 


The screening and posttreatment evaluation 4 
sions closely followed Pauls (1966) procedir 
Before these assessments, a measure of trait ao 
(Spielberger, Gorsuch, & Lushene, 1970); WA i 
tained. After the program had been explained, A on 
ticipants were required to give a 2.5-min speee ech, 
an assigned topic. Immediately before the spè tial 
each subject completed the Anxiety Dite 
(Husek & Alexander, 1963) and a question? 


Table 1 provides an overview of the general 
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form, devised by the present authors, which pro- 
vided indications of state anxiety and somatic/ 
emotional turmoil, respectively. Two physiological 
measures, a 15-sec pulse rate and a palmar sweat 
print, were also collected at this time. During the 
speech, two trained observers noted behavioral 
manifestations of anxiety on Paul’s (1966) Timed 
Behavioral Checklist. Following the speech, which 
was given before an audience of 7-10, each subject 
completed the Personal Report of Confidence as a 
Speaker, as modified by Paul (1966). 

Following each of the treatment sessions, subjects 
rated their perceptions of the interaction along 
seven 7-point semantic differential scales (e.g, tense- 
relaxed, fast-slow). Ratings of relaxation during 
image visualization, anxiety evoked by images, and 
clarity of images presented during the second 
through fifth sessions were made along 6-point di- 
mensions, A termination questionnaire was also 
administered to assess subject involvement in the 
program as well as perceptions of the therapist/ 
experimenter on nine bipolar adjectives. 


Experimental Groups 


Subjects were randomly assigned to three condi- 
tions, each of which was replicated in two con- 
texts. All treated participants were attached to a 
polygraph that provided prerecorded printouts, 
which were shown at the end of each session. Sub- 
jects in the clinical context were told that the print- 
outs indicated that they were responding well to 
treatment, whereas in the laboratory context, this 
feedback was interpreted as indicating support for 
the research hypotheses. Thus, even though par- 
ticipants were led to believe that they were either 
“good” clients or good subjects, positive expecta- 
tions of the therapeutic efficacy of the procedures 
were induced only in the clinical setting. 

Desensitization/clinical context. Subjects assigned 
to this condition came to a departmental therapy 


Clinic and were informed that they would be receiv- 
form of therapy that 


presented, and his procedures were replici 
regard to training relaxation, 
archy, and desensitization Proper. 
Desensitization/laboratory context. Treatment was 
conducted in a laboratory setting. Following Leiten- 
berg, Agras, Barlow, and Oliveau (1969), subjects 
were told that the experimenter was interested, in 
studying the visualization of anxiety-producing 1m- 
ages and concomitant physiological reactions. Re- 
laxation was presented as a means of facilitating 
visualization. The actual treatment procedures paral- 
lelled those used in the clinical setting. À 
Placebo/clinical context. This placebo manipula- 


tion replicated the procedures ra ae Tens 
Wi ects learned that they wo! e 
orell (1973). Subject s of individuals 


presented with tachistoscopic scenes ai 
gesticulating while speaking. The rationale explain a 
how viewing subliminal scenes while relaxed woul 
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Table 1 
General Experimental Design 
E O eee 
Context 
Weeks of 
study Therapy Research 
4-5 Pretreatment Refused participation 
assessment in therapy study 
6-11 Treatment 
12-13 Posttreatment Recontacted for 
assessment laboratory research 
14-15 Pretreatment 
assessment 
16-21 Treatment 
22-23 Posttreatment 
assessment 


Note. Subjects were selected during Weeks 1-3. 


facilitate the acquisition of more adaptive public 
speaking skills. Irrelevant images were actually used. 

Placebo/laboratory context. The same tachisto- 
scopic procedures were used with the rationale 
stressing the experimenter’s interest in the effects 
of subliminal exposure to stress-related images. It 
was intimated that such stimuli would have rela- 
tively minor impact. 

No treatment/clinical context. This group par- 
ticipated in the assessment procedures before and 
after the therapy phase. Subjects were informed 
during the pretreatment assessment that they could 
not be accommodated in the current program and 


were promised therapy at a later date. 

No treatment /laboratory context. After the 
screening session, these subjects were scheduled to 
return for a second evaluation at the end of the 


research program. 
Results 


Comparability of the Experimental 
Conditions 


To ascertain whether any systematic pre- 
treatment differences between the groups 
existed, seven Treatment X Context X Ther- 
apist analyses of variance were computed. 
Since only 4 of 49 F tests were significant 
and since they did not fall into a consistent 
pattern, the essential equivalence of the 
groups was assumed. 


Reliability Considerations 


A pair of raters independently matched 
each palmar sweat print against 15 specimens 
to provide quantifiable data. Interobserver 
reliabilities were 94 and .90 for judgments 
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Table 2 i . A 
Means of Self-report, Physiological, and Behavioral Anxiety Indices Collected Before and After | 
Intervention | 
p 
Clinic Laboratory 
Desensi- Desensi- 
tization Placebo Control tization Placebo Control 
Measure Pre Post Pre Post Pre Post Pre Post Pre Post Pre Post 
Anxiety 4 
differential 75.5 62.6 72.5 60.4 66.3 64.9 69.6 62.9 72.6 70.2 67.1 65.4 
uestionnaire 
= form 29.8 21.2 32.5 21.0 25.9 23.8 24.7 18.9 25.0 22.6 24.3 21,2 
Public report of 
confidence as a 
speaker 22.5 15.8 2257 Wiel 18.2 15.4 19.6 15.9 21.8 17.9 19.1 18.0 
Trait anxiety 42.7 37.9 42.0 42.3 42.9 40.6 41.7 38.5 42.9 41.7 42.8 420. 
Pulse rate 84.1 78.0 78.8 82.0 85.0 84.7 80.1 76.0 84.4 84.1 84.3 84.3 
Palmar sweat tes YE) 9.1 9.7 CARD 7.9 68 89 84 1.7 E 
Behavioral 
checklist 43.4 36.7 43.4 35.8 40.9 38.4 43.6 36.1 43.4 42.0 42.0 41,0 


made before and after treatment. The usable 
score for each print was obtained by aver- 
aging the two ratings. Each subject’s pulse 
was converted into beats per minute by 
multiplying by four. Three pairs of observers 
were trained to rate anxiety on the Timed 
Behavioral Checklist. Interrater reliabilities 
of total scores were .79, 87, and .86 for the 
observer dyads. The scores of both raters 
were summed to provide an indication of the 
behavioral manifestations of public speaking 
anxiety. 


Overall Effectiveness of the Manipulations 


Means of the major dependent variables 
before and after the experimental manipula- 
tions are presented in Table 2. Treatment X 
Context x Time analyses of variance indi- 
cated significant effects for repeated testing 
on the Anxiety Differential, the questionnaire 
form, the Personal Report of Confidence as a 
Speaker, and the Timed Behavioral Checklist. 
Treatment X Time interactions were signifi- 
cant for state anxiety, F(2, 61) = 3.55, p< 
.05, the questionnaire form, F(2, 61) = 3.18, 
Ż < .05, and showed tendencies on the Per- 
sonal Report of Confidence as a Speaker, F(2, 
61) = 2.87, p < .07, and pulse rate, F(2, 61) 
= 2.67, p< .08. These interactions were 
largely due to greater improvement in the 


desensitized subjects. Reliable Context X 
Time interactions on the Anxiety Differential 
and the questionnaire form indicated that the 
clinic context was associated with greater anx- 
iety reduction. 

The reliability of improvement within each 
condition was assessed by correlated ¢ tests, 
which are presented in Table 3. 

Desensitization was the only treatment that 
produced significant improvement on self-re- 
ported, behaviorally observable, and, most im- 
portantly, physiologically measured anxiety. 
The placebo led to improved self-reported and 
behavioral functioning when conducted in 4 
clinical context, but it was no more effective 
than the passage of time when administered 
under the guise of research. The no-treatment 
control groups generally showed little change. 


Analysis of Individual Changes 


To investigate the effects of the manipula- 
tions on the performance of individual a 
jects, residual change scores were compute 
by subtracting actual posttreatment scores 
from the predicted score determined by 1°- 
gression analysis. This procedure is analogous 
to an analysis of covariance in that it equali 
existing between-group differences and a 
moves changes due to statistical bite 
Reliable improvement or deterioration 


| 
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each subject was defined as having a residual 
change score either greater or less than 1.96 
times the standard error of measurement for 
that variable (the 95%. confidence interval). 
Table 4 presents the percentages of subjects 
in each treatment group who manifested sig- 
nificant improvement or deterioration on each 
variable as a result of participating in the 
experiment. 

These data generally support the conclu- 
sions drawn from prior analyses. Subjects re- 
ceiving desensitization in both contexts or the 
placebo in the clinical setting showed the 
highest rate of improvement on the self-report 
measures, with higher rates of deterioration 
being noted for the remaining groups. The 
extreme variability generated by the clinical 
placebo procedure is noteworthy. Although 
the improvement figures parallel those pro- 
duced by desensitization, deterioration per- 
centages were of a higher magnitude. De- 
sensitization also produced the highest per- 
centages of improvement and the smallest 
amount of deterioration on the physiological 
and behavioral measures. 

Analysis of items measuring subjects’ per- 
ceptions of the treatment sessions were non- 
significant with the exception of one reliable 
difference relating to the level of activity be- 
tween the therapists. Responses to the termi- 


Table 3 
Within-Treatment Correlated t Values Betwi 


Anxiety Indices 


een Pretreatment and Posttreatment 
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nation questionnaire revealed that subjects in 
the desensitization groups believed their visu- 
alizations to be significantly clearer, F(1, 33) 
= 50.17, p < .001, and more anxiety evoking, 
F(1, 33) = 12.50, p < .01, than participants 
in the placebo conditions, In addition, subjects 
in the clinical context viewed their therapists 
as more active and organized and reported 
greater expectations for therapeutic improve- 
ment, F(1, 33) = 25.36, p < .001, than sub- 
jects in the research phase. These therapist 
differences did not affect performance on the 
outcome measures, however. 


Discussion 


Perhaps the most important conclusion to 
be drawn from this study is the superiority of 
desensitization in reducing self-reported and 
behaviorally measured anxiety and in lower- 
ing physiological arousal, The effects of this 
treatment were robust in terms of their in- 
sensitivity to contextual cues. Subjects who 
were desensitized in a research setting mani- 
fested therapeutically relevant gains, in spite 
of the fact that they viewed the procedure as 
having little utility in this regard, Setting 
factors, on the other hand, exercised a power- 
ful effect on the outcomes produced by the 
placebo manipulation, with this procedure be- 


Means for 


Public report è 
Anxiety Questionnaire Trait of confidence Pulse piny oracle y 
Treatment differential form anxiety asa speaker rate wi 
ae 3.38** KEE i 3.41** 6.62*** 2.10* H e K 
i ; ad ai = 
Placebo clinic 2.18 3.84** —.17 ae ve Se ae tA 
patol clinic -10 1.50 T A 5 ‘ i 
ee ba! 3.04 
boa 4.30** 3.03* 1.55 2.13 1.80 5.81 
ee 0 59 15 9 
k laboratory 1.23 1.09 45 2.07 A i i 
oe 69 š 
Bbo .56 3.07* 16 1.15 04 
*p < 05. 
re < O01. 
** p < 001. 
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Percentages of Subjects in Each Condition Showing Reliable Improvement and Deterioration in 


124 JEFFREY M. SLUTSKY 
Table 4 
Performance 
Clinic 
Measure Desensitization Placebo 
Anxiety 55 60 
differential 18 40 
Confidence as 55 40 
a speaker 9 40 
Questionnaire 64 70 
form 18 20 
Trait 63 30 
anxiety 9 40 
Pulse 55 50 
rate 36 30 
Palmar 55 30 
sweat 36 70 
Behavioral 45 70 
checklist 10 20 


Laboratory 

Control Desensitization Placebo Control 
29 30 20 33 
57 10 70 42 
36 30 30 16 
21 40 60 42 
29 60 30 33 
71 20 60 42 
46 50 30 36 
46 20 30 36 
50 60 40 33 
36 10 30 50 
22 80 40 41 
50 10 40 33 
21 70 30 17 
43 20 60 58 


Note. The top figure of each entry and the bottom figure of each entry refer to improvement and deterioration 


respectively. 


ing as efficacious as desensitization in a thera- 
peutic context but not reliably better than no 
treatment in a research-related atmosphere. 

These findings seem useful in placing some 
of the previously reviewed discrepancies into 
a coherent perspective. It appears that many 
of the studies that reported differences be- 
tween desensitization and placebo treatments 
(e.g., Davison, 1968; Paul, 1966) used pseu- 
dotherapeutic manipulations of low credibility 
(Borkovec & Nau, 1972). Highly plausible 
placebos, such as the one developed by Tori 
and Worell (1973), seem to engender positive 
outcomes that match those produced by de- 
sensitization but only when administered in 
clinical contexts. Failure to find differences 
between these two types of treatment in the 
Tori and Worell study might well have been 
due to contextual differences whereby de- 
sensitization was presented only as a labora- 
tory procedure while the therapeutic relevance 
of the placebo was stressed, 

Thus it appears that embedding placebo 
manipulations in a therapeutic context is a 
necessary but not a sufficient condition for in- 
suring their effectiveness. 

The theoretical import of these findings in- 


volves the support that they provide for 
Wolpe’s (1958) counterconditioning model of 
desensitization. The study provides further 
evidence that this therapeutic procedure has 
a potent effect on alleviating anxiety related 
to a clinically relevant problem and is rela- 
tively insensitive to cognitive influences. 
Modification of this conclusion awaits the 
resolution of many problems in the expectancy 
literature (Wilkins, 1973). 
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A New Measure of Psychological Androgyny Based on the 


Personality Research Form 


Juris I. Berzins, Martha A. Welling, and Robert E. Wetter 
University of Kentucky 


In line with recent reconceptualizations of psychological masculinity and fem- 
ininity as independent dimensions, the present article describes the development 
and validation of the PRF ANDRO scale whose Masculinity and Femininity 
subscales were drawn, on the basis of theoretical definitions, from the item pool 
of Jackson’s Personality Research Form. Using data obtained from over 2,000 
college students, the subscales of the PRF ANDRO scale are shown to be inde- 
pendent, reliable, minimally related to socially desirable responding, and sub- 
stantially related to the corresponding subscales of Bem’s Sex Role Inventory 
(correlations between .50 and -65) and to major personality dimensions. Fur- 
ther evidence of construct validity is adduced from the score patterns in 18 
different samples that include clinical and other noncollege groups. Since the 
PRF ANDRO scale can be scored from the answer sheets of the Personality 


Research Form, other investigators may reanalyze prior studies with particular 
regard to the proposition that high levels of Masculinity and Femininity, jointly 
denoting psychological androgyny, predict greater interpersonal competence and 
transsituational adaptability than do traditionally sex-typed role orientations. 


In recent years, the conceptualization and 
measurement of sex roles have undergone 
radical changes (Bem, 1974, 1976; Block, 
1973; Constantinople, 1973; Kaplan & Bean, 
1976; Pleck, 1975; Spence, Helmreich, & 
Stapp, 1975). The once self-evident assump- 
tion that high levels of sex typing promote 
psychological adjustment has come under in- 
creasingly sharp theoretical and empirical 
attack (e.g., Bem, 1976; Bem & Bem, 1970); 
behaviors thought to be “natural” correlates 
of biological sex differences, to be exemplified 
by “appropriately” masculine men and femi- 
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nine women, have lost their stereotypic po- 
tency. Rather, members of the human libera- 
tion movement argue that depending on the 
situation, the behavioral repertoires of healthy 
adults of both sexes can encompass behaviors 
previously assigned to the sexes differentially. 
The concept of psychological androgyny—the 
integration of culturally masculine and femi- 
nine attributes into one’s self-definition and 
behavior—has become ascendant in recent 
perspectives on mental health (e.g., Bem, 
1976; Kaplan, 1976). 

Traditional measures of psychological mas- 
culinity-femininity, however, have been predi- 
cated on the assumptions that masculinity and 
femininity define the endpoints of a single 
bipolar dimension (the sexes as “opposites”) 
and that measures of this dimension are best 
constructed by aggregating various items that 
show large endorsement differences between 
the sexes or between persons of differing fe 
ual preferences (Constantinople, 1973). Ad- 
herence to these assumptions has generate 
measures that create virtually nonoverlapping 
distributions of scores for men and women, 
with the implication that the majority of pef- 
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sons who score within the range typical for 
their gender are “appropriately sex typed,” 
whereas the few persons who score outside 
that range presumably exemplify sex role con- 
fusion, deviance, ambivalence, or maladjust- 
ment. But it is clear that defining masculinity 
as “not femininity” and femininity as “not 
masculinity” precludes characterizing persons 
as both feminine and masculine (androgy- 
nous) or, logically, as neither, To permit the 
latter sex role categories to emerge, at least 
two independent dimensions, one for mascu- 
linity and one for femininity, are needed. If 
psychological masculinity and femininity are 
construed as independent dimensions of hu- 
man functioning (albeit dimensions whose 
components in many cultures have been re- 
garded as differentially desirable in men and 
women), then one could characterize at least 
some individuals as both masculine and femi- 
nine, instrumental and expressive (Parsons & 
Bales, 1955), assertive and yielding (Bem, 
1974), or agentic and communal (Bakan, 
1966; Block, 1973; Carlson, 1971). 

Three self-report measures that embody 
this dualistic conception of sex roles have al- 
ready been published. The Bem Sex-Role In- 
ventory (BSRI; Bem, 1974) contains sepa- 
rate Masculinity and Femininity subscales, 
each comprised of 20 adjectives (7-point re- 
sponse format) judged differentially desirable 
for men and women in our society. The BSRI 
has been used to select participants in experi- 
ments that have demonstrated the greater 
behavioral flexibility of androgynous persons, 
relative to sex-typed ones, across several situa- 
tions (see Bem, 1976). Spence et al. (1975) 
developed separate male-valued and female- 
valued subscales of a Personal Attributes 
Questionnaire, with each subscale comprised 
of 18 or more bipolar adjectives (5-point for- 
mat) representing attributes judged stereo- 
typically masculine or feminine but considered 
desirable for both sexes. Heilbrun (1976) ex- 
tracted Masculinity and Femininity subscales 
from an earlier bipolar composite index based 
on the Adjective Check List (Cosentino & 
Heilbrun, 1964); items had been selected 


originally by contrasting the endorsement pro- 
fied with masculine 


portions of men identi uli 
fathers and women identified with feminine 
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mothers. All three instruments use a quadri- 
partite classification of persons—high mas- 
culine/high feminine (androgynous), high 
masculine/low feminine (masculine typed), 
low masculine/high feminine (feminine 
typed), and low masculine/low feminine (un- 
differentiated or indeterminate). 

The present article describes the develop- 
ment and psychometric credentials of a fourth 
instrument. It was inspired by and drew its 
theoretical rationale from Bem’s BSRI but 
was developed from the items of a widely 
used multiscale personality inventory, the 
Personality Research Form (PRF; Jackson, 
1967). Although various aspects of the valid- 
ity of the Masculinity and Femininity con- 
structs embodied in the subscales of this new 
instrument, the PRF ANDRO scale, have 
already been investigated in several contexts 
(Kelly & Worell, 1976; Berzins, Note 1; 
Welling, Note 2; Wetter, Note 3; Woods, 
Note 4), the Masculinity and Femininity sub- 
scales of the BSRI are regarded as the prin- 
cipal criteria for initial assessment of con- 
vergent validity in this article. 


Rationale for Item Selection 


The rationale outlined by Bem (1974) in 
constructing the BSRI included provisions 
for (a) separate Masculinity and Femininity 
scales, (b) items selected on the basis of sex- 
typed desirability (e.g, in American society, 
a masculine characteristic should be judged 
more desirable for a man than for a woman), 
and (c) items with generally positive con- 
tent. We chose to apply this rationale to the 
PRF item pool, since we had conducted ex- 
tensive research with this instrument and, 
were scale development successful, accumu- 
Jated data could be reanalyzed. 

The PRF is a multitrait inventory based on 
Murray’s need theory. Form AA of the PRF 
(Jackson, 1967) contains 20 content scales 
(Abasement, Achievement, Affiliation, Agares- 
sion, Autonomy, Change, Cognitive Structure, 
Defendence, Dominance, Endurance, Exhibi- 
tion, Harmavoidance, Impulsivity, Nurtur- 
ance, Order, Play, Sentience, Social Recogni- 
tion, Succorance, Understanding) and two 
validity scales (Infrequency and Desirabil- 
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ity). Each scale contains 20 items, 10 keyed 
true and 10 false. Reviews of the PRF attest 
to its unusually good reliability and validity 
(cf. Wiggins, 1973). 

To select potential Masculinity and Femi- 
ninity items from the PRF content scale item 
pool (400 items), the original PRF scale 
placement of the items was disregarded, but 
their content was evaluated for (a) consist- 
ency with Bem’s rationale and (b) consist- 
ency with rationally derived abstract defini- 
tions of the main content themes of Bem’s 
Masculinity and Femininity scales. In the 
latter regard, the BSRI Masculinity scale was 
held to depict a dominant-instrumental di- 
mension comprised of themes of social-intel- 
lectual ascendancy, autonomy, and orientation 
toward risk; the BSRI Femininity scale was 
judged to revolve around a nurturant—expres- 
sive dimension, containing themes of nurtur- 
ance, _affiliative-expressive concerns, and 
self-subordination. PRF items consistent with 
the masculine themes consequently were se- 
lected if they appeared more desirable in men 
than in women, whereas items consistent with 
the feminine themes were selected if more 
desirable in women than in men. To facilitate 
some degree of control over acquiescent re- 
sponding in the eventual scales, both true- 
and false-keyed items were considered. Sixty- 
four (32 masculine, 32 feminine) PRF items 
were provisionally selected in this manner by 
the first two authors, These items turned out 
to have been drawn from 16 of the 20 PRF 
content scales (no items from Change, Cog- 
nitive Structure, Order, or Sentience), with 

9 Masculinity items drawn from the Domi- 


analyses, the provisional scales were shortened 
to 29 Masculinity 


false) and 27 Femininity items (17 keyed 
true, 10 false). 


Sex-Typed Desirability Ratings 


To evaluate the consensuality of the au- 
thors’ judgments regarding the sex-typed de- 
sirability of the final 56 items, a separate 
study was conducted (Wicher, Note 5 )- When 
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scoring direction is adjusted appropriately 
(false-keyed items reflected), Masculinity 
scale items should be judged more desirable 
“for a man” than “for a woman,” whereas 
Femininity scale items should be judged op- 
positely. Using Bem’s (1974) rating format 
(7-point scales ranging from “not at all de- 
sirable” to “extremely desirable”), 177 stu- 
dents in introductory psychology (61 men, 
116 women) rated all items; 30 men and 57 
women judged the items for a man; 31 men 
and 59 women judged them for a woman; no 
judge rated both target sexes. Each PRF item 
was presented to judges in its full form and 
was followed by the question, “In American 
society, how desirable is it for a MAN (alter- 
nately, WOMAN) to mark this item TRUE?” 
Judges were cautioned to make normative (de- 
scriptive) rather than prescriptive judgments. 

A 2 (judge sex) x 2 (target sex) analysis 
of variance disclosed that the mean desirabil- 
ity of Masculinity items was 5.35 when the 
target was a man but was 3.29 when the tar- 
get was a woman, F(1, 173) = 497.57, p< 
0001. The mean desirability of Femininity 
items, in turn, was 5.28 when the target was 
a woman but 3.58 when the target was a man, 
F(1, 173) = 392.01, p< .0001. In neither 
analysis were effects associated with judge sex 
or the Judge x Target Sex interaction sig- 
nificant. Analysis of individual items showed 
that all 56 target sex differences were in the 
expected direction, with 53 differences sig- 
nificant beyond the .05 level and 50 beyond 
the .001 level. 


Participants and Measures used in 
Psychometric Anal ses 


The main portion of this article is based 
on two samples. Sample 1 (m =1160) was 
comprised largely of students in introductory 
Psychology at the University of Kentucky in 
1974-1975. The mean age was about 20. Sam- 
ple 1 members completed (a) the 85-item 
Interpersonal Disposition Inventory (IDI) 
comprised of the 56 items of the PRF AND R 
scale, the 20-item PRF Desirability scale, 7 
items from the PRF Infrequency scale @ 
more than one of these was answered in He 
keyed direction, the respondent was exclude 
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Normative Data and Sex Differences on the PRE ANDRO Scale in Two College Samples 


Normative Data and Se DAV ee eee 


Males Females 
Measure M SD M SD F 
Sample 1 
Masculinity 16.70 4.33 12.86 4.76 
culin i $ al 3 192,29* 
Femininity 14.29 3.59 17.90. 3,50 TRE 
Sample 2 
Masculinity 16.18 4.18 11.86 4.84 195,19" 
Femininity 14.31 3.57 18.37 3.60 312.29* 


Note. Sample 1 is comprised of 457 men and 703 women; Sample 2, of 434 men and 552 women. 


* p < .0001. 


from analyses) and 4 fillers and (b) the 60- 
item BSRI. During the second semester, par- 
ticipants also completed (a) a personal atti- 
tudes survey, which included, in alternating 
order, the 33 items of the Marlowe-Crowne 
Social Desirability Scale (Crowne & Mar- 
lowe, 1964) and the 33 items of a self-esteem 
scale designed especially for this study as a 
measure of generalized self-evaluation (Well- 
ing, 1976; Wetter, Note 3) and (b) Rotter’s 
Internal-External Locus of Control Scale 
(Rotter, 1966). 

Sample 2 (n = 986) consisted of introduc- 
tory psychology students tested in 1970. Mean 
age was 18.7. Their responses to the full PRF 
(Form AA) were scored for the Masculinity 
and Femininity subscales of the PRF 
ANDRO scale. It should be noted that the 
order of appearance and the context of PRF 
ANDRO scale items differ in the regular 


PRF and the IDI. 


Normative Data 


Table 1 presents normative data and tests 
for sex differences for the PRF Masculinity 
and Femininity scales when scored from the 
IDI (Sample 1) and the full PRF (Sample 
2), Although sex differences in item endorse- 
ment had played no part in item selection, in 
both samples men exceeded women on the 


Masculinity scale and women exceeded men 


on the Femininity scale (both ps < 0001) ; 
te or exceed one 


the differences approxima’ 

standard deviation in each instance.” Analysis 
of individual items revealed that 83.9% 
(47/56) of these showed significant ($ < 05) 


differences in the expected direction; no items 
were significant in the opposite direction. As 
one would expect if most individuals were sex 
typed in our culture, men also scored higher 
on the Masculinity scale than on the Femi- 
ninity scale, and women showed an even 
greater difference in the opposite direction. 

Although Table 1 also reveals some dif- 
ferences between the two samples (e.g, Sam- 
ple 1 women scored higher than Sample 2 
women on Masculinity), such differences may 
arise with equal plausibility from differences 
in age or academic level (Sample 1 is older), 
the different contexts in which items were 
embedded (IDI vs. PRF), or cultural changes 
(1970 vs. 1974-1975). 


Scale Independence and Reliability 


With regard to the theoretical requirement 
that Masculinity and Femininity scores be 
orthogonal (rather than strongly and nega- 
tively correlated), the correlations between 
the two scales in Samples 1 and 2, respec- 
tively, were —.05 and —.11 for men and —.16 
and —.24 for women. 

Estimates of the retest stability of scale 
scores, using an interval of approximately 3 


i 

1 Users of the BSRI should note that on the BSRI 
Masculinity scale, the 452 men obtained a M = 5,07, 
SD =.65; the 703 women, M=454, SD= 695 
F(i, 1153) = 172.66, p < 0001. On the BSRI Femi- 
ninity scale, men had a M = 4.56, SD = 53; women, 
M=5.14, SD = 53; F(1, 1153) 


= 322.09, P< 0001. 
Alpha coefficients were 86 for Masculinity and .80 
for Femininity. 
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weeks, were obtained from a separate sample 
of 55 men and 82 women enrolled in the first 
author’s Personality classes; these students 
were unaware of our interest in sex roles at 
the time of testing. Rather than completing 
the same instrument twice, about half of the 
students completed the entire PRF in class 
and the IDI at home; the other half reversed 
the sequence. Given the differing item con- 
texts and conditions of administration, the 
obtained retest coefficients (r = .81 for both 
the Masculinity and Femininity scales) 
should be regarded as conservative. 

Although the explicit multidimensional defi- 
nitions of Masculinity and Femininity that 
were used tend to lower interitem correlations 
and the resulting scale homogeneities, item- 
total correlations in eight independent anal- 
yses (2 Scales x 2 Samples x 2 Sexes) 
ranged between .12 and 52 (all ps< 01), 
with median values ranging from .28 to 35, 
The alpha coefficients for Masculinity were 
-76 in Sample 1 and .79 in Sample 2. For 
Femininity, the coefficients were .67 in Sam- 
ple 1 and .70 in Sample 2, 


Sex Role Classification 


Following the Procedure used by Bem 
(1977) and Spence et al. 
formed median splits on the distributions of 
scores on the Masculinity and Femininity 
scales separately, 
In this manner, 


greater and 14 or less were designated as high 
and low, 


ninity scale values 


femininity), masculine typed (high masculin- 
ity/low femininity), 
_ Masculinity/high femininity), and indetermi- 
nate (low on both). When Samples 1 and 2 
were pooled, the resulting classification per- 


Were androgynous, 18.7% and 20.3%; mascu- 
line typed, 48.7% and 13.6%; feminine 
typed, 10.4% and 48.1%; indeterminate 
22.1% and 17.9%, Roughly speaking, then, 
1 of every 2 Persons is “appropriately” sex 
typed, 1 in 5 is androgynous, 1 in 5 js inde- 
terminate, and only 1 in 10 is “cross-typed.” 
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Content Dimensions in the BSRI and the 
PRF ANDRO Scale 


Since a rational analysis of the content 
themes of the BSRI gave us the inspiration 
to develop the PRF ANDRO scale, it is ap- 
propriate to examine empirically the extent 
to which such themes emerged in the self- 
descriptions of Sample 1 participants, Using 
the data of the 1,155 persons who had com- 
pleted the BSRI, the 40 BSRI items and gen- 
der (male = 0, female = 1) were intercorre- 
lated and factored by the method of principal 
components (unities in the diagonal), fol- 
lowed by varimax rotation. Application of 
Cattell’s scree test to the nine eigenvalues 
greater than 1.0 suggested that eight factors 
(56% of total variance) should be rotated. 

In terms of saliently (r> .50) loading 
items (shown in Parentheses), the four fac- 
tors defined exclusively by BSRI Masculinity 
items refer to Social Ascendancy (acts as a 
leader, dominant, has leadership abilities, 
forceful, aggressive, assertive, strong personal- 
ity), Autonomy (independent, self-sufficient, 
self-reliant, individualistic), Intellectual As- 
cendancy (defends own beliefs, willing to take 
a stand), and Physical Boldness (athletic, 
competitive). The three factors defined by 
BSRI Femininity items describe Nurturant 
Affiliation (tender, compassionate, warm, gen; 
tle, sympathetic, affectionate, understanding, 
sensitive to the needs of others, eager to 
soothe hurt feelings), Self-subordination 
(childlike, gullible, flatterable), and Introver- 
sion (soft-spoken, shy). The eighth factor 
refers only to respondents’ gender (.88) and 
the items feminine (.85) and masculine 
(—.87), indicating that the latter items are 
Synonymous with biological gender. Overall, 
these results offer surprisingly good support 
for our initial conceptual analyses. 

A comparable analysis conducted with the 
56 PRF ANDRO scale items (plus gender) 
yielded 18 eigenvalues greater than 1.0.2 Ap- 


“The larger number of eigenvalues for the PRE 
ANDRO scale, as compared to the BSRI, is a fue 
tion of the larger number of items, the lesser ae 
reliabilities associated with true—false ven i 
point scales, and possibly the lack of balancing 10 
acquiescence in the BSRI. 
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plication of the scree test suggested the ad- 
visability of rotating only eight factors (34% 
of variance), of which seven were interpreted. 

The four factors defined exclusively by 
PRF Masculinity items referred to Social- a |23233333 
Intellectual Ascendancy (e.g., “I seek out AERES AA iat | 
positions of authority”), Autonomy (e.g, “If 
I have a problem, I like to work it out TORTIE aaa 
alone”), Orientation Toward Risk (e.g. SI o% bobnih hi $ 
avoid some sports and hobbies because of their J | 0270 n] j 
dangerous nature,” keyed false), and Indi- 
vidualism (e.g., “I don’t care if my clothes 
are unstylish, as long as I like them”). The 
two factors defined exclusively by PRF Femi- 
ninity items addressed Nurturance (e.g., 
“When I see a baby, I often ask to hold 
him”), and a blend of Affiliative Concerns 
and Self-subordination (e.g., “I like to be 
with people who assume a protective attitude 
toward me”). The former (Nurturance) di- 
mension showed the highest loading for gen- 
der (.70) among the factors. The seventh 
factor included three Femininity items (e.g., 
“T am usually the first to offer a helping hand 
when it is needed”) and two Masculinity 
items (e.g., “When I see a new invention, I 
attempt to find out how it works”), collec- 
tively denoting Helpful Initiative. 

Although both the BSRI and the PRF 
scales are clearly multidimensional, the simi- 
larities between the main themes of the re- 
spective Masculinity and Femininity subscales 
suggest that convergent validity coefficients 
should be appreciable. 


Personality Research Form; BSRI = Bem Sex- 


Desirability Scales, Self-esteem, and Locus of Control 
5 
1 control. 


©191 men, 359 women. 


Convergent and Discriminant Validity 

To determine the convergences between 8 
the PRF and BSRI measures of Masculinity 
and Femininity as well as their respective in- 
dependence from socially desirable responding 
(assessed by the PRF, BSRI, and Marlowe- 
Crowne Desirability scales), and to examine 
their relation to measures of self-esteem and 


beliefs in internal-external locus of reinforce- 


ment, the measures administered to Sample 
i presented in 


1 were intercorrelated and are 
Table 2. 3 i 
The convergent validity coefficients for 


Masculinity were .60 for men and .65 for 


women; for Femininity, they were .52 and 


s.p < 001. 


> 452 men, 703 women. 
High scores on this scale denote beliefs in externa’ 


M-C = Marlowe-Crowne. 


Measure 
s for men and women are shown above and below the main diagonal, respectively. PRF 
**p < 01. 


BSRI Masculinity? 
369 women. 


BSRI Femininity” 


PRF Masculinity* 
PRF Femininity* 
Locus of control? 


Self-esteem® 


2". 
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d 196 men, 
*p < 05 


Correlations Between PRF and BSRI Sex Role Measures, 


Table 2 
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-50, respectively. (For the sexes combined, 
they were .68 and .61). These values indicate 
substantial similarities between the constructs 
underlying the total scores in these instru- 
ments. Analysis of individual items, in fact, 
showed that all PRF Masculinity scale items 
correlated significantly (Mdn = 23, all ps < 
-001) with BSRI Masculinity scores; simi- 
larly, all PRF Femininity items correlated 
significantly (Mdn = .22, all ps < .04 or bet- 
ter) with BSRI Femininity scores. Con- 
versely, all BSRI Masculinity items and all 
but one BSRI Femininity item related sig- 
nificantly to their respective PRF counterpart 
scores (Masculinity Mdn = .37, F emininity 
Mdn = .33, all ps < .02 or better). 

Excluding the BSRI items masculine and 
feminine (which refer to gender only), the 
high scorer on the PRF Masculinity scale is 
described by the 10 best BSRI items as acts 
as a leader, has leadership abilities, dominant, 
willing to take a stand, willing to take risks, 
independent, forceful, competitive, strong per- 
sonality, and individualistic. The high scorer 
on the PRF Femininity scale, in turn, can be 
described as sympathetic, loves children, eager 
to soothe hurt feelings, sensitive to the needs 
of others, tender, compassionate, affectionate, 
gentle, warm, and understanding, 

With regard to discriminant validity, the 
correlations of the sex role measures with the 
three Desirability scales (Measures 5, 6, and 
7 in Table 2) are generally positive and low. 
For the two PRF scales, the 12 relevant co- 
efficients range from .00 to .29; for the BSRI 
scales, they Tange from .02 to .32. For both 
instruments, consequently, socially desirable 
responding explains very little variance in 
Scores, 

Inspection of the correlations involving the 
Self-esteem scale, however, reveals that scores 
on both Masculinity scales are moderately 
related to favorable self-evaluations (for the 
PRF Masculinity scale, 36 among men and 
-38 among women; for the BSRI, .39 among 
men, .44 among women). In contrast, the re- 
lation of both Femininity scales to self-esteem 
is negligible, with coefficients ranging from 
—.06 to .13. Differences among sex role cate- 

gories in self-esteem (Spence et al, 1975; 
Welling, 1976; Wetter, Note 3), consequently, 
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are more likely to be related to variations in 
Masculinity than in Femininity scores when 
the PRF ANDRO or the BSRI measures are 
used, 

Finally, it is apparent that apart from low. 
order correlations suggesting a positive rela- 
tion of Masculinity scores to beliefs in in- 
ternal control, significant among men and for 
the BSRI among women, little variance is 
shared between these sex role measures and 
beliefs in internal-external control. As noted 
elsewhere (Berzins, Note 6), the best pre- 
dictor of locus of control is the PRF Desira- 
bility scale, probably because it contains an 
appreciable self-esteem component (see Table 
2) and has in fact been construed as a mea 
Sure of self-esteem in a prior study (Berzins, 
Barnes, Cohen, & Ross, 1971). 


Sex Role Categories and Personality 
Dimensions 


Since the members of Sample 2 had com- 
pleted the entire PRF, it was possible to 
examine the relation of their PRF ANDRO 
scale scores to the principal components of 
the 20 PRF scales, The latter were extracted 
in an earlier analysis (Berzins, Note 6), but 
they require brief description. Following stan- 
dardization of PRF content scales within sex, 
the matrix was factored for the sexes sepa 
rately and combined, with varimax rotation. 
In all three analyses, five components showed 
eigenvalues greater than 1.0 (explaining 
64.4% of total variance in the combined anal 
ysis); between-sex congruency comet 
(Harman, 1967) ranged from .85 to .94; ani 
invariance coefficients (Pinneau & Newry 
1964), from .75 to .93. In the combined anal 
ysis (m= 986), the components and theif 
principal markers (loadings > .60) were r 
pulsivity (Impulsivity, .79; Cognitive Sete 
ture, —.81; Order, —.76); Dependency Sra 4 
corance, .86; Autonomy, —.74); Social Bi 
(Exhibition, .78; Affiliation, .76; Dorma 
-65); Defensiveness (Defendence, .81; i 
gression, .76; Abasement, —.64); and ae 
lectuality (Sentience, .80; Understan¢ ia 
74). Component scores (M = 50, SD = A 
derived from this solution by multiple m 
sion should render less obtrusive the overt 


A NEW MEASURE OF PSYCHOLOGICAL ANDROGYNY 133 
Table 3 
Mean Factor Scores and Univariate F Rati 
c core 7 atios Associ i i 
Across Five Principal PRF Components Among ae Fee Be eres 
Category 
PRF component Indeterminate Feminine typed Masculine typed Androgyno! F 
us 
Dependency 
Men 
aen Par che aes 51.8, 78.53%" 
Ia Dp 
Social Poise ne eh ee 
Men 
Women Ge eae He 57.1o 51.16%** 
F 1p i 
Intellectuality j o PIIR 
M 
er fae ues ee 55.50 20,05*** 
Sa. 9 A 
Defensiveness i Dig ice! 
Men 49.0, 
W -Oa,b 46.15 51.6, 49.0s, 8:25%% 
omen 50.7a 48.24 56.2» S05.. 12.38"** 
Impulsivity 
Men 512 47. 
y ; 0 50.5 49,3 2.45" 
Women 53.30 48.6, 53.7 48.5, 9.46""* 
n 
Men 102 52 
[ 205 
Women 98 290 66 HA 


Note. In each row, cells with the same subscript 
significant difference test. All factor scores have 
prio PRF = Personality Research Form. 
RESE .06, 
ae b < 002. 

p < 0001. 


between PRF ANDRO scale items and the 
PRF scales from which they were drawn 
originally. 

Differences between the four sex role 
groups, based on univariate analyses of vari- 
ance computed for the sexes separately, are 
shown in Table 3. The PRF components arè 
presented in order of decreasing intergroup 
differentiation. 

Although the F ratios in 9 of the 10 rows of 
Table 3 reveal highly significant differences 
between groups, it is apparent that these dif- 
ferences are especially pronounced on the first 
two dimensions, Dependency and Social Poise. 
On the Dependency dimension, feminine- 
typed persons attained very high and mas- 
culine-typed persons very low scores; the dif- 
ference between these two groups reached 
two standard deviations among women. Dif- 
ferences of this magnitude suggest that the 
median split procedure whereby these persons 


do not differ at the .05 level or better by Tukey's honestly 
been standardized to a mean 0! 


f 50 and a standard deviation 


were operationally defined as masculine typed 
(above the median in Masculinity, below in 
Femininity) and feminine typed (the reverse 
pattern) is functionally equivalent to splitting 
the distribution of Dependency component 


scores at its median and terming high scorers 


feminine and low scorers masculine! To assess 
Androgyny dif- 


this equivalence empirically, 
ference scores (Femininity minus Masculin- 
ity; cf. Bem, 1974) were correlated with De- 
pendency component scores. The resulting 
coefficients were 79 for men and .82 for 
women, suggesting that whenever Masculinity 
and Femininity scores differ greatly, as in 
strongly sex-typed persons, one might as well 
speak of individual differences along a dimen- 
sion ranging from extreme dependency to €x- 
treme autonomy. Although the mean Depen- 
dency scores in Table 3 suggest some Gender 
x Sex Role differences (€-8» indeterminate 


and androgynous men appear more dependent 
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than their female counterparts), the main 
thrust of these results “identifies” masculine 
versus feminine typing with variations in De- 
pendency. Note that the two non-sex-typed 
groups, androgynous and indeterminate, dif- 
fer from both sex-typed groups but not from 
each other. 

On the Social Poise dimension, on the other 

hand, the pattern of means was very different. 
Here, androgynous persons scored much 
higher than did indeterminate persons, the 
mean difference between these groups being 
about 1} standard deviations. Androgynous 
individuals also scored significantly higher 
(and indeterminates lower) than the two sex- 
typed groups (which did not differ from each 
other). Since scores on the Social Poise di- 
mension are orthogonal to scores on the De- 
pendency dimension, differences in Social 
Poise obviously comprise an independent ma- 
jor dimension along which sex role groups 
vary. Since androgynous persons by definition 
have high scores on Masculinity and Feminin- 
ity (whereas indeterminates have low scores), 
it is not surprising that Androgyny sum scores 
(Femininity plus Masculinity; cf, Strahan, 
1975) should correlate .60 (for men) and .59 
(for women) with scores on the Social Poise 
dimension. 

Turning to intergroup differences on the In- 
tellectuality dimension, the patterning of 
group differences was almost identical to that 
on Social Poise: Androgynous and indeter- 
minate persons again defined the extremes, 
and androgynous persons scored significantly 
higher than at least one (and indeterminates 
lower than both) of the sex-typed groups. 
Differences on the Defensiveness dimension 
were less pronounced but resembled those 
associated with the Dependency dimension, 
in that among both sexes the masculine-typed 
persons were the most, and feminine-typed 
ones the least, defensive. Finally, differences 
on the Impulsivity dimension suggest that 
among women, feminine-typed and androgy- 
nous persons show greater impulse control 
than do masculine-typed and indeterminate 
persons. These differences did not reach sig- 
nificance among men. 


Among. college Students, four of the five 
principal components have proven salient in 
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differentiating the groups. Although the re- 
sulting characterizations are technically four 
dimensional, two main patterns emerged: (a) 
Masculine-typed persons were the most in- 
dependent and defensive, whereas feminine- 
typed persons were the most dependent and 
least defensive; (b) androgynous individuals 
were the most socially poised and intellec- 
tually oriented, whereas indeterminates were 
the most socially awkward and unintellectual, 
(These patterns have been fully cross-vali- 
dated in samples of psychotherapists and alco- 
holics; see Berzins, Note 1.) 


Broader Issues in Construct Validity 


Thus far, this article has involved college 
students only. To facilitate an appraisal of 
the validity of the Masculinity and Feminin- 
ity constructs across a variety of populations 
varying in age, education, occupation, and 
clinical status, Figure 1 arrays 18 different 
samples, representing over 6,000 persons, on 
both scales. Ten samples contain both men 
and women. Seven samples were scored from } 
the PRF, 11 from the IDI. 

The configuration of groups along the Mas- 
culinity (vertical) dimension strikingly cor- 
roborates the theoretical definition of Mascu- 
linity as an amalgam of social-intellectual 
ascendancy, autonomy, and risk taking. 
Policemen, men majoring in accounting, and 
male dental students showed the highest Mas- 
culinity mean scores; clinically depressed 
women, newlywed women, and women en- 
rolled in a hospital weight-reduction program 
showed the lowest scores. Analyses conducted 
within groups (e.g., a sample of 1,109 college 
students) have also revealed interesting yang 
ations along the dominant-instrumental a 
mension. For example, college men who, a 
years from now, expected to make as muc 
money as their classmates had mean Mascu- 
linity scores of 16.3; those who expected to 
make moré scored 17.8; and those who €x- 
pected to make much more scored 19.6, F i 
383) = 15.27, p< .0001. The comparab 
values for college women were 12.7, 14.3, an 
16.3, F(2, 720) = 15.67, p < 0001. 

The theoretical alignment of Remini) 
with themes of nurturance, affiliativeexpre™ 
sive concerns, and self-subordination appea" 
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to be corroborated by the fact that high 
school, college, and “general population” 
women showed high scores, whereas most 
male groups and, among women, gay women 
and women majoring in accounting scored 
considerably lower. It is curious, however 
that both male and female psychotherapists 
showed relatively low Femininity (and Mas- 
culinity ) scores in comparison to their respec- 
tive gender peers. Perhaps because they are 
the most highly educated and testwise group 
sampled, and because their interpretation of 
some nurturance-related items may have been 
restricted to psychotherapeutic contexts, these 
therapists did not describe themselves more 
extremely on this dimension. Although no 
sample of unemployed housewives with chil- 
dren—who might be expected to show very 
high Femininity scores—is presented, anal- 
ysis of college women’s anticipated life-styles 
disclosed highly significant differences in 
Femininity scores as a function of the num- 
ber of children these women anticipated hav- 
ing in their own future families. Women an- 
ticipating childless versus one-, two-, three-, 
and four- (or more) child families showed 
mean Femininity scores of 15.1, LK 2 LETS 
18.6, and 18.9, respectively, F(4, 718) = 
17.68, p < .0001. For college men, the cor- 
responding values were 12.3, 15.0, 15.0, 14.6, 
and 14.7, F(4, 381) = 3.18, P < Ol. 

The overall distribution of mean scores in 
Figure 1 clearly shows that the vast majority 
of groups in our society are Sex typed. If we 
were to treat the entries in Figure 1 as indi- 
viduals rather than groups, median splits on 
both scales would classify 12 of 15 male 
groups as masculine typed and 11 of 13 fe- 
male groups as feminine typed. The overall 
percentage (82%), of course, is much higher 
than that for individuals within any one of 
the groups. It does emphasize, however, the 
pervasive and highly significant sex differ- 
ences on both scales in almost every sample 
tested to date. Indeed, only the self-desig- 
nated homosexual volunteers did not show 
significant differences on these scales; both 
gay men and women emerged as quite high 
in Masculinity and moderate in Femininity 
(cf, Heilbrun & Thompson, 1977, for similar 
findings with the Heilbrun scale). 
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Figure 1. Distribution of Masculinity and Femininity 
scores across 18 samples (28 groups). (Account = 
college students in accounting classes, "= 105; Alco 
= hospitalized Veterans Administration alcoholics, 
n= 760; Coll. 1 and Coll, 2 = college students in 
logy classes in 1974-1976 and 
1970, n = 2,547 ‘and 986, respectively ; Dental = den- 
n= 54; Depres = hospitalized 
clinically depressed patients, "= 20; Gay = homo- 
Het-VA = heterogeneous 
Administration outpa- 
igh Sch = high school students, 


tients, n = 335 Hi 
hospitalized nonpsychiatric medi- 


n= 700; Hospit = 

cal patients, ” = 20; Newlywed = newlywed couples, 

n= 198; Opiate civilly committed opiate addicts, 
Publ 


n= 216; Police = municipal police, n= 25; 
sp = college students in public speaking classes, 
n= 166; Schiz = schizophrenic Veterans Adminis- 
tration outpatients, ” = 20; Therap = psychothera- 
pists and trainees, "= 276; VFW = Veterans of 
Foreign Wars, n=57; Weight = participants in a 
hospital weight-reducing program, "= $4.) 


There are exceptions of clinical interest as 
well. Male schizophrenic outpatients and alco- 
holic inpatients, for example, showed mark- 
edly “feminine” typing. Within both sexes, 
psychopathology appears to be associated 
with normatively low Masculinity scores (egs 
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clinically depressed female inpatients, hetero- 
geneous nonschizophrenic male outpatients) 
and, among men but not women, with some- 
what elevated Femininity scores. Whereas the 
problems that male schizophrenics have had 
with respect to gender identity are docu- 
mented in the literature (cf. LaTorre, 1976), 
male alcoholics as a group have seldom been 
depicted as feminine typed by traditional 
measures of Masculinity-Femininity. From 
the present theoretical perspective, however, 
it is entirely reasonable to expect chronic alco- 
holics (M age = 43.9 years) to score very low 
on a dominant-instrumental dimension. 
Whereas most of the groups in Figure 1 
are relatively homogeneous in age (M ages 
range from 15.7 for high school students to 
49.0 for the Veterans of Foreign Wars), 
cross-sectional comparisons among age groups 
in the more heterogeneous samples of men re- 
veal low but significant decreases in Mascu- 
linity and/or increases in Femininity scores 
as age increases. Classification of subjects 
into the four sex role categories, consequently, 
should be based on normative considerations 
of age as well as those concerning education, 
socioeconomic status, and the degree of psy- 
chopathology. 


General Discussion 


: PRF 
ANDRO scale are independent, reliable, mini- 


mally related to socially desirable responding 
substantially related to their respective coun- 
terparts in the BSRI, convergent with major 
personality dimensions, and capable of mean- 


: l ies using the 
ormer to classify subjects should yield re- 


with, studies 
(e.g., Bem, 1975; Bem & 
the fact that 


the sex role categories delineated in this study. 
Substantively, the construal of Masculinity 
and Femininity as independent dimensions 
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rather than as opposites, and the resulting 
quadripartite classification of sex roles, has 
afforded a rather comprehensive view of the 
relation of sex roles to personalit y dimensions, 
Although masculine- and feminine-typed per- 
sons indeed emerged as opposites on some di- 
mensions (e.g., Dependency, Defensiveness), 
they were relatively indistinguishable on 
others. On these other dimensions (e.g., So- 
cial Poise, Intellectuality), it was the an- 
drogynous and indeterminate persons who | 
emerged as “polar opposites.” Had the PRF 
ANDRO scale been constructed in the tra- 
ditional manner, that is, by selecting PRF 
items that showed the largest sex differences 
in endorsement, it is obvious that neither the 
personality patterns of androgynous persons 
nor the sex-role-related differences in Social 
Poise and Intellectuality would have been 
detected, 

To further appreciate the extent to which 
the present approach, like that of Bem’s, 
differs from traditional ones, recall that one 
index of the “validity” of traditional measures 
has been the point-biserial correlation between 
gender and a measure of masculinity-femi- 
ninity; such coefficients often exceed .70 
(Constantinople, 1973). By this standard, 
both the PRF and BSRI scales did not fare 
very well: In Sample 1, gender correlated 
—.38 and —.36 with the PRF and BSRI 
Masculinity scales and .45 and .47 with the 
respective Femininity scales. But although sex 
differences may characterize both traditional 
and new measures of sex roles, their presence 
has little to do with overlapping content do- 
mains. For example, in a subset of Sample 
2 participants (n = 682), the correlations be- 
tween the (bipolar) Masculinity-Femininity 
scale of the Omnibus Personality Inventory 
(Heist & Yonge, 1968) and the PRF Mascu- 
linity scale were .04 (men), .08 (women), 
and .30 (combined); the corresponding C0- 
efficients with the PRF Femininity scale were 
—.22, —.16, and —.42. Apart from the 
higher coefficients resulting from sex differ- 
ences in the combined sample, it is clear that 
Our measures share trivial content variance 
with this traditionally constructed measure. 
Tn another sample of 206 male alcoholics, the 
correlations between the Minnesota Multi- 
phasic Personality Inventory Mf scale and out 
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Masculinity and Femininity scales were —.06 
and .00, respectively. Since our Masculinity 
and Femininity scales are independent rather 
than negatively correlated among both men 
and women, the pervasive sex differences ob- 
served in most groups (cf. Figure 1) are bet- 
ter understood as reflecting the consequences 
of traditional socialization practices (encour- 
aging boys toward self-definitions thought 
more desirable in men than in women, vice 
versa for girls) than the psychometric “redis- 
covery” of biological gender accomplished by 
compounding items on which the sexes show 
differing endorsement frequencies. 

The multithematic definitions of our Mas- 
culinity and Femininity constructs—contrast- 
ing social-intellectual ascendancy, autonomy, 
and risk taking with nurturance, affiliative— 
expressive concerns, and self-subordination— 
were corroborated empirically by the factor 
analyses of the BSRI and PRF ANDRO scale 
item pools. The reader should note that the 
“discovery” of these content themes in no 
way precludes using these instruments in their 
present form, nor does it argue for the utility 
of fractionating the present scales into sub- 
scales; all items in both instruments corre- 
lated significantly with their respective total 
scores and are intended to emphasize band- 
width over fidelity. Because of their breadth, 
however, the Masculinity and Femininity con- 
structs can be amalgamated with related ma- 
jor conceptions in personality and interper- 
sonal theory, for example, Bakan’s (1966) 
distinction between agency and communion 
as male and female principles that require 
integration (see also Block, 1973; Carlson, 
1971), Jung’s (1956) animus/anima dialectic, 
Parsons and Bales’s (1955) conception of in- 
strumental and expressive roles, Leary’s 
(1957) and Carson’s (1969) organizing inter- 
personal behaviors around dominance-sub- 
mission and love-hate coordinates (cf. Ber- 
zins, Welling, & Wetter, Note 7), and even 
Campbell’s (1975) recent discussion of the 
tensions along a selfishness yersus altruism 
dimension throughout history- i 

Overall, the definition of psychological an- 
drogyny that has emerged from the present 
study emphasizes a relative balance of mas- 
culine- and feminine-typed attributes in the 
context of high social competence (openness 
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to interpersonal and intellectual experiences). 
Block (1973) and others have marshaled evi- 
dence in support of the proposition that the 
attainment of higher levels of moral develop- 
ment is coextensive with the integration of 
traditionally masculine and feminine domains 
of self-definition and behavior. Conversely, 
individuals who are neither androgynous nor 
sex typed—indeterminate persons—are char- 
acterized by lower levels of social competence 
in this study and low self-esteem in others 
(e.g, Spence et al., 1975; Welling, 1976; 
Wetter, Note 3). 

Finally, the quadripartite classification of 
sex roles, insofar as it converges conceptually 
and empirically with the dimensions of the 
interpersonal þehavior circle (Carson, 1969; 
Leary, 1957; Berzins et al., Note 7) requires 
examination in the context of interpersonal 
communication, compatibility, and influence, 
especially in psychotherapy (Berzins, in 
press). Even though most therapists are men 
and most patients women, little research has 
been devoted to the appraisal of the condi- 
tions under which gender- and sex-role match- 
ing variables affect the processes and out- 
comes of psychotherapy. The ascendancy of 


| androgyny as 4 model of mental 


sychological i 
Feith suggests that such research has high 


priority. 
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and Disinhibition. 


The Sensation-Seeking Scale (SSS) was de- 


veloped in an attempt to provide an opera- 
tional measure of the construct optimal level 
of stimulation (OLS). The construct is an old 
one, first formulated by Wundt (1873) to ex- 
plain the curvilinear relationship between af- 


| fective reactions and intensities of stimulation. 


After lying dormant for about 80 years, the 
OLS resurfaced in the 1950s and early 1960s 
in many theories, including those of Berlyne 
(1960), Fiske and Maddi (1961), Hebb 
(1955), Leuba (1955), Malmo (1959), Mc- 
Clelland, Atkinson, Clark, and Lowell (1953), 
and Schlosberg (1954). Berlyne, Fiske and 
Maddi, and Hebb, Malmo, and Schlosberg 
suggested that the idea of an optimal level of 
arousal (OLA) could be substituted for OLS, 
since the arousal construct could accommodate 
stimulus parameters such as novelty versus 
constancy, and complexity. i 

The first SSS (SSS II; Zuckerman, Kolin, 
Price, & Zoob, 1964) was developed with the 
idea of predicting responses to the experimen- 
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This study compared the factor structure of the Sensation-Seeking Scale (SSS) 
in English and American samples, and a new form of the SSS, applicable to 
both samples, was constructed. Three of the four factors showed good cross- 
national and cross-sex reliability. English and American males did not differ on 
the total SSS score, but American females scored higher than English females. 
Males in both countries scored higher than females on the total SSS score and 
on the Thrill and Adventure-Seeking and 
declines occurred for both sexes, particul 


Disinhibition subscales. Significant age 


larly on Thrill and Adventure Seeking 


tal situation of sensory deprivation. It con- 
sisted of a General scale derived from factor 
analyses of many diverse kinds of items re- 
flecting a positive reaction to or desire for 
stimulating, exciting, and novel kinds of ex- 
periences. This scale was rather heavily loaded 
with the risk-taking kind of items that subse- 
quently became part of the Thrill and Ad- 
venture-Secking subscale. 

The SSS II was initially applied in sensory 
deprivation experiments, and the idea of stable 
individual differences in OLS and OLA be- 
came the central postulate in the theory de- 
veloped by Zuckerman (1969) to explain dif- 
ferences in reactions to sensory deprivation. 
In the late 1960s, interest shifted from sen- 
sory deprivation to the broader construct- 
validity implications of the OLS-OLA mea- 
sure (Zuckerman, Bone, Neary, Mangelsdorff, 
& Brustman, 1972; Zuckerman & Link, 
1968). The SSS proved to have considerable 
validity for a variety of phenomena, ranging 
from design preferences to sexual and drug €x- 
periences and volunteering for unusual experi- 
ments or risky activities. Most of the research 
has been summarized in chapters by Zucker- 
man (1974, in press). 

Studies by Farley (1967) and Zuckerman 
and Link (1968) suggested that there might 
factor of sensation seeking, 


be more than one 
sed in an attempt 


and factor analyses were U 


any form reserved. 
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to discover what these factors might be. 
Zuckerman (1971) reported the “dimensions 
of sensation seeking” found in samples of male 
and female undergraduates in the Philadelphia 
area. Four factors were identified, and three 
of them showed good reliability in their load- 
ings across the sexes. The fourth factor, 
Boredom Susceptibility, was not clearly de- 
fined in the female group. 

The first factor was called Thrill and Ad- 
venture Seeking (TAS), and it contained 
items expressing a desire to engage in sports 
or other activities involving speed or danger, 
The second factor was called Experience Seek- 
ing (ES), and it represented the seeking of 
experience through the mind and senses, 
travel, and a nonconforming life-style. The 
third factor was labeled Disinhibition (Dis), 
which seemed to represent the desire for s0- 
cial and sexual disinhibition as expressed in 
social drinking, partying, and variety in sex- 
ual partners. The fourth factor, called Bore- 
dom Susceptibility (BS), represented an aver- 
sion to repetition, routine, and dull people, 
and restlessness when things are unchanging. 
Examples of the sensation-seeking parts of 
the forced-choice items can be seen in Table 2. 

Recently, 
factor analyzed Form IV of the SSS, using 156 
American undergraduates as subjects. The au- 
thors did not analyze male and female data 


quarter of t 
that with n 


Griffith Suggested that different fa 
exist in males and females. This is a ibil- 
ity, of course, but Zuckerman w. hae 


as looking fi 
factors that had broad applicability, Although 


the factor scales in Form IV Temained sub- 

stantially correlated, the General scale could 

not be used as an adequate Substitute for the 
or two reasons: (a) Since it w. 

; z as 

based only on items that were contained in 

the Experimental Form I, it did not include 


ctors might 
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an adequate sampling of items from sever 
factors based on new items written for Eg 
perimental Form III. (b) In the validit 
studies all of the scales, including the Gener, 
scale, related to some of the validity criteri 
but for other criteria, such as the physiologi 
cal, only specific subscales showed significan| 
relationships, 

Before the Stewart and MacGriffith study, 
two factor studies were done on Form II o 
the SSS: one on an English sample (Farley, 
1967) and one with a Japanese translatio 
given to a Japanese sample (Ohkubo, Note 1); 
The general factor, defined by the unrotated 
factor loadings on the first factor, correlated 
-67 between English and American samples 
but only .35 between both American and 
English samples and the Japanese sample. A 
translated questionnaire cannot be expected 
to show very high cross-cultural factor re- 
liability because of language and cultural dif- 
ferences. The reasonably high correlation be- 
tween English and American samples was 
encouraging. 

One of the aims of the present study was to 
examine the cross-cultural reliabilities of the 
factors in Form IV of the SSS, comparing the 
factor-analytic results from the original study 
(Zuckerman, 1971) with those from a large, 
Socially heterogeneous sample of an English 
Population. If the four factors originally de- 
rived from the American sample were mean- 
ingful and had a biological basis (Zuckerman, 
1974), they would show cross-cultural as well 
as cross-sex stability. 

A second purpose was to develop a shorter 
form of the SSS, based on the four fey 
analyses (English and American males an i 
females). In this form, a total score, derive 
from the summation of the four factor scores; 
could be substituted for the General scale 1 
Forms II and IV, which did not contain a 
adequate sample of items from the Dis an 
BS factors. be 

Assuming that the same factors could a 
found in English and American samples, a 
third question concerned cultural sie 
in scores on the sensation-seeking s¢@ E 
Ohkubo (Note 1) found that both Japan 
and Thai students (Thai data from Bad 
witz, 1967) scored lower than American : a 
dents on the General scale of Form II. The 


SENSATION SEEKING IN ENGLAND AND AMERICA 


differences must be interpreted cautiously, 
since they were based on translated sensation- 
seeking scales. Farley and Farley (1967) 
found that a sample of English male industrial 
apprentices and civil servants scored very 
close to the mean for American college males, 
despite the differences in socioeconomic and 
educational levels. The question remains: Do 
English and American subjects resemble each 
other on new subscales developed on the basis 
of factor similarity between the national sam- 
ples? 
Sex differences have been found on the Gen- 
eral scale in American, Japanese, and Thai 
samples and on the factor scales in American 
college samples (Zuckerman, 1974, in press). 
A fourth purpose of the study was to see if 
similar sex differences would be found in the 
English sample. 

A number of studies summarized by Zucker- 
man (1974, in press) have reported negative 
correlations between age and the sensation- 
seeking General scale when the samples cov- 
ered a wide age range. However, none of these 
studies have studied age decline in a system- 
atic fashion in all of the sensation-seeking 
scales. A fifth aim of this study was to ex- 
amine age decline in the SSS subscales in the 
English sample, Adequate data are not yet 
available to do this kind of analysis in an 
American sample. 


Method 
Subjects 


The English subjects consisted of 254 males and 
693 females from the Maudsley Twin Register, rang- 
ing in age from 16 to 70. These subjects were used 
for the following reasons: (a) This group could pro- 
vide data for a genetic analysis of the SSS; (b) they 
had previously taken the Eysenck Personality Ques- 
tionnaire (EPQ) and thus could provide comparison 
data on these two instruments; (c) they covered a 
wide age range, thus providing our first good data 
for age comparison in a normal population; and (d) 
they were an interested and cooperative group, hav- 
ing answered previous questionnaires by mail. The 
SSS Form IV was sent out by mail to twins who 
had previously taken the EPQ. The return rate was 
about 80%. 

The question can be raised as to whether or not 
twins are a special population. Previous studies 
showed that twins from the Maudsley Twin Register 
have normal patterns of scores on personality tests 
(Eysenck, 1976). 
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After the data from Form IV were analyzed in 
the English sample, a new form (Form V) of 40 
items was constructed. This form, along with the 
EPQ, was given to 97 male and female undergradu- 
ates from two large sections of undergraduate psy- 
chology at the University of Delaware. Most of these 
students fell in the age range of 17-25, and the modal 
range was 18-20. The SSS was given first, and the 
EPQ was given second. 

The 72-item SSS Form IV item responses were in- 
tercorrelated and factor analyzed in the English male 
and female samples separately. They were rotated 
obliquely to simple structure using the promax 
method, and the loadings were compared with the 
loadings from the previous study by Zuckerman 
(1971), Product-moment correlations, the principal 
components method, and oblique rotations were used 
in both studies. The previous study (Zuckerman, 
1971) factored the SSS Experimental Form III, which 
contained 113 items, whereas the current study fac- 
tored Form IV, containing 72 items. The factor struc- 
ture might have shown some differences due to the 
extra 41 items in the previous study. These 41 items 
were not included in Form IV because they showed 
no loadings of any size on the four primary factors. 
This means that they contributed little to the in- 
terpreted factor structure, so the difference between 
the two forms should not have introduced much dis- 
tortion in the comparison of factors across samples, 

The sample on which the American factor analysis 
was based consisted of 160 male and 172 female un- 
dergraduates from psychology courses at Temple 
University. The English sample was much more 
heterogeneous as to age and education, and differences 
jn the factor structure could have occurred because 
of the difference in populations. But to repeat the 
argument used in regard to sex differences in factor 
structure: If factors are meaningful and have some 
biological basis, they should be generalizable to a 
broad range of the population. 

Our design allowed for sex and national compari- 
sons of factor reliability: English males with Ameri- 
can males, English females with American females, 
English males with English females, American males 
with American females, English males with American 
females, and English females with American males. 


Results 


Factor Reliabilities 


The rotated factor loadings on the first four 
factors in the six samples were correlated over 
71 items. One item from the General scale was 
13-item Form III given to the 
American groups, but this was not an item 
contained in any of the factor scales. The 
matrix of correlations was 16 x 16 (4 samples 
x 4 factors). ; 

The factor reliability coefficients from this 
Table 1, which also gives 


missing in the 1 


matrix are shown in 
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Table 1 " 
Factor Reliability Coefficients 


Tee . 
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Sensation-Seeking Scale factors 


is B: 
Comparisons between TAS ES Dis Eis S 
«10 
Eng M/Eng F -90 68 A K 
Eng M/U.S. M .87 72 9 K 
Eng M/U.S. F -67 51 “66 a 
Eng F/U.S. M 88 68 i 3 
Eng F/U.S. F 72 15 ` E 
U.S. M/U.S. F .73 67 79 4 
Ri f cross-factor a 
orrelan .06 to —.55 -09 to —.49 Al to —.49 44 to —.55 


Note. Eng = English, U.S. = American, M = males, F 


ES = Experience Seeking, Dis = Disinhibition, BS = 


the range of other correlations between dif- 
ferent factors within a sample and between 
different factors across samples. As in the mul- 
timethod—multitrait validity model (Campbell 
& Fiske, 1959), the correlations between the 
same factors across samples should be signifi- 
cant and exceed the correlations between dif- 
ferent factors (traits in the multimethod- 
multitrait model). Of course, reliability coeffi- 
cients should also be high as well as being 
significant. 

Three of the four factors clearly met these 
criteria of factor reliability. Thrill and Ad- 
venture Seeking, Experience Seeking, and Dis- 
inhibition all showed significant and reason- 
ably high resemblance between the four na- 
tional and sex samples. With the exception of 
only 1 of the 18 Correlations for these three 
scales, the coefficients all exceeded .60 and 
all fell above the Tange of the cross-factor 
correlations, 

The case was not as clear for the Boredom 
Susceptibility factor. Although the factor was 


fairly similar in English males and 
showed 


iong ak (< .40), if 
significant, resemblance, Only three of the 


coefficients exceeded the range of the cross- 
factor correlations for this factor. 


Construction of Form V 


On the basis of the four 


factor analyses, an 
attempt was made to sele 


ct items for a new 


= females; TAS = Thrill and Adventure Seeking, 
Boredom Susceptibility. 


form (Form V), with the aim of using 10 
items for each factor that met the criteria of 
having a primary loading on the same factor 
in all samples and loadings exceeding .30 in 
magnitude. Such scales would have the great- 
est value for the cross-cultural comparisons. 

There was no problem in meeting these cri- 
teria for TAS, in which all but one of the 
items (21) had loadings over .30 on the factor 
in all four samples. When a choice had to be 
made between items having similarly high 
factor loadings, an attempt was ne to di- 
versify the content in the new scales. 

For Dis, one item (47) had to be selected 
that did not meet the criteria in the American 
male sample and another (66) that did not 4 
the English male sample. The BS ee 
Previously been defined by male load : 
since the factor was not well defined ii : 
American females, Another item (12) did no 
meet the criteria in the English males. na 
item (61) had to be included even thoug! 
did not meet the criteria in the Ame 
analyses. The most radical change was on a 
ES scale, on which three new items ee 
and 25) had to be included even though t zi 
did not meet the criteria in all the sinp 4 
some cases items with loadings of slightly E J 
than .30 had to be used, but they aay ana 
their highest loading on the relevant ma. 

In this manner, 10 items were selecte 355 
each of the four primary factors in the ad 
to comprise Form V. A total score io ie 
new form can be obtained by summing 
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four subscale scores, but Form V contains no 
general scale as did Form IV. 

Table 2 shows the sensation-seeking choice 
on each item selected, with the factor loadings 
of that item on the appropriate factor in each 
of the four samples. In comparing these load- 
ings it should be remembered that the factor 
analyses were of 72 items in the English sam- 
ple and 113 in the American sample, so that 
some of the total variance may have been on 
items not directly compared across samples. 


Scale Reliabilities on Forms IV and V 


Table 3 shows internal reliabilities in the 
English and American T emple University 
samples for Form IV and the English and 
American University of Delaware samples for 
Form V. The reliabilities of the English and 
American samples were quite similar for Form 
IV even though the structure of these scales 
was determined only by the American factor 
analyses. 

The reliabilities of the Form V factor scales 
were expected to be somewhat lower because 
the scales were shorter, that is, 10-item scales 
as opposed to 14- and 18-item scales in Form 
IV. Actually the only substantial drop in re- 
liability was on the ES scale where reliabilities 
fell from .7 and .8 to .6. à 

The most homogeneous scales, TAS and 
Dis, showed little loss of reliability in the new 
form; the ES reliability fell somewhat but 
was still within acceptable limits; the BS 
reliability remained at the borderline range of 
high .5, where it had been for American fe- 
males in Form IV. 


Correlations Between Scales 


The correlations between the factor scales 
in Forms IV and V are shown in Table 4. 
These scale correlations had been rather high 
in Form IV, and it was hoped that the scales 
in Form V would have more independence, al- 
though some significant correlation was still 
expected. 

Table 4 shows that the correlations among 
Subscales, particularly among ES, Dis, and 
BS, that were quite high in Form IV were re- 
duced in Form V. TAS continued to correlate 
with ES in Form IV but showed very low and 
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sometimes insignificant correlations with Dis 
and BS. 


National and Sex Differences 


Table 5 shows the comparisons of English 
and American male and female samples on 
Form V. Form V was felt to be most ap- 
propriate for these comparisons, since these 
factor scales were based on cross-national 
similarity of factors. Only the 16- to 19-year- 
old English subjects were used in these com- 
parisons, since the American college students 
were mostly within this age range. Although 
the young English and American males were 
not different on the total SSS score, the Amer- 
ican males were significantly higher on Ex- 
perience Seeking, and the English males scored 
higher on Boredom Susceptibility. The Ameri- 
can females were significantly higher than the 
English on the total score, the Thrill and Ad- 
venture-Seeking scale, and the Experience- 
Seeking scale. 

Looking at sex differences within the two 
national groups, both English and American 
males were significantly higher than the fe- 
males on the total score and on the TAS and 
Dis factor scores. The English males were 
higher than the English females on BS. There 
were no significant sex differences on the ES 


scale in either country. 


Age Comparisons 


Table 6 shows the mean scores of the males 
he SSS, Form V within each 


and females on t i 
age group of the English sample. Figure 1 


Figure 1. Changes in sensation-seeking total scores 


as a function of age. (SSSV = Sensation-Seeking Scale 
Form V.) 
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Table 2 ; i 
Loadings of Items Selected for Form V of the Sensation-Seeking Scale 


Loading 
——— 
English American 
No. on 5 
Form IV Item M p Mi 
Thrill and Adventure Seeking 
10 I often wish I could be a mountain climber, 68 47 44 SL 
21 I sometimes like to do things that are a little frightening. 43 36 S0 sean 
29 I would like to take up the sport of water skiing. 62 73 71 64 
31 I would like to try surfboard riding, 64 .64 16 56 
35 I would like to learn to fly an airplane. 45 .64 63." S38 
37 I would like to go scuba diving. 59 58 64 36 
43 I would like to try parachute jumping. 63 -69 66 32 
53 I like to dive off the high board, 53 54 4o A 
69 I would like to sail a long distance in a small but sea- 
worthy'sailing craft. .52 46 „54 38 
71 I think I would enjoy the sensations of skiing very fast 
down a high mountain slope, 165. 58 „57 52 
_ ee 
Experience Seeking 
11 I like some of the earthy body smells, 40 39 46 16 
15 I like to explore a strange city or section of town myself, 
even if it means getting lost. 28 = 56. 23.48 
18 1 have tried marijuana or would like to, ist 47 60 58 
19 I would like to try some of the new drugs that produce 
hallucinations, 3436 60. H 
25 I like to try new foods that I have never tasted before. 60 433 23.34 
33 I would like to take off on a trip with no preplanned or 
definite routes or timetables. 37 39 40 Al 
34 I would like to make friends in some of the “far-out” 
groups like artists or “hippies.” 57 35 48 45 
38 I would like to meet some persons who are homosexual 
(men or women). 56 AT 56 (54 
51 loften find beauty in the “clashing” colors and irregular 
ah form of modern painting. 53 36 50.45 
People should dress in individual Ways even if the effects 9 
are sometimes strange. 42 41 47 A 
j Disinhibition 
I like wild “uninhibited” parties, * 53 -54 
z I enjoy the company of real nE Es om 38 50 
I often like to get high (drinking liquor or smoking 
ny Marijuana). 39 59 Al 46 
I like to have new and exciting experiences and sensations ; 
ah ; nee if they are a little unconventional or illegal. Al 31 12 32 
ike to date members of the Opposite sex who are 
5 R Physically exciting, 46 58 45 aa 
€eping the drinks full is th À 7 354 
59 A person should have a Las pk get - 
A sexual experience 
Ap T before marriage, 35 43 42 | .32 
Could conceive of myself seeki ` A : 
A work wiih Sere ge y Pleasures around the 5 4 40 42 
enjoy watching many of the “sexy” i i à $ 39 -%4 
5 Xy” scenes in m 7 54 -68 “ 
66 I feel best after taking a couple of drinks, i 26 AT 42 33 
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Loading 
No. on English American 
Form IV Item M F M F 
‘ Boredom Susceptibility 
I can’t stand watching a movie that Py 
b 
r i get bored seeing the same old faces. pia 4 i R ie 
When you can predict almost everything a person will do ; $ a 
A and say, he or she must be a bore. 45 AT 40 
I usually don’t enjoy a movie or a play where I can 5 ; : A 
5 predict what will happen in advance. .42 41 56 03 
Looking at one home movies or travel slides bores ; ‘ $ 
me tremeni ously. 65 3 
A I prefer friends who are excitingly unpredictable. <39 A 3 a 
I get very restless if I have to stay around home for any ‘ ‘ 
a length of time. 39 39 31 04 
5 The wort social sin is to be a bore. AEA ET 
I like people who are sharp and witty even if they do ; 
a sometimes insult others. AT 29 25 05 
I have no patience with dull or boring persons. T 3239 


Ne = me 
‘ole, M = male, F = female. Only the sensation-seeking choices in t 


he forced-choice items are presented 


here; a copy of Form V can be obtained from the first author. 


shows the data for the total score, and Figure 
2 shows the trends for the separate scales. 

The F values from simple analyses of vari- 
ance between the age groups for each sex are 
shown in Table 6. The age differences in sen- 
sation seeking were significant for the total 
score and on all scales for the females. The 
s change was significant for the total score 
a | the Thrill and Adventure-Seeking and 
ate scales for males but was not sig- 
x cant for Experience Seeking and Boredom 
usceptibility. Examining Figure 2, it is ap- 
parent that the fall in SSS scores was more 
pronounced for the TAS and Dis scales than 
for the ES and BS scales. 


Discussion 


PaE the differences in populations 
a ed in England and America and the ad- 
at nal items used in Form IIT on the Ameri- 
es sample, the amount of cross-national and 
im ss-sex correspondence in the SSS factors 1s 
hae Even the Boredom Susceptibility 
ity or, which had not shown cross-seY reliabil- 
eli the American sample, did show such 
an ility in the English one. The data argue 

ongly for the meaningfulness of the factor 


scales of the SSS Form IV. This is not to say 
that we can generalize to other cultures. The 
status of the factors in translated forms of 
the SSS is still an open question. 

On the basis of the factor stabilities, we 
were able to construct a new shorter Form V 
of the SSS, with a total score balanced for the 
four factors. This new form has the advantage 
of reducing the interscale correlations between 
component factor scores with little loss in 
reliability of these scores. Form V should 
prove useful in further research on sensation 
seeking on both sides of the Atlantic. 

Although there were educational differences 
between the younger American and English 
samples that were compared, education has 
not proved to be a highly significant factor in 
the SSS (Farley & Farley, 1967; Kish & 
Busse, 1968). The English and American 
males did not differ on the total score on Form 
V, but the pattern on the subscales was dif- 
ferent, with ‘Americans scoring higher on Ex- 
perience Seeking and the English scoring 
higher on Boredom Susceptibility. Experience 
seeking seems to represent a style of life com- 
mon in the 1960s that is still an influence in 

970s but it is apparently not 


America in the 1 
as important in England. The ES scale was 
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Table 3 
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Scale Reliabilities on Sensation-Seeking Scale Forms IV and V 


Form IV Form V 
——— 

English* American English* 

Scale M F M F M F 
General AU ES: ed) 81 = = 
TAS 83.84 85.85 81 .82 
ES 167 18 84 88 65 67 
Dis ATA AS Pd eed 5 78 17 
BS 62.66 -TS 58 65 59 

Total scores 

(Form V) — — — — 83 86 


Note. M = males, F = females; TAS = Thrill 


» Split-half corrected coefficients were used; ms = 160 


and Adventure Seeking, ES = 
Disinhibition, BS = Boredom Susceptibility. All coefficients in the table 
* Alpha coefficients were used; ns = 254 males and 693 females, 


Experience Seeking, Dis: 
are significant, p < .01. 


males and 170 females. 


° Internal consistency coefficients (from interitem rs) were used; ns = 97 males and 122 females. 


the only one that did not show any hereditary 
influence in the preliminary study by Buchs- 
baum as reported in Zuckerman (1974), Ex- 
perience seeking may have been most influ- 
enced by the educational differences between 
the two samples, 

Disinhibition seemed to be less influenced 
by cultural differences than did other scales, 
Some unpublished data on racial differences in 
America (Kurtz & Zuckerman, Note 2) have 
shown that blacks are lower than whites on 


Table 4 


TAS and BS but not on Dis or ES. Dis see 
to be the most culture-free scale, and it is tht 
one most highly related to certain psych} 
physiological variables (Zuckerman, 1974), | 
In contrast with the males, the Americal 
females were significantly higher than th 
English females on the total score and on tht 
ES and TAS subscales. As with the males, 1 
national differences were found on the Dis 
scale, f 
The new Form V shows more selective së 


Correlations Between Subscales on Forms IV and V 


Form IV Form V 
M ge juicForm Vi a 

Sensation- isha 5 od American® f 

Seeking scales BL ‘eee pea — ee 
correlated M F M F M F M F 

TAS x ES 42.52 39 21 

: : : f 37 : 42 . 

TAS x Dis 12 335 35 21 ae 35 14* a 
‘AS X BS 28.36 25 28 10* 20 06° ag 
ES X Dis 54 37 4 St Ci ant AT 20 SR 
ES x BS 57 159 a) 21 29 26 ee 

is X BS 45 50 44 134 COO 37 lade 
Note. M = males, F = females; TAS = 


Disinhibition, BS = B 


indie oredom Susceptibility, All correlati 
* Ns = 254 males and 693 females, 

y ae = Aa males and 170 females, 

EN a 


males and 122 females. 
* ns. 


S <.05. 


. is a 
Thrill and Adventure Seeking, ES = Experience Seeking, Dis. 


wise 
ions Were significant (p < .01), unless othe 
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Table 5 
Comparisons of American College Sample with Y i. i 
Comber ee chine Some ae pi ‘ounger English Twin Samples (Ages 16-19) 


Male: 
ales Females Sex differences 
Eng* U.S. Engvs. U.S. Engt U.S. Engvs. U.S. Eng U.S 
Scale M SD M SD t M SD M SD t t t 
TAS 74 2.6 1.8, 2:3 1.40 5.6 3.0 6.9 
$ 4 # i 9 2.6 3.45** ad 
BS 41 21 4,8 2.0 3.03** 41 2.2 5.0 2.1 ele a aa 
pe 6.2 Sf 5.9 2.6 86 41° 2.7 4,7 2.7 1.71 4,59** 3.21% 
3.8 2.5 3:22:00) AESA 2.8 21 3.0 1.9 „64 2.91** 65 
Total 21.5 6.7 21.6 5.7 AT 16.6 7.2 19.6 66  3.22** 4,58** _ 2.44" 
Note. Eng = English, U.S. = American; TAS = Thrill and Adventure Seeking, ES = Experience Seeking, 


Dis isinhibition, BS = Boredom Susceptibility. 


în = 72. 
dn = 97. 
on = 106. 
àn = 122, 
* < .05. 
>< 01. 


both countries. However, the finding of a re- 


differences than does the old Form IV, on 
lation between gonadal hormones and Dis in 


which males were higher than females on all 


of the scales (Zuckerman, 1974), On Form V 
the replicated (across nations) sex differences 
were limited to the TAS and Dis scales. Males 
also scored significantly higher on the total 
score in both countries. The Dis scale showed 
the largest sex difference, even on Form IV. 
The difference on Dis can, of course, be in- 
terpreted as reflecting different kinds of so- 
cialization experiences of males and females in 


Table 6 


a sample of American males (Daitzman, 
Zuckerman, Sammelwitz, & Venkataseshu, 
Note 3) suggests that biological factors may 
also play a role in differences on this per- 
sonality dimension. 

The decline in sensation seeking with age 
was predicted in the theory formulated by 
Zuckerman (1969). Tn the chapter by Zucker- 
man (1974), it was predicted that TAS and 


Mean Scores of English Males and Females by Age Groups 
i BS 


Dis 


n Total score TAS ES 
Nanhi RESA 
Ages M F M F M F M F M F M F 
1619 72 106 4A 62 4d 3.8. SA 
21.5 16.6 We son? AENA 
20:20, 119. 250 E De vara Hie: SAN 4.0 35 2.8 
30-3925 tis Eea OP Ag a ice” ome 3.6 21 
te 26 89 15.8 10.7 43 2.6 La PAT 46 21 3.0 2.3 
o eee aegis Hie (alo gi hy eens Bhs RN 
oo 12 3a 1247.0 cy ata eee O 3.3. 2.0 
F between age groups* 
o3e 360% EE 403 20 e oh Gt SUSR liyy Ss 
Dis = 


N 

Di, M= males, F = females; TAS = 
a de hibition, BS = Boredom Susceptibility. 
*p S = 4/249, females = 4/688. 


Thrill and Adventure Seeking, 


ES = Experience Seeking, 
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Thrill and Adventure Seeking Experience Seeking 


as 


sss V 


. SSV = ation- 
Figure 2. Changes in sensation-seeking subscale scores as a function of age. (SSSV = Sens: 


Seeking Scale Form V.) 


ES would decline at faster rates than Dis and 
BS. The results showed a greater decline for 
TAS and Dis than for ES and BS. A regres- 
sion analysis showed that age accounted for 
18% of the variance for Dis, 21% for TAS, 
5% for ES, and 3% for BS. There was a 
clear linear decrease in sensation seeking with 
age on the total SSS score. 

What is the basis of the increasing cautious- 
ness and conservatism of age? It might simply 
reflect the mellowing effect of accumulated 
experience, But many biological changes also 
occur with age, incl 


activity and diminution of gonadal hormone 


erman, 1969) suggests that Sensation seeking 
increases from childhood to adolescence, Far- 
ley and Cox (1971) found no increase from 
ages 14 to 17. A new instrument will have to 


be developed for younger children in order to 
test this hypothesis, 


Reference Notes 
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ES would decline at faster rates than Dis and 
BS. The results showed a greater decline for 
TAS and Dis than for ES and BS. A regres- 
sion analysis showed that age accounted for 
18% of the variance for Dis, 21% for TAS, 
5% for ES, and 3% for BS. There was a 
clear linear decrease in sensation seeking with 
age on the total SSS score, 

What is the basis of the increasing cautious- 
ness and conservatism 
reflect the mellowing 
experience, But many 
occur with age, 
activity and di 


It can be hypothesized that the same biologi- 
cal factors that are prominent in 
the sensation-seeking tendency. 
The other part of the age postulate (Zuck- 
erman, 1969) suggests that sensation seeking 


ages 14 to 17. A new instrument will have to 
be developed for younger children in order to 
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Validation of the Beck Depression Inventory in a University 
Population Using Psychiatric Estimate as the Criterion 


William Bumberry and J. M. Oliver 
Saint Louis University 


James N. McClure 
. Washington University 


This study investigated the utility of the Beck Depression Inventory for survey 
use in a college population by determining its concurrent validity, using psychi- 
atric rating of depth of depression as the criterion (N = 56). Interrater reli- 
ability of psychiatric estimate as measured by a Pearson product-moment cor- 
relation coefficient was .62, perhaps because the primary and secondary raters 
used different diagnostic procedures, The Pearson product-moment correlation 
coefficient between the inventory and the psychiatric rating was .77. These 
findings indicate that the Beck Depression Inventory is indeed a valid instru- 
ment for use in a college population. The Pearson product-moment correlation 
coefficient between the inventory and the psychiatric estimate fell to .30 ina 


second sample in which 1-14 days intervened between 


administration of the 


inventory and the psychiatric interview (n= 27). This attenuation in subjects 
who experienced a time delay is consistent with the nature of the depression 
inventory as a measure of state as opposed to trait depression. The apparent 


decline in measured depression additionally suggests the 


need for longitudinal 


study to determine its course and outcome. 


Depression is widely viewed as the most 
frequently occurring psychic disorder among 
college students, Seligman (1973) contends 
that it is not only the most common of the 
Psychological dysfunctions among students, 
but it is also increasing in frequency. Very 
little information is available concerning the 
prevalence of depression in college popula- 
tions, 

Recently, however, Oliver, Croghan, and 
Katz (Note 1) have estimated the prevalence 
of depression in college students by adminis- 
tering the Beck Depression Inventory (Beck, 
1970; Beck, Ward, Mendelson, Mock, & 
Erbaugh, 1961) to a random sample of fresh- 
men and sophomores at four private, medium- 


The completion of this study was made possible 
by the generous participation of James Halikas and 
Amos Welner, both of Washington University Medi- 
cal School. 

Requests for reprints should be sent to William 
Bumberry, who is now at 1970 Latham, Apartment 
44, Mountain View, California 94040, 


sized, coeducational, urban universities, When 
the criterion for depression was set at a num- 
ber or intensity of symptoms associated with 
a diagnosis of depression in a psychiatric 
Population, 23% of the respondents qualified 
as at least mildly depressed. These findings 
Suggest that the rate of depression may be as 
much as 50% higher in college students than 
in American adults between the ages of 18 
and 74, in whom the prevalence of depres- 
sion was cited as 15% in a special report on 
depression issued by the National Institute 
of Mental Health (1973). Thus, there is 4 
large discrepancy between the estimates of 
the rate prevailing in the unselected popula- 
tion reported by the National Institute of 
Mental Health and that found by Oliver et 
al. using the Beck Depression Inventory. ; 

The Beck Depression Inventory is a clini- 
cally derived self-report inventory of depres- 
sion designed for use in psychiatric popula- 
tions. It was constructed to assess the current 
depth of depression, whether or not depres- 
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VALIDATION OF THE BECK DEPRESSION INVENTORY 


d 
sion is viewed as the primary diagnosis, and 
consists of 21 items covering affective, cog- 
nitive, motivational, and physiological areas 
of depressive symptomatology. The range of 
possible scores extends from 0 to 63, with 
scores of 0-9 being categorized by Beck as 
not depressed, 10-15 as mildly depressed, 
16-23 as moderately depressed, and 24-63 
as severely depressed. 
Two validation studies were reported by 
Beck et al. (1961); they were conducted in 
psychiatric populations, with psychiatric as- 
sessment of depth of depression comprising 
the criterion measure with which the depres- 
sion inventory was compared. Biserial corre- 
lation coefficients of .65 and .67 were ob- 
tained (ms= 226 and 183, respectively). 
These validation studies may be largely re- 
sponsible for the rather widespread applica- 
tion that the depression inventory has recently 
enjoyed, having already been used as a cri- 
terion measure in well over 100 published 
studies (Beck & Beck, 1972). In spite of its 
popularity, however, it should be noted that 
the depression inventory was designed for 
use with psychiatric populations and has not 
been demonstrated to be a valid measure of 
; depression for survey use in a college or any 
other unselected population. 

The purpose of this study was to assess 
the applicability of the Beck Depression In- 
ventory for survey use in this population. 
The method consisted of determination of the 
concurrent or diagnostic validity of the de- 
pression inventory, using psychiatric rating of 
depth of depression as the criterion measure. 
| 


Method 
Subjects 


Subjects in this study consisted of 56 university 
Students (27 males and 29 females) at both under- 
graduate and graduate levels from two medium-sized, 
Private, coeducational, urban universities (Washing- 
ton University and Saint Louis University) located 
in the same midwestern city. The Scholastic Apti- 
institutions averaged 


tude Test scores at these two 
500 Math, 


550 Verbal, 500 Math and 500 Verbal, 
Tespectively. 

An attempt was made (although unsuccessfully) 
© include an equal number of subjects from each 
a the four categories of depression as designated 
Y Beck. Since 77% and 16% of randomly selected 
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lowerclassmen can be expected to score in Beck’s 
normal and mild categories, respectively (Oliver et 
al, Note 1), subjects in these two categories (ns = 
16 and 19, respectively) were obtained through a 
random sampling procedure at Saint Louis Univer- 
sity. The university phone directory, constituting the 
registrar’s list, provided the pool from which sub- 
jects were randomly selected. Each subject from 
this pool was personally contacted. 

Since only 6% and 1% of randomly sampled lower 
classmen can be expected to fall in the moderately 
and severely depressed categories, respectively, this 
strategy had to be supplemented to procure a suffi- 
cient number of subjects in these categories (ns = 
17 and 4, respectively). More extensive sampling was 
obtained by administering the depression inventory 
to various large undergraduate and graduate classes. 
Finally, with the cooperation of the psychiatric di- 
vision of the Samuel B. Grant Health Service at 
Washington University, students referred for psy- 
chiatric evaluation with and without depression were 
referred to the study. These subjects were adminis- 
tered both the depression inventory and the psy- 
chiatric interview for depth of depression before any 
other diagnostic or therapeutic procedures were 
introduced. 


Materials 


The Beck Depression Inventory as originally pub- 
lished (Beck et al., 1961) was used, After the sub- 
jects had completed the entire inventory, the dura- 
tion of each symptom was also obtained. 

The structured interview used by the primary psy- 
chiatrist was developed and has been used for over 2 
decades by the Washington University Department 
of Psychiatry. Interrater reliability of information 
derived from this interview has been discussed by 
several investigators (Feighner, Robins, Guze, Wood- 
ruff, Winokur, & Munoz, 1972; Helzer et al, 1977). 
Data obtained from this interview form the basis 
of the forthcoming revision of the APA Diagnostic 
and Statistical Manual (APA, in press) and of nu- 
merous publications concerning a wide range of dis- 
orders from anxiety neurosis to organic brain syn- 
drome and including depression. 

Those psychiatrists providing estimates for the 
purpose of assessing interrater reliability, however, 
did not use this structured interview but proceeded 
with a diagnostic clinical interview however they 


saw fit. 


Procedure 


Requirements for participation in the study were 
briefly explained, informed consent was obtained, 
and confidentiality was promised. The inventory was 
then administered to all subjects, either individually 
to those students randomly sampled, in a group for 
the large classes, or self-administered to students 
referred from the Health Service, before subjects 
participated in a psychiatric interview. 
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Table 1 

Percentages of Interrater Pairs of Psychiatric 
Ratings as a Function of Degree of Agreement 
Between Psychiatrists 


Degree of agreement Beck (1970) Current study 


Complete 56 61.5 
1 degree of disparity 41 38.5 
2 degrees of disparity 2 0 
3 degrees of disparity 1 0 


Note. The data in column 1 are from Depression: 
Causes and treatment (1st ed.) by Aaron T. Beck. 
1970, p. 194. Copyright 1967 by Aaron T. Beck. 
Reprinted by permission, 


Subjects were selected for participation in the 
psychiatric interview according to depth of depres- 
sion as measured by the depression inventory, but 
all interviews were conducted by interviewers blind 
to depression inventory scores. Structured interviews 
as described above were conducted to assess the 
current depth of depression, scored in four levels 
as none, mild, moderate, or severe according to the 
criteria developed by Beck for his validation studies 
(Beck et al., 1961), The interviews were conducted 
by a board-certified psychiatrist (the third author) 
with 20 years of experience who maintains both a 
student practice through the Health Service and an 
adult practice, 

Measures of interrater reliability were obtained 
at the beginning and end of the study. Both mea- 
sures were obtained by comparing the primary rat- 
ings with those made by one of two additional 
board-certified Psychiatrists, the first (J.H.) with 


9 years of clinical experience and the second (A.W.) 
with 4, 


Results 


Interrater reliability between the primary 
Psychiatrist Providing criterion ratings for 
depth of depression and the two other psychi- 
atrists, as measured by a Pearson product- 
moment correlation Coefficient, was .62. The 
Proportion of pairs of psychiatric ratings 
manifesting varying degrees of agreement be- 
tween psychiatrists, from complete agreement 
through complete disagreement across the 
four levels of depression, obtained in the 
Present study and those reported by Beck 
(1970) are presented in Table 1 for the pur- 
poses of comparison. 

The Pearson product-moment correlation 
coefficient between the scores on the depres- 
sion inventory and the primary psychiatrist?s 
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ratings of depth of depression, with no de- 
pression scored as 0, mild as 1, moderate as 
2, and severe as 3, was .77 (SE = .14, p< 
.001). 

For purposes of comparison with Beck’s 
original validation data, the Pearson biserial 
correlation coefficient was calculated on the 
same data, after reducing the number of cri- 
terion categories from four to two, normal 
and mild as opposed to moderate and severe, 
The corresponding coefficient was .79 (SE = 
17, p < .001). Figures summarizing the re- 
lationship found in the present study together 
with those originally reported by Beck (1970) 
are provided in Table 2. 

The dichotomizing of psychiatric ratings 
for the purpose of calculating a Pearson bi- 
serial correlation coefficient produced a rather 
large disparity between numbers of subjects 
in the normal and mild category on the one 
hand and the moderate and severe category 
on the other. In this case, therefore, this sta- 
tistic unfortunately yields a large standard 
error of .17. In the treatment of data origi- 
nating in this study then, the product-moment 
correlation, with its smaller standard error, is 
preferred. 

There was no significant association be- 
tween psychiatric rating of depth of depres- 
sion and either sex, y2(3) = .95, p = .81, or 
grade level, y?(3) = .85, p = .84. 

Beck Depression Inventory means and stan- 
dard deviations according to the four cate- 


Table 2 

Correlations Between the Beck Depression 
Inventory and Psychiatric Rating of Depth 

of Depression 

ee ener ey L ÁŮÁ 


Study n b r SE p< 
Beck 

Study 1 226 .65 068 .01" 

Study 2 183 .67 059 .01* 
Current SOURI 77 

SE yO ds, 

$ .001 .001 


Note. Data in the first two rows are from Depression: 
Causes and treatment (ist ed.) by Aaron T. Beck, 
1970, p. 197. Copyright 1967 by Aaron T. Beck. 
Reprinted by permission. 

* Actual computed probability is less than .001. 
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gories of psychiatric ratings of depth of de- 
pression are furnished in Table 3. 

Use of Beck’s recommended cutting point 
of 10 between normal and depressed scores 
resulted in 2 false positives and 7 false nega- 
tives ( = 18), whereas retention of his sug- 
gested category boundaries produced 27 cases 
that were inaccurately categorized, of which 
15 were overestimates and 12 were under- 
estimates (p = .70). 

A second, smaller sample (m = 27) was run 
consisting of cases in which a time interval 
ranging from 1 to 14 days intervened between 
administration of the depression inventory 
and psychiatric interview. The Pearson pro- 
duct-moment correlation coefficient between 
inventory score and psychiatric estimate 
within this second sample with delay was only 
30 (SE = .21, p < .05). The Pearson biserial 
correlation coefficient for this same second 
sample was not meaningful, since the standard 
error was .36. 


Discussion 


The primary finding of this study is that 
the Beck Depression Inventory does indeed 
appear to be a valid instrument for the mea- 
ent of depression in a university popu- 
ore when psychiatric estimate of depth of 
The ae is considered to be the standard. 
tae. 7 of the correlation is, perhaps, & 
ae a some surprise, since the pencil-and- 
Ai a was devised by Beck to assess de- 
tee conceptualized as a heterogenous 
a or of dysfunctions while the psychi- 
a ae was rendered by a psychiatrist 
fr a a ly endorses an organic approach 
ba — This suggests that depression 
stability aes of sufficient salience and 
to patho} vs e detected by various approaches 
Koia aa When the depression inventory 
cle HA ely as a measure of the current 
identified pression, the proportions of falsely 
aE oa clearly indicate that the de- 
systematic ventory is not associated with any 

eae tendency to overestimate or un- 

imate depression. 
Beton a a strong association between de- 
a positive entory scores of 15 and above and 
response to Item I, which requests 


Table 3 
Beck Depression Inventor 

i y Means and 
Standard Deviations as a Function of 
Psychiatric Estimate of Depth of Depression 


Psychiatric estimate of depth 
of depression 


Statistic Normal Mild Moderate Severe 
n 16 19 17 4 
M 3.94 14,10 22.18 19,50 
SD 4,46 5.99 8.19 7,55 
Note. n = 56 


information on thoughts of suicide, x*(1) = 
33.65, p < .001, suggesting that a score of 15 
or more warrants individual attention to this 
item, Many subjects, not only those rated as 
severely depressed by psychiatric estimate, re- 
sponded positively to the self-harm item, 

A major difficulty in this study relates to 
the question of generalizability, since the cri- 
terion measure of psychiatric estimate of 
depth of depression consists of ratings pro- 
vided by the primary psychiatrist. Clearly, 
it would have been desirable to have several 
psychiatrists participating in the study. The 
obtained measures of interrater reliability 
were directed toward this issue, with the re- 
sults at a generally acceptable level. The 
somewhat modest correlations between pairs 
of psychiatric ratings perhaps derive from the 
fact that the primary and secondary psychi- 
atrists used differing interview procedures, the 
former following a structured format and the 
latter using personally determined clinical in- 
terviews. The marked contrast between the 
theoretical orientations of the primary rater 
and the originator of the depression inventory, 
however, suggests that a similar correlation 
between the inventory and the psychiatric 
estimate might well have been obtained had 
a number of psychiatrists of varying theoreti- 
cal persuasions been used to provide criterion 
ratings. HN, 

A second deficiency in this investigation was 
the small number of subjects who were judged 
as severely depressed by psychiatric estimate. 
Sampling procedures were designed to identify 
this portion of the population, but unfortu- 
nately an adequate number was not secured. 
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The peculiar reversal of the size of the 
mean scores on the depression inventory be- 
tween the groups judged moderate and se- 
vere by psychiatric estimate shown in Table 
3, with the mean for the moderate exceeding 
that of the severe category, suggests that both 
the depression inventory and the psychiatric 
ratings could be invalid at higher levels of 
depression. The small number of subjects in 
the severe range, however, indicates that this 
effect may be attributable only to errors in 
sampling and precludes drawing further in- 
ferences. This remains a question for further 
study. 

The relatively small number of subjects in 
the moderate and severe categories combined 
provided little data relevant to the typically 
reported finding of a significantly higher pro- 
portion of females than males undergoing 
depressions of clinical significance. According 
to this sample, undetected depressive episodes 
occurring in the unselected university popula- 
tion are equally likely to afflict both males and 
females. 

The remarkable accuracy of prediction of 
psychiatric estimate of depth of depression 
by the depression inventory when psychiatric 
interview is administered on the same day as 
the inventory indicates that the inventory 
elicits little error variance when psychiatric 
rating is accepted as the criterion. Since ina 
significant majority of the cases comprising 
the second sample with time delay (20 out 
of 27, p < .025), the psychiatric rating placed 
the subject in a milder category of depression 
than did the depression inventory, it may be 
inferred from the accuracy of the instrument 
that in this subsample the depression inyen- 
tory scores had declined during the interven- 
ing time period, This apparent tendency of 
depression inventory scores to diminish over 
the passage of time may reflect either spon- 
taneous remission or the subject’s attempt to 
present a more favorable impression during 
the second assessment. 

With a view toward providing a measure 
of the “steady state,” the duration of each 
symptom of depression that was acknowledged 
was obtained, with responses scored 1-4 for 

less than 2 weeks, 2 weeks to less than 6 
months, 6 months to any number of years, 
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and “always,” respectively. These duration 
scores were used as weights by which the item 
or symptom scores were cross-multiplied, and 
the total of cross-products provide a com- 
posite score representing both severity and 
duration of depression. It was anticipated that 
this composite score might not only indicate 
those subjects who had been depressed but 
might also predict those who would remain 
depressed. Though composite scores reflecting 
self-reported duration correlated very highly 
with depression inventory scores (r = .94, p 
< .0001), they proved less accurate than the 
“current state” scores as predictors of psy- 
chiatric estimate of depth of depression (r = 
40, ~<.01). These findings indicate that 
self-reported duration of symptoms does not 
provide a useful prognostic index about the 
future course of depression. 

Although the self-reported duration of 
symptoms failed to provide an accurate mea- 
sure of steady state depression, the low cor- 
relation between scores on the depression 
inventory and the psychiatric rating for sub- 
jects for whom a time interval separated ad- 
ministration of the inventory and the inter- 
view raises the issue of the degree to which 
the state measured by the inventory is stable. 
These findings suggest the need for longi- 
tudinal observations of depressed states in 
this and other general adult populations to 
determine their course and outcome. 


Reference Note 


1. Oliver, J. M., Croghan, L. M., & Katz, N. W. 
Prevalence of depression in freshmen and sopho- 
mores at four universities as measured in a ran- 
dom sample by the Beck Depression Inventory. 
Manuscript submitted for publication, 1976. 
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Projective and Repressive Styles of Processing 
Aversive Information 


Alfred B. Heilbrun, Jr. 
Emory University 


The relationship between projective and repressive styles of dealing with aver- 
sive information about the self as well as sex differences in the use of these 
two defensive styles were studied in normal subjects. An inverse relation be- 
tween the use of projective attribution and repressive forgetting of negative 
traits was found independent of sex, supporting the assumption of individual 
differences in the utilization of the major defense mechanisms. Females used 
repression more than males, but no sex difference in projection was shown. Im- 
plications of these findings for personality theory and research methodology 


are presented, 


Defense mechanisms, the cognitive opera- 
tions by which people protect themselves from 
psychological threat, have played a major role 
in psychodynamic formulations of psycho- 
pathology since Freud stressed the centrality 
of repression (Freud, 1957) and projection 
(Freud, 1938, 1956) within psychoanalytic 
theory, A great deal of research has been di- 
rected toward verification of repression or 
projection as clinical constructs or toward 
the testing of various theories relating to their 
occurrence or effects. However, reviews of 
the research literature on repression (Holmes, 
1974; Mischel, 1976), the selective forgetting 
of distressing information, have led to the 
same pessimistic conclusion. There is little 
firm evidence for the validity of this con- 
struct. The verdict of review concerning the 
demonstrability of Projection as a defense 
mechanism was somewhat less negative. 
Holmes (1968) utilized a fourfold typology 
to consider the research on the attribution of 
traits to others as a way of denying or other- 
wise ameliorating the aversive effects of an 
unacceptable self-characteristic. Was the sub- 
ject aware of the unacceptable trait or not, 
and was the attributed trait the same or dif- 
ferent from the unacceptable trait? Holmes 


Requests for reprints should be sent to Alfred B. 
Heilbrun, Jr., Department of Psychology, 


Emory 
University, Atlanta, Georgia 30322. 


concluded that there was no experimental evi- 
dence for unconscious projection of unac- 
ceptable traits, the classical form of projec- 
tion proposed by Freud. He did find support 
for the attribution of traits to others given 
an awareness of the trait by the individual. 
The question remains why it has proven 
so difficult to develop an impressive backlog 
of experimental evidence for such popular 
psychodynamic constructs as repression and 
projection. The problem may reside in the 
limitations of theory relevant to the defense 
mechanisms. It is also possible that the prob- 
lem is methodological. It is difficult to devise 
experimental paradigms that are convincing 
tests of these constructs and that are suffi- 
ciently precise so that the results are not vul- 
nerable to alternative explanation. Ý 
Yet another possible reason for the equiv- 
ocal status of repression and projection re- 
search is the nature of inference frequently 
used in such studies. There is a tendency for 
results to be considered in dichotomous terms; 
either subjects, considered collectively, dem- 
onstrate behavior defined as defensive or they 
do not. If a sufficient number do not, then 
the results fail to support the usefulness of 
the clinical construct. However, defenses as 
cognitive operations should be expected to 
Occur in varying degrees; for example, some 
might use repression consistently in dealing 
with threatening material, others might occa- 
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sionally engage in repression, and yet others 
might never deal with threat by using selec- 
tive forgetting. Furthermore, we might expect 
individuals to have different hierarchies of 
defense against threat. One person could be 
characterized as having a primary repressive 
style but have other defenses available, 
whereas another person might maintain pri- 
marily a projective style and less frequently 

deal with threat in other ways. 
These observations suggest some methodo- 
logical innovations in investigating the de- 
fense mechanisms. For one, the experimental 
method should provide the opportunity to 
assess more than one defensive operation in 
dealing with the same material, The subject 
who fails to demonstrate a particular mode 
of defense (and in previous studies might be 
thought of as disconfirming that construct) 
may be shown to be neutralizing threat by a 
different tactic higher in the hierarchy of de- 
fenses. Said another way, this procedure more 
likely would allow for the demonstration that 
ae have different characteristic styles of 
ealing with aversive information. Second, the 
experimental method should allow for the sys- 
ee study and analysis of sex differences 
aracteristic defensive behaviors. The im- 
me, of sex differences in ego defense 
virtually a mies Si fact that they remain 
Betis’ a $ pe oe E children 
I can discern, in ad i Ja A ee fe 
— , in ac ults as well. If such basic 
E ia, exist, it would not only be of 
lena 4 Sane interest but should be con- 
cee es A sampling procedures of subse- 
lee arch involving the defense mecha- 
es experiment provided subjects 
ioe ee to deal with aversive 
in aie a relevant to themselves 
Ee ced er a projective or repressive de- 
regarding wh No constraints were imposed 
a =e ether neither, one, or both de- 
investigated used. Two specific issues were 
défendin 5 (a) Given the option of either 
noaa E threatening information by 
s aN repression, will normal subjects 
other? and one defensive style and not the 
nd (b) Do males and females differ 
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in the extent to which they use projective and 
repressive defensive styles? 


Method 
Subjects 


The 47 subjects in this experiment were volunteers 
from a large undergraduate class at Emory Univer- 
sity. Of this number, 30 were males and 17 were 
females. The mean ages were 18.5 years for the 
males and 18.3 years for the females. 


Measures 


The basic instrument from which the evaluative 
materials were taken for this experiment was the 
Adjective Check List (Gough & Heilbrun, 1965), a 
300-item compilation of behavioral adjectives. 

The adjectives on the Adjective Check List have 
been rated by college students as to their favora- 
bility/unfavorability if used to describe a person 
(Gough, 1955). Seventy-five adjectives have been 
identified as favorable, 75 as unfavorable, and the 
remaining 150 adjectives are considered neutral in 
an evaluative sense. A set of 22 adjectives was se- 
lected from each of these three evaluative categories, 
and these 66 words comprised the critical word list 
in the present study. Since the projection and re- 
pression scores, to be described later, depend on the 
reliability of these categories, new ratings were ob- 
tained from the subjects in this experiment to verify 
category meanings, since projection and repression 
scores, to be described later, depend on the relia- 
bility of their categories. The mean rating for each 
evaluative term (given in Table 1) is based on a 
7-point scale ranging from highly unfavorable (score 
=1) through neutral (neither favorable nor un- 
favorable—score = 4) to highly favorable (score = 
7). Generally speaking, Gough’s original groupings 
held up very well, The favorable and unfavorable 
categories of evaluative terms, on which both de- 
fensive scores were based, provided nonoverlapping 
values. The rating for each term within these two 
categories fell on the appropriate qualitative side of 
the neutral point, and the grand mean rating for 
the favorable terms (5.84) was over three rating in- 
tervals from that for the unfavorable terms : (2.46) 


on a 6-interval scale. 


Procedure 


two sessions. The first was 
during which the subjects 
were asked to rate the favorability or unfavorability 


of the words comprising the critical word list. These 
66 behavioral adjectives were randomly dispersed 
throughout a rating form made up of 120 adjectives 
taken from the Adjective Check 


items were added to redu! 


Subjects were seen for 
a small group session 
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ble 1 cA ; 
pate os Values for Words on the Critical Word List 
Qualitative category 
Unfavorable Neutral Favorable 
Adjective M Adjective M Adjective M 
thetic 2.74 assertive 4.91 eee or 
arale 2.30 cautious E E i u s 
.11 cool i 
one vi contented 4.91 gapable he sa 
dodaly 2.40 determined 5.68 depen apie oe 
egotistical 2.68 excitable 4,64 apris g s 
fickle 2.60 formal 3.57 tan S ana 
frivolous 3.34 forgetful 2.91 forvi g ont 
hostile 2.06 hurried 3,32 help! S A Be 
impatient 3.11 informal 5.04 inven ive y Bee 
indifferent 2.94 idealistic 4,83 magna N Sier 
intolerant 2.16 initiative 5.51 insightfu! ace 
moody 3.49 modest 4,96 mature $ pi 
obnoxious 1.72 outspoken 4.06 organize Ber 
prudish 2.49 precise 5.40 patient wg 
resentful Zol reserved 4.23 realne Be 
slipshod 2,55 stubborn 3.04 sociable aa 
selfish 1.91 serious 4.90 sincere a 
stingy 2.33 stolid 4.17 stable Re 
tactless 2.04 trusting 5.89 tolerant iA PE 
undependable 1.72 uninhibited 4.83 understanding aa 
weak 1.94 wary 4.19 warm 


3 à ble 
Note, Mean ratings are based on N = 47; values range from highly unfavorable (1) to highly favora 
(7). The qualitative category is based on Gough’s original groupings. 


the critical words to be subsequently used in memory 
tasks, 


Subjects were then seen individually in a labora- 
tory session within approximately the next week. 
During the initial Part of the laboratory session, the 
subject was told that a series of words taken from 
a survey of mothers who had been asked to de- 
scribe their college-age children was going to be 
presented. The subject was instructed to listen care- 
fully and to remember as many of the words on 
the list as possible, These instructions regarding ma- 
ternal evaluation, while not made specific to the 
subject in question because of ethical constraints 
(ie, they were not offered as what the subject’s 
mother thought of him or her), have proven suc- 
cessful in creating an evaluative set in several pre- 
vious studies by me. I (Heilbrun, 1973a) have re- 
viewed these studies elsewhere. The taped list pre- 
sented the words in repeating triads of favorable, 
unfavorable, and neutral, The words were spoken in 
an adult female voice at 5-sec intervals and were 
Presented at a clearly audible volume. Earphones 
were provided to preclude irrelevant auditory stimu- 
lation. 

As soon as the final word had been Presented, the 
subject was given a copy of the Adjective Check 
List and was asked to check those words that had 


ared on the list. The subject was informed that 
ier were 66 words on the list and that szachy 
that number was to be checked even if it REST 
guessing. Once this was done, the female gra mae 
student experimenter offered the following instr 
tions; 


Some of the words used to describe coner 
sons (daughters) may be descriptive of you, he 
some may not. I want you to go through is 
words you checked and indicate for each a 
whether you feel it is more descriptive of you K 5 
of most college males (females) or whether i J 
more descriptive of most college males (female: 
than it is of you. 


Subjects then went through the 66 checked woi 
and indicated by using different colored pencils (wi 3 
a color code continuously available) which of thes 
two alternatives obtained for each term. ' 
Sent to a room down the hall, the subject bie 
met by a second experimenter. This male gradual 
student administered a filler questionnaire to Cor 
the subject’s time during the next 30 minutes. hes 
the questionnaire had been completed or 25 mny a 
had elapsed (whichever came first), the subject a 
told to return to the original room and wait ot 
side in the chair until recalled by the female experi 
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menter. This experimenter recalled the subject after 
exactly 30 minutes had elapsed from the time the 
subject had left her research room. The purposes 
of this filler period were twofold: (a) to be sure 
that sufficient time had elapsed since last exposure 
to the critical words so that subsequent recognition 
would qualify as long term and (b) to occupy the 
subject on a distractor task for as much of this time 
as possible to avoid rehearsal of the terms previously 
encountered. 

The final part of the laboratory session required 
the subject to again recognize as many of the words 
heard on the tape from among those on a new Ad- 
jective Check List. Exactly 66 check marks were 
requested. When this was completed, the subject 
was thanked and was urged not to discuss the ex- 
periment with anyone else. 


Results 


Projection 


Projection was estimated from a score that 
considered the extent to which the subject 
attributed unfavorable evaluative terms, cor- 
tectly recognized from the maternal list, to 
“ee rather than to self. The projection term 

is score was the number of correctly 
recognized unfavorable terms attributed to 
others minus the number of correctly recog- 
pa unfavorable terms attributed to self. 
ened this term, taken alone, would not 
7 en between the person who denies 
RA a x traits and defensively attributes 
who en i and the competent individual 
id a y rates peers as more adequately 
E y the unfavorable terms. Accord- 
ca pocerection term (the number of cor- 
boat peevieet favorable terms attributed 
pe pes the number correctly recognized 
el te erms attributed to others) was sub- 
Ae for om the projection term to compen- 
Bethe ee self-esteem. The more posi- 
oe erence between the projection and 
Da s, the more the selective at- 
ar the in characteristics to others 
tive charact e selective endorsement of posi- 
oci eristics. In other words, the higher 
negative ee cessing potentially self-relevant 
on fae in an extraordinary way rela- 

i ag traits and in a style expected 
TH shout x Lower scores, whatever they 
à processin e person, clearly contraindicate 
style, g of negative traits in a projective 
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i This projection scoring procedure was used 
in a previous study in which theoretically 
predicted relationships between maternal ex- 
perience and defensive style were obtained 
(Heilbrun, 1972). More recently, further evi- 
dence of the heuristic value for this index of 
projection was obtained by showing that valid 
predictions from a projective test (the The- 
matic Apperception Test) could be made for 
projectors but not for nonprojectors (Heil- 
brun, in press). 

A comparison between the mean projection 
score for males (1.63, SD = 4.60) and fe- 
males (3.00, SD = 4.90) revealed no statis- 
tically significant difference between the two 
sexes, #(45) = .94, p > .30. Males and fe- 
males did not differ in their utilization of a 
projective style of processing aversive evalua- 
tive information. 


Repression 


The repression score reflected the extent to 
which the person failed to correctly recognize 
unfavorable terms from the critical list dur- 
ing the second retention test after correctly 
recognizing them during the first retention 
test and attributing them to himself or herself. 
In other words, at an earlier time the sub- 
ject had been aware of aversive self-charac- 
teristics but subsequently failed to remember 
them, Unfavorable terms from the critical list 
assigned as self-characteristic will be referred 
to as “repression relevant.” Two corrections 
were required within this score before assum- 
ing that the failure to recognize repression- 
relevant words represented a repressive style 
of processing negative information. 

1, The number of repression-relevant words 
that the person fails to remember during the 
second retention test should be considered in 
terms of the original number of such words. 
The subject who has self-attributed 10 of 
the mothers’ unfavorable terms and forgets 
2 of these (80% recognition) is showing less 
forgetting than the subject who forgets 2 of 
these terms from an original number of 4 
(50% recognition). Accordingly, a percent 
mber of repression-relevant 


repression-relevant terms) was used to correct 
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for the size of the original pool of such terms. 

2. The number of repression-relevant terms 

that the person fails to remember should be 
proportionally greater than would be expected 
relative to that person’s more general long- 
term recognition rate. This correction was 
achieved by using all other correctly recog- 
nized terms from the initial memory task, 
those having no theoretical relevance to re- 
pression, as a general reference group and de- 
termining their retention percentage on the 
second recognition task. This general refer- 
ence group included words that were correctly 
recognized initially and that were unfavora- 
ble and attributed to others, neutral and at- 
tributed to self or others, or favorable and 
attributed to self or others. 

The final repression score was percent re- 
tention from general reference group minus 
percent retention from repression-relevant 
group. Higher scores in a positive direction 
suggest repressive forgetting of aversive in- 
formation, whereas scores near zero or nega- 
tive scores contraindicate repression and 
might be taken to suggest sensitization to 
aversive information. 

These percentage correction score distribu- 
tions for males and females were very similar 
and approximated normal distributions. The 
latter observation was confirmed by chi-square 
goodness-of-fit tests (Maxwell, 1961) applied 
separately to the distribution of male, y?(2) 
= 2.93, p> .20, and female, x?(2) = 2.23, 
> .30, repression scores, Neither distribu- 
tion deviated statistically from a normal curve. 

The projection scores were split indepen- 
dently by sex at their median values (two for 
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males and three for females), thereby defin- 
ing high and low projectors. Repression scores 
(see Table 2) were then analyzed by means 
of a 2 X 2 factorial (Level of Projection x 
Sex) analysis of variance for unequal cell 
frequencies (Winer, 1962). There were two 
significant main effects. Female subjects had 
higher repression scores than males, F(1, 43) 
= 10.48, p < .005, and low projectors had 
higher repression scores than high projectors, 
F(1, 43) =9.56, p < .005. No Sex X Pro- 
jection Level interaction was obtained, F(1, 
43) = 1.02, p > .30. 


Control Analyses 


Two control analyses of the data were con- 
ducted to lend further confidence to the meth- 
odology on which the results are based. The 
first of these concerned the effectiveness of 
the laboratory analogue that was devised to 
activate defensive styles in the subjects. Can 
it be demonstrated that the subjects were 
responding within an evaluative context in 
which arousal of defensive behavior might 
be expected? The instructions indicated that 
the evaluative terms on the tape came from 
a survey of mothers asked to describe their 
college-age sons. The same or very similar 
instructions have proven effective in eliciting 
an evaluative set in a host of previous lab- 
oratory studies (reviewed in Heilbrun, 1973a) 
as gauged by the establishment of a theo- 
retically coherent network of results. Effec- 
tiveness of the procedures could be shown if 
the defensive style relationships found in this 
study when the subjects processed terms ac- 


Repression Scores as a Function of Level of Projection and Sex 


Level of projection 


High Low 
Com- 
Sex of subject n M SD $ M SD Pinea 
Male 17 —12.13 9.62 13 
f -l —4.25 12,04 —8.71 
Female 9 -3.70 14.30 i 
Combined M ESA z 8 A 12.90 3.61 


Note. The higher the scores in the positive direction, the greater the repression. Grand M = —4.24. 
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tually from the tape (and presumably from 
mothers) could not be replicated for the 
terms incorrectly “recognized” on the initial 
memory task and not having a maternal 
source. 

Projection and repression scores were de- 
rived for all subjects from the pool of adjec- 
tives that had not appeared on the maternal 
tape but had been checked by each subject 
as being on the tape. The same statistical 
analyses were performed as reported earlier 
in this section. Male (M = —1.50) and fe- 
male (M = —.47) projection scores failed to 
differ, t(45) = .81, p > .40, as before, and 
an analysis of variance disclosed no sex dif- 
ference in the repression scores of males (M 
= —15.49) and females (M = —23.25), F (1, 
43) = 1.29, p > .25. Furthermore, no rela- 
tionship between projection and repression 
was found, F(1, 43) = .05; male high pro- 
jectors (M = —14.49) and low projectors 
(M = —16.49) had virtually identical repres- 
sion means, and females found to have low 
and high levels of projection also had very 
similar means (—26.05 vs. —20.76). Thus, 
telationships involving defensive styles were 
obtained only for those terms presented within 
an evaluative context. 

The second control analysis considered the 
tepression score, based on the difference be- 
tween the retention rate for one group of 
words comprised totally of unfavorable terms 
and another group comprised of words that 
a largely neutral or favorable. It might 
contended that the failure to retain the 
€y unfavorable terms for females (or low 
ae is attributable to their more gen- 
a aversion to all the unfavorable words 
eae in the critical list whether they 
an considered by the subject to be self- 
Coenen or not. This failure to recog- 
cae hare evaluative terms coming from 
s y ernal source has been demonstrated in 
i te experiment using the same words 
aoe the present experiment, similar instruc- 
ae and an auditory threshold discrimina- 
oo (Heilbrun, 1973b). In that 
nition A lege males showed a specific recog- 
ER eficit for negative maternal evaluative 
Volks, poken in a female voice near threshold 

e. Examination of the mean number of 


adjectives within each qualitative category on 
the critical list that were correctly recognized 
on the initial memory task should shed light 
on this issue. If repression score elevations 
for females and low projectors are attributa- 
ble to a more general problem in remember- 
ing negative terms having a maternal source, 
it would be expected that this would also ap- 
pear in the recognition performance of these 
two groups immediately following the presen- 
tation of the taped list and before the words 
were sorted into those characterizing the self 
and others. 

Table 3 presents the mean number of rec- 
ognitions within each qualitative category on 
the initial memory task for the four Sex x 
Projection Level groups. A 2 X 2x3 (Level 
of Projection X Sex X Qualitative category) 
factorial analysis of variance for unequal cell 
frequencies revealed only one significant ef- 
fect, Females recognized more critical words 
(M = 45.24) than did males (M = 40.47), 
F(1, 35) = 5.86, $ < (025. However, the ab- 
sence of a Level of Projection X Qualitative 
Category or Sex X Qualitative Category in- 
teraction offered no support for the conten- 
tion that low projectors or females experienced 
a more general aversion to unfavorable criti- 


cal terms. 


Discussion 


Two basic questions were considered in the 
present experiment. First, given the option 
of dealing with unfavorable information about 
themselves defensively by either projective or 
repressive styles or nondefensively, will there 
be a difference in the use of either style be- 
tween males and females? The results indi- 
cated that both sexes used projective attribu- 
tion of unfavorable characteristics to others 
to about the same degree. However, females 
were more prone to use the selective forgetting 
of unfavorable characteristics involved in re- 
pression to a greater extent than males. Since 
sex differences have received little systematic 
attention in previous studies of repression, it 
is difficult to say what the implications of the 
present results might be for prior research 

ue, however, for more 


results. They do argue k 
careful attention to subject sex 1n subsequent 


investigations of repression. 
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Mean Number of Correct Recognitions on the Initial Memory Task for Males and Females 


Differing in Use of a Projective Defensive Style 


Category of evaluative term 


Type of subject Favorable Neutral Unfavorable 
Male 
High projector 13.29 12.47 14.29 
Low projector 14,00 12.77 14.23 
Female 
High projector 14.11 14.56 15.11 
Low projector 16.75 15.00 15.12 


The repression results taken alone, even 
with the sex difference, could have easily taken 
their place among those of a host of other 
studies that have sought to demonstrate moti- 
vated selective forgetting but, when reviewed 
as late as 1974, led Holmes (1974) to con- 
clude that “the continued use of repression 
as an explanation for behavior does not seem 
justifiable” (p. 651). The overall rate of re- 
tention of self-attributed unfavorable terms 
was only about 4% less than that found for 
all other combined terms; considering all sub- 
jects, this is hardly robust evidence for re- 
pression, However, the present investigation 
went one step beyond its predecessors by of- 
fering subjects alternative ways of defending 
against aversive information. The second 
question under investigation then became not 
whether repression (or projection) can be 

demonstrated but whether, given these alter- 
natives, subjects can be shown to make pri- 
mary use of one or the other defense. The 
results suggest that the subjects did tend to 
use a primary defense style, whether the sub- 
ject was male or female, Those who projected 
more usually repressed less, and vice versa, 
It is worth noting in this context that seven 
subjects showed no evidence of either a pro- 
jective or repressive style of defense. This 
serves as a reminder that many defensive 
styles are available to Cope with threatening 
information and that the present methodology 
assessed but two of these. It is also true that 
the use of either style within this study as 
a primary defense does not necessarily imply 
that the same style is the dominant defense 
in real life. It is the generally inverse rela- 


tionship of repression and projection found 
in this study that is held to be significant to 
both laboratory investigation and natural be- 
havior. i 
Holmes (1974) has challenged the inter- 
pretation of laboratory studies of repression 
that have presumably demonstrated the oc- 
currence of this defense mechanism. He con- 
siders response competition to be a more 
parsimonious explanation of the selective loss 
of threatening material, and two of his own 
studies (Holmes, 1972; Holmes & Schallow, 
1969) encourage this viewpoint. In the earlier 
of these, the retention of nonevaluative nouns 
was found to be just as impaired by the in- 
troduction of unrelated interpolated material 
between exposures as by associating word ex- 
posure with threats. In the follow-up experi- 
ment, a deficit in retention followed not only 
the association of words with threat but also 
the association of words with ego-enhancing 
instructions. It was the cognitive activity 
associated with either negative or positive 
evaluation that was thought to be distract- 
ing and to interfere with subsequent recall 
of the words. Holmes’ research does demon- 
strate quite clearly that memory deficits asso- 
ciated with ego threat can be accounted for 
in response interference terms. Furthermore, 
his assumption that these deficits are mediated 
by attentional factors is bolstered by the fact 
that the impaired recall of terms associated 
with threat, ego enhancement, or response 1n- 
terference was indistinguishable from control 
group performance after debriefing and the 
refocusing of attention on the critical words. 
Holmes was aware of the inferential problem 
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of proving that the conditions of memory loss 
for the repression group were the same (or 
were different) from these for the response 
interference or ego-enhancing groups, however. 
The significance of this inferential prob- 
Jem for the present repression results is two- 
fold. On the one hand, all subjects were given 
the same conditions of exposure to the criti- 
cal words prior to the final recognition task. 
After initial recognition was measured, each 
subject judged the relevance of the evaluative 
terms to self and others and then was main- 
tained on a buffer task for the delay period. 
Furthermore, repression was assumed only 
when a specific class of evaluative terms suf- 
fered selective loss—those having a maternal 
source that were unfavorable and assigned as 
self-characteristic. A response interference 
interpretation, which assumes the importance 
of a source of stimulation other than threat, 
would have to account for why this narrow 
class of terms suffered more interference than 
other terms for the same individual. This 
cannot be reasonably explained in terms of 
the experimental procedures that were stan- 
dard for all subjects; it requires an explana- 
tion based on the way in which each subject 

processed the critical terms. 
Ta Nc differences in processing nega- 
a m pen information were invoked 
oath oe of selective memory loss, 
ae e right back to considering re- 
ae asa defensive mechanism. It is here 
oie ta, with Holmes in emphasizing 
ie. gies and attention. At least 
fainter as Dollard and Miller’s (1950) 
Res | treatise explicating psychoanalytic 
h in learning theory terms, repression 

as been considered by man 
of anxiety-ge: i oes, ee 
nae gi nerated response interference 
ii ate attentional shift from threaten- 
arai pa to something more neutral. Given 
sci. stimulus, a previous aversive 
each renea from occurring by the 
hee ot occurrence of a “stop thinking” 
Ase ‘ similar explanation involving re- 
Son: ompetition seems to fit the repres- 

. ae in the present study. 

ished =a thing that might be accom- 
` TM the present experiment would be 
the investigation of the defense 
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mechanisms closer to the mainstream of per- 
sonality research. Rather than concentrating 
on laboratory demonstration of the defense 
mechanisms as clinical, trauma-induced phe- 
nomena, study of defensive styles could be 
focused on the ways the normal individual 
deals with aversive information that threatens 
self-esteem or emotional well-being. This 
would broaden our inquiry to a wider range 
of interesting questions such as the social 
origins of different defensive styles and the 
conditions under which defenses are invoked, 
circumvented, or shifted. 
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Comments 


The Right to Treatment: Issues in the Treatment of Homosexuality 


Ellie T. Sturgis and Henry E. Adams 


University of Georgia 


The current controversy regarding the classification and modification of h 

sexual behavior appears to have been stimulated by pressures on Precio 
organizations exerted by groups such as the Gay Liberation assent ree 
consequence, Davison has argued against the use of sexual reanientstion| ri ny 
dures for homosexuality. Davison’s recommendation appears motivated b; aS 
political factors rather than psychological evidence, resulting in a n eck sd 
accurate assessment of the needs of the individual. We maintain that te ues- 
tion of the abnormality of homosexuality is an empirical issue, but this eh 
is irrelevant to the treatment of the individual. The criterion of abnormal be- 
havior is not a necessary prerequisite for behavior modification, since behavior 


therapists can and do intervene with problems of everyday living to enhance 
personal effectiveness. The decision concerning the modification of sexual orien- 
tation should be based on the life circumstances and values of the individual 
rather than the value systems of the therapists. We strongly urge clinicians to 
take a tolerant and impartial view of sexual problems without imposing homo- 
sexual, heterosexual, or other value systems on their clients. 


: pees (1976), in his discussion of the ethical 
A ae presented by an examination of homosex- 
“TE vior, has raised a number of important and 
i, ssues. However, many of Davison’s argu- 
a to be based on social and political 
; ae er than empirical evidence. Such an 
ae aa ina neglect of the assessment of 
eae s r the individual, which allows thera- 
ae aed when necessary, an intervention 
TER ced to the requirements of the indi- 
the rights i aber an assessment is not conducted, 
fate of the client may well be violated. 
he oy ts Position that the homosexual be denied 
hee 3 m oaiRcation of sexual orientation vio- 
a NA ent of the individual to treatment and 
an a itrary values of the clinician in the 
Breen i the clinician who assumes that all 
Teorientation ae receive treatment for sexual 
insure an ae e purpose of this discussion is to 
an ANAA and scientific examination of 
dent of ees and other target behaviors indepen- 
exerted by ie social and political pressures 
activist groups. 
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What Are Appropriate Target Behaviors? 


In his discussion of threats to the ethical founda- 
tions of behavior therapy, Davison (1976) chal- 
lenges the ethics involved in the selection of target 
behaviors for modification. He cites examples of 
test anxiety and individuals who lack assertive 
skills, He questions why therapists treat the “iden- 
tified client” rather than the systems that produced 
the anxiety or the individuals who violated the 
rights of the nonassertive person. A therapist’s de- 
cision to modify the group behaviors of society 
rather than behaviors of an individual is legitimate. 
However, the logical consequence of such a de- 
cision implies that the same ethical considerations 
be applied to society as are applied in the case of 
an individual. Thus, the intervention must be re- 
quested by society, there must exist adequate treat- 
ment procedures, and the clinician must know the 
effectiveness of the treatment program as well as 


ue system, Is society less com- 
plex than the individual? The active cooperation 
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of the client is required for any treatment method 
to be effective. Is this not also true for society? 

Even though social change is often a desirable 
goal, it is not particularly evident that behavior 
therapists have a sufficient data base to initiate 
or effect such changes. With individual clients, 
therapy can be objective, empirical, and ethical; 
however, these values seem to escape us when we 
discuss society as the client. Unless behavior thera- 
pists maintain the same professional values with 
society that they maintain with individuals, they 
become political activists rather than psychologists. 
It is questionable that psychologists interested in 
social change have any more data for their opinions 
than do politicians. 


Are Therapists Neutral on the Homosexual Issue? 


Davison (1976) supports Halleck’s (1971) con- 
tention that therapists never make ethically neutral 
decisions. This is a legitimate argument, since re- 
search has indicated that the therapists often serve 
as a modeling and moral agent for the clients 
(Bandura, 1969). By agreeing to a treatment goal, 
the clinician conveys the ethical and philosophical 
approval of the change, The nonneutrality of the 
therapist is an issue in any treatment procedure. 
However, Davison implies that the situation is more 
serious when homosexuality is the issue, Only sub- 
jective opinion supports this contention, It is un- 
likely that therapists are more biased about homo- 
sexuality than other problems such as pedophilia, 
sadism, depression, or schizophrenia. Therefore, to 
Propose that a client not be treated unless the 
therapist is neutral is to eliminate the helping 
professions. Value judgments are made about 
homosexuality as well as for other behavior pat- 
terns whether or not society accepts the condition 
as appropriate, Halleck (1976) contends that al- 
though values are always present in a therapeutic 
situation, the problems are simplified when a com- 
prehensive assessment of the case is conducted, 
As the circumstances of the behavior pattern vary, 
it is likely that clinicians will alter their opinions 
and values accordingly. Halleck argues that a com- 
prehensive assessment increases the likelihood that 
intervention will not be haphazard, It is further 
suggested that after the assessment and formulation 
of any problem behavior regardless of its nature 
therapists consult with clients, inform them of their 
opinions about the problem, discuss Possible treat- 
ment alternatives, and discuss implications of the 
alternatives. Included should be a clear statement 
of the therapist’s values that the client can evaluate 
in accepting or rejecting treatment alternatives, 
Such an approach decreases the likelihood that a 

client would unknowingly be influenced by the 


COMMENTS 


value systems of the therapist and also allows the 
individual to exercise greater personal control over 
possible consequences of treatment procedures, 


Homosexuality: Normality or Abnormality or 
Is There No Cure Without a Disease? 


The issue of abnormality is inevitably raised 
when psychologists consider the phenomenon of 
homosexuality. The research data have supported 
both the “mental health” of the homosexual (Gag- 
non & Simon, 1973; Green, 1972) and the path- 
ology of the individual (Bieber et al., 1962), 
Davison, in a telling argument, maintains that be- 
fore conclusions can be made that etiology factors 
are pathological, the resultant behavior pattern 
must first be labeled abnormal, Even if a homo- 
sexual individual exhibits nonsexual pathological 
behaviors, it remains to be demonstrated that such 
behaviors are associated with or are the result of 
the homosexual condition, Abnormal patterns of 
behavior observed in homosexuals may be indepen- 
dent of homosexuality and should not be used as 
a basis for a decision about the abnormality of 
homosexuality, 

The abnormality of homosexual behavior, like 
the classification of any behavior pattern, is an 
issue to be resolved empirically rather than through 
verbal discourse or through the vote of a profes- 
sional body (Adams, Doster, & Calhoun, 1977). 
Nevertheless, it does not follow, as Davison sug- 
gested, that the abnormality of homosexuality or 
any other behavior is a necessary or sufficient pre- 
requisite for intervention. Many individuals who 
seek the aid of a psychologist are normally func- 
tioning individuals who experience difficulties in 
some aspects of their lives. Should the issue of 
treating individuals who report dissatisfaction with 
their pattern of sexual preferences be different from 
treating individuals who are dissatisfied with be- 
havior patterns in nonsexual response systems? 
When clinicians respond differently to homosex- 
uality than to other problems, are we not react- 
ing to social or political pressures rather than to 
the basic issue of treatment? 

For the clinician, the question of adaptive oF 
maladaptive response patterns should be consid- 
ered in reference to the individual who requested 
help. Psychological intervention for maladaptive 
behavior is warranted when the behavior pattern 
causes the individual discomfort or distress. How- 
ever, it does not necessarily follow that behavior 
therapists should offer assistance only to individ- 
uals labeled as abnormal. 3 

Behavioral techniques have often been used with 
nondeviant behaviors to enhance positive aspects 
of behavior or increase the personal effectiveness 


_ oe ee a 


COMMENTS 


of an individual. Such target behaviors have in- 
cluded enhancing social skills, assertiveness, study 
behavior, training in child management procedures, 
and problem-solving techniques (Goldfried & Davi- 
son, 1976). The development of modification tech- 
niques does not necessarily indicate that the ther- 
apist sees the client as having a “disease” as 
Davison inferred. 


Is Social Prejudice the Primary Agent in Distress? 


Davison stated that social prejudice contributes 
to the problems of the homosexual and causes the 
individual undue distress. Although this is a valid 
observation, he implies that social prejudice is 
unique to homosexuality. On the contrary, social 
pressures exist for most individuals who deviate 
from societal norms, regardless of the direction of 
the deviation, Such individuals are frequently sub- 
jected to criticism and discrimination including 
those who have sought psychological assistance; 
those of a different race, ethnic, or social group; 
and even indivduals who deviate positively (i.e., 
high-achieving women or creative individuals; Al- 
per, 1974; Farina & Ring, 1965; Lamy, 1966). 
The ability of an individual to adapt to such social 
pressures and to perform adequately is appro- 
Priately considered one element of normal func- 
tioning. Regardless of the desirability or unde- 
sirability of social prejudice, to assume that homo- 
sexuals experience more prejudice than others do 
Is questionable, The role of social prejudice in 
specific deviations (positive or negative) can be 
More appropriately answered through empirical in- 
vestgation than through armchair speculation. Un- 
til such evidence is available, one cannot conclude 
that social pressure is the critical factor in the de- 
velopment of distress and desire for change in the 
homosexual but not in the development of distress 
and the desire for change in other patterns of be- 
havior, 

Davison (1976) stated that society does not at- 
tribute problems to heterosexuals on the basis of 
their sexual behaviors. The validity of this state- 
Ment is questionable. For example, most rapists 
are heterosexual, but they are viewed as having 
problems related to their sexuality. Since the time 
of Freud, psychological literature has been filled 
With cases in which the problems or abnormal be- 
havior patterns of heterosexuals have been attrib- 
Uted to fixations at stages of psychosexual de- 
Velopment or have resulted from unresolved sexual 
Conflicts. The validity of such hypotheses is irrele- 
Vant. However, the impact of Freudian thought on 
Society can neither be estimated nor denied. 

Even if society is the causal agent of distress in 

lomosexuality, it does not follow that the solution 
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is to modify the attitudes of society, as was indi- 
cated in our previous discussion of this issue. 
Although psychologists should try to educate the 
public concerning the factual issues related to 
homosexuality (or other behavior patterns), psy- 
chologists should refrain from presenting informa- 
tion that lacks a data base. Social propaganda or 
reeducation efforts that are based on personal 
opinions rather than empirical evidence can have 
drastic consequences for society and for the pro- 
fession. Excellent examples of the danger of such 
campaigns are the disease concepts of “mental 
illness” and popularized myths of alcoholism 
(Rosenhan, 1973; Sobell, Sobell, & Christelman, 
1972). 


Do Effective Modification Programs Reinforce 
Societal Beliefs? 


Begelman (1975) and Davison (1976) maintain 
that the very existence of reorientation programs 
for homosexuality constitutes a significant causal 
element in reinforcing the doctrine that homosex- 
uality is undesirable. However, the authors fail 
to substantiate the claim with objective data, A 
review of recent popular publications concerning 
behavioral techniques such as Orwell's (1949) 
1984 and Burgess’ (1962) A Clockwork Orange 
indicates that the public is well aware of the pos- 
sible dangers of behavioral control, but they are 
less aware of possible benefits of behavior modi- 
fication techniques. 

A review of the work on “brainwashing” fol- 
lowing the Korean War (Brownfield, 1965) indi- 
cates that very effective programs were developed 
that elicited large behavior changes in individuals 
exposed to these procedures. However, the mere 
fact that the programs effectively changed atti- 
tudes, beliefs, and behaviors of the captives did 
not cause the public to view the captives as tab- 
normal.” To the contrary, the public outcry was 
directed at the techniques and the individuals who 
developed them. More recently, similar public 
alarm has occurred regarding behavior modifica- 
tion programs in the prisons, schools, and other 
institutions. Davison has responded to these pub- 
lic criticisms by issuing statements defending be- 
havior therapy to a number of senators, the Ameri- 
can Civil Liberties Union, U.S. Department of 
Health, Education, and Welfare, as well as to a 
number of magazines and newspapers (Davison, 
1974; Davison & Stuart, 1975). If a society con- 
siders the target behavior of effective treatment 
procedures as undesirable, why does society object 
to the implementation of treatment programs that 
may be effective with behaviors clearly considered 
deviant, such as crime? Contrary to Davison’s 
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position, the evidence suggests that society objects 
to any procedure potentially capable of controlling 
behavior, regardless of the nature of the behavior. 
Further, it is doubtful that the availability of tech- 
niques encourages their use. The rigorous opposi- 
tion to behavior modification techniques well illus- 
trates this point. 


Are Sexual Preferences Equivalent to 
Sexual Values? 


Davison (1976) seems to assume that sexual 
preferences of individuals determine their values 
concerning sexual behavior. The literature on sex- 
ual behavior would suggest that this is not always 
the case, There are numerous examples in which 
individuals report experiencing annoyance and/or 
shame concerning patterns of sexual gratification 
but are unable to bring the patterns under self- 
control. Such case and group studies involve such 
sexual patterns as impotence, frigidity, homosex- 
uality, fetishism, transvestism, pedophilia, and sad- 
ism (Davison, 1968; Kohlenberg, 1974; LoPiccolo, 
Stewart, & Watkins, 1972). The problem of in- 
congruent values and preferences is best illustrated 
by many secondary homosexuals who are married 
and/or engage in heterosexual encounters but have 
homosexual arousal, Davison’s argument that the 
homosexual seeks treatment: primarily because of 
social pressures appears to neglect the possibility 
that there are clients who may actually wish to 
alter their preference to be congruent with their 
values rather than changing their value system. 
The decision to change preferences and/or values 
should not be made a priori but should be deter- 
mined by aspects of each case. 


A Proposal for Therapy with Homosexuals 


Davison (1976) and Silverstein (Note 1) have 
Proposed that all individuals with homosexual 
preferences be treated by desensitizing them to 
their guilt about homosexual preferences and life- 
styles. Such a Standardized procedure appears 
to be a return to a state of affairs in which it 
was not possible to tailor treatment Procedures 
to the particular needs of the individual’s case. 
Indeed, one distinctive characteristic of behav- 
ior modification is that these procedures are ap- 
propriate for altering a variety of behaviors. Any 
a priori decision to use only specific techniques 
eliminates one advantage of behavior therapy 
that allows the clinician to select treatment pro- 
cedures appropriate to the individual case. Bieber 
(1973) has distinguished psychoanalysis and be- 
havior therapy on this very basis. 
Critical to the selection of any treatment plan 
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is a careful assessment of the client’s current + 
functioning. The clinician should delineate prob- 
lem behavior(s) along with possible antecedent 
(or stimulus) events and the consequent or main- 
taining factors. A behavioral formulation views 
homosexual preference as a learned pattern of 
behaviors that is acquired by individuals through 
their experiences and that may be maintained 
or modified in the same way that other behaviors 
are maintained or modified. Possible targeted 
behaviors for intervention can include such dif- 
ficulties as negative attitudes toward the object 
of sexual preference itself, a desire for sexual 
reorientation, dissatisfaction with the “gay” life- 
style, interpersonal difficulties in homosexual and/ 
or heterosexual social exchanges, guilt or nega- 
tive reactions toward the self, sexual dysfunc- 
tions with homosexual and/or heterosexual part- 
ners, negative attitudes toward members of the 
Opposite sex, or any other personal difficulty that 
an individual (heterosexual or homosexual) can 
exhibit. The nature of a problem behavior is not 
determined when the client states that he or she 
is homosexual or even that he or she is dis- 
satisfied with the status of being homosexual. 
Only when a target behavior is thoroughly de- 
fined and the antecedents and consequences are 
determined can the therapist conceptualize the 
problems and suggest various treatment proce- 
dures to the client. 

In Davison’s (1976) discussion of the termina- 
tion of preference change programs, he admits 
that such a move runs the risk of denying treat- 
ment to homosexuals who desire an orientation 
change, not on the basis of social pressures but 
based on “a sincere desire for things that in our 
culture are usually part of the heterosexual pack- 
age—a spouse and children” (Davison, 1976, P: 
161). As one solution, he proposes that clinicians 
accept such risks. This position appears to vio- 
late an earlier position adopted by Davison 1m 
defense of behavioral treatment for institutonal- 
ized individuals. In that article, he stated that 
“we cannot deny mental patients their right for 
treatment” (Cashman, 1974, p. 5). How does one 
justify treatment for institutionalized individuals 
whose consent, according to Davison, may some 
times be coerced and then refuse to offer treat- 
ment for the purpose of sexual reorientation tO 
a homosexual because the decision for treatment 
may have resulted from social pressure? The de 
cision for treatment should be made by the 
client rather than the therapist and should be 
based on an understanding of all relevant 45 
pects of the situation and the client’s desires. 

In summary, the position taken by Davison aP- 
Pears to be based on philosophical beliefs that 


ignore empirical data. His position violates the 
right of the individual to select treatment goals. 
The purpose of this article was not to espouse 
particular views concerning the nature and treat- 
ment of homosexuality. To the contrary, the in- 
tent was to challenge Davison’s position in order 
to stimulate relevant research and insure pro- 
tection of the client’s right to treatment. Through 
careful consideration of issues related to the 
modification of homosexuality, it is hoped that 
clinicians will arrive at their own conclusions 
based on rational, objective logic resulting from 
an evaluation of empirical evidence and the needs 
of the individual case. As clinical scientists, we 
can ill afford to make our decisions on the basis 
of social pressures or personal values. The state 
of a science cannot be decided by popular vote 
or verbal mandate. To be effective psychologists, 
we must ultimately let the data and the require- 
ments of individual cases guide us in the formu- 
. Silverstein, C. Behavior modification and the gay 

community. Paper presented at the meeting of the 


t 
lation of our views and treatment procedures. 
Reference Note 
Association for Advancement of Behavior Ther- 
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Not Can but Ought: The Treatment of Homosexuality 


Gerald C. Davison 
State University of New York at Stony Brook 


My earlier proposal to terminate change-of-orientation programs rests on moral 
not empirical grounds. Arguments based on whether therapists can or cannot 
alter sexual preferences are irrelevant. Therapists, moreover, have no abstract 
responsibility to accede to requests from clients for certain types of treatment; 
we work within a host of personal, conceptual, and even legal constraints. Ther- 
apists are characterized better as secular priests than as professionals applying 
ethically neutral techniques. Therapists should attend to large-scale social and 
political factors in their clients’ lives as conscientiously as they attend to intra- 
psychic and interpersonal variables; our students should study philosophy and 
Politics as well as learning theory and research design. Finally, to urge that 


therapists desist from sex reorientation Programs is not tantamount to exhorting 
them not to see homosexuals in therapy; indeed, renouncing these widely used 
Programs can help professionals focus on the problems homosexuals (and 
others) have, rather than on the so-called problem of homosexuality. 


I am grateful for the invitation to respond to 
Sturgis and Adams’ (1978) critique of my pro- 
posals to terminate change-of-orientation pro- 
grams for homosexuals (Davison, 1976), 

Given the cultural biases against homosexual- 
ity, it is problematic to assert that people who 
ask for change of orientation are expressing a 
“free wish.” We have been remiss in examining 
why people Tequest certain kinds of treatment 
(cf. Silverstein, Note 1). 

Some of the misunderstanding of my position 
may stem from a confusion in levels of discourse. 
Even though I am Prepared to argue that indi- 
viduals can benefit from a renunciation of change- 


] n V me of the specific points 
raised in the critique of Sturgis and Kams 
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(1978). They are entirely correct to claim that 
my arguments are based on sociopolitical factors 
rather than on empirical considerations. They 
have properly grasped the thrust of my 1976 
article, but I do not share their displeasure at 
my being moved by ethical rather than empirical 
concerns. What they fail to understand is that 
the issues we are dealing with as therapists are, in- 
deed, philosophical-ethical ones, and these moral 
considerations transcend research considerations. 
To discourse on the empirical level, as they do 
very well, is simply to misperceive the essence 
of the issue. Their critique is irrelevant to my 
article. 

Those who continue to offer change-of-oriente- 
tion treatment to homosexuals do not have @ 
monopoly on sensitivity to clients’ rights. I do 
not believe that the issue can be settled by argu- 
ing, as they do, that a therapist has some sort 
of abstract responsibility to satisfy a client’s €X- 
Pressed needs. It is not that simple. As Begel- 
man (1975) has pointed out, therapists constrain 
themselves in many ways when clients ask for 
certain kinds of help. There is a host of client 
requests that therapists do not honor. In fact, 
the courts (cf. Kaimowitz v. Michigan Depart- 
ment of Mental Health, 1973) are becoming in- 
volved in denying the “voluntary” requests of Pë- 
tients for certain types of treatment. Requests 
alone have never been a sufficient criterion fo" 
Providing therapy. í 

Clients, moreover, make certain requests 0 
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some therapists and will not do so with others— 
though, admittedly, it is difficult to collect good 
data on this, Therapists are purveyors of social 
ethics, and it is better to own up to this secular 
priest role (cf. London, 1964) than to continue 
pretending that it does not exist. 

Sturgis and Adams (1978) are correct in assert- 
ing that the normality or abnormality of a be- 
havior is irrelevant to whether therapists should 
try to change it. My earlier discussion of the 
normality status of homosexuality could have 
been omitted, but at the time it seemed worth- 
while to review both the evidence and the logic 
behind the futile attempts that have been made 
to address the issue. 

My colleagues do not show an understanding 
of the social nature of “empirical evidence.” Psy- 
chologists, like other scientists, do not merely 
go out and “gather data.” They hold precon- 
ceived ideas of what they will find and how they 
will decide they have found it (Davison & Neale, 
1978). Scientists adopt paradigmatic ways of de- 
fining the problems they will study and how they 
will study them (Kuhn, 1962). We do not, as 
Sturgis and Adams (1978) suppose, simply “ar- 
tive at... conclusions based on rational, objec- 
tive logic resulting from an evaluation of empiri- 
cal evidence” (p. 169). 

As noted above, their assertion that I based 
my arguments on social and political considera- 
tions is accurate, but to say that therefore I am 
neglecting a careful assessment of the needs of 
the individual does not follow, certainly not logi- 
cally or even empirically. The “psychological 
needs” of a person are seen by Sturgis and Adams 
as necessarily separate from their social-political 
needs or pressures. This separation is neither 
valid nor necessary, and my exhortation is that 
mental health professionals expand their perspec- 
tives to include those sets of variables that are 
too often overlooked, It is certainly not inherent 
to a social-learning approach to ignore political 
and ethical variables; in fact, our general orienta- 
tion is well suited to this more comprehensive 
assessment enterprise, as others have creatively 
demonstrated (e.g., Bandura, 1969; London, 
1969; Ullmann & Krasner, 1975). 

Sturgis and Adams (1978) contend that “to 
Propose that a client not be treated unless the 
therapist is neutral is to eliminate the helping 
Professions” (p. 166). This is erroneous on two 
grounds. First, I am not proposing that clients 
not be treated unless therapists are neutral. 
Rather, I am suggesting (after Halleck, 1971) 
that therapists cannot be neutral and that they 
Should realize this. Nowhere, either in what I 
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have written, taught, or practiced, do I advocate 
we not do therapy because we are not neutral, 
Second, to admit to one’s biases hardly elimi- 
nates the helping professions, If anything, it poses 
exciting challenges, not the least of which con- 
cerns the content of clinical training. Courses 
in politics, sociology, and philosophy would seem 
at least as appropriate as courses in learning and 
statistics, 

Having cited Halleck (1971) extensively in my 
1976 article, it is not surprising that I generally 
agree with his comments on that effort (Halleck, 
1976). A comprehensive assessment is impos- 
sible to argue with, and for therapists to make 
clear their biases (value systems) is precisely 
what I am proposing. Along with Sturgis and 
Adams, I agree with this position. But I would 
ask only that therapists tell a homosexual that 
his or her sexual orientation is wrong whenever 
they embark on a change-of-orientation proce- 
dure. The alternative, as stated in my 1976 ar- 
ticle, is for therapists to be as vigorous in de- 
vising sexual enhancement procedures regardless 
of orientation as they have been in helping homo- 
sexuals become less homosexual. 

I have been misunderstood (not necessarily 
by Sturgis and Adams) to have said or implied 
that I advocate not treating homosexuals. This 
is hardly the case. It is one thing to say that one 
should not treat homosexuality; it is quite an- 
other to suggest that one should not treat homo- 
sexuals. Indeed, I have urged that therapists do 
finally consider the problems in living that homo- 
sexuals really have. Such problems are perhaps 
especially severe, given the prejudice against 
their sexual orientation, It would be nice if an 
alcoholic homosexual, for example, could be 
helped to reduce his/her drinking without having 
his/her sexual orientation questioned. It would 
be nice if a homosexual fearful of interpersonal 
relationships, or incompetent in them, could be 
helped without the therapist assuming that homo- 
sexuality lies at the root of the problem. It 
would be nice if a nonorgasmic or impotent 
homosexual could be helped as a heterosexual 
would be rather than guiding his/her wishes to 
change-of-orientation regimens. Implicit in my 
original argument is the hope that therapists 
will concentrate their efforts on such human prob- 
lems rather than focusing on the most obvious 
“maladjustment” —loving members of one’s own 
sex. 
Perhaps some people can be hurt by my pro- 
posals. To suggest otherwise would be naive. But 
to assume that people are not being hurt by the 
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prevalent prejudices is at least as naive. I sup- 
pose one has to be reminded of the fact that we 
live in an imperfect world. I trust there are few 
therapists who deliberately harm their clients. 
But it is inherent to my comment that great 
numbers of people are being hurt by the availa- 
bility of change-of-orientation programs, and 
these include individuals who themselves are not 
seeing therapists. I believe far fewer people would 
be distressed if we eliminated that option and 
instead devoted our energies to (a) societal preju- 
dices and (b) to the whole range of problems 
that homosexuals, heterosexuals, and those in 
between, have as we all negotiate our way. 
I hope the debate continues. 


1 Begelman (1975) takes a different approach in 
countering the argument that these proposals against 
sex orientation change are misguided because they 
may not make people happy. He questions whether 
the right thing for therapists to do is necessarily 
what will make their clients happy. Sometimes it is 
ethical to make a client more distressed rather than 
less so, as when the client's behavior is morally 
wrong, or when, as is the issue here, the goals of 
the treatment are ethically questionable. 
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Brief Reports 


Modality, Self-disclosure, and Gender as Determinants 
of Psychotherapeutic Attraction 


Michael R. Kowitt and John P. Garske 
Ohio University 


The present study investigated the effects of therapy modality and the self-disclosure 
tendency and gender of the subjects on therapeutic attraction. Forty high and 40 low 
scorers on a modified self-disclosure questionnaire were asked to rate audiotapes of 
simulated therapy sessions on several dimensions. The primary results were as follows: 
High self-disclosers preferred client-centered therapy, whereas low self-disclosers pre- 
ferred systematic desensitization; client-centered therapy was perceived as providing 
a greater opportunity for self-exploration, whereas systematic desensitization was per- 
ceived as more effective; low-self-disclosing males and high-self-disclosing females 
rated the therapist as attractive but ineffective; and females were more attracted to 
systematic desensitization, whereas males were more attracted to client-centered 


therapy. 


f Clinical researchers have contended that a sub- 
ject’s attraction to a specific therapeutic modality 
has a significant impact on the process (Goldstein, 
1971) and outcome (Devine & Fernald, 1973) of 
psychotherapy. The present study investigated this 
interactive effect in a controlled analogue design. 
The major hypothesis was that high-self-disclosing 
subjects would be more attracted to client-centered 
therapy, whereas low-self-disclosing subjects would 
be more attracted to systematic desensitization. 
College undergraduates (94 males, 84 females) 
completed Panyard’s modified version of the Jour- 
ard Self-Disclosure Questionnaire and listened to 
one of two simulated audiotapes of excerpts of 
Psychotherapy sessions. One tape depicted client- 
centered therapy; the other depicted systematic 
desensitization. The actors were the same in each 
(female client, male therapist). Each tape con- 
tained segments of the 2nd, 3rd, 8th, and 10th 
therapy sessions; Sessions 2 and 10 were identical 
for each to control for the client’s report of her 
Problem (interpersonal difficulties) and her evalua- 
tion of the effectiveness of the therapy (moderately 
Improved), respectively. The tapes were also 
matched for duration (10 minutes). Following the 
tape presentation, each subject rated seven items 
derived from Goldstein’s (1971) Tape Rating 
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Scale to evaluate the therapist, the technique, and 
the therapy’s effectiveness. 

A2 X 2 X 2 factorial analysis of variance was 
performed on each of the seven rating variables. 
The factors were therapy type (client centered/ 
systematic desensitization), subject self-disclosure 
(upper/lower quartile), and subject sex, Post 
hoc analyses of the interactions were performed 
using Cicchetti’s procedure, The major finding in- 
volved the hypothesized interaction between ther- 
apy type and subject self-disclosure (p < .02); the 
low-self-disclosure subjects preferred the therapist 
who used systematic desensitization (p <.05), 
whereas the high-self-disclosure subjects tended to 
prefer the client-centered therapist, Low self-dis- 
closers also rated the client-centered therapist as 
providing less help (p <.05). Other significant 
findings were: Client centered therapy was per- 
ceived as providing a greater opportunity for self- 
exploration, whereas systematic desensitization was 
perceived as more effective (despite the matched 
effectiveness of the tapes) ; low-self-disclosing males 
and high-self-disclosing females rated both thera- 
pists as more attractive but less ineffective; fe- 
males generally preferred systematic desensitiza- 
tion, whereas males preferred client-centered 
therapy. 

The results revealed that both subject and mo- 
dality variables independently and interactively 
affected subject evaluation of psychotherapy. Os- 
tensibly, high self-disclosers were attracted to a 
modality (client-centered therapy) that matched 
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their preference to reveal themselves, whereas low- 
self-disclosers preferred a modality (systematic de- 
sensitization) with structure and direction. There 
were also complex subject-gender effects. Although 
these data might be suggestive of differences in 
actual therapy outcomes (Devine & Fernald, 1973), 
they should be viewed as preliminary, pending ex- 
tensions to clinical populations. Moreover, the de- 
sign was restricted by its use of extreme groups 
of self-disclosers and a single client-therapist 
gender pairing. 
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Factor Analysis of the Ward Atmosphere Scale 


Lynn Alden 


University of British Columbia, Vancouver, Canada 


Eighty-seven inpatient volunteers completed the Ward Atmosphere Scales (WAS) and 
semantic-differential ratings of their ward. Results of a components analysis did not 
support Moos’s rational grouping of WAS items into 10 dimensions, but they sug- 
gest that a global evaluation dimension might underlie 8 of the 10 subscales. 


Clinicians are increasingly using measures of the 
treatment environment in an attempt to identify 
characteristics of psychiatric treatment programs 
related to treatment efficiency and effectiveness 
(Ellsworth & Maroney, 1972; Moos & Schwartz, 
1972; Sidman & Moos, 1971). Among the more 
widely used instruments are a series of measures 
developed by Moos and his colleagues, including 
the Ward Atmosphere Scale (WAS; Moos & 
Houts, 1968) and the Social Climate Survey 
(Moos, 1968). Because these instruments are gain- 
ing widespread acceptance in clinical settings 
(Milby, Pendergrass & Clarke, 1975; Otto & 
Moos, 1974), it is important to understand the na- 
ture of the information conveyed by the WAS. 
One question that arises is the extent to which 
each of the 10 WAS subscales conveys unique in- 
formation about the environmental unit. Although 
Moos has not found sufficiently large subscale cor- 
relations to warrant collapsing subscales, a recent 
analysis by Wilkinson (1973) suggested that a sin- 
gle global dimension might underlie environmental 
measures of this type. Further, the pattern of WAS 
subscale correlations with the Perception of Ward 
scales (Ellsworth & Maroney, 1972) and with such 
factors as patient satisfaction (Houts & Moos, 
1969) suggests that a cluster of the WAS sub- 
scales may function as a single unit. This study 
investigated the number and nature of dimensions 
tapped by the WAS. 
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Method 
Subjects 


Subjects were 87 inpatient volunteers at a psy- 
chiatric institution for the criminally insane. Sub- 
jects ranged in age from 18 to 65, were unmarried 
(80%), were likely to have a history of previous 
psychiatric treatment (59%), and were often 
judged to be violent (71%). The average subject 
had an eighth-grade education and had been hos- 
pitalized for 2-3 years. The most common diag- 
nosis was character disorder (71%), with 8% 
diagnosed as neurotic and 21% diagnosed as psy- 
chotic, Subjects were distributed over eight small 
wards, Three wards were described as utilizing a 
conventional milieu approach; three wards followed 
a behavior modification format; and two wards 
were described as existential therapy units, 


Procedure 


All subjects completed the WAS and semantic- 
differential ratings of their ward, Six adjective pairs 
were included as part of the semantic-differential 
technique, with two scales marking each of the 
three affective components of meaning. The ad- 
jective ratings were made on 7-point scales, 


Results 


The 10 WAS subscales were intercorrelated, sub- 
jected to a principal-components analysis with 
unities in the diagonals and rotated to a varimax 
criterion of simple structure, Three factors having 
eigenvalues greater than 1 emerged, All but two 
subscales (Anger, Staff Control) displayed high 
positive loadings on the first factor, which ac- 
counted for 50% of the total variance. The second 
factor, accounting for 14% of total variance, was 
marked by a high positive loading on the Anger 
scale. The third factor, accounting for 10% of 
the variance, was marked by Staff Control. 
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The semantic-differential ratings were subjected 
to a principal-components analysis following the 
procedure described above. Two components 
emerged: evaluation-activity, which accounted for 
40% of the total variance, and potency, which 
accounted for 29% of the total variance. 

Factor scores on the WAS and the semantic dif- 
ferential were computed for each subject following 
the Alberta general factor analysis program (Hak- 
stian & Bay, 1973). Pearson correlation coeffi- 
cients were computed between the WAS factor 
scores and the semantic-differential factor scores. 
Evaluation—activity displayed a moderate relation- 
ship, r(87) = .48, p <.01, with the first compo- 
nent of the WAS, suggesting that the first compo- 
nent reflected subject evaluation of the unit. Low 
but statistically significant relationships were found 
between evaluation-activity and the third compo- 
nent, Staff Control, r(87) = .21, p <.05, and be- 
tween potency and the second component, Anger, 
(87) = .22,p <.05. 


Discussion 


The results of the components analysis do not 
support Moos’s rational grouping of WAS items 
into 10 dimensions, In general, one global dimen- 
sion was found to underlie subjects’ 
eight subscales (Involvement, Support, Spontaneity, 


we: SE Hs Clarity). The other 
ents appeared to be specific compo- 
nents, marked by only one scale each, The a 


unit. It is worth noting that this interpretatio; 
Ports the conclusions drawn by Wilkinson (1973), 
Tt may be that the WAS functions largely as 


how positively a 
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subject feels about the ward. It might prove worth. 
while to investigate the extent to which simple 
client satisfaction measures can achieve the same 
ends as more extensive measures such as the WAS, 
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Defensive Externality and Level of Aspiration 


Dorothy J. Hochreich 


University of Connecticut 


This study tested the hypothesis that increasing the salience of achievement cues will 
result in higher levels of achievement-striving behavior for one subgroup of externals, 
congruent externals; defensive externals and those with an internal locus of control 
orientation will behave in a striving manner regardless of level of cue explication. 
Ninety-six male subjects, defensive externals, congruent externals, and internals, par- 
ticipated in a level of aspiration task under either “game”. or “test” instructions, As 
predicted, congruent externals showed significantly more realistic striving behavior in 
the test condition (p <.006); internals and defensive externals showed similar high 
levels of realistic striving under both conditions. 


Recent research dealing with the relationship be- 
tween expectancies for internal-external control 
and achievement behavior (Hochreich, 1975) sug- 
gests the utility of using Rotter’s interpersonal trust 
measure to select two groups of externals: defen- 
sive (low-trust) externals, who appear to be am- 
bitious, achievement-oriented persons whose en- 
dorsement of external statements primarily reflects 
a verbal defense against failure; and congruent 
(high-trust) externals, who seem to be less am- 
bitious and less inclined to react to failure by means 
of blame projection, Previous studies indicate that 
depending on the particular circumstances, defen- 
sive externals will sometimes behave like internals, 
sometimes like extreme externals, 

Lefcourt (1967) suggested that externals, as 
compared with internals, fail to perceive the avail- 
ability of achievement rewards unless achievement 
cues are made quite explicit. Using three levels of 
Cue explication in instructions for a level of aspira- 
tion task, he found that externals showed a marked 
increase in realistic achievement-striving behavior 
as a result of greater cue explication, but internals 
did not vary across conditions. If, however, de- 
fensive externals are as achievement oriented as 
internals, they should react in an internal, striv- 
ing manner whether or not achievement cues are 
made salient, The less ambitious congruent exter- 
nals, on the other hand, may require stronger 
achievement instructions in order to perform suc- 
cessfully, 
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Three groups of male subjects (32 per group) 
participated: defensive externals (external and low 
in trust, determined by median splits); congruent 
externals (external and high in trust); and inter- 
nals (with trust scores representing the full range). 
Within each group, subjects were randomly as- 
signed to participate in Rotter’s level of aspiration 
board task under one of two instructional sets: 
game condition (task described as a children’s 
game) or test condition (described as a test of 
motor skills and ability to assess one’s own per- 
formance). Results were analyzed using Rotter’s 
nine patterns of level of aspiration behavior, Pat- 
terns 1-4 represent relatively realistic, responsive, 
and achievement-striving behaviors. Patterns 5-9 
represent less realistic behaviors (overestimation, 
underestimation, rigidity, or unresponsiveness to 
feedback). 

Overall, subjects showed more realistic patterns 
in the test condition than in the game condition, 
x? (1) = 3.50, $ < .10, two-tailed, The three groups 
did not differ from each other within the test 
condition (81.5% of defensive externals, 100% 
of congruent externals, and 87.5% of internals 
had realistic patterns). A significant difference was 
found within the game condition, however (88% 
of defensive externals, 56.5% of congruent exter- 
nals, and 81% of internals had realistic patterns), 
x2(2) = 4.66, p<.10, two-tailed. As predicted, 
neither internals nor defensive externals varied 
across conditions (Fisher’s exact probability test, 
p=.66 in both cases), whereas congruent exter- 
nals in the test condition had a significantly higher 
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Predictors of Child Noncompliant Behavior in the Home 


Rex Forehand, Karen C. Wells, and Ellie T. Sturgis 
University of Georgia 


The present study examined which data (parent behavior, child behavior, or parent 
reports) obtained in a clinic setting are the best indicators of child noncompliance 
in the home. Subjects were 18 mothers and their clinic-referred children. A stepwise 
multiple regression analysis indicated that two maternal behaviors, beta commands 
and total rewards, displayed in the clinic were the best predictors of child compli- 
ance in the home. The traditionally accepted parent report measures and child be- 
havior in the clinic were not significant predictors of compliance in the home. 


Although parent reports and observations of 
parent-child interactions in a clinic setting tra- 
ditionally have been used to assess deviant child 
behavior occurring in the home, many researchers 
now are using naturalistic home observations. Un- 
fortunately, many clinicians do not perform such 
assessments due to the time and expense involved. 
This study examined which assessment procedures 
in the clinic are the best indicators of child non- 
compliance, a frequent behavior problem in the 
home. 

Subjects were 18 mother-child pairs. The chil- 
dren, ages 2-9 (M=5.1 years), had been re- 
ferred for treatment of noncompliance. 

Clinic assessment procedures consisted of parent 
completion of the three scales of the Parent Atti- 
tude Test (Cowen, Huser, Beach, & Rappaport, 
1970), which pertain to child home behavior prob- 
lems, and three 20-minute parent-child observa- 
tions performed over 2 weeks in a playroom. Each 
observation consisted of 10 minutes in which the 
mother and child engaged in play activity chosen 
by the child (free play situation) and 10 minutes 
in which the mother determined the play activity 
(command situation). In both situations total 
Maternal rewards, questions, and commands as well 
as maternal rewards contingent on compliance, 
Maternal beta commands (vague or interrupted 
commands to which the child could not comply), 
and child compliance were recorded. Five 40- 
minute home observations, performed over 2 weeks, 
Occurred for each mother-child pair. The mother 
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was instructed to adhere to her daily routine, Child 
compliance was recorded in the home. Overall re- 
liability, obtained in 20% and 30% of the home 
and clinic observations, respectively, was 74%. 

The 15 independent variables (three question- 
naire scales and six behaviors in free play and 
command situations) and the criterion variable 
(compliance in the home) were submitted to a 
stepwise multiple regression analysis, Two maternal 
behaviors, beta commands in the command situa- 
tion and total rewards in the free play situation, 
were significant predictors of child compliance in 
the home, overall F(2, 15) = 9.28, p<.01, The 
resulting multiple regression equation was Y= 
25166 — .06299 (commands) + .18623 (rewards), 
indicating a negative relationship between beta 
commands and compliance and a positive relation- 
ship between rewards and compliance. A multiple 
correlation of .74 indicated that the two indepen- 
dent variables accounted for 55% of the total 
variance (R?) in home compliance. A correction 
for shrinkage, applied to the data to control for 
the small sample size, reduced the value of R? to 
.51 (p <01). Commands and rewards were re- 
sponsible for 74% and 26%, respectively, of the 
predictive ability. The best predictors of child 
compliance in the home were two clinic-observed 
maternal behaviors rather than the traditionally 
accepted parent report measures (questionnaires) 
or the child behavior (compliance) in the clinic 
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MMPI Correlates of Drug Addiction Based on Drug of Choice 


Lee Trevithick and Harmon M. Hosch 
University of Texas at El Paso 


Previous Minnesota Multiphasic Personality Inventory (MMPI) research has focused 
on addict differences based on substance abused. These investigations have largely 
failed to detect differences using standard univariate methods. The current study used 
multivariate analysis to detect differences among groups based on drug of choice 
(amphetamines, barbiturates, or heroin). Sixty-five addicts (48 males and 17 females) 
served as subjects. Their composite MMPI profile revealed elements of distress, con- 
fusion, and depression as well as sociopathy. Multiple discriminant analysis success- 
fully generated two orthogonal functions that accounted for virtually all of the 
variance between groups. The loadings of each function were analyzed in terms of 
the behavioral components characterizing each group. The implications for differential 
treatment strategies and for theories of personality etiology among drug abusers are 


discussed. 


Previous research (Henriques, Arsenian, Cut- 
ter, & Samaraweera, 1972) has failed to detect 
Minnesota Multiphasic Personality Inventory 
(MMPI) differences among groups of heroin, 
barbiturate, and amphetamine abusers. This re- 
search, however, relied on a univariate analysis 
of variance that detects group differences only 
with respect to each individual scale. Since the 
MMPI scales are interrelated, the present study 
used a multiple discriminant analysis (Dixon, 
1973) to extract scale combinations that might 
discriminate among the groups. 

Subjects were 65 patients involuntarily com- 
mitted to a residential drug abuse program. The 
sample included 48 males and 17 females; mean 
age was 26.75 years, and the mean years of 
formal education was 11.6. Although virtually 
all had a history of multiple drug abuse, each 
had one preferred drug and would use it first 
when it was available. Groups thus formed in- 
cluded 29 heroin, 13 barbiturate, and 23 am- 
phetamine abusers, 

The composite profile showed elevations (T 
>10) on the Pd, Sc, D, and Pt scales. This 
confirms the current view of the addict per- 
sonality as containing elements of distress, con- 
fusion, and depression in addition to sociopathy. 

The analysis of MMPI profiles by addict 
group yielded two discriminant functions that 
accounted for virtually all of the between-group 
variance and resulted in 46 cases (70%) being 
correctly classified into their addict group. It 
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appears then that the weighted combinations of 
MMPI scale scores can successfully differentiate 
the three groups of drug abusers. 

Scaled vector weights for the first discriminant 
function were largely defined by the Hs and Pa 
scales (characterizing the heroin group) at one 
pole and by the Hy and Sc scales (barbiturate 
and amphetamine groups) at the other pole. 
Tentative labels for the poles of this discrimi- 
nant function might be egocentric anxiety and 
acute alienation, respectively. 

Thus, although the overall profile distinguishes 
addicts from nonaddicts, combinations of scales 
discriminate among the subgroups. 

These variations imply that differential treat- 
ment strategies might be profitably explored. 
Intervention might be quite different toward an 
addict who attempts to distract the therapist 
from his underlying problems (amphetamine 
abuser), as compared with the approach toward 
the nonconformist addict who angrily refuses 
to cooperate (barbiturate abuser). An intriguing 
question is whether these profile differences Te- 
flect the effects of the drug abused or whether 
preexisting personality differences resulted in the 
varying modes of substance abuse. Only longitudi- 
nal research can provide the answer to this 
perplexing dilemma. 
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Predicting Confabulation from the Graham-Kendall 
Memory-For-Designs Test 


Dan Joslyn 
Veterans Administration Hospital 
Knoxville, Iowa 


John L. Grundvig 
Veterans Administration Hospital 
Tampa, Florida, and the University of 
South Florida College of Medicine 


C. J. Chamberlain 
Veterans Administration Hospital 
Knoxville, Iowa 


The Embellishment score on the Memory-For-Designs Test (MFD) may be a useful 
predictor of confabulation in hospital settings. Embellishments are additional features 
(lines and curves) that alter the original design. Sixteen long-term hospitalized male 
psychiatric patients who were judged by staff to be habitual confabulators embellished 
their drawings on the MFD twice as frequently (p < 01) as a group of 16 patients 
of similar age (40-80) and diagnosis who were judged to be nonconfabulators. This 
was true for both brain-damaged and chronic schizophrenic patients. 


The Memory-For-Designs Test (MFD; Gra- 
ham & Kendall, 1960) is a frequently used de- 
vice for the gross screening of brain damage. If 
performance on a series of standard figures falls 
below certain normative standards, impairment of 
cerebral function is inferred. As a consequence of 
studies involving alternative scoring systems for 
the MFD (Grundvig, Ajax, & Needham, 1973; 
Grundvig, Needham, & Ajax, 1970), the clinical 
impression developed that the embellishment score 
(Taylor, 1961) was related to confabulatory be- 
havior in patients, The purpose of this study was 
to determine whether these two characteristics oc- 
curred in the same individuals. If such were the 
case, the Embellishment score on the MFD might 
be used as a predictor of verbal confabulation in 
the social world of the patient. The term con- 
fabulation has been used in a variety of ways 
Tanging from the habit of talking without regard 
for the truth to outright fabrication, with or with- 
Out insight. Our definition simply required that 
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the patient habitually tell improbable stories. No 
specific attempt was made to rule out intentional 
deception, exaggeration, or delusion, 

Subjects were selected from among chronic-care 
patient populations in the Nursing Home Care 
Unit and a psychiatric unit for brain-damaged and 
chronic schizophrenic patients at the Veterans Ad- 
ministration Hospital, Knoxville, Iowa, Treatment 
teams on the units were asked to identify patients 
who habitually told stories of doubtful validity, 
since patients and staff have had long-standing 
stable interactions, Sixteen patients about whom 
there was general staff agreement were selected for 
the confabulator group (M age = 58.9 years, SD 
=9.9), and 16 patients, matched for age, were 
selected for the nonconfabulator group (M age = 
60.7 years, SD =10.5). Nonconfabulators were 
not chosen to be representative of nonconfabula- 
tors in the general population but of nonconfabu- 
lating patients in chronic neuropsychiatric wards. 
Although not matched for diagnosis, the diagnostic 
representations in the two groups were comparable 
in terms of the broad categories of chronic brain 
syndrome and schizophrenia. Major classifications 
were as follows: confabulators—10 chronic brain 
syndrome, 5 schizophrenics, and 1 manic-depres- 
sive; nonconfabulators—11 chronic brain syndrome 
and 5 schizophrenics. 

The MFD and Wechsler Memory Scale (Form 
I) (Wechsler & Stone, 1945) were administered 
to all patients. The authors who administered and 
scored the MFD and Wechsler Memory Scale 
were unaware of patient classification on confabu- 
lation, thus ruling out the possibility of subjective 
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bias. The Wechsler Memory Scale was adminis- 
tered to determine comparability of the two groups 
in memory functioning. The MFD was scored by 
the conventional Graham and Kendall (1960) scor- 
ing system as well as the modified version (Grund- 
vig et al., 1970) of the Taylor (1961) system. The 
only score from the modified system that is re- 
ported in the present study was the Embbellish- 
ment score. Each of the 15 designs was inspected 
for the presence of embellishments, that is, addi- 
tional features that alter the original figure. Only 
1 point was scored for each design containing em- 
bellishment(s), Therefore, the total possible Em- 
bellishment score was 15. To determine the relia- 
bility of this score, two of the authors and a 
graduate student assistant scored the MFD for 
embellishments, Pairwise reliability coefficients 
among the three scorers were .85, .54, and .47. 
The lower two coefficients were obtained from an 
author whose only exposure to scoring was to study 
Taylor’s (1961) thesis, which gave verbal instruc- 
tions for scoring embellishments, The first coeffi- 
cient was obtained between the second author and 
a graduate assistant whom he had trained person- 
ally. Some explicit shaping of criterion classifica- 
tion beyond the reading of Taylor’s original defini- 
tions and examples seems necessary to improve 
interscorer reliability. Beyond the computation of 
reliability coefficients, only the Embellishment 
scores of the second author were used, 

Results show that the confabulators embellished 
their drawings on the MFD almost twice as fre- 
quently (M = 4.9, SD = 3.2) as the nonconfabu- 
lators (M =2.5, SD = 1.5), #(30) = 2.79, p< 
.01, two-tailed, Furthermore, this relationship was 
true for both brain-damaged and schizophrenic vet- 
erans, The mean Embellishment score for the 10 
brain-damaged confabulators (M = 4.9) was sig- 
nificantly greater than for the 11 brain-damaged 
nonconfabulators (M = 2.4), #(19) = 1.99, p< 
05, one-tailed. Likewise, the Embellishment score 
for the 5 schizophrenic confabulators (M=5.6) 
was significantly greater than that for the 5 schizo- 
phrenic nonconfabulators (M =2.6, t(8) = 2.83 
p< .02, one-tailed. In addition to finding these 
mean differences between groups, we examined the 
distribution of Embellishment scores to determine 
the optimal degree of separation between confabu- 
lators and nonconfabulators, We obtained a hit rate 
of 69%. 

The groups did not differ significantly in te: 
of the conventional MFD error score of Gahan 
and Kendall (1960) (for confabulators, M = 16.7, 
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SD= 14.2; for nonconfabulators, M = 13.8, sp 
= 9.0). These values reflect considerable percep- 
tual-motor impairment. This conventional MFD 
error score was positively correlated with the Em- 
bellishment score, r(30) = .63, p < .001. 

The confabulators and nonconfabulators were 
practicaly identical in overall memory functioning 
as measured by the Wechler Memory Scale. Means, 
not age corrected, for the confabulators and non- 
confabulators, respectively, were 36.8 (SD = 
11.0) and 36.2 (SD = 7.7). These values corre- 
sponded to a Wechler Memory quotient of 79 and 
indicated that both groups had significant memory 
impairment. 

In this investigation we began with a somewhat 
concrete, but hopefully, operational definition of 
confabulation in the real world. Our impression of 
a relationship between the tendency to confabulate 
in the real world and to embellish abstract design 
reproductions was borne out. Two groups of pa 
tients sharing the common characteristics of general 
memory and perceptual-motor impairment were 
distinguishable on the basis of a specific type of 
error production, that is, embellishment. This was 
true for both brain-damaged and schizophrenic pa- 
tients and was related to a tendency to confabu- 
late in their real-world interactions with hospital 
staff personnel. Results suggest that confabulation, 
as defined in this study, has perceptual, spatial, 
and motor correlates as well as the well-recognized 
deficit in verbal behavior. 
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Role of Self-instruction and Self-reinforcement in the 


Modification of Impulsivity 


Wilbur J. Nelson, Jr., and John C. Birkimer 
University of Louisville 


This study sought to determine which components of a previously successful training 
procedure are necessary in modifying children’s impulsivity. Forty-eight impulsive 
second and third graders of both sexes were used as subjects. Changes in errors and 
response latencies on Kagan’s Matching Familiar Figures Test were used as criteria. 
Subjects were divided into four groups: (a) self-instruction (SI); (b) self-instruc- 
tion/self-reinforcement (SI/SR); (c) attention control; and (d) assessment control. 
Only SI/SR subjects had significantly reduced errors and significantly increased 
latencies (p < .001). Results indicate that self-reinforcement is a necessary aspect of 


training in the modification of impulsivity. 


Training in self-verbalization procedures has been 
applied to a variety of target behaviors, such as 
impulsivity (Meichenbaum & Goodman, 1971). In 
this study, subjects were taught to change their 
private speech as a strategy for modifying their im- 
pulsive behavior, However, little attention was 
given to the actual components of the self-verbali- 
zations that were used. Thus, both self-instructive 
(e.g., “draw the line down, down)” and self-rein- 
forcing (e.g., “good”) components were included. 
Thoresen and Mahoney (1974) have drawn a dis- 
tinction between these categories of verbal behav- 
ior, thus producing uncertainty as to how the ob- 
tained results are to be interpreted. 

The purpose of this research was to investigate 
the effect of self-instruction alone, in addition to 
the effect of a combination of self-instruction and 
self-reinforcement, on an impulsive response style. 
Thus, the design provided a test of whether both 
Components were necessary to produce a change. 
Since the Matching Familiar Figures Test (MFF) 
has traditionally been used as a measure in this 
area, it was used in the present study as a pretest 
and a posttest. Treatment effects were determined 
by changes in number of errors committed and 
latencies to response. 

The 48 subjects were black children, both males 
and females, enrolled in the second and third 
grades of a public shool. An MFF pretest was 
Used to identify the impulsive children from an 
initial total of 140. Subjects were randomly as- 
Signed to each of four training groups. For chil- 
dren in the self-instruction (SI) group, the ver- 
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balizations were composed of self-instruction alone, 
whereas subjects in the self-instruction/self-rein- 
forcement (SI/SR) group received a combination 
of self-instruction and self-reinforcement, The 
training techniques were similar to those used by 
Meichenbaum and Goodman (1971). The no-self- 
verbalization controls received treatment identical 
to that of the experimental subjects but without 
self-verbalization training, and the assessment con- 
trols were given only the pretest and the posttest. 

Simple main effects analyses of variance indi- 
cate that SI/SR children made significantly fewer 
errors, F(1, 44) =15.16, p <.001, and had a 
significant increase in latency to response, F(1, 44) 
= 13.20, p <.001, from pretest to posttest MFF. 
No significant changes were revealed for subjects 
in any of the other conditions, These results are 
in agreement with those obtained by Meichenbaum 
and Goodman (1971). However, the present find- 
ings demonstrate that the emission of self-instruc- 
tion alone does not produce the same results that 
were obtained in the earlier study, These data 
provide clear-cut support for the inclusion of self- 
reinforcement training as a component in cogni- 
tive self-instruction packages designed to modify 
children’s impulsivity. Further research is needed 
to determine whether the self-reinforcement as- 
pect is equally critical in modification of other 
types of behavior and skills. 
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The Palmar Sweat Index as a Function of Repression—Sensitization 
and Fear of Dentistry 


Charles E. Early and Ronald A. Kleinknecht 


Western Washington University 


This study investigates the relationship between the repression-sensitization dimen- 
sion and palmar sweating in response to simulated dental threat stimuli (drill sounds). 
Sensitizers were found to be more physiologically aroused than were repressors during 
a brief relaxation period and during a presentation of drill sounds but not during 
initial or final measurements. Sensitizers also reported being more fearful of dentistry 
than repressors. These results are seen as consistent with Mischel’s social-learning 
formulation of repression-sensitization in terms of differential attention to threatening 
stimuli and contrary to repression hypotheses. 


Physiological response to threatening situations 
is highly variable and often shows low to moderate 
correlations with behavioral or self-report mea- 
sures. The present study was designed to investi- 
gate an individual difference variable, repression— 
sensitization, which is theoretically related to pat- 
terns of response to threat and which might serve 
to account for variability in physiological response. 

Lazarus and Alfert (1964) found consistent 
with repression hypotheses that Tepressors, who re- 
ported less arousal to threatening movies than did 
sensitizers, showed greater levels of skin conduc- 
tance response, 

The Palmar Sweat Index (PSI), shown to co- 
vary with skin conductance, was used in this study 
as the measure of physiological arousal in response 
to simulated dental drill sounds (threat stimulus), 
Sixty female undergraduates were individually 
brought into the experimental room where they 
completed the Byrne Repression-Sensitization 

(R-S) scale, They were seated in a reclining chair, 
and the first PSI measure was taken. Next, they 
listened to a 5-min tape recording of relaxation 
instructions, designed to reduce arousal associated 
with the experimental setting, followed by the sec- 
ond PSI. Then on an alternating basis, subjects 
listened to one of two 15-sec tape recordings: one 
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with sounds of an actual dentist’s drill and one 
of a child’s windup toy car, which simulated the 
whining sound of a dentist’s drill, Subjects were 
instructed to think of the respective tapes as a 
dentist’s drill or a windup toy, At the end of each 
tape, the third PSI was taken. Following a brief 
question period, subjects were told the session was 
over, and the fourth PSI was taken, Before leav- 
ing, subjects completed a 20-item dental fear sur- 
vey to be correlated with R-S scores. 

PSI responses were analyzed using a mixed anal- 
ysis of variance, with two levels of repression- 
sensitization (median split) and two sounds (drill 
or car), with repeated measures at the four points 
designated above. A significant main effect for 
repression-sensitization was found showing sensiti- 
zers to be more physiologically responsive than te- 
pressors, F(1, 56) = 6.44, p < .05. Since our in- 
terest was in the differential physiological response 
of repressors and sensitizers to the various condi- 
tions, responses were analyzed at each of the four 
measurement points. This analysis showed sen- 
sitizers to be more responsive than repressors dur- 
ing relaxation and during presentations of sounds 
but not on the initial or final measurement. 

A significant trials effect, F(3, 148) = 17.64, 
p < .01, and a Trials X Stimulus Group inter- 
action, F(3, 168) = 3.62, p < .05, were also found. 
Analysis of the interaction showed that subjects 
who heard the drill gave greater responses on Trial 
3 than on the other trials (which is to be expected, 
since that was the only trial on which different 
stimuli were presented). ; 

The fact that sensitizers showed greater physio- 
logical responsiveness during both relaxation an 
threat conditions than repressors is seen as CON- 
trary to the repression hypothesis and the Lazarus 
and Alfert (1964) data, Further, subjects’ self- 
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reports appeared to be consistent with their physi- 
ological response as evidenced by a moderate but 
significant correlation of .38 between the dental 
fear survey and repression-sensitization, suggesting 
that repressors also report being generally less fear- 
ful of dentistry than sensitizers, These data would 
seem to be in accord with Mischel’s (1976) re- 
formulation of the repression-sensitization dimen- 
sion as reflective of individual differences in learned 
patterns of attending to or avoiding potentially 
threatening stimulation, a process that could rea- 
sonably be expected to affect physiological re- 
sponsiveness. An individual difference dimension 
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such as repression-sensitization might well serve 
to account for some of the observed variance in 
physiological responsiveness in fear research. 
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Effects of “Status Sets” on Rotters Locus of Control Scale 


Kay M. Davidson and Kent G. Bailey 
Virginia Commonwealth University 


Subjects were given sets based on varying levels of social class to determine the 
susceptibility of Internal-External Locus of Control Scale (I-E) scores to situation- 
ally induced frames of reference. College students (N = 90) took the I-E scale twice: 
once while playing the role of a person who had just been hired for one of six 
ranked occupations and once from their own frame of reference. “In-role” I-E scores 
revealed the expected positive relationship between levels of status and internality, 
and these scores differed from the subject’s “out-of-role” normal responses. Present 
results are in accord with recent findings that the I-E scale may be subject to faking 


and situational effects. 


It was recently reported (Deysach, Hiers, & 
Ross, 1976) that “internal” and “external” instruc- 
tional sets affected Internal-External Locus of 
Control Scale (I-E) scores, and, further, scores 
so obtained differed from those reflecting the sub- 
ject’s own personal beliefs. Add to this the con- 
sistent positive empirical relationship between social 
status and internality (Rotter, Chance, & Phares, 
1972), and the question of “faking good” on the 
I-E scale arises. The present study investigated 
how well a group of subjects could approximate 
the empirically established relationship between 
I-E scores and status, The following hypotheses 
were evaluated: (a) Groups of subjects given dif- 
ferent instructional sets based on varying levels of 
status can discriminate between items on the I-E 
scale such that scores will become progressively 
more internal as status increases; (b) I-E scores 
obtained under these circumstances will differ from 
the subject’s “normal” scores, 

Ninety subjects (65 males and 25 females) 
were tested, representing six ranked Status groups 
of 15 subjects each, Status levels were derived 
from a list of 90 occupations previously ranked 
according to prestige (Hodge, Siegel, & Rossi 

1964). The six jobs, from lowest to highest status, 
were Level 1, clothes Presser; Level 2, grocery 
checkout clerk; Level 3, bookkeeper; Level 4 
assistant personnel manager; Level 5, civil engi- 
neer; and Level 6, U.S. representative to Congress. 
A 50-word description was written about each job 
to include information about the job holder’s salary, 
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Analysis of variance revealed that status sets 
had a strong effect on subjects’ I-E scores “while 
in the role,” F(5, 84) = 26.34, p<.001. LE 
scores were clearly inversely related to levels of 
status. Only Status Level 5 (civil engineer) de- 
viated from a perfect negative relationship, and 
even then the deviation was not great. Means 
across the six status levels, going from clothes 
presser to U.S, representative were 19.50, 17.93, 
13.36, 8.53, 9.93, and 5.30. Analysis of variance 
on the “role” versus normal score differences also 
proved to be highly significant, F(5, 84) = 27.43, 
p < .0001. Difference score means were 11.24, 
9.73, 3.86, —4.60, —3.33, and —8.83 across the 
six status levels, and a multiple comparisons pro- 
cedure revealed that pairwise combinations were 
significant at the extremes but not in the middle 
ranges and adjacent status categories, At extreme 
status levels, then, a reverberating effect occurred 
whereby role playing an external role led to inter- 
nal self-scores and role playing an internal role 
led to external self-scores. 

The above findings indicate that subjects can 
assume an assigned status role and then produce 
I-E scores congruent with that role, Further, 
within-role and normal self-scores tended to differ 
in a way that suggested a role versus self social- 
comparison process that raises interesting possi- 
bilities for future research, 
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Effects of Model Status and Juvenile Offender Type on the Imitation 
of Self-reward Criteria 


T. John Akamatsu and Parvis A. Farudi 
West Virginia University 


To explore variables that could facilitate the application of modeling to the treatment 
of juvenile delinquents, the effects of model status and offender type on the imitation 
of self-reward criteria were examined. Immature-inadequate and gang-oriented of- 
fenders viewed videotapes of models who were either stringent or liberal in self- 
reward criteria and who were either staff members or peers. Subjects who viewed 
the liberal model rewarded themselves significantly more than subjects who viewed 
the stringent model, Significant interactions involving model status and observer type 
suggest that such factors should be considered in the development of treatment 


programs. 


In the present study the effects of both model 
status and offender type were examined in an at- 
tempt to identify variables that might be of im- 
portance in maximizing the effectiveness of model- 
ing approaches with juvenile offenders. The im- 
itation of self-reward criteria was chosen for study 
because of consistent previous results (e.g., Thelen 
& Fryrear, 1971) and because this behavior was 
thought to be relevant to the treatment of juvenile 
offenders, In addition, Kunce and Thelen (1972) 
found that stringent self-reward criteria resulted 
in improvement on actual task performance, an 
intriguing finding that warrants replication. 
„Forty-eight male subjects were exposed to a 
videotape of either a male adult or peer model 
who adopted either stringent or liberal self-reward 
criteria for his performance on a pursuit rotor task. 
Two types of juvenile offenders were chosen for 
inclusion in the study based on the system of 
Quay and Parsons (Note 1): the Behavior Cate- 
gory 1 (BC-1, immature-inadequate) type char- 
acterized by inability to cope in a complex world, 
incompetence, and immaturity; and the Behavior 
Category 4 (BC-4, socialized-subcultural) type 
characterized by having bad companions, engaging 
in gang activities, and being accepted by a delin- 
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quent subgroup. It was predicted that subjects 
would imitate the self-reward criteria of the model 
whom they observed and that model status and 
offender type would interact. That is, subjects 
from the immature-inadequate group would be 
more influenced by the adult model, and subjects 
from the socialized-subcultural group would be 
more influenced by the peer model, It was further 
predicted that exposure to the stringent model 
would result in better performance on the experi- 
mental task. 

The procedure was an adaptation of that used 
by Thelen and Fryrear (1971), Subjects were 
tested individually, were instructed that they would 
be participating in an arm-coordination task, and 
were told that the easiest way to explain how the 
test worked was to show them a videotaped demon- 
stration. Subjects were then shown one of the four 
videotapes, After viewing the tape their attention 
was directed to a chart containing normative in- 
formation, They were informed that they could 
take a token that was worth a nickel whenever 
they thought they deserved it, Subjects received 
the following predetermined scores on the six test 
trials: 6, 7, 8, 8, 7, and 9. i 

The mean number of tokens taken by subjects 
on trials in which scores of 6 or 7 were obtained 
was the dependent measure, Preliminary analysis 
indicated that no significant effects could be at- 
tributed to race of subject, so this variable was 
collapsed. A 2 (stringent-liberal) X 2 (adult or 
peer model) X 2 (behavior category type) anal- 
ysis of variance was Cai ied out on these data, The 
analysis revealed a main effect for reward cri- 
teria, F(1, 40) = 41.00, p < .0001. Subjects ex- 
posed to the stringent model took fewer rewards 
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(M = .625) than subjects exposed to the liberal 
model (M = 2.33). 

The analysis of variance also revealed a sig- 
nificant Reward Criteria X Model Status X Be- 
havior Category Type interaction, F(1, 40) 
= 4,122, p<.05. Inspection of the means re- 
vealed that as predicted, BC-1 subjects showed 
greater imitation of the staff models than peer 
models under both stringent and liberal conditions. 
The reverse was true for BC-4 subjects. Although 
the differences were in the predicted directions, 
post hoc Newman-Keuls analyses revealed that 
they were not significant. The significant interac- 
tion was a function of no difference in the imita- 
tion of stringent and liberal peer models by BC-1 
subjects, Similar comparisons for BC-4 subjects 
and for BC-1 subjects who were exposed to staff 
models were all significant (p < .01). 

Mean time on target for the six pursuit rotor 
trials was analyzed with a 2 (stringent ys. liberal) 
X 2 (staff vs. peer) X 2 (BC 1 vs. BC 4) x 6 
(trials) repeated measures analysis of variance, 
which revealed a main effect for reward criteria, 
Subjects exposed to the stringent models (M = 
5.70) performed significantly better than subjects 
exposed to the liberal models (M = 4.02), F(1, 
40) = 6.91, p < .02. A main effect for behavior 
category type was also detected such that BC-4 
subjects performed better than BC-1 subjects 
(M= 5.56 vs. 4.17), F(1, 40) = 4.73, p<.03. 
A significant interaction between model status and 
behavior category type was also detected, F(1, 40) 
= 8.122, p<.01. Newman-Keuls analyses re- 
vealed that BC-1 subjects who observed the staff 
model (M = 3.22) performed significantly more 
poorly than BC-1 subjects who observed the peer 
model (M = 5.11, $ < .05) and BC-4 subjects 
who observed the staff model (M = 6.45, p <.01). 

The results of the study confirm that self-reward 
criteria are imitated by juvenile offenders even 

when less stringent norms are Provided, The im- 

portance of model status and juvenile offender 
type in determining the amount of imitation shown 
is at Teast partially supported by the data. Differ- 
ences in imitation as a function of these variables 
were found as indicated by the significant three- 
way interaction, Although differences were in the 
predicted directions, they were not significant, It 
may be that the restricted range of possible re- 
sponses (0-3) resulted in a ceiling effect that ob- 
scured differences, 

A differential Susceptibility to staff vers 

models among BC-1 subjects is stipported by’ the 
data. For these subjects even the potent reward 
criteria manipulation did not affect imitation of 
the peer model. It would appear that the staff 
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model was a more salient source of information for 
BC-1 subjects than was the peer model. 

The results of the analyses on the pursuit rotor 
scores confirm the finding of Kunce and Thelen 
(1972) that observation of a stringent model re- 
sults in better task performance. Thus, modeling 
procedures do seem to have effects that go beyond 
the simple replication of the observed behavior. 
Greater motivation to perform well, or greater at- 
tention to the task at hand, might result from 
exposure to a stringent model. 

The finding of better performance among BC-4 
subjects is not too surprising, since on a common- 
sense level one might expect such subjects to per- 
form better on motor tasks than immature-inade- 
quate types. Involvement by BC-4 types in a 
broader variety of activities or competitive inter- 
action with peers in sports or games could result 
in superior performance. 

The interaction of model status and behavior 
category type found in the present study further 
supports the notion of differential susceptibility to 
staff and peer model influences among offender 
types. The BC-1 subjects performed more poorly 
in the staff model condition perhaps as a function 
of the perceived discrepancy in abilities between 
themselves and the model, Such a difference was 
not found among BC-4 subjects. 

In summary, the results of the present study 
have confirmed the feasibility of the application 
of modeling techniques to the treatment of juve- 
nile delinquents. Both direct observational learning 
and indirect motivational effects have been dem- 
onstrated. Results concerning the effects of model 
status and offender type, although not conclusive, 
suggest that consideration of such variables may 
be of importance in creating more individualized 
and maximally effective treatment programs. 
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The purpose of the study was to investigate the factor structure and scale reliabil- 
ities of Gough’s Adjective Check List (ACL) and to assess their stability over time. 
Employees in a community mental health center completed the ACL twice, separated 
by a 1-year interval. After each administration, separate factor analyses were com- 
puted, All scales had highly significant test-retest reliabilities. Five factors emerged 
in each analysis, two of which accounted for about 55% of the common variance. 
Repetition of factor analysis at two different times resulted in a more stable factor 
structure than did the usual method of single-time analysis. 


The purpose of this research was to investi- 
gate the factor structure of Gough’s Adjective 
Check List (ACL; Gough & Heilbrun, 1965) 
and to assess the stability of this structure over 
time. Since Gough’s original article (Gough, 
1960), the ACL has been widely used in per- 
sonality assessment. Recent studies illustrate 
the diversity of its application (eg., Patrick, 
e & Masterson, 1974; Sowa & Lutter, 

In spite of its popularity, few studies have 
been done to assess the internal characteristics 
of the ACL’s 24 scales. In factor-analytic in- 
vestigations, Parker and Megargee (1967) found 
4 factors and Scarr (1966) found 10. Neither of 
the studies provided evidence about the stability 
of those factors over time. The only available 
information about scale reliabilities is provided 
by Gough and Heilbrun (1965). 

s Respondents were 71 professional employees 
in a community mental health center. Of the 71, 
35 were males. Their average age was 36.2 years, 
and their average number of years of formal 
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education was 16.9. Sixty-two of the participants 
were white, and the rest were distributed across 
various minority groups. As part of a larger 
project, the respondents completed the ACL 
twice, at approximately a 1-year interval. 

After each administration, separate factor 
analyses, using varimax rotations, were com- 
puted on the T scores for the 24 ACL scales. 
The two resulting factor structures were com- 
pared by calculating a coefficient of congruence 
(Harman, 1967, pp. 269-272) for each scale 
score. Test-retest reliabilities were also calcu- 
lated for each of the 24 scales. 

The scales comprising each of the five factors 
and their loadings on the factors for each ad- 
ministration are given in Table 1. To be included 
in the factor, the scale had to have a loading 
greater than an arbitrarily decided on .40 in 
each of the two analyses. Eighty-one percent of 
the total communality was accounted for by 
each of the five factors for each administration. 
All of the coefficients of congruence were sig- 
nificant beyond the .001 level. 

Test-retest correlations (ranging from a low 
of .51 to a high of .86, M =.11) for each of 
the 24 ACL scales were significant beyond the 
001 level. These reliabilities were all greater 
than those reported in the ACL manual for adult 
males (6 months between testings) and for 
medical students (54 years between testings). 

The data clearly indicate the primacy of Fac- 
tors 1 and 2. Even though Factors 3, 4, and 5 
are also replicable and interpretable, they ac- 
count for less variance in the results than do 
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Table 1 s i 
Scale Factor Loading for each Administration (tı and te) 
Factor 
4 2 3 4 5 
Scale tı ty tı ty h h tı b h h 
Self-confidence .83 81 es 
Achievement -65 68 i 2 
Dominance 87 87 Js © 
Exhibition 74 72 . 
Autonomy 87 81 A 
Aggression A O). a =: jae 
Change 55 A3 4 
Succorance =.72 —.70 
Abasement —.95 —91 
Deference —.93 —.85 
Defensiveness 15 -62 
Favorable adjectives checked 86 B81 
Unfavorable adjectives checked = TA {9 ' 
Self-control 59 57 
Personal adjustment 34 34 
Intraception 75 14 
Nurturance 85 80 
Affiliation 17 73 
Endurance 87.79 
Order 78.79 
Heterosexuality 60 62 
Counseling readiness —.64 —.67 ae 
Lability 41. 


Note. t = Time 1; t, = Time 2. 


Factors 1 and 2. Factor 1 seems to reflect 
“self-confidence and competency.” Factor 2 re- 
flects “conventional sociability.” Factor 3 ap- 
pears to reflect an “organized industriousness” 
coupled with desire for achievement, Factor 4 
reflects “lack of anxiety and self-doubt,” and 
Factor 5, “instability or changeability.” 

The results of the present study do not readily 
allow comparison with other factor-analytic 
studies. The samples in previous studies differ, 
and there is no report in these studies of at- 
tempts to assess stability of the factors across 
time. The Parker and Megargee (1967) study 
is informative, however, since these authors ob- 
tained four factors that are quite similar to the 
four found here, despite lack of sample simi- 
larity and method of administering the instru- 
ments. Of the 13 scales in their Factor 1, 4 
are present in our Factor 1. Seven of their 
Factor 1 scales are present in our Factor 2; 
and 2 are present in our Factor 4. Parker and 
Megargee’s Factor 2 includes 7 scales, all of 
which are found in our Factor 1. In the Parker 
and Megargee study, about 66% of the com- 
mon variance is accounted for by the first two 


factors; in the present study, about 55% of 
the total variance is accounted for by the first 
two factors. í 

The data are highly encouraging for users 0 
the ACL, The fact that every scale had a high 
test-retest reliability increases one’s confidence 
in their validity potential. It appears that what- 
ever the ACL scales measure, they do it con- 
sistently. This stability also appears in 
factor structure, at least for this sample. If 
findings were reported for other samples, S 
evidence for the utility of the ACL as a reliable 
measurement device would be enhanced. 

The technique of repeated factoring at T 
different points in time results in a more stal F 
factor structure than does the usual memod 
single-time analysis. One possible reason for 5 
ferences in the results reported here and ee 
reported by Parker and Megargee ( 1967) 1s F i 
in this study many scales that loaded in the HH 
administration failed to load in the second 4 7 
ministration. It would be informative to eee 
the Parker and Megargee data using a rando: i 
selected hold-out sample for cross-validation. | 
light of the many scales that “drop out during 
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cross-validation, it would not be surprising to 
find that such an analysis would provide data 
more consonant with those reported here. The 
fact that certain scales did replicate for the 
game sample across time in this study lends 
strong support for continuing to cross-validate 
factor structures for this and other personality 
diagnostic instruments. 


References 


Gough, H. G. The Adjective Check List as a per- 
sonality assessment research technique. Psycho- 
logical Reports, 1960, 6, 107-122. 

Gough, H. G., & Heilbrun, A. B., Jr. The Adjective 
Check List manual. Palo Alto, Calif.: Consulting 
Psychologists Press, 1965. 


191 


Harman, H. H. Modern factor analysis (2nd ed.). 
Chicago: University of Chicago Press, 1967. 

Parker, G. V. C., & Megargee, E. I. Factor analytic 
studies of the Adjective Check List. Proceedings 
of the 75th Annual Convention of the American 
Psychological Association, 1967,75, 211-212. (Sum- 
mary) 

Patrick, A. W., Zuckerman, M., & Masterson, -F. 
An extension of the trait-state distinction from 
affects to motive measures. Psychological Re- 
ports, 1974, 34, 1251-1258, 

Scarr, S. The Adjective Check List as a personality 
assessment technique with children. Journal of 
Consulting Psychology, 1966, 30, 122-128. 

Sowa, P. A, & Lutter, H. S, Attitudes of hospital 
staff toward alcoholics and drug addicts. Quar- 
terly Journal of Studies on Alcohol, 1974, BD) 
210-214. 


Received November 22, 1976 m 


Journal of Consulting and Clinical Psychology 
1978, Vol. 46, No. 1, 192-193 


The Seashore Tonal Memory Test as a Neuropsychological Measure — 


Carl B. Dodrill and Sureyya S. Dikmen 
Department of Neurological Surgery _ 
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The sensitivity of the Tonal Memory Test to impaired brain function was evaluated 
and compared with that of Halstead’s Neuropsychological Battery and the Trail 
Making Test. Neurologic subjects consisted of 102 individuals with histories of head 
trauma or epilepsy, and control subjects consisted of 68 individuals without histories 
of neurological problems. In general, the Tonal Memory Test differentiated the normal 
and neurologic individuals on either a subject-by-subject or group-by-group basis as 
well as did the other neuropsychological measures, and without excessive overlap 


with them. 


Few studies have been reported that have util- 
ized the Seashore Tonal Memory Test in the in- 
vestigation of the behavioral correlates of brain 
lesions. In probably the most extensive report of 
this type, Milner (1962) used it in evaluating the 
effects of temporal lobectomy for epilepsy. Al- 
though Milner’s study and others have suggested 
that the Tonal Memory Test may be useful in 
the evaluation of brain-related conditions, none 
have attempted a formal evaluation of the mea- 
sure, The goal of this study was to establish the 
discriminative value of the test in differentiating 
between normal and neurologic patients and to 
compare it with the other measures in the neuro- 
psychological battery originated by Halstead and 
developed by Reitan. 

The normal controls included 68 individuals on 
whom extensive neurological histories revealed no 
neurological problems or events that may have re- 
sulted in such problems, They were solicited from 
a variety of community agencies and not from 
any patient population, The mean age of the group 
was 27.22 years (SD = 10.07), and-the mean edu- 
cation was 12,28 years (SD = 1.29). 
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The neurologic group consisted of 57 individuals 
with seizure disorders and 45 individuals with head 
injuries. Each of these subgroups was diversified 
with respect to type and characteristic of the neu? 
rological problem in question, but a positive di 
nosis of the problem was made in each instant 
The mean age of the total group was 26.45 years, 
(SD = 10.52), and the mean education was 12.33 
years (SD = 1.85). 

The Seashore Tonal Memory Test (Seashore, 
Lewis, & Saetveit, 1960) consists of 30 pairs 0 
tonal sequences including 10 items each of three, 
four, and five note spans. One note is different in 
the two sequences of each pair, and the subject 
must identify that note by number. The final score” 
represents the number of items correctly completed. 
The other tests in the neuropsychological battery 
are well-known. a 

The results of the study are summarized in 
Table 1. The tests are arranged in this table in 
order from that producing the largest ¢ value to 
that producing the smallest. The classification of 
individual subjects was done on the basis of pie 
established cutoff scores for all measures but thei 
Tonal Memory Test. For this measure, & cutoff | 
score between 21 and 22 correct answers produced 
the largest number of correct individual classifica: 
tions, Using either the group-by-group or subject 
by-subject discriminative procedures, the Tonal 
Memory Test did as well as did the typical tesi 
in the battery in distinguishing between normal 
and neurologic patients. 

Several findings from this study are worthy of 
comment. First, the Tonal Memory Test appeal 
to be somewhat better than the Rhythm Test n 
differentiating between normal and neurologic su t 
jects. This may be due in part to the fact that il 
is technically more adequate in construction, sinte 
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Table 1 
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Data for Normal and Neurologic Patients on All Test Variables in Order of Efficacy 


Normal Neurologic 
controls* patients? 
Test variable M SD M SD t BrE 
Tapping 54.37 5. 
Impairment Index 16 ie e E ee %8 
TPT—Localization 5.57222 3.45 2.18 6.17** 69 
TPT—Time 11.47 4.67 20.71 12.89 —5.66** 68 
Trail Making Test, Part B 59.06 18.35 104.08 73.70 —4,93** 59 
Tonal Memory 2493 469 20.29 7il 473" 62 
Speech Sounds 4.41 1.98 8.34 6.97 —4.52** 59 
Trail Making Test, Part A 24.38 7.16 38.17 2485  —4.45** 57 
TET Meno 8.06 1.20 7.00 1.91 4,.07** 49 
ened 32,84 18.86 47.25 27.06 —3.82* 55 
ythm 27.00 2.20 25.24 3,44 3:159 61 


Note, TPT = Tactual Performance Test. 
`n = 68. 
dn = 102. 
a < .001 (¢ > 3.35). 
p < 0001 (¢ > 3.99). 


it has a lower chance score (8) in comparison with 
the Rhythm Test (15) and therefore a greater 
range in scores, Its greater difficulty also tends to 
Promote a broader range of scores and greater 
differentiation between groups. 

Although the effectiveness of the Tonal Mem- 
Sa Test in the classification of individual subjects 
T T modest, we observed that the classifica- 
4 + the other better known tests was usually 
a etter and sometimes somewhat worse (eg., 
A e . In addition, in studies in which there 
le a iene oy better classification of subjects 

‘au eitan, 1955), we observed that most of 
erate patients had grossly discernible evi- 
Ade vf tissue damage to the cerebral hemispheres. 
Ty re evaluation of our patients is not likely 
Hons utinely show such grossly discernible altera- 
The haa structure, They did show very 
7 3 e signs of impaired brain functions, however. 
ELAN neurologic group might be better de- 
Er as showing signs of “brain impairment” 
the er than “brain damage,” and it is possible that 
eae discrimination by some of the mea- 
nah ‘ound by us may be due, at least in part, 
i difference. 

ea respect to overlap between the Tonal 
sas aa ae other tests, the median correlation 
ele: for normal controls and .49 for neuro- 
: ian: This appears to be no higher than 
i relations of the other tests among them- 


T] 
he apparently superior discriminating ability 


of the Tapping Test is explainable at least in part 
by the fact that 47 patients were taking Dilantin 
at the time of testing. Dodrill (1975) has shown 
that this drug has a fairly specific impact on 
motor performance, and the striking intergroup 
differences likely thus reflect both impaired brain 
functions and drug effects. 

Overall, the Seashore Tonal Memory Test is a 
short measure that differentiates normal from neu- 
rologically impaired individuals at levels of sta- 
tistical significance that are consonant with those 
already used in the neuropsychological battery, 
and it does so without undue overlap with these 


measures, 
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Relationships Between Dimensions of Anxiety and Sensation Seeking 


Barry R. Burkhart, Raymond M. Schwarz, and Samuel B. Green 


Auburn University 


To determine more precisely the relationships between the general dimensions of 
sensation seeking and anxiety, 242 undergraduates (130 males, 112 females) completed 


the Sensation Seeking Scale (SSS) and the S-R Inventory of General Trait Anxious- 
ness (S-R GTA). The intercorrelations among the five scales from the SSS and the 


four scales from the S-R GTA were computed and compared to theoretical predic- 
tions, In general, the empirical findings were consistent with rational and theoretical 
notions. The majority of the correlations were negative, with the strongest relationship 
existing between anxiety in physically dangerous situations and sensation-seeking 
needs. However, the marked variation in the intercorrelations, ranging from moder- 
ately negative to low positive, is interpreted as supporting the necessity of multi- 
dimensional measures of both the anxiety and sensation-seeking constructs. 


Recent research has established that a com- 
prehensive theory of motivation must incorpo- 
rate both tension reduction and stimulus seek- 
ing as complementary, multidimensional con- 
structs (McReynolds, 1971), Development of 
theory and research in this area clearly would 
be enhanced by the specification of the empiri- 
cal relationships between the various dimensions 
of anxiety and stimulus seeking. Segal (1973) 
reported such data for the Sensation Seeking 
Scales (SSS; Zuckerman, Note 1) and the S-R 
Inventory of Anxiousness (Endler, Hunt, & 
Rosenstein, 1962). As predicted, most of the 
correlations between the dimensions of sensa- 
tion Seeking and specific anxiety stimulus situ- 
ations were negative, although they varied greatly 
in magnitude. The strongest negative relation- 
ship occurred between anxiety responses in sit- 
uations involving physical danger and sensation 
seeking. 

The present study was conducted to extend 
the results reported by Segal (1973) by at- 
tempting to determine the relationships between 
the general dimensions of sensation seeking and 
anxiety, The instruments used were the SSS 
and an extensive revision of the S-R Inventory, 
the S-R Inventory of General Trait Anxiousness 
(S-R GTA), which was constructed to rectify 
the psychometric deficiencies and limited gen- 
eralizability of the S-R Inventory of Anxious- 
ness (Endler & Okada, 1975). Unlike the older 
version, which used 11 specific situations, the 
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S-R GTA is composed of four general stimulus 
situations: interpersonal, physically dangerous, 
new or ambiguous, and routine situations. The 
SSS is composed of five separate factorially de- 
rived scales: the General scale, Thrill and Adven- 
ture Seeking (TAS), Experience Seeking (ES), 
Disinhibition (Dis), and Boredom Susceptibility 
(BS). 

The subjects for this study were 242 under- 
graduates, both males (m= 130) and females 
(m=112), at Auburn University. Each student 
was asked to complete the SSS and the S-R 
GTA. Initially, intercorrelations were compute?) 
separately for males and females between the 
four scales of the S-R Inventory and the five 
SSS scales. In light of the fact that just 1 of 
the 20 pairs of correlations differed significantly 
for males and females and due to space limi 
tations, only the correlational matrix for males 
and females combined is presented. . _ 

The results of the correlational analysis a 
presented in Table 1. The magnitude and t 
sign of the correlations between the SSS scales 
and the S-R Inventory subscales varied i 
moderately negative for Situation 2 (physi 
danger) to low positive for Situation 4 (innoct! 
ous-routine), with Situation 3 (ambiguous-neti 
and Situation 1 (interpersonal) falling betw i 
the two extremes. This ordering of the cornea 
tions is rationally appealing. Individuals W 
prefer to participate in sensation-seeking yA. 
ties tend not to be anxious in physically da 
gerous situations. On the other hand, peop 
who are anxious in innocuous, routine situan 
may be expected to seek sensation-producing K 
periences. (It should be noted that the posit" 
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Intercorrelations Between Sensation-Seeking Factors and Subscales of the S-R Inventory of 


General Trait Anxiousness 


Re A E 


S-R Inventory subscale 


oe Se ae eee 


SSS factor scale Interpersonal Physical danger Ambiguous Innocuous 
| 
General SSS ah —,30*** —.23*** 18** | 
Thrill and adventure seeking —.09 —.37*** —.18** 10 
Experience seeking —.14* —.31*** —,23*** 20%* 
Disinhibition —.03 —.18** —.09 an 
Boredom susceptibility =-15* —.22*** —.25%** 24 
Note. SSS = Sensation Seeking Scale, 
*p < .05. 
"> < 01. 
mtp < .001. 
Reference Note 


relationship between sensation seeking and anxi- 
ety arousal in routine situations was significant 
only for females.) Ambiguous-new situations 
and interpersonal situations were not as clearly 
related to sensation seeking, although the former 
is more similar to the physically dangerous 
Situations than the latter. The correlational pat- 
tem supported this continuum. 


These results partially replicate the findings 
of Segal (1973). Again, the majority of the 
Correlations were negative, with the largest ones 
associated with situations that involved physical 
danger. However, these results also suggest that 
differential relationships exist for different di- 
Mensions of anxiety and sensation seeking. This 
theoretically consistent variation makes a com- 
Pelling case for the necessity of multidimen- 
sional measures of both constructs and provides 
Valuable evidence of the construct validity of 
their measures, 


1. Zuckerman, M. Manual and research report for 
the Sensation Seeking Scale. Unpublished manu- 
script, University of Delaware, 1975. 
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Psychotherapy Process Variables Distinguishing the “Inherently Helpful’ 
Person from the Professional Psychotherapist 


Beverly Gomes-Schwartz and Joseph M. Schwartz 
Vanderbilt University 


Therapeutic hours conducted by analytically oriented, experientially oriented, and 


nonprofessional (“inherently helpful” college professors) therapists were rated along 
eight process dimensions—therapist exploration, therapist directiveness, feeling atten- 
tion, task orientation, therapeutic relationship, patient exploration, patient negativism, 
and patient psychic distress. Between-group differences were obtained on six of the 
eight dimensions. Only patient negativism and patient exploration failed to yield 
significant effects. These results substantially replicated the findings of previous 


analogue investigations, 


Although nonprofessional therapists often as- 
sume major responsibilities for mental health 
care, little is known about how the untrained 
but “inherently helpful” nonprofessional con- 
ducts therapy. In simulated interviews, untrained 
“therapists” tend to make directive interventions 
rather than exploring patients’ feelings or ex- 
periences (cf. D’Augelli, Danish, & Brock, 1976). 
Nonetheless, in similar mock interactions, non- 
professionals (e.g., untrained college students; 
Pope, Nudler, VonKorfi, & McGee, 1974) were 
as warm, genuine, and empathic as experienced 
therapists. Indeed, the interviewees felt more 
accepted and were less anxious in interviews 
with nonprofessional, as opposed to professional, 
therapists. 

Previous discussions of professional-nonpro- 
fessional differences have not taken into account 
the effects of therapists’ theoretical orientations. 
However, evidence from surveys of therapists’ 
self-described techniques (cf. Sundland & Barker, 
1962) and from studies of therapists’ Tesponses 
in analogue interviews (cf. Strupp, 1960) sug- 
gests that analytically oriented therapists rely 
on interpretive techniques, whereas Rogerian or 
experiential therapists focus on establishing 
warm, personal relationships. 


Se 

This research was supported by National Insti- 
tute of Mental Health Grant 20369, The contribu- 
tions of H. H, Strupp, principal investigator, and 
D. Hartley, S. Hadley, and G. Blackwood are grate- 
fully acknowledged. 

Requests for reprints and for an extended report 
of this study should be sent to Beverly Gomes- 


Schwartz, who is now at Department of Psychol- 
ul McLean Hospital, Belmont, Massachusetts 


Although the activities of novice therapists 
in simulated interactions may be similar to those 
of nonprofessionals counseling genuinely dis- 
turbed clients, generalization of findings from 
analogue investigations must be viewed with 
caution. The present study was undertaken to 
determine whether previous findings would be 
replicated in the actual therapeutic interactions 
of analytic, experiential, and untrained thera- 
pists. 

The patients were 25 unmarried, male college 
students with elevated scores (J > 60) on Min- 
nesota Multiphasic Personality Inventory Scales 
2, 7, and O who were participating in a psycho- 
therapy outcome study. On a rotational basis, 
8 patients had been assigned to analytic thera 
pists (3 male psychiatrists, M experience = 29.0 
years), 7 to experiential therapists (2 male psy- 
chologists, M experience = 19.5 years), and 10 
to alternate therapists. The 6 alternate thera- 
pists were experienced male professors 
years since PhD = 18.0) identified by univer- 
sity administrators, faculty, and students 4 
teachers who were frequently approached by 
students for personal counseling. Although the 
professors had no formal psychotherapy training, 
they were recognized in the academic COM- 
munity as inherently helpful people. ‘a 

Two advanced clinical psychology gradual 
students independently rated videotapes of rd 
third therapy session on the Vanderbilt Psycho 
therapy Process Scale (VPPS; Strupp, Hartley 
Blackwood, Note 1). This 84-item Like 
scale was developed from earlier work by ‘ 
linsky and Howard (1967) to rate psychothe 
apy process from the perspective of a clinica 
observer. Eight subscales derived from the P 
strument were internally consistent (average co: 
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eficient alpha = .82) and were rated with satis- 
factory agreement (average interrater r = .82). 
Two scales tapped therapist factors—Therapist 
Exploration and Therapist Directiveness. Three 
scales measured patient dimensions—Patient Ex- 
ploration, Patient Psychic Distress, and Patient 
Negativism. Three scales gauged combined pa- 
tient and therapist contributions—Feeling At- 
tention, Task Orientation and Therapeutic Re- 
lationship. 

Based on previous analogue findings, the fol- 
lowing differences among treatment groups were 
hypothesized and tested via univariate F tests 


and Newman-Keuls comparisons (p < .05 signifi- 


: 


cance level) : 

1, Analytic therapists and their patients were 
expected to engage in greater exploration of 
psychodynamics than either experiential or al- 
temate dyads, The predicted effect was obtained 
on ratings of Therapist Exploration, F(2, 22) = 
6.00, p < .01. Analytic therapists received higher 
Scores than experiential or alternate therapists. 
Although differences among the groups on Pa- 
tient Exploration were in the predicted direction, 
they failed to reach significance, F(2, 22) = 
2.07, p = .15. 

2. Alternate therapists were expected to use 
More directive interventions (e.g., concrete sug- 
gestions) than either professional group. Al- 
though overall differences on Therapist Directive- 
Ness were significant, F(2, 22) = 3.56, $ < .05, 
only analytic therapists were less directive than 
alternates. 

3. Analytic and experiential dyads were ex- 
Pected to focus on important therapeutic issues, 
Particularly examination of patients’ feelings, to 
å greater extent than alternate dyads. On both 
Task Orientation and Feeling Attention, overall 
differences were significant, F(2, 22) = 4.76, p 
<.05, and F(2, 22) = 5.82, p<.01, respec- 
tively. Analytic and experiential pairs received 

gher scores than alternate pairs on both di- 
mensions, 

4. Experiential and alternate therapists were 
€xpected to maintain friendlier, more open re- 
{tionships with their patients than were ana- 
ke therapists. Ratings of Therapeutic Rela- 
a yielded a significant overall effect, F(2, 
2) = 10.86, p<.001, and the predicted differ- 
ences among groups. 

5. No differences among groups in patients’ 
Rance and expression of hostility were an- 
fopated, and no differences on Patient Nega- 
ism were obtained (F < 1). 

4 6. Patients seen by alternate therapists, as 
nosed to professionals, were expected to mani- 
êst less anxiety in the session. Although a sig- 
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nificant overall effect was obtained on ratings 
of Patient Psychic Distress, F(2, 22) = 4.29, p 
<.05, only patients seen by analytic therapists 
were more distressed during the session than 
those seen by alternate therapists. 

The findings in this study are largely con- 
sistent with previous analogue results. Congru- 
ent with their theoretical preferences, analytic 
therapists explored psychodynamics, whereas ex- 
periential therapists offered warmth, empathy, 
and genuineness. In contrast to the professionals 
who tend to focus on significant therapeutic is- 
sues, alternate therapists seemed to engage in 
informal conversation and advice giving. Per- 
haps their patients’ lesser anxiety in the ses- 
sions may be attributed to the alternate thera- 
pists’ reluctance or inability to examine sensitive 
intrapsychic or interpersonal issues. 

Although the present results suggest that un- 
trained nonprofessionals should not be used 
with the expectation that they offer the same 
services as professional therapists, no conclusions 
as to the effectiveness of the nonprofessionals’ 
interventions can be drawn from these data. 
Indeed, patients in this study were equally satis- 
fied with their therapist, regardless of the thera- 
pist’s professional status. 

Results of the ongoing Vanderbilt psycho- 
therapy project from which the present data 
were derived are likely to further clarify the 
impact of observed variations in therapeutic 
approach on outcome. 
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“Acceptance,” Values, and Therapeutic Change 


Larry E. Beutler 
Department of Psychiatry 
Baylor College of Medicine, Houston, Texas 


Stephen Pollack 
University of Houston 


Avis Jobe 
Texas Department of Corrections, Huntsville 


This article studied the role of therapist acceptance of patient values, patient accept- 
ance of therapist values, and value persuasion on outcome among 13 psychotherapy 
dyads. A priori assessment of value acceptance was related to patients’ perceptions of 
their therapists and ratings of improvement, with outcome being enhanced by selec- 
tive value rejection as well as acceptance, A strong relationship (p < .01) was found 
between the patients’ acquisition of their therapists’ values and their ratings of 


improvement. 


The current project was designed as an effort 
to evaluate psychotherapy as a social persuasion 
process. Hence, self-ratings of improvement and 
value change were assessed as influenced by (a) 
the therapist’s ability to accept the values of the 
patient, (b) the patient’s ability to accept the 
therapist’s values, and (c) the amount of initial 
similarity between the patient and the therapist on 
several value and attitudinal dimensions. 

Although the importance of therapist “accept- 
ance” is well recognized in psychotherapy research, 
little attention has been given either to the role 
of the patient’s ability to accept the therapist or 
to the possibility that some values are more im- 
portant to accept than are others. In the current 
effort to remedy this lack, 13 second-year graduate 
students ‘in clinical psychology, all having been 
trained in relationship/insight-oriented therapy, 
were used as therapists, One case was randomly 
selected from each therapist’s caseload. The pa- 
tients’ ages ranged from 17 to 25 (M = 20.5). 

Prior to beginning ‘apy all therapists were 
asked to complete a series of value questionnaires 
developed and described elsewhere (Beutler, Jobe, 
& Elkins, 1974). These scales assessed values rela- 
tive to others’ approval, the threatening nature of 
the world, God, Communism, Christianity, social 
laws, and premarital sexual behavior. Each scale 
was constructed in such a way as to derive lati- 
tudes of acceptance and rejection as well as the 
respondent’s preferred attitude on each dimension. 

After the first visit and again at the end of 12 
psychotherapy sessions, patients were also assessed 
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on the value questionnaire. On the latter occasion 
patients completed another questionnaire (Wilson, 
Morton, & Swanson, Note 1), which was designed 
to assess their improvement on three dimensions: 
satisfaction with therapy, satisfaction with the ther- 
apist, and global improvement. 

Patient and therapist compatibility on the value 
scale was assessed in two ways, The therapist's 
values were considered acceptable to the patient 
if the therapist-preferred position fell within the 
patient's latitude of acceptance. A similar strategy 
was used to assess the patient’s acceptability to 
the therapist. In addition, the role of actual simi- 
larity as opposed to acceptability was assessed by 
calculating the number of statements separating 
the patient’s and therapist’s preferred positions on 
each scale. é 

The influence of three major independent vari- 
ables on each of three outcome measures was 
assessed by means of a series of stepwise linear 
regression analyses, The results suggest that if 
either the therapist or the patient initially rejected 
their counterpart’s estimate of threat in the world 
while accepting their views on premarital sex, satis- å 
faction with therapy was increased. The same basic 
pattern also facilitated global improvement, al 
though only insofar as the therapist’s acceptance 
and rejection of the patient’s values were concerned. 

A significant correlation (r = .76, p < .01) was 
obtained between global improvement ratings and 
the degree to which patients came to acquire their 
therapists’ attitudes across the seven scales. Ap- 
parently, adoption of a therapist’s view of life 
facilitates a patient’s sense of positive growth. 4 

Finally, patients’ attitudes toward their thera 
pists were enhanced if they rejected their thera- 


pists’ belief or disbelief in God and acquired thei 


Inc. All rights of reproduction in any form reserved. 


BRIEF REPORTS 


values. On the other hand, if the therapists reject 
their patients’ opinions both of Christianity and 
approval, they also become increasingly attractive 
to their clients. 

Although limited in generalizability, the findings 
have implication for ethics, training, and the prac- 
tice of psychotherapy. The concomitant initial re- 
jection of therapists’ views of the world with the 
acceptance of their views about God and sex- 
uality are of particular significance in demonstrat- 
ing that acceptance need not be complete. Selec- 
tive rejection of the therapist’s values may also 
be important, although the dynamics of this rela- 
tionship are unclear. For example, the pervasive 
relationship between patient-therapist initial dis- 
agreement about world threat and outcome is not 
clear and deserves special research attention. 

Apparently, the therapist’s attitude toward the 
patient’s values has its greatest impact on the pa- 
tient’s feelings of growth, the patient’s attitude 
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toward the therapist’s values seems more strongly 
related to the development of trust and attraction, 
and acquiring the therapist’s values facilitates gen- 
eral improvement. 
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Effect of Physical Attractiveness on Therapists’ Initial Judgments 
of a Person’s Self-concept 


Stevan E. Hobfoll and Louis A. Penner 
University of South Florida 


The purpose of this study was to investigate the effect of a person’s physical attrac- 
tiveness on a therapist’s initial judgment of that person’s self-concept. Videotapes and 
audiotapes were made of interviews with attractive and unattractive males and 
females. Graduate students in clinical psychology rated the person they had heard 
(seen) as to self-concept. As hypothesized, physically attractive persons of both sexes 
were rated as having better self-concepts than unattractive persons. Further, the self- 
concept ratings of attractive females increased significantly from the audiotape to the 
videotape conditions, whereas the ratings of all the other stimulus persons remained 


the same. 


Social psychological research on physical at- 
tractiveness has shown that a physically attrac- 
tive stimulus person will be seen as possessing 
more positive personality attributes than a physi- 
cally unattractive person (cf. Berscheid & Wal- 
ster, 1974). Other studies have suggested that 
this stereotype may be more pronounced for 
female than for male stimulus persons. Despite 
the growing concern with sex role stereotypes as 
they may affect clinical practices (Report of the 
Task Force, 1975), there has been relatively 
little research conducted on the extent to which 
clinicians adhere to these societal stereotypes 
in their judgments of their clients. 

The purpose of this study was to examine the 
relationship between a person’s (judged) physi- 
cal attractiveness and a therapist’s estimate of 
that person’s self-concept. It was hypothesized 
that ratings of a stimulus person’s self-concept 
would be positively related to the stimulus per- 
son’s physical attractiveness and that this rela- 
tionship would be stronger for female than for 
male stimulus persons. 

Undergraduates rated photographs of 83 of 
their classmates on an 11-point scale of physical 
attractiveness. On the basis of these ratings, the 
2 most attractive males, 2 least attractive males, 
and a similar number of females were selected 
as the stimulus persons. These 8 people were 
interviewed, and a 10-minute audiotape or video- 
tape of the interview was presented to 13 male 
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and 3 female graduate students in clinical psy- 
chology, each of whom had had at least 1 year 
of clinical experience. After hearing (viewing) 
an interview with a stimulus person, the student 
clinicians rated that person on a 7-point scale of 
self-concept. Data were analyzed via two 8 X 8 
Latin squares, with mode of presentation as the 
between-subjects variable and physical attrac- 
tiveness, sex, and tapes (2 stimulus persons were 
used in each condition) as the within-subjects 
variables. 

Stimulus persons rated as most attractive were 
judged by student clinicians as having a sig- 
nificantly better self-concept than unattractive 
stimulus persons, F(1, 91) = 349.59, p < .001, 
and males were rated as having better self-con- 
cepts than females, F(1, 91) =8.29, p< 0l. 
There was a significant Mode of Presentation X 
Attractiveness interaction, F(1, 91) = 8.24, P< 
01, Although the attractive stimulus persons 
were rated as having better self-concepts than 
the unattractive people in both the audiotape 
and videotape conditions, the self-concept ratings 
of attractive persons increased significantly from 
the audiotape to the videotape conditions. The 
ratings for unattractive stimulus persons Te 
mained the same. Finally, there was a significant 
Mode of Presentation X Sex X Attractiveness in- 
teraction, F(1, 91) = 6.309, p < .05. Attractive 
females were rated as having significantly better 
self-concepts in the videotape condition than 1M 
the audiotape condition, but ratings of all other 
stimulus persons did not change. 

The finding that attractive stimulus persons 
received higher self-concept ratings in the audio- 
tape condition was unexpected. Explanations o 
this finding may lie in Berscheid and Walsters 
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(1974) argument that people judged by others 
as physically attractive may have more positive 
social experiences than unattractive persons. 
Thus, the higher self-concept ratings of attrac- 
tive stimulus persons in the audiotape condi- 
tion may have been due to these people mani- 
festing more self-confidence and better social 
skills in the interview than unattractive stimu- 
lus persons. At the same time, the results of 
this study suggest that the student clinicians 
did adhere to societal stereotypes as to the rela- 
tionship between physical attractiveness and the 
possession of positive attributes. Further, it 
would appear that these stereotypes exerted their 
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strongest influence when the stimulus person 
was an attractive female. 
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Empathy and Imagery in Avoidance Behavior Reduction 


John T. Esse 


University of Miami 


This therapy analogue study was designed 


Wallace Wilkins 


University of Maine at Orono 


to assess the relative effects of therapist 


empathy and instructed imagination of heirarchy scenes on avoidance behavior re- 
duction. Although the communication of differential therapist empathy was validated, 
behavior change attributable to therapist empathy was minor in comparison to the 
effects of imagery instructions. Imagery instructions delivered in a relatively un- 
empathetic fashion produced as much avoidance reduction as imagery instructions 
delivered in an empathetic manner. Unempathetic imagery instructions also produced 
significantly greater avoidance reduction than the establishment of an empathetic 
relationship without instructed imagery exercises. 


Although the efficacy of systematic desensitiza- 
tion, as a treatment package, has been documented, 
considerable controversy exists about the specific 
elements and the theoretical mechanisms that ac- 
count for gain, Therapist empathy, defined as an 
ability to understand a client’s experiences and 
communicate that understanding to the client, is 
also a focus of considerable debate concerning the 
conditions that facilitate therapeutic gain. In this 
study, to assess the relative contributions of im- 
agery instructions and therapist empathy, imagery 
exercises were delivered with and without empathy; 
empathy was delivered with and without imagery 
instructions. 

The subjects were 30 undergraduate female 
students who indicated “much fear,” “very much 
fear,” or “terror” on Item 39 (snakes) during a 
classtoom-administered Fear Survey Schedule II 
(Geer, 1965), and who were unable to touch with 
bare hands a live, 3-foot (.91 m) king snake 
during the Behavioral Avoidance Test (BAT; 
Nawas, 1971). All participants in the study re- 
ceived credit toward partial fulfillment of intro- 
ductory psychology course requirements. 

The pre-BAT and post-BAT were conducted by 
a female experimenter, who remained experimen- 
tally blind as to the hypotheses and design of the 
study, Following the pre-BAT, subjects were con- 
tacted for a treatment session that lasted for a 
maximum of 45 minutes, All sessions were admin- 
istered individually by a female graduate student 
after subjects were informed that “the procedure 
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which you are about to undergo has often been 
found helpful in reducing snake fear.” Immedi- 
ately following the treatment session, each subject 
completed a set of ratings, which included the 16 
empathetic understanding items from the Barrett- 
Lennard Relationship Inventory (Barrett-Lennard, 
1962), and was tested on the post-BAT. 


Mechanical Imagery Procedure 


The 10 members of this group received a modi- 
fication of the instructed attention shift procedure, 
shown by Wilkins and Domitor (1973) to be ef- 
fective in reducing avoidance behavior in a short 
period of time. The procedure was executed in a 
mechanical fashion that allowed little deviation 
from the predetermined procedure and minimized 
therapist empathetic reactions, The hierarchy of 
scenes that was presented for instructed imagina- 
tion consisted of the 20 BAT steps read in the 
second-person present tense beginning with the 
least threatening and proceeding to the most threat- 
ening scene. 


Empathetic Imagery Procedure 


The 10 members of this group received the basic 
procedures of the mechanical imagery procedure 
(MIP) delivered in a more informal, responsive, 
and spontaneous manner. During the presentation 
of the imaginal scenes, the therapist varied the 
wording of the scene descriptions and the tone of 
her voice so as to be more responsive to the feel- 
ings that the subject appeared to be experiencing. 


Empathetic Conversational Procedure 


The 10 members of this group were asked to 
talk about feelings about snakes. There was ver 
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exploration of any methods that the subject felt 
would be helpful in reducing her fear, and, at 
some point during the session, she was also asked 
to imagine herself approaching a snake and to de- 
scribe the feelings that were aroused, Beyond this 
general suggestion, however, no directives were 
given as to what she could do to become less fear- 
ful, During the entire session the therapist at- 
tempted to be maximally sensitive to and reflective 
of emotional expressions. 


Validation of Empathy Manipulation 


Within a possible range from —48 to +48, mean 
ratings of empathetic understanding for the MIP, 
the empathetic imagery procedure (EIP), and em- 
pathetic conversational procedure (ECP) groups 
were, respectively, 1.70 (SD = 6.147), 21.90 (SD 
=5.466), and 18.30 (SD = 7.602). An examina- 
tion of specific comparisons among group means 
Showed that significantly greater therapist empathy 
Was communicated during the EIP than during the 

“MIP F(1, 18) = 60.301, p <.001, and that sig- 
nificantly greater empathy was communicated dur- 
ing the ECP than during the MIP F(1, 18) = 
28.831, p < .001, Statistically nonsignificant dif- 
ay occurred between the EIP and the ECP 

ups, 


j Avoidance Behavior Change 


= pretreatment BAT scores for the MIP, 
4 7 and ECP groups were, respectively, 8.20 
p- 3.259), 8.10 (SD = 2.378), and 8.00 (SD 
=3.266). Posttreatment BAT means were, re- 
NA 13.90 (SD =3.573), 13.30 ( SD= 
E> and 10.10 (SD = 5.259). A 2 (trials) X 
4 Teatments) repeated measures analysis of vari- 
Rc performed on BAT scores resulted in a sig- 
oe trials main effect, F(1, 27) = 52.703, p< 

| in) and a Treatments X Trials interaction, F(2, 
= 3.558, p < .05. Orthogonal comparisons of 
ae scores among the groups showed that 
a F resulted in significantly greater improve- 
aaa than the ECP, F(1, 27) = 4.495, p <05, 
that the MIP resulted in significantly greater 
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improvement than the ECP, F(1, 27) = 6.062, 
p<.05. Differences between the MIP and EIP 
were statistically nonsignificant. The number of 
posttreatment MIP, EIP, and ECP subjects who 
actually touched the snake bare-handed (BAT 
Step 14; Nawas, 1971) were, respectively, 7, 4, 
and 2. 

From the pattern of results that emerged from 
this study, it is apparent that in comparison to 
the effect of the imagery procedure used here, 
therapist empathy had minimal influence on ther- 
apy outcome, Neither the inclusion or elimination 
of therapist behaviors rated as empathetic had a 
reliable effect on outcome beyond that produced 
by the imagery procedure. 

Using the EIP as a standard of comparison, 
the MIP group performance indicated that em- 
pathy is not a necessary condition for avoidance 
behavior reduction, whereas the ECP group per- 
formance indicated that empathy is not a sufficient 
condition for the same outcome. These findings 
are consistent with the interpretations that rela- 
tionship factors are relatively unimportant in be- 
havioral procedures and that instructed attention 
shifts toward and away from symptom-related 
stimuli are important. 
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Frequent Citations in the Journal of Consulting and Clinical Psychology 
During the 1970s 


W. Miles Cox 


Indiana University 


Authors who have been cited most frequently in the Journal of Consulting and Clin- 
ical Psychology during the first half-decade of the 1970s are identified. In turn, these 
authors’ publications that have been cited most frequently are indicated. Most prom- j 
inent are the work of Julian B. Rotter on the locus of control of reinforcement and 
Truax and Carkhuff’s contributions to training and practice in counseling and 


psychotherapy. 


Tn their recent survey, Koulack and Keselman 
(1975) found that the Journal of Consulting and 
Clinical Psychology is one of the most prestigious 
periodicals among clinical psychologists in terms 
of where they wish to publish and where they ex- 
pect important psychological material to be found. 
To determine the influence of certain contributors 
on publications in the Journal, I identified the most 
frequently cited authors and their most frequently 
cited publications during the first 5 years of the 
present decade. 

A frequency distribution was constructed to de- 
termine the number of times that any given au- 
thor was cited among the 12,893 reference entries 
in the 887 articles appearing in Volumes 34-42 

(1970-1974), Each author was assigned 1 point 
for each entry in the reference section of each ar- 
ticle in which his or her name appeared. Next, a 
frequency distribution was constructed to deter- 
mine the number of references to each publication 
by the authors who were cited most frequently. 

The names of authors who were cited most fre- 
quently are shown in Table 1, and their particular 
publications that were cited most frequently are 
indicated in Table 2, It should be recognized that 
frequency of citation might, of course, be a re- 
flection of factors other than an author’s influence 
per se (cf, Cox, 1977). It is noteworthy in this 
regard that the work of B. J. Winer heads the 
list of most frequently cited publications as a re- 
sult of the widespread use of his statistics book 
(Winer, 1962) rather than his direct contribution 
to clinical psychology, Nevertheless, from Tables 
1 and 2, we can judge what the prominent areas 
of clinical research have been during the present 
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Table 1 

The Most Frequently Cited Authors in the 
Journal of Consulting and Clinical Psychology, 
Volumes 34-42, 1970-1974 


———— 


No. 


Author citations Rank 
Rotter, Julian B. 131 
Truax, Charles B. 105 (1) 


Gough, Harrison G. 85 (6) 
Carkhuff, Robert R. 82 
Cowen, Emory L. 


a 
© 
AN 
= 
Goxa uw n= 


Cohen, Jacob 77 

Lang, Peter J. 76 

Winer, Ben J. 72 

Bandura, Albert 70 

Rogers, Carl R. 68 (1) 10 
Zuckerman, Marvin 66 (6) 11 ‘ 
Wolpe, Joseph 65 12 | 

pS 


Note. The number of self-references is in parentheses, 
including self-reference by co-authors. 


decade. It should be noted in addition that psy- 
chologists who are cited most frequently in the 
current journal literature are also those who are 
judged to be scientifically “eminent” on the basis 
of a variety of other independent criteria (cf 
Myers, 1970). 
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Predicting Violent Behavior from WAIS Characteristics: 
A Replication Failure 


Lois Shawver and Charles Jew 
California Medical Facility, Vacaville, California 


Kunce, Ryan, and Eckelman have reported some promising evidence on an index for 


(WAIS) characteristics. The present study is an attempt to replicate their findings. 


Results of this study were unsuccessful. 


predicting violent behavior derived from differential Wechsler Adult Intelligence Scale ) 


Many researchers have tried to use psychologi- 
cal tests to detect people prone to commit vio- 
lence, but their efforts have generally been disap- 
pointing. After reviewing recent negative findings, 
Kunce, Ryan, and Eckelman (1976) described 
how they analyzed Wechsler Adult Intelligence 
Scale (WAIS) profiles of a sample of violent and 
nonviolent male offenders and factored out new 
predictors of violent behavior. Their most promis- 
ing index was the ratio of the subject’s Similarity 
score to the sum of his WAIS subtests X 100. A 
low Similarities ratio score occurred significantly 
more often in the violent group in both their origi- 
nal and cross-validation samples. However, since 
identifying a patient as violent could in itself 
have devastating consequences, it is imperative 
that any such findings be thoroughly verified be- 
fore clinicians put them to use. This should be 
especially true in small cross-validational samples. 
Kahn (1959) advanced a hypothesis similar to 
the one advanced by Kunce et al. Kahn reasoned, 
as did Kunce et al., that extreme violence is the 
result of impulsivity due to poor abstract reason- 
ing ability and should be reflected in lower Simi- 
larities scores, Kahn’s data were based on Wechsler- 
Bellevue scores of individuals hospitalized for eval- 
uation of legal insanity, Although Kahn found 
some elements in the test data supporting his im- 
pulsivity hypothesis, his Similarity subtest findings 
were not as predicted, In contrast to Kunce et al. 
Kahn found that violent offenders did not have 
a Sey lower score on the Similarities sub- 


The present study was an attempt to replicate 
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the Kunce et al, (1976) findings. Their sample 
“consisted of white males recently court coma 
mitted as criminally insane or undergoing pretrial 
mental examination following arrest for a felony” 
(p. 42). The sample used here was similar, All 
subjects were white males diagnosed as psychotic, 
Subjects were not included when a review of court | 
transcripts left doubt as to whether the crime 
should be judged as violent or nonviolent, 

Kunce et al. reported that violent subj 
earned lower Similarities ratio scores (M = 5.44) 
than nonviolent subjects (M = 9.42) in a crossi 
validational sample consisting of 7 violent and 7 
nonviolent cases. The present study included 16 
subjects in the violent sample and 10 in the none 
violent sample, In the present study, the findings 
of Kunce and his colleagues were not replicated, 
Not only were the results not significant, but the & 
direction of the findings were reversed as were the | 
findings reported by Kahn. In this study violent 
subjects earned higher Similarities ratio Scores 
(M = 10.25) than did nonviolent subjects (M= 
9.62). 

The conclusion here is that the Similarities ratio 
score advanced by Kunce et al. has not yet been 
sufficiently validated to be used by clinicians to 
predict violent behavior in specific cases. 
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The Psychological Corporation is a single 
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“An Outcome Study of Short-Term Communication Training 
with Married Couples 


Norman Epstein and Elizabeth Jackson 
State University of New York at Buffalo 


Communication training, 


sessions over 3 weeks, 


ments reduced disagreement 


in spouses’ verbal behavior and perc 


A Th recent years, treatments of marital dis- 
cord frequently have been designed to foster 
Clear communication between intimates, with 
“the expectation that such behavioral changes 
Would be associated with increased marital 
Satisfaction (e.g., Bolte, 1970; Ely, Guerney, 
W Stover, 1973; Lederer & Jackson, 1968; 
f pets, Figurel, & McNamee, 1975). Raush, 
y; Hertel, and Swain (1974) noted that 
conflict is inevitable when partners in a close 
e seek to satisfy their varied needs, 
they argue that clear communication is a 
ene for conflict resolution. Although 
aa communication may be a consequence of 
eg conflict, the lack of information ex- 
p oe may itself impede resolution of differ- 
1966: a interpersonal tensions | (Bardill, 
9; Raush et al., 1974). It is likely that 


>y“ 


ae article is based on a paper presented at the 

) ay, Of the Eastern Psychological Association, 

Pe April 1976. 

j eens for reprints should be sent to Norman Ep- 

P a et of Psychology, State University of 
AEW York at Buffalo, Buffalo, New York 14226. 


A 


interaction insight 
compared for changes in marital verbal interaction and spouses’ ratings of each 
other on the Barrett-Lennard Relationship Inventory. Fifteen couples were ran- 
domly assigned to the three groups. Both treatments involved five 14-hour group 
led by the same 
assessments, discussions of individual couples were tape recorded and coded. 
The pretest—posttest interval for waiting list controls was equal to that for the 
treatment groups. Communication training produced 
assertive requests, compared to insight treatment and no treatment. Both treat- 
significantly. Communication training produced a 
l greater decrease in attacks and a greater increase in spouse-rated empathy than 

the control condition, but insight training and no treatment did not differ on 
t these variables. Generally, communication training led to more extensive changes 
eptions of marital communication than in- 
Nay sight training. Further research is suggested to test 
ù for modifying various classes of maladaptive marital interaction. 
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training, and no treatment were 


male and female cotrainers. During 


a significant increase in 


the limits of this intervention 


poor communication and marital dissatisfac- 
tion mutually reinforce each other. 

Empirical investigations of the relationship 
between communication clarity and marital 
satisfaction provide evidence that their as- 
sociation is quite strong. Relying on self- 
report measures, Navran (1967) and Murphy 
and Mendelson (1973) found high positive 
correlations between spouses’ scores on marital 
satisfaction inventories and their scores on 
questionnaires assessing openness of marital 
communication. In Raush et als (1974) 
study, spouses’ use of rejection and coercion, 
coded from actual interactions, was signifi- 
cantly related to poor resolution of conflict. 
Relationships characterized by avoidance had 
a static quality in which little problem solving 
took place. Epstein and Jackson (Note 1) 
examined the relationships between spouses’ 
ratings of empathy, congruence, and uncondi- 
tional positive regard received from each other 
and categories of their verbal interaction 
coded from tapes by independent raters. Per- 
ceived empathy was related inversely to fre- 
quencies of self-justification and disagreement 


Inc, All rights of reproduction in any form reserved. 
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by the spouse and related positively to the 
spouse’s use of statements that revealed feel- 
ings, offered support, or conveyed agreement. 
Ratings of the spouse’s congruence were re- 
lated inversely to the frequency of self-justifi- 
cation. These findings suggest that patterns 
of clear, supportive communication are associ- 
ated with components of marital satisfaction. 

Two basic patterns of unclear communica- 
tion commonly observed in disturbed mar- 
riages are avoidance and active escalation of 
conflict. Olson (1972) and Raush et al. 
(1974) suggested that spouses who feel threat- 
ened by potential confrontation on relation- 
ship issues often withdraw from direct com- 
munication. They rely on evasive messages 
such as denials, self-justification, and disqual- 


ification * rather than stating their opinions - 


and feelings clearly. On the other hand, con- 
flicts may be escalated directly through issue 
expansion (e.g., “You always ”), dispar- 
agement, and other personal attacks (Raush 
et al., 1974). Both avoidance and escalation 
of attack decrease the exchange of informa- 
tion necessary for constructive problem solv- 
ing. 

Avoidance and attack closely parallel the 
two modes of behavior described by Alberti 
and Emmons (1974) as characteristic of peo- 
ple who have deficits in assertive behavior: 
nonassertion (withdrawal) and aggression. 
They outline procedures designed to improve 
an individual’s communication skills by sub- 
stituting specific assertive Messages for eva- 
sive and/or attacking Messages. Although 
group assertiveness training has been shown 
to increase clarity and directness of subjects’ 
communication in role playing situations, Al- 
berti and Emmons question whether the 
newly learned behaviors will generalize to 
interactions with significant others outside 
the training group. Studies by Eisler and 
Hersen (1973) and Lehman-Olson (1976) 
provide some encouraging evidence for the 
effectiveness of assertiveness training with in- 
dividuals, focusing on their marital relation- 
ships. However, both Alberti and Emmons 
and Lehman-Olson suggested that assertive- 
ness training with married persons would be 


more effective if Spouses participated to- 
gether. The present 


study draws on tech- 
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niques of assertion training for in vivo modi- 
fication of unclear communication between 
spouses. 

In the present study, a group treatment 
designed to increase clarity and assertiveness 
of communication by means of active prac- 
tice (e.g., role playing) was administered to 
couples who complained of difficulties in 
communicating. Its effects on both verbal be- 
havior and spouses’ self-reported perceptions 
of each other’s empathy, congruence, and 
positive regard were compared to those found 
with an alternate treatment focusing on in- 
sight into behavioral interaction patterns and 
with a no-treatment control group. The “in- 
teraction insight” treatment fostered spouses’ 
understanding of repetitive maladaptive be- 
havioral patterns (e.g., interruptions) in 
their relationship, without the repeated prac- 
tice of assertive messages incorporated in 
the communication treatment. Inclusion of 
this alternate treatment group allowed a test 
of the hypothesis that active practice of 
more effective communication patterns will 
produce not only more assertive behaviors 
but also a greater increase in spouse-pet- 
ceived openness of communication than that 
resulting from insight training alone. 

The specific hypotheses were that in a pre- 
test-posttest design, (a) subjects who re- 
ceived communication training would exhibit 
a greater increase in specific assertive mes- 
sages than subjects in the interaction insight 
and no-treatment control groups; (b) sub- 
jects receiving communication training would 
exhibit a greater decrease in nonassertive 
messages (e.g., disqualification, self-justifica- 
tion) and aggressive messages (e.g., attack) 
than subjects in the other groups; (c) sub- 
jects who received communication training 
would report greater increases in their 
Spouses’ congruence (open, direct communi- 
cation) than subjects in the other groups; 


1A disqualification is a message that invalidates 
another message sent previously or concurrently. Dis- 
qualifying one’s own message allows one to expres 
an idea or feeling but not take responsibility for it, 
Disqualifying another person’s message communicates 
that one does not acknowledge the validity or even 
the existence of the other’s ideas and feelings (01s0% 
1972; Watzlawick, Beavin, & Jackson, 1967). 


(d) 

interaction insight groups would report in- 
creases in empathy and positive regard re- 
ceived from their spouses, with the former 


n 
= 
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subjects in both the communication and 


treatment producing greater increases on 
these dimensions; and (e) improved con- 
fict resolution, evidenced by fewer disagree- 
ments, would result from both treatments, 
vith communication training producing the 
stronger effect. 


Method 
Subjects 


Couples who responded to public media an- 
houncements of a research project involving free 
workshops for “marital communication problems” 
initially were screened by phone in order to select 
subjects who were married at least 2 years, had 
completed high school, were not presently in 
counseling, were not judged to be in psychological 
crisis, and did not exhibit severe psychopathology. 
These criteria were used to insure a moderate de- 
gree of homogeneity in the sample. Appropriate re- 
ferrals were made for callers who did not meet the 
above criteria. 
yp atment interview with each couple served 
aie ane screening. Three interviewed couples 
oa eferred for more extensive marital therapy 
ee the experimenters’ judgments that their 
Dorie were more severe than a communication 
Ria he remaining 16 couples who were 
ieee for the workshops then were 
aaa assigned to the communication training 
itis) couples), the interaction insight group (5 
re cou i and the no-treatment “waiting list” group 
“a Hee During the course of treatment, 1 
mica ee out of the insight group. Subjects 
renilisto e waiting list received treatment at the 

n of the study. 


Materials and Procedure 


ipa pretreatment and posttreatment assess- 
e ee couples, each subject completed 
Tett-Lenna oe Relationship Inventory (Bar- 
Athy, come’ 1962), indicating the degree of em- 
Bar en ngruence, and unconditional positive re- 
aad received from the spouse. This scale 
Jacob, TSN in previous studies (eg, Quick & 
i 13; Wells, Figurel, & McNamee, 1975) to 

ext ma satisfaction. 
Communicati couple was instructed to discuss their 
and a A problems for 15 minutes with a male 
id not ae interviewer present. The interviewers 
Biving th €rvene in the couple’s discussion after 
€ following instructions: 


assess 


Th o 
two E for us to get an idea about how the 
You talk with each other, we would like 


to spend 15 minutes giving each of you an op- 
portunity to talk about what communication is 
like between you. What seems to happen when 
you want to talk to each other about important 
matters? If there is a problem with your com- 
munication process, what is it, and what seems to 
be its source? Please direct your comments to 
each other, not to us, and we will listen. 


Following this warm-up period, the couple was 
asked to continue discussing their communication 
pattern for 10 minutes without the interviewers 
present. Interviews were tape recorded with each 
couple’s written permission. 

Subjects in the two treatment groups met for 
five 14-hour sessions over a 3-week period. Both 
groups were led by the same male and female co- 
trainers, who were experienced in both modes of 
treatment and were unaware of the experimental hy- 
potheses. Although the treatment formats were dif- 
ferent for the two groups, the major topic dis- 
cussed at each session (e.g expression of caring, 
cooperation, problem solving) was the same. During 
each session, subjects in both treatments formed 
smaller groups of two or three couples to perform 
specific exercises. At the beginning and end of each 
session, the total group membership met to discuss 
common communication issues. 

The communication treatment emphasized the 
practice of specific assertive requests, opinions, and 
statements of feeling. During the first session, couples 
of guidelines for clear com- 
and Emmons’ (1974) argu- 
ments regarding “assertive rights” in interpersonal 
relationships were discussed. All sessions included 
modeling of assertive communication by cotrainers, 
practice by each couple with issues they chose from 
and behavioral feedback to 


coached each other on expressing messages precisely 
and directly. The cotrainers intervened actively and 
reiterated the communications guidelines frequently. 

The insight treatment focused on the particular 
interaction patterns of each couple that confused or 
frustrated the spouses. The cotrainers instructed 
subjects in the observation of verbal and nonverbal 
messages that exacerbate conflict. A major goal was 
to increase each subject’s awareness of the impact 
his/her behavior had on the spouse’s feelings and 
behavior. Each couple received extensive feedback 
regarding interaction patterns from the cotrainers 
and other couples. Delineation and practice of al- 
ternate modes of clear communication was minimal. 

Each couple’s posttest interview was held within 
a week of the Jast treatment session and was identi- 
cal in procedure to the pretest. The pretest-posttest 
interval for the control group was equal to that for 
the treatment groups. AJl couples were given an op- 
portunity for further treatment. 
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Results 


Eleven categories of each subject’s taped 
verbal behavior were scored by two inde- 
pendent coders from a 5-minute segment of 
each couple’s discussion while alone. Based 
on an a priori decision to allow a short time 
for each couple to adjust to the interviewers’ 
departure, the coded segment always was the 
3rd through the 7th minute of discussion. 
The coding categories and their respective 
interrater reliabilities, calculated as phi coef- 
ficients (Herbert, Note 2), were assertive re- 
quest, 1.00; self-revelation, .85; specific state- 
ment, .86; general statement, .83; asking for 
feedback, 1.00; disqualification, .90; agree- 
ment, .98; disagreement, .80; support, 1.00; 
attack, .84; and self-justification, .84, 

Pretest, posttest, and change (posttest 
minus pretest) scores were computed for each 
subject on the proportion of total speech acts 
coded into each of the 11 verbal behavior 
categories and ratings of spouse’s empathy, 
congruence, and positive regard. Since 
Spouses were assessed as couples, their scores 
on each dependent measure were not con- 
sidered to be independent, and the analyses 
were conducted with couples’ scores com- 
puted as the mean score for each husband 
and wife. Therefore, the ns for analyses were 
6, 4, and 5 couples in the communication, in- 
sight, and control groups, respectively. 

Analyses of variance for pretest scores in- 
dicated no significant initial differences among 
the three groups on the dependent variables, 
The fact that only 1 of the 14 pretest F 
values reached the level of $ =..10, which 
would be expected by chance alone, and only 
2 reached p< -20, served as evidence that 
the random assignment of couples to groups 
produced adequate matching. 

Further one-way analyses of variance and 
planned (one-tailed) ¢ tests were used to 
compare change scores for the groups. The 
potential problem of highly correlated de- 
pendent measures, which would Tesult in 
redundant tests in the major analyses, was 
assessed by computing the intercorrelations 
of all the dependent variables. The mean ab- 
solute value of these 91 correlations, cal- 
culated with z, transformations (Edwards 
1976), was .22 (ns) and represents a mean 
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of less than 5% shared variance. Although j 
of the 7 correlations exceeding .40 (16 
overlap) were those among the three Reh 
tionship Inventory subscales, the important 
conceptual distinctions inherent in the sub 
scales and the differential predictions mat 
for them in this study were considered to by 
sufficient grounds for analyzing them sep. 
rately. 

Of the one-way analyses of variance fo 
the three subject groups, the strongest effect 
were found for the following dependent vati 
ables: assertive request, F(2, 12) = 27.65,) 
<.001; disagreement, F(2, 12) = 3.42, p= 
067; attack, F(2, 12) = 2.08, p= 16 
spouse-rated empathy, F(2, 12) = 1.75, p= 
215; asking for feedback, F(2, 12) = 1.6), 
p= .225; and self-justification, F(2, 12)= 
1.37, p = .290. No other variable produced 
an F value exceeding 1.00. E 

Planned ¢ tests indicated a significant it- 
crease in assertive requests for the commumh 
cation group, relative to the insight group, 
#(12) = 6.51, p < .001, and relative to tht! 
control group, ¢(12) = 6.03, p < .001. Chang 
in assertive requests did not differ for the it 
sight and control groups, ¢(12) = —.82. Sig 
nificantly greater reduction in percentage d 
disagreements was found for the communica- 
tion group, #(12) = —1.84, p < .05, and the 
insight group, (12) = —2.52, p < .05, rel 
tive to the control group. Contrary to th 
hypothesis, the decrease in disagreement was 
not significantly different for the insight a" 
communication groups, ¢(12) = .89. ? i 

As predicted, communication training pa 
duced a significantly greater decrease in 4 
tacks than the no-treatment control, ¢(1 
= —2.03, p < .05. Insight training pron 
an intermediate decrease in attacks that ; 
not differ significantly from the other te 
There were nonsignificant trends toward a 
crease in use of self-justification in the “i 
munication group, relative to the com 
group, #(12) = —1.41, p < .10, and in n 
insight group, relative to the control a 
t(12) = —1.46, p< .10. Similarly, ; 
were nonsignificant trends toward an mna 
in asking for feedback in the comen 3 
group, relative to the control group, u ee 
1.42, p < .10, and in the insight group, 
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tive to the control group, t(12) = 1.72, p 
< .10. No other significant differences were 
found among the three groups for the re- 
maining verbal coding categories. 

Analyses of subjects’ change scores on the 
Relationship Inventory indicated a signifi- 
cant increase in spouse-perceived empathy in 
the communication group, relative to the 
control group, (12) = 1.79, P< 05. The 
trend toward greater increase in empathy for 
the communication group than for the in- 
sight group was not significant, t(12) = 1.25. 
The trend toward greatest increase in spouse- 
rated congruence in the communication group 
also was not significant. 

Table 1 presents the pretest, posttest, and 
change-score group means for the significant 
effects. 


Discussion 


The significant decrease in disagreements 
of subjects in the two treatment groups sug- 
gests that they were moderately successful in 
reducing overt conflict in this sample of mar- 
< ried couples. Although communication train- 
ing did not produce a greater decrease in 
overt disagreement than interaction insight 


Table 1 
Pretest, Posttest, and Change Score Group 
Means for Significant Effects 


Variable 
Asser- Dis- Spouse- 
tive agree- rated 
Groupandscore request Attack ment empathy 
Communication® 
Pretest 00 | gS 87S 
Posttest ‘og 18 09. = 25 
Change ‘0g. =i 06 8-50 
Insight> 
Pretest 01 .28 24 —7.38 
Posttest oo 22301 UAE OS. 
Change EE ie ie 
Controle 
Pretest 01 15 10 3.40 
Posttest ‘OL. 19) eee 
Change 00) lod. A 
‘n= 6, 
§n = 4, 
pn = 5, 
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training, the communication training was 
more effective in reducing attack behaviors. 
The somewhat smaller decrease in state- 
ments of disagreement in the communication 
group might be due to the fact that the 
assertiveness training encouraged direct ex- 
pression of feelings and opinions, including 
disagreement. An unanswered question is 
whether or not training in the assertive ex- 
pression of disagreement along with decreased 
attacking will lead to greater conflict resolu- 
tion than insight training in the long run. 
The significant increases in assertive re- 
quests and spouse-perceived empathy in the 
communication group indicate that a short- 
term structured intervention involving both 
partners can have a measurable impact on 
both overt interactions and spouses’ experi- 
ences of each other’s communication, The 
overall findings provide evidence that com- 
munication training produces greater changes 
in some categories of subjects’ behavior than 
interaction insight training. The lack of sig- 
nificant change in other categories of verbal 
behavior (eg; disqualifications, self-revela- 
tion) and in spouse-perceived congruence 
and unconditional positive regard suggests 
that the effectiveness of this communication 
training for couples should be tested further. 
It is important to determine whether treat- 
ment of longer duration would produce com- 
parable change across the various communi- 
cation categories or whether certain classes 
of behavior are particularly resistant to 
change with this intervention. Since the im- 
pact of communication training on spouses’ 
perceptions of change was less extensive, it 
appears that it may be easier to implement 
behavioral changes than attitudinal changes 
in close interpersonal relationships. A ques- 
tion for future research is whether spouses’ 
perceptions of behavioral change will follow 
in time. 
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Rating Scales for the Identification and Treatment 
of Hyperkinesis 


Patricia Goldring Zukow, Arnold H. Zukow, and P. M. Bentler 
University of California, Los Angeles 


In response to the need for a simple instrument to aid the psychologist, edu- 
cator, or physician in the identification and treatment of hyperkinesis, parent 
and teacher rating scales were developed. Multivariate analysis of the parent 
ratings of clinical and control subjects indicated three factors: Factor 1, Ex- 


citability ; 


Factor 2, Motor Coordination; and Factor 3, Directed Attention. A 


factor analysis of the teacher form yielded two similar factors, Attention/Excit- 


ability and Motor Coordination. 


Analysis of variance of each factor score re- 


vealed highly significant differences between clinical and control subjects. Cutoff 
scores were developed to aid in diagnostic decision making. These scores cor- 
rectly identified a large percentage of clinical and control subjects. 


Many concerned professionals have ex- 
pressed well-founded reservations about the 
use of medication (Cantwell, 1975; Fish, 
1975, Myers & Pless, 1976; Rie, 1975; Rie, 
Rie, Stewart, & Ambuel, 1976; Sroufe & 
Stewart, 1973; Grossman, Note 1) and the 
effectiveness of various therapeutic ap- 
proaches (Gittelman-Klein et al., 1976; Ke- 
ogh & Margolis, 1976) in the management 
of hyperactive children. Several complete 
teviews and analyses of both the positive 
effects as well as the drawbacks of drug 
therapy have recently appeared (Rie et al., 
1976; Whalen & Henker, 1976). In spite 
of the fortunate fact that medication can 
yield behavioral improvement in a large 
Proportion of cases of hyperkinesis (Con- 
ners, 1972; Conners, Rothschild, Eisenberg, 
pete, & Robinson, 1969; Conners, 
Shae Meo, Kurtz, & Fournier, 1972; 
ee & von Neumann, 1974) or that be- 
ge modification techniques affect short- 
erm or situational improvement in academic 
Performance (Gittelman-Klein et al., 1976; 

Leary, Pelham, Rosenbaum, & Price, 
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1976; Varni, 1976), the specific biologi- 
cal and behavioral bases and diagnostic cri- 
teria for hyperkinesis remain uncertain. 
There is common agreement that hyper- 
kinesis contains elements of high activity 
level associated with learning and behavioral 
disorders (Laufer & Denhoff, 1957; Milli- 
chap, 1968; Stewart, Pitts, Craig, & Dieruf, 
1966; Werry, 1968), but the practical issue 
of identification remains unresolved (Rie, 
1975; Grossman, Note 1). 

Many physical and behavioral disorders 
can be reliably classified on the basis of first- 
hand observations by a single professional, 
However, the identification of hyperkinesis 
must involve the active cooperation of both 
parents and teachers. One study reported, 
for example, that only 10 of 46 subjects 
could have been correctly identified on the 
basis of a single examination conducted by 
one individual (Sleator & von Neumann, 
1974), As Rie (1975) has noted, satisfactory 
assessment of a child’s behavior must be 
based on evaluations by several persons well 
acquainted with many facets of the child’s 
everyday life. It follows that the correctness 
of classification, consequent assignment of 
treatment condition or therapy, and assess- 
ment of behavioral changes will depend 
heavily on the adequacy of behavioral meas- 
urement provided by teachers and parents. 
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Teachers are acquainted with a broad range 
of children engaged in a great number of 
school activities and can thus compare chil- 
dren’s behavior to standards for their age 
and community; parents, of course, can pro- 
vide observations from their daily contact 
with their own children and others in a va- 
riety of naturalistic settings. 

There is as yet little consensus regarding 
the perceived behavioral bases of the hyper- 
kinetic syndrome. Several comprehensive be- 
havior rating scales designed to be used by 
parents and teachers in evaluating problem 
behavior succeed in differentiating between 
hyperkinetic and normal children (Blunden, 
Spring, & Greenberg, 1974; Conners, 1969, 
1970; Kupietz, Bialer, & Winsbert, 1972). 
However, these instruments do not specify 
the facets of behavior involved in hyper- 
kinesis. So far, the instruments designed for 
the specific purpose of identifying hyper- 
kinesis have been inadequately researched 
(Davids, 1971) or they evaluate behavior too 
specific in nature to capture the clinical com- 
plexity of hyperkinetic syndrome (Werry & 
Sprague, 1970). Blunden et al. (1974) iden- 
tified a single hyperactivity factor in their 
teacher rating data of normal subjects. As- 
pects of hyperactivity included restlessness, 
impulsiveness, and low perseverence, in ad- 
dition to the concepts of distractibility and 
low concentration. Thus in the Blunden et al. 
study, the presence of one aspect of hyper- 
activity implied the presence of the others. 
Conners (1969), in contrast, has found that 
distractibility and concentration difficulties, 
along with poor Coordination, represented a 
daydreaming /inattentiveness factor that was 
clearly distinct from hyperactivity in teacher 
rating scales, According to teachers a hyper- 
active child might, or might not, also be 
shown to have coordination and concentration 
problems. Finally, Conners (1970) demon- 
strated that parents who rated behavioral 
symptoms clustered their symptoms into sev- 
eral distinct dimensions, but that it was pri- 
marily one dimension—aggressive conduct 
disorder—that proved to differentiate normal 
versus clinic patients and neurotic versus 
hyperkinetic children, Although this factor 
makes sense as an identifier of hyperkinesis 
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because it includes behaviors such as rest- 
lessness and temper, the dimension seems to 
include irrelevant behaviors such as having 
problems with peers and siblings and ex- 
cludes such relevant behaviors as poor co- 
ordination and inattentiveness that may lead 
to problems in school. It is apparent that 
even though some differential diagnosis is | 
possible with existing instruments, the be- 
havioral basis for the disorder remains un- 
certain. 

A recent study by Langhorne, Loney, Pa- 
ternite, and Bechtoldt (1976) analyzed cer- 
tain chart, teacher, parent, and psychiatrist 
ratings of a variety of variables considered 
to be relevant to the diagnosis of hyper- 
kinesis. Using factor analysis they found that 
chart raters and psychiatrists were using the 
same basis for judgments of hyperactivity. 
Chart raters and teachers tended to assess 
excitability, but only teachers agreed regard- 
ing information about attention difficulties. 
Although Langhorne et al. interpreted their 
results as reflecting source of data rather than 
intrinsic characteristics of hyperkinesis, it $ 
difficult to evaluate their results. All their 
analyses were based on only a very small, in- 
adequately rationalized subset of variables, 
which were also subjected to unnecessary 
transformation prior to analysis. It is clear 
that additional studies of this issue are called 
for. 

The current research is addressed to two 
interrelated goals. First, it aims at clarifying 
the behavioral bases of hyperkinesis as seen 
in parent and teacher reports. Second, it aims 
to provide a measure of hyperkinetic behavio! 
that is relevant to the objective classification 
and treatment of hyperkinesis and to the as- 
sessment of significant behavioral changes. 


Parent Form 
Method 


Initial item pool. Since it seemed posi 
hyperkinesis as perceived by the parent TER 4 
volve several interrelated, but distinct, are 
behavior, items were written by the second al “ail 
to cover these areas. The first cluster of item ed 
intended to assess “classical” hyperkinetic a 
including such behaviors as being qa i 
and explosive. Items and descriptions fou 
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other scales and observed in clinical rounds, con- 
ferences, and journals were modified and adopted. 
The second group of items was directed toward as- 
sessing motor coordination, such as the ability to 
button without difficulty. These items were included 
to clarify the observation that poor coordination 
seems to accompany daydreaming and inattentive 
behavior more closely than hyperactivity per se 
(Conners, 1969). The third type of item was con- 
cerned with the inability to sustain continuous par- 
ticipation in, and completion of, structured activities 
related to learning and school problems, as per- 
ceived by the parent. This includes behavior such 
as daydreaming, which had been found in previous 
work to be possibly distinct from the first item type 
but which was clearly implicated in the hyperkinetic 
syndrome (Conners, 1969). A total of 28 items were 
written. Instructions required the parent, usually 
the mother, to “circle the answer that most applies 
to your child.” Possible responses were yes Or no, 
although one item was a three-category item col- 
lapsed to yes/no for analysis.* 
Subjects. The questionnaire was administered to 
mothers of children attending cooperating public 
elementary schools and private preschools and, after 
initial screening for hyperkinetic behavior prob- 
lems, to mothers of a subset of the children re- 
ferred to a private pediatric practice. At the time of 
evaluation the children were not receiving medica- 
tion for hyperkinetic behavior. Mothers of the pub- 
lic school children received the questionnaire from 
teachers who were selected by the principal or di- 
rector of the schools. Completed forms were re- 
turned to the teachers. The schools were chosen to 
be representative of the socioeconomic class of the 
children predominantly in this private practice, that 
is, skewed slightly toward upper-middle class. Vari- 
ous private physicians, mental health practitioners, 
school personnel, and concerned nonprofessionals re- 
ferred children to a private pediatric practice for 
evaluation of possible hyperkinetic disorder. The 
judgments of the examining pediatrician, the second 
author, were based on physical examination, medi- 
cal and family histories, plus behavioral evidence 
Teported by parents and teachers. Inclusion in the 
clinical group depended on consistent evidence of 
behavioral disturbance at home and at school along 
with a persistent learning disorder associated with 
an inappropriately high activity level. The clinical 
group children were judged not to possess an overt 
Neurological syndrome, deafness, visual disability, or 
global mental retardation, Other rare but possible 
mes of hyperactivity such as hypoglycemia or 
Yperthyroidism were excluded by appropriate lab- 
Oratory tests, 
ae Were available from mothers of 136 control 
ite clinical children, out of 278 questionnaires 
ten uted. The control sample was composed of i 
hes oo number of males as females. All koi 
of ten male subjects were retained, and a Se 
sateen subjects was randomly selected from t : 
male, sample to best approximate the ratio © 
S to females in the clinical group (3:1). Final 
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sample sizes at 23-53 years of age were 63 control 
and 27 clinical subjects, and at ages 51-11, 19 con- 
trol and 51 clinical children. 

Method of analysis. There were three separate 
steps to the analysis. First, items were grouped into 
three clusters on the basis of their statistical con- 
sistency with other items. An overall score was then 
obtained for each group of items, yielding three 
combined scores and a fourth total score across all 
items, Finally, analyses of variance were performed 
to evaluate whether the four combined scores 
yielded significant differences among clinical, con- 
trol, and age-varying children. 

Only 26 out of the 28 items on the questionnaire 
proved analyzable due to an error in data collec- 
tion. To determine the interrelations among these 
items, in the total sample of subjects, monotonicity 
analysis was deemed appropriate, since the item re- 
sponses were binary. This procedure determines la- 
tent dimensions of variation for rank-order data, 
analogous to principal components analysis for in- 
terval data (Bentler, 1970), Three-, four-, and 
five-factor solutions were obtained and rotated to an 
orthogonal, simple structure orthosim criterion 
(Bentler, 1977). Each item was then clustered 
into one of three groups on the basis of which factor | 
it most adequately assessed. Since the three-factor 
solution proved to be most clearly interpretable, a 
target rotation was performed (Bentler, 1971) and 
the orthogonal loadings were interpreted (Bentler, 


1968). 


Results 


Factor computations. Factor loadings for 
the three-dimensional solution are presented 
in Table 1; these loadings summarize the ex- 
tent to which each item measures each di- 
mension. Items have been grouped into three 
clusters on the basis of which factor they 
assess best. y 

Three clear-cut factors emerged, showing 
a strong correspondence to the hypothesized 
dimensions. Thus, parent perceptions of nor- 
mal and hyperkinetic children vary in at least 
three ways, so that a particular child may 
show any combination of behaviors from the 
three dimensions; within a dimension, how- 
ever, if a child tends to exhibit one of the 
behaviors, the child tends to exhibit the 


others. On the whole, items tend to measure 


a given factor quite well and are not relevant 


to other factors. (Contrast the low loadings 


i Jes should 
1 Requests for copies of the rating scales S 
be eet to Arnold H. Zukow, 5363 Balboa Boule- 
vard, Encino, California 91316. 
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Table 1 
Parent Form: Dimension Loadings 
Factor loading 
Item M 1 2 3 
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Item M 


Factor 1: Excitability 


Irritable 45 TO ES 1 RENES (1 
Quick-tempered, 

explosive -60 73 .02 05 
Overly sensitive 49 71 01 -30 
Emotionally high 

strung 44 8 ll 24 
Unpredictable, 

unmanageable 43 63 .20 27 
Panics easily 30 56 28 21 
Poor coordination 24 55 1S 24 


Can’t seem to keep 

from touching 

everything and 

everyone around 

him 45 
Fidgets -63 
Reacts adversely 

to changes in 

routine 29 41 27 07 


Factor 2: Motor Coordination 
Has trouble 


drawing, writing 19 03 80 17 
Has trouble 

buttoning 21 —19 4.78 —.03 
Eyes and hands 

don’t seem to 

function together .09 AT 70 14 
Trouble with 

bicycle «13 08 69 24 
Exceptionally 

clumsy 10 35 65 12 


Note. Those factor loadings that are considered to define a given factor appear in italics. 


on irrelevant factors.) In some cases, items 
appear to be falling in between two factors, 
as indicated by a high loading on two fac- 
tors; such items are ignored in factor in- 
terpretation. 

The items’ defining the first factor appear 
similar to those marking hyperactivity as de- 
scribed by others. It is an excitability factor 
concerned with both the quality and quan- 
tity of behavior, including perpetual motion 
accompanied by explosive outbursts, The sec- 
ond factor refers to motor coordination ex- 
clusively. Items include general descriptions 
of poor coordination and more specific delays 
and difficulties with buttoning, walking, draw- 


Factor 2: Motor Coordination 


Was slow learning 


to walk 16 —.02 
Speech develop- 

ment has been 

slow 15 28 


Factor 3; Directed Attention 


Is child lazy—not 

trying to do 

well in school? 19 18 
Daydreams while 

doing homework 

assignments .24 20 
Not learning in 

school although 

seems “bright” .24 14 
Knows work orally 

at home—gets 

to school and has 

to write it 

down—fails. 14 16 
Short attention 

span 53 31 
Tolerance for 

failure and 

frustration is low 64 40 
Jumps from one 

activity to 


another 45 24 
Unusually 
hyperactive 48 45 


ing, speech, and so on. Finally, the third 
tor refers to directed attention or susta 
participation in goal-directed behaviors. 
items deal with motivation, attention/co! 
trol, and learning as perceived by the 
ent, All item means are given in Table 
which shows the relative proportion of h 
dren considered to exhibit a given behai 
Factor scores. Once the three factors 
been extracted, a score was computed fo 
each child on each factor. The items wh 
loadings are italicized in Table 1 were 
to compute three factor scores, with Factor 
being measured by nine items, Factor 2 b 
seven items, and Factor 3 by seven ite 
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Scores were obtained by adding a point for 
every yes response. The three factor scores 
proved to be an adequately internally con- 
sistent, as measured by coefficient alpha, in- 
dex representing the extent to which items 
are measuring the same dimension, Obtained 
alphas were .84, .74, and .82 for the three 
factors, respectively. To assess whether the 
three scores measured somewhat different as- 
pects of behavior, the intercorrelations among 
the scales were obtained. In the combined 
sample of 160 children, the correlations were 
30, .58, .30 among Factors 1 and 2, 1 and 3, 
and 2 and 3, respectively. These correlations 
are sufficiently low relative to the internal 
consistencies, so one can conclude that the 
thtee factors are reliably measuring distinct 
aspects of behavior. On the other hand, the 
significant and positive correlations among all 
three scores suggests that all items measure 
Something in common, which, according to 
our experimental procedures, should involve 
the differentiation of hyperkinetic children 
from controls, Consequently, a fourth score 
was obtained as the sum of the previous three 

Scores, 
ae the two items of Table 1 not 
Pie ines a the three separate factor scores 
EA E ed in this total score. Since “poor 
E me ion” was deemed inconsistent with 
ie. ace of the other items marking 
ae that item was excluded from the 
ihe a of the factor total score. In view 
62) Bae nature of Coordination (Fac- 
the “es the absence of a high loading of 
ten ae item on Factor 2, it is 
TI at parents interpreted this item 
aetivit ae more globally on the quality of 
à es er than to a specific motor task 
they thi be expected, nor did parents inter- 
iad pan in terms of inattentiveness as 
Usually} y Conners (1969). The item “un- 
a Gee loaded quite similarly 
cluded <a 1 and 3 and therefore was not in- 
ay either factor score. Even though 
ae items do not clearly define a given 
lo z do contribute to the overall de- 
K fe hyperkinetic behavior. Thus these 
a $ oa included along with all italicized 
ae ‘able 1 in the computation of the 
» total score. Its internal consistency 
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Table 2 


Parent Form: Analyses of Variance 
EA E A T Sa SE E 


Source of variance 


Subject 
status Age 
Factor (A)* (B)* AXB» Error? 
1: Excitability 
MS 547.6 34 106 38 


F ratio 143.0** .9 2.8 
2: Motor Coordination 
MS ; 31.3 0 1.6 2.2 
F ratio 14.3°* 0 fri 
3: Directed Attention 
MS 9 250.4 10.5 26.8 2.2 
F ratio NOS LTE Tore 
Total score 
MS 2,600.3 19.7 a Hets has) 
F ratio 230.7** = 1.8 0 
sdf=1. 
» df = 156. 
*p < 05. 
** p < 001. 


was .89, higher than the three separate scores, 
so that this score provides the most reliable 
overall description for these children. There 
are thus four scores for each child, The scores 
were examined for group mean differences. 
Group comparisons. The parent form dis- 
criminates among clinical and control status 
at statistically significant levels, The statisti- 
cal analyses are presented in Table 2. Each 
of the three factors, as well as the total 
score, verifies the statistically higher means 
obtained by clinical subjects as compared 
with control subjects. As expected, the total 
score yields the most reliable differentiation. 
In only one case was there a significant ef- 
fect of age of subjects on the scores, This oc- 
curred in the third scale, concerned with di- 
rected attention. The effect is quite minimal 
in significance level compared to subject 
status, however, and can only be interpreted 
simultaneously with the interaction effect. 
Even though among the clinical group older 
subjects received higher scores, among the 
controls the age trend was minor and re- 
versed. nat 
Cross-validation. Results from prelimi- 
nary work in cross-validation showed that 
among 26 clinical subjects (34-14 years old), 
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all four factor scores on the parent rating 
scales were significantly different (p < .001) 
from the original control subjects. In all 
cases the direction of the differences and the 
size of the difference was virtually identical 
to that reported in the original study. A com- 
parison of the original clinical group means 
to the means of the new clinical subjects re- 
vealed a high degree of similarity. More 
studies to substantiate this finding are in- 
dicated. 
Classification. The task of differentiating 
clinical from control subjects can be accom- 
plished by using the total scores, with cutoff 
scores being determined to maximize correct 
classification. Children receiving scores of 0-8 
can be labeled as normal, whereas those scor- 
ing 9 or more can be labeled as hyperkinetic. 
Comparing the “true” status of the children, 
as determined by pediatrician judgments, 
with the labels obtained solely by the cutoff 
score yields the following picture: Ninety-six 
percent of the 90 young subjects were cor- 
rectly classified, and 90% of the 70 older 
subjects were correctly classified. A more de- 
tailed look at the younger subjects reveals 
that 89% of the clinical subjects and 98% 
of the controls were correctly classified. 
Among the older subjects, 90% of the clini- 
cals were correctly identified, as were 89% of 
the controls. These percentages are quite en- 
couraging, compared to the 70%-80% figures 
reported in similar studies (Conners, 1970). 
Among the cross-validational clinical sub- 
jects, the cutting score of 8/9 correctly iden- 
tified 81% of the children. This decrease in 
classification accuracy is a typical concomi- 
tant of cross-validation, but fortunately the 
results demonstrate that the instrument can 
be expected to be successfully applied in fu- 
ture studies, 


Teacher Form 
Method 


Initial item pool. The rationale fo: 

form followed that of the parent fan a Ge 
necessary to limit the size of the item paal due to 
time demands on teachers, A total of 15 items were 
devised to represent a broad range of school ac- 
tivities: tapping, motor skills, attention, and impul- 
sive/explosive behavior. To introduce sufficient vari- 
ance into the responses, te i 

ea: ae » teacher ratings were made 
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Subjects. A new set of subjects, consisting of 18 
controls and 36 clinical subjects, participated. Sub- 
ject selection is described under Parent Form, An 
attempt was again made to sample a broad age 
spectrum. Sample sizes at ages 24-54 were 22 and 11 
for controls and clinicals, respectively. At ages 51-11, 
the corresponding sample sizes were 53 and 25. 

Method of analysis. Steps analogous to those re- 
ported for the parent form were carried out for the 
teacher form. Since the data were adequately con- 
tinuous rather than binary, a traditional least 
squares factor analysis was undertaken. Initial com- 
munality estimates were squared multiple correla- 
tions. Two different factor analyses were under- 
taken, a three-factor solution and a two-factor solu- 
tion. The first three eigenvalues of the three-factor 
solution were 8.03, 1.15, and .82, suggesting that two 
factors or even one factor might be sufficient to ac- 
count for the item intercorrelations. Rotation of the 
three-factor solution yielded two interpretable fac- 
tors and a third unclear factor. The two-factor solu- 
tion, rotated by orthosim, in contrast, yielded the 
same two interpretable factors. 


Table 3 


Teacher Form: Factor Loadings 
Pou ee Pa ES 


Factor loading 


Item 1 2 


Factor 1: Attention/Excitability 


Is attention span short? 83 33 
Child fidgets kee!) 
Is the child a behavioral 

problem in class? 78 22 
Unable to follow directions? 73 40 
Quick-tempered, explosive 07 20 
Seems to touch everything 

and everyone around him 66 31 
Finds it hard to play with 

his peers 65 37 
There are no activities that 

the child can focus his 

attention on 62 56 
Has a low tolerance for 8 

failure and frustration 1 n 
Reacts adversely to changes 9 

in routine 00 g 


Factor 2: Motor Coordination 


Exceptionally clumsy 20 K | 
Coordination is poor 32 i 
Speech development is slow 55 

or not clear LOS Tt 
Eyes and hands can’t seem 53 

to function together 37 10 
Quiet and withdrawn—a loner 19 £ 


3 al 
Note. Factor loadings that contribute to the tot 


score for each factor appear in italics. 
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Results 


Factor computations. Factor loadings for 
the two-factor solution are presented in 
Table 3. As before, items have been grouped 
into two clusters on the basis of which factor 
they best assess, The content of the items 
suggests that the first factor deals with at- 
tention and excitability, combining the two 
separate parent factors. It seems to repre- 
sent the classical aspects of hyperactivity 
noted for the hyperkinetic syndrome. Factor 
2, mirroring the content of the second parent 
factor, deals with motor coordination. 

Factor scores. A score was determined 
for each child on each factor by summing the 
rating score given a child for each item in a 
given factor. A third total score was ob- 
tained analogously to the parent form by 
summing across the two factor scores and, in 
addition, including the item dealing with in- 
ability to focus continuously on any one ac- 
tivity; that item had been omitted from the 
two separate scores because it falls between 
the two factors, An internal consistency C0- 
efficient was obtained for each dimension, and 
the intercorrelation among the two factor 
scores was obtained in the combined total 
sample. 

The internal consistency coefficients were 
again sufficiently high to reinforce the con- 
cept that the items comprising the two factor 
scores are measuring something in common. 
The alpha coefficients of .93 and .81 were 
compared to the intercorrelation of .63 be- 
tween the scores on the two factors, which is 
again sufficiently low to consider the two fac- 
tor scores to be measuring an independent 
aspect of behavior. On the other hand, the 
correlation of .63 is sufficiently high to as- 
sure that the scores measure the common 
tendency toward hyperkinesis, as indexed by 
Coefficient alpha of .94 for the 15-item total 
Score, 

Group comparisons. The teacher form 
discriminates among clinical and control 
status at statistically significant levels. Fac- 
tor scores are able to discriminate between 
clinical and control subjects. The subject 
status column of Table 4 indicates that the 
clinical means were significantly higher than 
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Table 4 
Teacher Form: Analyses of Variance 


einen ea 


Source of variance 


n 


Subject 
status Age 
Factor (A) (B)}} AXB* Error? 
1: Attention/Excitability 
MS 1,675.1 110.9 195.3 44.7 
F ratio 37.5%* 2.5 4.4* 
2: Motor Coordination 
MS 130.1 6 7.6 10.6 
F ratio 12,3%% i MW 
Total score 
MS 3,172.1 138.6 311.2 90,3 
F ratio fe bay ÙS TAE 
sdf = 1, 
b df = 107. 
* p < 0S. 
** p < 001. 


the control means in all of the comparisons 
at a high level of significance, There appears 
to be no age effect at all, and in only one case 
was the interaction of subject status and age 
statistically significant. As before, this inter- 
action suggests that scores for clinical chil- 
dren increase with age, whereas those of con- 
trol children do not. 

Cross-validation. Results from the pre- 
liminary cross-validation of the teacher form 
paralleled the findings from the preliminary 
cross-validation of the parent form. Signifi- 
cant differences were found between the 26 
new clinical subjects and the original control 


< 001). 
S a Children with a score of 35 
or less can be classified as “normal », those 
receiving a score of 36 or more can be clas- 
sified as “hyperkinetic.” Among the 33 
younger subjects, 10% were correctly clas- 
sified, whereas 85% of the 78 older subjects 
were appropriately grouped. The cutting score 
of 35/36 correctly classified 85% of the new 


clinical subjects. 


Discussion 


The factors that emerged in the parent 
and teacher forms indicated that both par- 
ents and teachers differentiated the excitabil- 
ity/attention versus motor coordination as- 
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pects of hyperkinesis. However, the analysis 
of the parent form revealed that parents dif- 
ferentiated excitability and directed atten- 
tion as well. It is possible that this result is 
an artifact of the items used in the two 
studies. A careful analysis of the item pools 
indicates that the content concerned with 
daydreaming, not doing well in school, and 
so on, was omitted from the school form. In 
future versions of the instrument, item tap- 
ping this content will be included. 

Although the factor loadings make it ap- 
pear extremely likely that the factors from 
the two instruments are similar, there was 
no way in this study of verifying or disprov- 
ing this hypothesis. Strict similarity can be 
verified by the demonstration of a high cor- 
relation between scores obtained on the par- 
ent form and the comparable scores on the 
teacher form, Unfortunately, the design of 
the current studies did not allow data to be 
obtained from both parent and teacher sources 
on the same subjects, so a correlation coeffi- 
cient could not be computed. Even though a 
significant correlation between the parent 
and teacher factors might be expected, parents, 
teachers, and others actually have different 
sources of observational data about a child. 
It is quite possible that there may be little 
agreement regarding, say, concentration prob- 
lems. The teacher may be in a better posi- 
tion to observe such characteristics in the 
classroom, A further study on this issue is 
clearly called for, 

Langhorne et al, (1976) had the data to 
correlate composite teacher ratings, chart 
ratings, psychiatrist diagnosis, and parent 
ratings but, unfortunately, did not report on 
the extent of agreement that might be ob- 
served using such composite ratings. As 
pointed out before, since their analyses were 
based on an arbitrarily selected, very small 
subset of their data, it is not possible to rely 
on their conclusion that there is little or no 
agreement among sources of data. It is possi- 
hn eee the many items discarded 

analysis, there may have existed 
humerous items showing high agreement 
across data sources, 

The factors that did appear in this study 
were similar to those found by previous work- 
ers. The teacher form attention /excitability 
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factor was very similar to the hyperactivity. 
factor reported by Blunden et al. (1974), as 
well as to the hyperactivity factor identified 
by Conners (1969). Results from the present 
parent form lend support to Conners’ ( 1969) 
finding that an inattentiveness factor is dis. 
tinct from impulsivity /hyperactivity. Further, 
our results show a clear separation between | 
directed attention, including distractibility 
and concentration difficulties and poor motor 
coordination. Items from these two categories 
were not well represented in previous re- 
search, as indeed some were absent from the 
current teacher form. Obviously, items deal- 
ing with other content material, such as so- 
cial behavior, also deserve further study and | 
should be included in future research on this 
topic. 

Numerous researchers have emphasized the 
disappointing results found in many studies 
investigating the effect of medication 
(Whalen & Henker, 1976) and/or psycho- 
logical (Gittelman-Klein et al., 1976; O- 
Leary et al., 1976) or educational therapy 
(Douglas, 1974; Keogh & Margolis, 1976) on 
academic achievement. Some of these findings 
may be attributed to the problem of obtaining 
sufficient homogeneity among subjects, as 
recently reviewed in Langhorne et al. (1976). 
For example, the criteria for subject selection 
in studies of hyperkinetic children is often 
quite global (Keogh & Margolis, 1976; Yi 
et al., 1975). Furthermore, the variation 0 
criteria for subject selection precludes gel 
eralization of results from study to study. 

A preliminary investigation on the effec- 
tiveness of medication as a function of ini 
tial scores on the instruments yielded mixe 
and somewhat disappointing results. Of the 
20 possible correlations, two were statisti- 
cally significant. Statistically significant cof- 
relations of .68 and .75 (p< .02) v 
found between clinical rankings of m 
cation effectiveness and Factor 1 of 
teacher form and the total score in a sam- 
ple of nine children under 54 years of ee 
The children most responsive to medica 
were those initially considered by teat 
be most disturbed in excitability or in com a 
nation with poor coordination. Since an 
have been few reliable predictors of Be, 
siveness to medication (Ross & Ross, 1976): 
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——————————<— — ~& "00 
æ O 


RATING SCALES FOR HYPERKINESIS 


this encouraging result is being investigated 


further. 
The total factor score derived from the 


parent and teacher rating scales permits the 
accurate identification of subjects, although 
it must be admitted that the issue of differ- 
ential diagnosis also merits future research. 
Scores on the individual factors may provide 
a vehicle for investigating differential diag- 
nosis and examining potential relationships 
between particular treatment conditions and 
changes in disapproved behavior, academic 
performance, cognitive abilities, and social 
skills. There is a striking similarity between 
the factors that emerged in this analysis and 
aspects of attentional disorders observed and 
described by several research groups. Doug- 
las (1972, 1974) has characterized the facets 
of attention necessary for adequate academic 
performance as the ability to focus, to inhibit 
impulsive responding, and to sustain and or- 
ganize attention; that is, to “stop, look, and 
listen”; Keogh and Margolis (1976) as com- 
ing to attention, problem solving, and main- 
taining attention; and Whalen and Henker 
(1976) as differences in ability to modify 
motor behavior to meet situational demands, 
in socially adaptive behaviors including im- 
pulsivity, and in sustained attention. Al- 
though the components of attention proposed 
by these researchers are not isomorphic and 
probably not entirely independent, the indi- 
vidual factor scores might perform the im- 
Portant function in future research of provid- 
ing an objective basis for subject selection for 
various treatment conditions and for assessing 
changes in behavior and performance. The 
relationship between the parent and teacher 
rating scales and the Margolis-Keogh chil- 
dren’s checking task is currently under in- 
vestigation to provide further validation of 
the scales, 

In conjunction with the researcher’s Or 
clinician’s own observations, these instru- 
ments fulfill the function for which they were 
designed—to provide supplementary informa- 
tion and to serve as a guide for the consci- 
entious clinician or researcher in the identi- 
fication and treatment of hyperkinesis 
(Frankenburg, 1974)? 
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Reference Note 


1. Grossman, H. J. Current management of hyper- 
activity. Paper presented at the meeting of the 
American Academy of Pediatrics, Washington, 
D.C., October 1975. 
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Assessment difficulties have impeded progress in evaluating the therapeutic role 


of visual imagery. Four studies examine 


d imagery questionnaires and addressed 


the issues of (a) reliability; (b) agreement among different questionnaires; (c) 
social desirability; and (d) construct validity. The Betts and Gordon ales, and 
a newer inventory, the Paivio Individual Differences Questionnaire, were eee 
ined. Reliability of the Paivio inventory was found to be satisfactory and equiv- 


alent to that of other imagery questio: 


nnaires, In two studies, the Betts and 


Paivio questionnaires were correlated at the .45-.50 level, but correlations in- 
volving the Gordon scale were inconsistent from one study to the next. In gen- 
eral, imagery measures were not influenced by social desirability. Factor analysis 
indicated that subjective and objective measures of visualization are indepen- 
dent. The final study revealed a relationship between imagery questionnaire 
scores and reported values and interests. It is suggested that imagery is not a 
unitary construct and that criteria other than visuospatial tests may be appro- 


priate for validating imagery questionnaires. 


The emphasis on imagery in otherwise 
dissimilar therapeutic techniques (e.g. Horo- 
witz, 1970; Wolpe & Lazarus, 1966) suggests 
that understanding imagery may be as useful 
in clinical practice as in the learning laboratory 
cf. Paivio, 1971). Behavior therapy tech- 
niques, in particular, frequently include in- 
structions directing the client to imagine speci- 
fied aversive or reinforcing stimuli. Substituting 
imagery instructions for in vivo presentation 
of stimuli is more than a matter of conveni- 
ence, for imagery instructions potentially en- 
hance the flexibility and power of various tech- 
niques (Cautela, 1971). However, it is not 
known to what extent imagery actually medi- 
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ates the obtained results; research has failed to 
establish a relationship between therapy out- 
come and self-reported individual differences 
in imagery (Beere, 1972; Davis, McLemore, & 
London, 1970; McLemore, 1972), Either thera- 
peutic success is independent of ability to form 
images or the expected relationship is obscured 
by inadequate measurement of individual 
differences in imagery. 

The assessment problem is difficult. It is not 
clear what imagery questionnaires really mea- 
sure or what criteria are appropriate for vali- 
dating them. Objective tests of spatial ability 
appear to share little or no variance with visual 
imagery questionnaires (Di Vesta, Ingersoll, 
& Sunshine, 1971; Forisha, 1975; Neisser, 
1970; Paivio, 1971, p- 496), and no consistent 
relationship has been established between 
questionnaires and objective measures of mem- 
ory for objects (cf. Danaher & Thoresen, 1972; 
Marks, 1972; Rehm, 1973; Rimm & Bottrell, 
1969). In fact, it has been argued that the sub- 
jective experience of imagery bears little rela- 
tionship to any measurable capacity (Neisser, 
1970). Reliable physiological correlates of sub- 
jectively reported imagery are equally elusive 
(see Paivio, 1971, 1973; Richardson, 1969). 


m in any form reserved. 
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The questionnaires themselves have received 

relatively little critical attention despite their 
probable psychometric shortcomings. Rating 
scales such as Sheehan’s (1967b) version of the 
Betts Questionnaire upon Mental Imagery 
(reprinted in Richardson, 1969) and Marks’s 
(1973) Vividness of Visual Imagery Ques- 
tionnaire require subjects to judge their 
images using a numerical scale on which one 
pole always represents “good” imagery. This 
format is very susceptible to response sets (see 
Mischel, 1968). The same criticism can be 
leveled at the Gordon (1949) scale of imagery 
control (reprinted in Richardson, 1969) on 
which a yes response always indicates control- 
lable imagery. In fact, Di Vesta et al. (1971) 
found that the factor that loaded on the Betts 
and Gordon questionnaires also loaded on 
the Marlowe-Crowne Social Desirability Scale 
(Crowne & Marlowe, 1960). This finding is 
troublesome, since it suggests the possibility 
that any relationship between therapeutic out- 
come and self-reported imagery merely reflects 
an underlying response set that influences both 
imagery reports and response to therapy. 

The present research addresses four funda- 
mental issues regarding imagery question- 
naires: (a) internal consistency and retest re- 
liability ; (b) agreement among different sub- 
jective measures; (c) the influence of social 
desirability; and (d) construct validity. At- 
tention is focused on an inventory—Paivio’s 
(1971) Individual Differences Questionnaire 


(IDQ)—as a possible alternative to the tradi- 
tional rating scale, 


Study 1 
Method 


Processes. The direc- 
items is reversed to mini- 
r acquiescence, 


MERRILL HISCOCK 


The IDQ was completed by the first two groups of 
subjects. Item-scale correlations were computed to iden- 
tify nonproductive items; 15 items whose correlation 
with the respective scale score was less than .25 for both 
groups was excluded from the revi: questionnaire, 
After 1 new item was added, the revised questionnaire 
contained 38 verbal and 34 imagery items. The format 
was changed to a Likert scale format with five avail- 
able responses per item. 

The 72-item IDQ was administered to the third group 
of subjects, and Cronbach’s alpha (Cronbach, 1951) 
was computed to assess internal consistency. Fifty- 
eight (73%) of the 79 subjects completed the question- 
naire a second time 2-6 weeks after initial testing,The 
scores of these subjects provided a basis for estimating 
retest reliability. 


Results 


Scores for both IDQ scales were distributed 
in Gaussian manner. Cronbach’s alpha for the 
Imagery scale of Paivio’s original question- 
naire was .80 for the first group and .81 for the 
second group of subjects. The same statistic 
for the 72-item version was .87 for the group of 
79 subjects. The Verbal scale yielded similar 
values of alpha: .83 and .86 for the original 
questionnaire and .88 for the 72-item version. 
Retest reliability coefficients for the 72-item 
IDQ were .84 for the Imagery scale and .88 for 
the Verbal scale. 


Study 2 
Method 


This study addressed two issues: (a) the degree of 
association among the Paivio IDQ, Sheehan’s version 
of the Betts Questionnaire upon Mental Imagery, and 
the Gordon scale of (visual) imagery control and (b) 
the correlation between each of these instruments and 
the Marlowe-Crowne Social Desirability Scale. A 
sample of 123 undergraduates (68 females, 55 males) 
completed the 72-item IDQ, the Visual and Auditory 
scales of the Betts questionnaire, and the Gordon scale, 
All but 10 of these subjects also completed the Marlowe- 
Crowne Social Desirability Scale. Correlations wer? 
computed. 


Results 


Table 1 shows the correlations among the 
five imagery scales and between each of thes? 
scales and the Marlowe-Crowne scale. Sex of 
subject is also included in the intercorrelation 
matrix; negative correlations with sex reflect 
higher scores for females. The polarity of cor- 
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Table 1 
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Correlations Among Imagery Questionnaires and the Marlowe-Crowne Social Desirability Scal 
ility Scale 


Variable 2 3 4 
5 6 7 

1. Sex Ai — 29** 

2. Gordon scale F 756 233" 03 —.20* 

3. Betts Visual scale 40% .21* AT SO) 

4, Betts Auditory scale % Aon Sr .09 

5. Paivio Imagery scale 21 —.15 10 

3 Paivio Verbal scale 3 02 
AL 


. Marlowe-Crowne scale 


Note. n = 123 for all variables except the Marlowe-Crowne scale, for which » = 113. 


*p < 05. 
"p< 01. 


relations involving the Betts questionnaire has 
been reversed to compensate for the negative 
scaling of that instrument. Positive associa- 
a pe imagery reported on the Betts 
les and other variables j- 
tive correlations. i anpi A 
The strongest association indicated in Table 
1is between the Paivio IDQ Imagery scale and 
the Betts Visual scale (r = .49). Correlations 
between the Gordon scale and the other two 
visual imagery scales did not exceed .21. There 
oS a moderately strong association between 
Visual and Auditory scales of the Betts 
ot (r = 40) but virtually no as- 
a ae between the Visual and Verbal scales 
ae ave IDQ. The Betts Auditory scale 

a e ID Q Verbal scale were not related. 
Pan Resi showed a correlation greater 
ea r the Marlowe-Crowne Social De- 
Bas 7 ed The variable that did correlate 
ae i hs with social desirability was sex; 
Pichon io ed to score higher than males. Cor- 
a on etween the Marlowe-Crowne scale 
a oe imagery scales did not exceed .11 
Bis: ude, and their average was exactly 
gee of —.29 and —.33 between sex 
Fhe mee Visual scale and between sex and 
S i IDQ Imagery scale, respectively, 
ea, at females report “better” imagery 
Rule males. The means were 10.62 for fe- 
21) ie 12.87 for males on the Betts scale, 
Fee se a p < .01, and 134.99 for females 
= 3.71, p Be for males on the IDQ, #(121) 
EA oi .01. Separate intercorrelation ma- 
but the e computed for females and males, 
only significant difference involved the 


relationship between the Gordon scale and the 
Marlowe-Crowne scale. The correlation for fe- 
males was —.28 versus .11 for males (z = 2.05, 
p < 05). 


Study 3 


Method 


The 43 females and 36 males who completed the 
72-item Paivio IDQ in Study 1 were given six other 
questionnaires and tests. Among these were the Gordon 
scale and the Visual and Auditory scales of the Betts 
questionnaire. An objective test of spatial ability, the 
Minnesota Paper Form Board Test, was included in the 
battery. This test, which has been used by Ernest and 
Paivio (1969, 1971) as part of an “imagery battery,” 
represents a possible exception to the usual finding that 
objective tests are unrelated to imagery questionnaires 
(cf. Paivio, 1971, p. 496). The Quick Word Test (Bor- 
gatta & Corsini, 1960) was selected as a vocabulary test. 
Because of time constraints, only the first 35 items were 
used. The final two measures were developed specifi- 
cally for use in the present study. One of these, called 
the Visual Memory Scale, requested subjects to imagine 
six common objects in the local area and then “read 
out” specific information from their images. For ex- 
ample, subjects were asked to visualize the university 
tower and count the columns of windows on one face. 
The other measure, called the Visual Manipulation 
Scale, required subjects to mentally manipulate various 
objects. A representative item from this scale, taken 
from Griffitts (1927), instructs subjects to picture & 
3-inch (4.62 cm) cube, painted red, that is sawed into 
Linch (1.54 cm) cubes. Subjects were asked to deter- 
mine the number of little cubes having paint on three 
faces. 

All together, 10 variables were represented in thein- 
tercorrelation matrix. The three imagery questionnaires 
contributed a total of five scales; there were four objec- 


tive tests, and subject’s sex was the 10th variable. The 


intercorrelation matrix was factor analyzed, and the fac- 
rimax procedure. 


tors extracted were rotated using a va 
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Table 2 : Rate 
Correlations Among Subjective and Objective Measures 
Variable 2 3 5 6 7 8 9 10 
—.10 —.36** —.02 —.19 —.16 21 10 Al —.02 
I C 47 37e | 56** 20 —.09 06. —.05 —06 
3 Betts Visual scale 43% .46** 05 —.16 .01 evë = 
4, Betts Auditory scale 24% = —.04 Ete ga AN. TE 
5. Paivio Imagery scale 09 —. : s ae =u 
6. Paivio Verbal scale 0. A a i 
7. Visual Memory Scale a X. 
8. Visual Manipulation Scale 4 A 
9. Quick Word Test i 
10. Minnesota Paper Form 
Board 
Note. n = 79. 
*p < 05. 
>< .01. 
Results the correlation between the Gordon scale and 


The intercorrelation matrix is presented in 
Table 2. Again, negative correlations with sex 
reflect higher scores for females, and positive 
correlations with the Betts scales reflect posi- 
tive associations. 

Relationships among the five imagery scales 
were similar to those found in Study 2, except 
that correlations between the Gordon scale and 
the two other visual scales were markedly 
stronger than those found earlier. The correla- 
tion between the Gordon scale and the Betts 
Visual scale (r = .47) was significantly greater 
than .16 in Study 2 (z = 2.38, p < .05), and 
the correlation between the Gordon scale and 
the Paivio IDQ Imagery scale (.56) was sig- 
nificantly greater than .21 in Study 2 (z = 2.86, 
$ < .01). The IDQ Imagery scale and the 
Betts Visual scale again were correlated at a 
moderate level (r = .46). 

Females again tended to report “better” 
imagery than males on the IDQ Imagery scale 
and the Betts Visual scale, but only the Betts 
scale yielded a significant sex difference, ¢(77) 

= 3.30, p < 01. The mean for females was 
10.81 versus 14.28 for males. When separate 
intercorrelation matrices were calculated for 
females and males, only two significant differ- 
ences were found, and both involved the Gor- 
don scale. The correlation between the Gordon 
scale and the Betts Visual scale was signifi- 
cantly greater for males (r = .66) than for fe- 
males (r = .08, z = 2.99, p < .01). Similarly, 


the Betts Auditory scale was greater for males 
(r = .60) than for females (r = .00; z = 2.92, 
$ <.01). in 

There was no appreciable association be- 
tween subjective and objective measures of 
imagery. The Paivio IDQ Verbal scale, how- 
ever, did show a modest correlation with the 
Quick Word Test (r = .30). 

Four factors with eigenvalues greater than 
1.0 emerged from factor analysis. Loadings for 
each of these factors are shown in Table 3. 
Factor 1 loaded on the three visual imagery 
scales and the Auditory scale of the Betts ques- 
tionnaire; Factor 2 loaded on the Minnesota 
Paper Form Board and the Visual Manipula- 
tion Scale; Factor 3 loaded on the subject’s sex 
and the Visual Memory Scale; and Factor 
loaded on the Paivio IDQ Verbal scale and the 
Quick Word Test. 


Study 4 
Method 


This study further explored the pattern of e 
tions involving the Paivio IDQ. Eighty-one males cy 
pleted the IDQ and three other tests. Space Rela’ ae 
from the Differential Aptitude Tests (Bennett, fe 
shore, & Wesman, 1966), was used as an objective Ra 
of spatial ability. Like the paper form board, Spare 
lations is a component of the “imagery battery a Test 
Ernest and Paivio (1969, 1971). The Quick Wor 11 100 
again served as a vocabulary test. This time & 
items were administered. 
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Table 3 
Factor Loadings for Four Varimax Factors 
i Factor 
Variable I II Ill IV he 
Betts Visual scale 75 01 
` PEY —.3 
pi Gordon scale .82 .02 - i ae e 
Paivio Imagery scale .15 .04 aA ‘05 ue 
Betts Auditory scale 68 —.15 15 -22 > 
Minnesota Paper Form Board —.07 .19 —.09 -27 A 
Visual Manipulation Scale ‘04 16 ‘08 25 ‘65 
Sex of subject —:14 105 79 = 20 69 
Visual Memory Scale —.05 —.01 70 a7 ) 
Paivio Verbal Scale .09 —.05 —.09 “88 78 
Quick Word Test sahi ‘50 29 ‘56 67 
Eigenvalue 2.49 1.69 1.17 116 $ 
"The fourth test in the present study was the Study Discussion 


of Values (Allport, Vernon, & Lindzey, 1960). The im- 
-agery literature contains some evidence that character- 
istics of an individual’s imagery are related to variables 
pes interests and choice of vocation (Chowdhury & 
Vernon, 1964; Roe, 1951). On this basis, one might ex- 
pect to find a relationship between the Paivio IDQ 
magery scale and the Study of Values, which is a 45- 

item ipsative scale yielding a score for each of six cate- 
h ps of interests and values. The scale can be dichoto- 
seat id into two groups of values: extraceptive (theoreti- 
ooh economic, and political) versus intraceptive 
(ae thetic, social, and religious) (Dunn, Bliss, & Siipola, 
Pe It was predicted on the basis of the Dunn et al. 
“an gs that the latter categories would be associated 
T imagery (see Paivio, 1971). Thus, rank 
En A lese categories were computed and summed, 
i at subjects who gave high ratings to these three 
“Values were assigned a high numerical score. 


7A 


Results 


“an intercorrelation matrix is displayed in 
a 4. There was no significant association 
She een the Paivio IDQ Imagery scale and 
a Relations (ry = .05), but there was a cor- 
3 ion of .35 between the IDQ imagery 
} aR and the rank-order index derived from the 
X y of Values. As predicted, subjects who 
R more frequent use of imagery tended 
a ave high aesthetic, social, and religious 
oC Other correlations involving the IDQ 
; a. scale were trivial. 
f ae coefficients in the matrix are not- 
Veb a correlation of .41 between the IDQ 
scale and the Quick Word Test repli- 
ee of Study 3. Also, the Quick 
Bene. Siam one Relations were associated 


The formal psychometric properties of Pai- 
vio’s IDQ that were evaluated were found to be 
satisfactory. Even though reverse scaling of 
several items should have counteracted re- 
sponse sets that might inflate a reliability esti- 
mate, retest reliability appears to be at least 
comparable to that of the shortened Betts 
questionnaire (Sheehan, 1967a) and Marks’s 
(1973) Vividness of Visual Imagery Question- 
naire. 

The instability of correlations involving the 
Gordon scale requires explanation, especially 
since McLemore ( 1976) recently reported very 
similar findings. Also, the only differences be- 
tween the intercorrelation matrices for females 
and for males involved the Gordon scale. An 
estimate of retest reliability is not available; 
but split-half reliability of the Gordon scale is 
reasonably good (r = 77 for Study 2 and .84 
for Study 3). A more probable source of diff- 
culty is the marked skewness of the frequency 
distribution. A concentration of scores at the 
high end of the scale allows a relatively small 
number of low scores to exert disproportionate 
influence on correlations with other measures. 
Consequently, correlations vary between sam- 
ples. Since the modal score is the maximal score 
(12 yes responses), transformations are not of 
much help. Moreover, the departure from 
normality will affect the bivariate distribution 
on which the product-moment correlation is 
based, and it may bias the z statistic used to as- 
sess the significance of differences between cor- 
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Table 4 

Correlations among Space Relations, Study 
of Values, Quick Word Test, and the Paivio 
Individual Differences Questionnaire 


Variable 2 3 4 5 
1. Space Relations O27 e298 0S 00 
2. Study of Values 08 35* 09 
3. Quick Word Test —.04 41* 


4. Paivio Imagery scale Al 
5. Paivio Verbal scale 


Note. n = 81 males, 
sp < 01. 


relations. Although the Gordon scale may be 
of some use in selecting cases for experimental 
or clinical purposes, it should either be modi- 
fied (cf. Lane, 1975) or avoided in correlational 
and factor-analytic studies. 

A milder degree of skewness in the distribu- 
tion of scores on the Betts scales does not seem 
to affect this test’s stability. On the contrary, 
correlations between the Betts scales and other 
variables (except the Gordon scale) proved to 
be remarkably stable from Study 2 to Study 3. 
Although the probable contribution of method 
variance cannot be overlooked (Mischel, 1968), 
the correlation between the Betts Visual scale 
and the IDQ Imagery scale suggests the exis- 
tence of some subjective dimension that can be 
tapped using instruments that differ in format 
and content. 

However, Di Vesta et al. (1971) concluded 
that subjective measures of imagery lack con- 
struct validity. There was no substantial cor- 
relation between imagery questionnaires (i.e., 
Betts’s and Gordon’s) and spatial tests, and the 
factor that loaded on the imagery question- 
naires also loaded on the Marlowe-Crowne 
scale. Present data failed to support the Di 
Vesta et al. finding regarding social desirabil- 
ity, but perhaps the disparate evidence can be 
reconciled. First, Di Vesta et al. may have 
created a “‘method factor” by adding imagery 
questionnaires and the Marlowe-Crowne scale 
to a battery of objective tests, Second, even if 
the Betts Visual and Auditory scales are not 
contaminated by social desirability, some of the 
other Betts scales might be (cf. McLemore 
1976). Third, the general instability of corre. 
lations involving the Gordon scale could ac- 
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count for apparent differences in the influence } 
of social desirability. | 
The present data support the Di Vesta etal, 
(1971) conclusion that imagery questionnaires 
and spatial tests are not interchangeable, | 
Studies 3 and 4 demonstrate the kind of dissoci- 
ation between self-reports and performance 
that led Neisser (1970) to argue that the ac. | 
curacy or usefulness of imagery is unrelated to 
the subjective experience of imagery. It is es- 
pecially interesting that the Betts Visual scale 
and the Visual Memory Scale were unrelated, | 
Even subjects who reported the clearest and | 
most vivid images were no more able than sub- 
jects on the other extreme of the Betts distri- 
bution to read out accurate information about | 
visual aspects of familiar objects (cf. Wood- 
worth, 1921, p. 372). | 
If imagery questionnaires are not contam- | 
inated by response sets, and nevertheless ate | 
unrelated to visuospatial performance, perhaps | 
it is because imagery is not a unitary construct, 
That which imagery questionnaires measure | 
may be different from, but no less interesting | 
than, that which visuospatial tests measure 
However, questionnaire validity still needs to 
be demonstrated. Two possibilities present 
themselves as potential criteria for validating 
imagery questionnaires. The first is perform- 
ance on certain verbal memory tasks. Paivio 
(1971) has suggested that “an acquired dis- 
position of the individual to react to words | 
(especially concrete words) with nonverbal | 
images” (p. 509) can be measured by question- 
naires. Consistent with Paivio’s conception the 
72-item IDQ proved to be useful in predicting 
recall of high-imagery adjectives from pea 
(Hiscock, 1976). Study 4 suggests a secon 
possible correlate of self-reported imagery: 
The obtained correlation of .35 between the 
IDQ Imagery scale and the Study of Values 1$ 
hot strong, but the relationship is consistent 
with previous findings (Chowdhury & Verno”, 
1964; Dunn et al., 1958; Roe, 1951), and it 
demonstrates the need for more attention t0 
values, interests, and choice of occupation 4 
potential correlates of subjective imagery- f 
Two additional implications of the present 
research merit consideration. The first 60? 
cerns the orthogonality of the Imagery ® 
Verbal scales of the Paivio IDQ. Apparently; 
being a “verbalizer” is not simply the recipro® 
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of being a “visualizer.” For some purposes, es- 
pecially studies of the relationship between 
imagery and language, the IDQ may prove 
most useful when it is used to select subjects 
on the basis of high scores on one scale and low 
scores on the other. The second observation 
pertains to sex differences. Females tend to 
score higher than males on the Betts Visual 
scale and the Paivio IDQ Imagery scale, and, 
in at least two imagery studies (Ernest & Pai- 
vio, 1971; Marks, 1973), females have outper- 
formed males. Ernest and Paivio (1971) pro- 


posed that “in some tasks, females ‘use’ imagi- . 


nal processes to facilitate recall whereas males 
do not” (p. 71). 

No imagery questionnaire is likely to be the 
best choice for all applications. Paivio’s IDQ 
promises to be a useful instrument for investi- 
gating habitual styles of information processing 
and, in particular, for investigating the manner 
in which people internally represent words. 
Consequently, it should be useful for studying 
the relationship of imagery to attitudes, inter- 
ests, career choices, and so on. The IDQ thus 
offers a means of validating the concept of vis- 
ual imagery as a cognitive style. The therapist, 
however, may prefer to have a “state” mea- 
sure rather than a “trait” measure of imagery 
(McLemore, 1976). In this case, and also when 
nonvisual (e.g., tactile) imagery is of interest, 
a rating scale may be the only appropriate de- 
vice. 
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4 Tn recent years there has been considerable 
interest in the development of actuarial sys- 
t for test interpretation in clinical settings. 
‘The Minnesota Multiphasic Personality In- 
ventory (MMPI) systems of Gilberstadt and 
Duker (1965) and of Marks, Seeman, and 
Haller (1974) are two well-known examples. 
Potential advantages of such systems. include 
‘economy of description, use of previous in- 
formation and experience accumulated for 
Specific code types, and the facility for auto- 
hating test scoring and interpretation to save 
a clinician valuable time. Thus, given the 
k MPI profile for a patient, the clinician can 
me if this profile matches a specific code 
4 The implicit assumption is that informa- 
_ tion garnered for this code type applies, more 
Or less, to this patient. 
E: Gilberstadt and Duker (1965) system 
9 prototypes that were identified using 
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highly differentiated classification within the three superordinate types. 


the “classic case” approach. Marks et al. (1974) 
developed 16 adult code types by examining 
profile configurations in a sample of psychiatric 
patients. Although each system was developed 
using a different rationale, one immediate ques- 
tion is the degree of overlap between the two 
approaches. A second issue concerns the actual 
steps used in classifying a profile. For example, 
would it be profitable to examine the indepen- 
dent contribution of profile elevation, scatter, 
and shape (Cronbach & Gleser, 1953; Skinner, 
1977) as a means of improving classification 
hit rates? Payne and Wiggins (1968) found that 
either system only classifies approximately 
28% of a psychiatric hospital sample, whereas 
joint application resulted in an overall hit rate 
of 49%. This percentage will vary apparently 
as a function of the educational level of the 
patient population. 

This article investigates empirically the de- 
gree to which MMPI profiles can be concep- 
tualized in terms of three superordinate MMPI 


types (neurotic, psychotic, and sociopathic). 
A second aim is to evaluate the degree to which 
the Gilberstadt and Duker (1965) and the 
Marks et al. (1974) co 


de type systems can be 
integrated in terms of such superordinate types, 
These three prototypes can be conceptualized 
as “ideal type constructs” (Hempel, 1965, 
chap. 7) that form a th 


eoretical model of psy- 
chopathology, as assessed by the MMPI. In 
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addition, this theoretical model offers certain 
practical advantages for classification research 
in clinical settings. The end result for classified 
patients is a set of elevation, scatter, and shape 
parameters that can be computed readily with 
a small calculator. Furthermore, graphic plots 
of the classification data can be constructed to 
facilitate interpretation. 


Method 
Subjects 


The 35 MMPI code type profiles, 19 from Gilber- 
stadt and Duker (1965) and 16 (adult) from Marks et 
al. (1974), were keypunched for computer processing. 
Then, relationships among these 35 code types were 
examined using a multivariate classification strategy. 
Thus, the 35 MMPI code types formed the derivation 
sample for a new MMPI taxonomy. 

To provide an estimate of the generalizability of this 
new MMPI taxonomy, data taken from Lanyon (1968) 
were classified. Lanyon has compiled mean MMPI pro- 
files (T scores) for 233 diverse psychiatric and normal 
groups. 


Procedure 


Conceptually, one begins by hypothesizing that a set 
of ideal personality types (Hempel, 1965, chap. 7) un- 
derlie the two MMPI codebook systems. The basic con- 
cept is a modal profile, which can be defined as a hy- 
pothetical MMPI profile pattern that is characteristic 
of a subset of patients in a psychiatric population, Our 
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Figure 1. Neurotic 
Inventory profile. 
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approach is based on the assumption that patients can 
be differentiated into relatively homogeneous classes by 
grouping those individuals who substantially resemble 
the same MMPI modal profile. 

A least squares estimate of the modal profiles is de- 
rived through a generalized principal-components 
model. Jackson and Williams (1975) and Skinner (1977, 
in press; Skinner, Note 1) have provided discussions of 
this classification procedure from somewhat different 
perspectives. Briefly, consider a data matrix X; giving 
the scores of N individuals from sample i on the 13 
MMPI clinical scales. The classification model can be 
depicted as 


Xi = f[M; Si, Ai P! + Ei}. (1) 


That is, relationships among MMPI profiles (row vec- 
tors) of sample i can be expressed as a function of (a) a 
column vector of profile elevation parameters Mi, (b) a 
column vector of scatter parameters S;, and (c) a matrix 
of shape parameters A; describing the extent to which 
each individual's MMPI profile resembles the shape of 
the MMPI model profiles. Note that shape describesthe 
actual pattern of “ups and downs” across the 13 MMPI 
clinical scales and elevation depicts the degree to which 
the MMPI profile as a whole has high (or low) T scores, 
whereas scatter represents how dispersed the 13 scale 
scores are about their average (Cronbach & Gleser, 
1953). Finally, Æ; is a residual matrix that provides a 
measure of how well the modal profiles “fit” the parti- 
cular sample X;. 

The present study focuses first on estimating the un- 
derlying “ideal types” or modal profiles defined by the 
matrix P. The 35 Code Types X 13 MMPI Clinical 
Scales data matrix X was decomposed according to the 
Eckart and Young (1936) theorem (i.e, X=F A P’); 
A least squares estimate of the modal profiles is provided 
by the right-hand eigenvectors P. Then, given the 
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Modal profiles P, the second aim of this study is to 
determine the degree to which the modal profiles are 
4 Tepresentative of the 233 Lanyon (1968) MMPI groups. 
~ Basically, one is fitting the model described in Equation 
Ito the Lanyon data, in which each group MMPI profile 
18 considered as a single entity. 


in a previous study using this classification model, 
tight modal profiles based on a structured inventory of 


Psychopathology were identified and replicated across 
three samples of alcoholic patients (Skinner, Jackson, 
& Hoffmann, 1974). The generalizability of these modal 
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profiles to diverse psychiatric and normal populations 
was explored by Skinner, Reed, and Jackson (1976). 


Results 


Essentially, the analysis considered similarity 
in profile shape among the 35 MMPI code 
types to identify a set of most representative 
MMPI profiles. Three modal profiles were de- 
rived (Table 1) and are plotted in Figures 1-3, 


e—e SOCIOPATHIC MODAL PROFILE 
Xe MALE NARCOTIC ADDICTS 


Pa Pt sc Ma Si 


jnnesota Multiphasic Personality 


234 


Table 1 

Minnesota Multiphasic Personality 
Inventory Diagnostic Prototype 
Modal Profiles 


Modal profile 


Scale 1 2 3 

L — .670 —1.509 — .820 
F —1.060 .596 - 171 
K — .256 —1.832 .093 
Hs 1.716 — .580 .053 
D 1.612 „655 = .563 
Hy 1.665 — 561 .147 
Pd 059 493 2.043 
Mf — .561 — .683 — 467 
Pa = .732 -705 — .068 
Pt -396 1.153 = .757 
Se =) .311 1.832 — .305 
Ma i200 = 125: 1.886 
Si — .603 — .145 —1,671 


Note. Each modal profile has a mean = .0 and 
standard deviation = 1.0. Table 1 corresponds to 
Matrix P in Equation 1. 


In Table 1 the data have a mean of 0 and vari- 
ance of 1, whereas each modal profile is scaled 
in the figures to have a mean of 50 and a vari- 
ance of 10. Note that the modal profiles in 
Table 1 correspond to the matrix P of the gen- 
eral model depicted in Equation 1. Table 2 pre- 
sents the various code types with their eleva- 
tion, scatter, and three shape parameters, The 
first modal profile, with high points on Hs, D, 
and Hy, classifies the neurotic code types from 
each system. For example, the Gilberstadt and 
Duker (1965) codes 1-3-2 and 1-2-3-7, and the 
Marks et al. (1974) code 2-3-1 are the purest 
exemplars of this neurotic ideal type. The sec- 
ond modal profile has high points from Sc and 
Pi. The best code type exemplar for this psy- 
chotic ideal type is the code 8-6 from each sys- 
tem. Finally, the third modal profile, labeled 
sociopathic, is characterized by Pd and Ma. Not 
surprisingly, the code 4-9 from each system has 
the highest weight. 

The weightings in Table 2 describe the cor- 
relation of each codebook type with the three 
modal profiles of Table 1. Thus, these weight- 
ings emphasize conformance in profile shape, 
that is, the actual pattern of ups and downs 
across MMPI scales (Matrix A;in Equation 1). 
A second important aspect of profile resem- 
blance is elevation, which is defined as the pa- 
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tient’s average across the 13 MMPI scales 
(Vector M; in Equation 1). Note that itis pos- 
sible for two individuals to have a profile shape 
quite similar to a modal profile depicted in 
Figure 1, even though the two individual pro- 
files may be widely different in elevation. For 
example, a psychiatric patient and a normal 
subject may show a high degree of similarity to 
the neurotic modal profile, indicating respec: 
tive high points on Hs, D, and Hy. However 
these two individuals may be differentiated at 
a second stage when considering elevation. One 
would expect the psychiatric patient to show 
more severe psychopathology and thus to have 
high points close to or exceeding a T score of 10. 
Also depicted in Figures 1, 2, and 3 is the 
mean MMPI profile for a reference group taken 
from Lanyon (1968). These reference group 
profiles provide “real world” exemplars of the 
three ideal types. The somatization reaction 
profile plotted in Figure 1 represents a group 
of 39 white veterans from the psychiatry set- 
vice of the Minnesota Veterans Administration 
Hospital (Rosen, 1958). Only “pure” cases 
were selected for this diagnostic category. The 
MMPI profile for this group has a marked a 
larity in shape (r = .93) to the neurotic model 
profile, although the somatization reaction pre 
file is higher in elevation. Similarly, He 
presents the MMPI profile of 100 white He 
veterans from the psychiatry service of H 
Minnesota Veterans Administration Hospital 
who had been diagnosed as paranoid schiza 
phrenics (Rosen, 1958). This profile substan 
tially resembles the shape (r = .92) of the Py 
chotic modal profile. However, the pa 
schizophrenics’ profile is higher in elevation 
Finally, Figure 3 presents a male nae r 
dict’s profile based on adult addicts treate ; 
the Patton (California) State Hospital On 
1964). This reference group profile clos 
sembles the shape (r = .82) of the soclopa ‘ 
modal profile and is slightly higher in pro 
elevation. F 
Classification results for selected group Ka 
files from Lanyon’s (1968) compendium 2 
given in Table 3.1 The three modal profiles A 
counted for 68.79% of the variance due to P 


stable 
* Data for the complete set of 233 groups are aval 
from the authors. 
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Table 2 
Shape 
System ; e E O get 
Elevation Scatter 1 2 3 
Gilberstadt & Duker (1965) 
Code 1-2-3 
62.61 13.22 
pei ea a eta 
Code L32 626P ae pelean A 
Code 1: 3-1 Bas wy oe =.16 07 
eae 65.39 12.48 74 32 ‘07 
Code 1-3-9 at P tf a aM 
aa 58.31 8.71 54 00 66 
Code 2-7-4 ga ee, a OG eee 
one: 66.92 12.26 59 66 18 
Code A 74.00 16.37 44 82 —.26 
aaa 58.00 7.52 16 2 19 
Code 4-9 60.31 8.97 61 09 63 
Pee 55.54 9.04 —.16 06 95 
ain 70.23 13.36 Al 83 —.07 
See 81.15 19.83 67 AT —.03 
RORYA 68.15 12.50 30 86 21 
OFE 70.61 14.26 —.08 89 —.04 
aie 63.31 11.28 —.34 2 42 
e 53.92 9.79 —.54 21 68 
63.54 12.07 91 30 =.10 
nore se i 66.15 12.53 78 ‘40 —.29 
E 67.54 12.35 Al 13 —.03 
SR 70.31 14.40 55 11 —.34 
ncaa 72.39 13.65 43 87 =.10 
Seer 63.23 10.12 85 aay 21 
ea 67.31 13.25 86 40 07 
ey 64.23 11.25 08 66 40 
CERET 67.54 14.13 43 73 09 
S 69.61 11.91 19 88 30 
areas 61.08 8.37 —.04 34 90 
mate 68.31 10.14 62 56 27 
Ce 72.31 16.39 03 93 04 
cate 65.61 11.76 Sk) 83 30 
Nac -6 57.19 9.20 —.50 46 56 
ormal 54.31 3.85 34 ain 60 


Ne ; 
abet The elevation parameters M; are the mean of the 1 
profile standard deviation, and shape parameters Ai 


spective modal profile of Table 1. 


a shepe among the 233 Lanyon (1968) groups. 
hi F ermore, using an entrance criterion of a 
ghest loading above |.50 |, 84.55% of the 233 
nan can be classified as salient on a specific 
a profile (cf. Skinner et al., 1976). These 
eae provide encouraging support for the 
ine of the three MMPI modal pro- 
aie ee diverse deviant and normal sam- 
ie ote that classification data analogous to 
e contained in Table 3 can be readily com- 


A 
Marks, Seeman, & Haller (1974) 
Code 2-3-1 


3 clinical scale T scores, scatter parameters Si are 
are the correlation between a code type and re- 


onal summary below) for 


puted (see computati 
ts with scores on the 13 


any sample of patien 
MMPI clinical scales. 

Given the classification results ex 
by Table 3, graphic displays can be constructed 
to facilitate interpretation ©: the data. That is, 
the elevation (M. ;) and shape (A,) parameters 
provide coordinates on orthogonal axes (Skin- 


ner, 1977, in press; Skinner, Note 1). For ex- 


ample, Figure 4 plots elevation by shape for 


emplified 
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Table 3 
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Classification Results for Selected Groups from Lanyon 


Shape 
SEE 
Lanyon (1968) group Elevation Scatter 1 2 3 

Neurotic modal profile 50.00 10.00 1.00 .00 00 
Somatization reaction (15b) 60.92 8.19 93 17 10 
Somatization reaction (14a) 58.51 6.79 +92 02 Lh 
Multiple sclerosis (39) 62.05 9.54 95 09 04 
Neurotic outpatients (13) 61.74 9.45 87 36 ll 
Neurotic inpatients (13) 59.43 7.75 .88 40 05 
Psychotic modal profile 50.00 10.00 00 1.00 .00 
Paranoid schizophrenics (2) 69.08 10.46 38 92 02 
Schizophrenics 15-29 (6a) 62.59 7.10 14 92 13 
Schizophrenics 30-39 (6a) 58.36 3.51 —.10 85 .29 

Acute schizophrenics (1) 64.15 6.26 wad .86 19 
Chronic schizophrenics (1) 65.39 8.36 —.08 78 06 
Sociopathic modal profile 50.00 10.00 .00 .00 1.00 
Male narcotic addicts (21) 56.82 6.80 —.11 03 82 

Female narcotic addicts (21) 56.05 6.64 =/21 .53 ay) 
Psychopathic personality (23) 58.20 6.38 00 56 2 
Prisoners (65a) 59.15 6.26 22 AT 16 

Model prisoners (65b) 59.15 5.39 49 46 61 


Note. The number in parentheses after the group name refers to the figure number in Lanyon’s (1968) book. 


the psychotic modal profile. The horizontal 
axis represents the degree to which each code 
type from Table 2 (or any new MMPI profile) 
matches the shape of Modal Profile II. The 
vertical axis depicts the degree of profile eleva- 
tion. Along with selected code types, two schizo- 


270) M27-8 

° M2-7-. 
‘2 62:7-4 Ne 
< e 

> 

4 

fy G8-9 


2 


75 


Figure 4. Plot of shape versus elevati 
atii 
Duker, 1965, code type, and M eend 


SHAPE 


for the psychotic modal profile. (G denotes a Gilberstadt and 
tes a Marks, Seeman, and Haller, 1974, code type.) 


phrenic group profiles (Lanyon, 1968) are 
plotted in Figure 4. Although the two profiles 
substantially resemble the shape of Modal 
Profile II, the schizophrenic 15- to 29-year-old 
group was lower in profile elevation than the 
paranoid schizophrenic group. Figure 5, on the 
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other hand, displays the degree to which se- 
lected code type and Lanyon (1968) group pro- 
files resemble the shape of Modal Profile I 
(neurotic) versus Modal Profile II (psychotic). 
Both figures provide a convenient summary of 
the classification analysis. 


Discussion 


This new method of identifying an MMPI 
taxonomy can be readily used in applied set- 
tings. That is, given a patient’s MMPI profile 
in T-score format, the computational steps for 
forming Equation 1 are quite straightforward : 


1. Compute the patient’s average over the 
13 MMPI clinical scales T scores to yield the 
elevation parameter (M;). 

2, Compute the standard deviation of the 
13 MMPI scales about this patient’s average 
to give the scatter parameter (Si). 

3, Compute the correlation of the patient’s 
profile with each of the three modal profiles 
(Table 1) to yield the three shape parameters 


i}. 


The investigator should scan the three shape 
Parameters to identify the highest weighting. 
Í this value exceeds a minimum standard (e.g-5 
|.50|), then a figure for the appropriate modal 
Profile (cf. Figure 4) should be constructed. 
‘he position of this patient can be located, 
a the elevation and shape corrdinates. If, 
lor example, a patient’s location is most proxi- 
a to code 1-3-2 from Gilberstadt and Duker 
ah then the clinician could interpret the 
dy of information regarding code 1-3-2 with 
tespect to this patient. Furthermore, a second- 
ag (tertiary, etc.) classification is possible by 
a the location of this patient relative 
T other code types in the ordination space. 
o one can interpret the theoretical and 
3 pirical data accumulated for the appropriate 
luperordinate modal profile. 
ay technical issue concerns the degree of reli- 
b e differentiation among code types proposed 
af each system. Note that the three modal pro- 
aa of Table 1 explain (reproduce) 81.92% of 
i € covariation among the 35 code types. That 
a three underlying “ideal type con- 
o cts,” one is able to capture most of the 
aon variance among the Gilberstadt and 
uker (1965) and Marks et al. (1974) systems. 


INTEGRATION OF MMPI ACTUARIAL SYSTEMS 
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Paranoid 
xSehizophrenics 


Figure 5. Plot of the neurotic (I) versus the psychotic 
(II) modal profile. (G denotes a Gilberstadt and Duker, 
1965, code type, and M designates a Marks, Seeman, 
and Haller, 1974, code type.) 


Certainly, one could question whether two code 
types within one system that are close together 
in the ordination space (e.g., Marks et al.’s, 
1974, codes 2-8 and 8-6 of Figure 4) represent 
reliable differences. They should probably be 
collapsed into one code type. 

From a theoretical perspective, it is hypothe- 
sized that: the shape parameters yield indices 
of the patient’s more enduring personality dis- 
position (Lorr, 1966), whereas elevation and 
scatter reflect more temporary Or situational 
factors influencing the degree of maladjust- 
ment (Carlson, 1972; Morf & Krane, 1973; 
Quertin, 1966). By examining the independent 
contribution of elevation, scatter, and shape 
components of profile similarity (Skinner, 1977, 
in press), one can integrate various aspects of 
the trait (core) and social-learning (situa- 
tional) theory approaches to the study of per- 
sonality (Bowers, 1973). For example, if pre- 
treatment and posttreatment data are 
available, one could fit the MMPI modal 
profiles to each data set and then examine the 
temporal stability of the elevation, scatter, and 
shape parameters. One might hypothesize that 
patients will be classified in the same modal 
profile on each occasion when considering 
shape. However, 4 decrease in the elevation 
parameter could be predicted corresponding to 
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a reduction in symptom severity in response to 
a successful treatment program. 

Although relatively homogeneous subgroups 
or clusters can be identified in the ordination 
space (cf. Figures 4 and 5), the current ap- 
proach is essentially a dimensional model. The 
level of differentiation implicit to the three 
superordinate modal profiles is similar to Gold- 
berg’s (1972) hierarchical classification (i.e., 
normal vs. deviant; neurotic vs. psychotic vs. 
sociopathic). 

In conclusion, the three MMPI modal pro- 
files form a typological model of psychopath- 
ology that is open to empirical evaluation. 
Hopefully, this model will guide further theo- 
retical developments and empirical research in 
the study of psychopathology. Expanding the 
system to more than three modal profiles will 
require additional psychopathological groups 
using more differentiated and less highly cor- 
related personality scales than those comprising 
the MMPI clinical scales. Furthermore, by 
building on the actuarial systems of Gilber- 
stadt and Duker (1965) and Marks et al. 

(1974), the present model should provide a 
convenient and useful diagnostic basis for 
clinical decision making, 


Reference Note 


1, Skinner, H. A. Modal profile analysis and classifica- 
tion research. Paper presented at the annual meeting 
of the Psychometric Society, Bell Laboratories, 
Murray Hill, New Jersey, April 1-3, 1976, 
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for Social Isolation m 
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Socially isolated college students, 28 men and 26 women, who volunteered for 
a program to increase social comfort and activity in friendship interactions, were 
randomly assigned to a treatment involving 12 real-life practice interactions 
with other subjects, a treatment involving 12 practice interactions plus 9 hours 
of social skills training, a minimal treatment control condition, or a delayed 
treatment control condition. Outcome was evaluated by multiple criteria He 
included self-report, self-monitoring, peer rating, and behavioral measures. Re- 
sults indicated no significant differences between the two treatment groups or 
between the two control groups. The two treatment groups showed substantial 
and significant improvements in contrast to each control group on measures of 
anxiety and social activity. These gains were maintained at follow-up 


assessments 3 and 15 months posttreatment. It is argued that the practice inter- 
action treatment may function as in vivo desensitization, thereby reducing social 


Recently, there has been a considerable 
| amount of research on the assessment and 
treatment of minimal dating problems (Arko- 
witz, 1977; Curran, 1977). Surveys have 
indicated that social anxiety and minimal 
dating are significant and pervasive concerns 
for many individuals (e.g., Bryant & Trower, 
1974). Several behavioral treatment pro- 
cedures have been developed and evaluated 
for these problems, such as social skills training 
procedures involving behavior rehearsal, model- 
ing, coaching, and feedback (e.g., Twentyman 
& McFall, 1975) ; anxiety-reduction procedures 
such as systematic densitization (e.g., Curran 
& Gilbert, 1975); and cognitive modification 
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anxiety and leading to increased social activity. 


procedures (e.g., Glass, Gottman, & Shmurak, 
1976). 

Despite the considerable interest in minimal 
dating and the apparent efficacy of these 
different procedures, there has been very little 
attention directed toward another significant 
social problem—anxiety and isolation in same- 
sex friendship interactions. Friends provide 
opportunities for social comparison and in- 
formation regarding social norms. In addition, 
they provide opportunities for modeling and 
feedback, which can have considerable in- 
fluence in shaping a variety of social behaviors. 
Friendship relationships can also provide 
emotional support and comfort in times of 
stress, Thus, the lack of friendship relationships 


can severely limit an individual’s social and 


emotional growth. 

In the present study, a treatment approach 
that has been effective for minimal dating was 
applied to the problem of social anxiety and 
isolation in same-sex friendship interactions. 
The approach is based on real-life practice 
and involves repeated exposures to moderately 


anxiety-arousing social situations in the natural 
sen, & Royce, 


environment (Arkowitz, Christen : 
1975). The procedure for minimal dating has 


involved providing subjects with a series of 


rm reserved. 
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six “practice dates,” each with a different 
randomly selected partner who was also a 
volunteer for the program. Research has 
demonstrated that the procedure of practice 
dating leads to significant decreases in social 
anxiety and increases in posttreatment dating 
activity with new partners whom the subjects 
met on their own (Christensen & Arkowitz, 
1974; Christensen, Arkowitz, & Anderson, 
1975; Kramer, 1975). The effectiveness of the 
procedure appears to be due primarily to 
anxiety reduction (Christensen et al., 1975). 
An advantage of the practice dating procedure 
is that it can be administered on a large scale 
basis by someone trained only in clerical skills. 
In addition, since the treatment takes place 
in naturalistic situations, there is minimal 
concern about generalization from the therapy 
situation to the subjects’ natural environments. 
The procedure evaluated in the present 
study consisted of randomly pairing members 
of the same sex for a series of such “dates.” 
This practice interaction group was compared 
to a second treatment group that received an 
identical series of practice interactions plus 
social skills training. (To the extent that 
subjects are characterized by inadequate social 
skills, the anxiety reduction effected through 
the practice interactions may not be sufficient 
for durable change to occur.) A minimal 
treatment group was used to control for atten- 
tion and expectancy factors (Paul, 1969). A 
delayed treatment control group and a no- 
treatment group (to serve as a control for 
follow-up) completed the design. It was hy- 
pothesized that the two treatment groups 
would show significant improvements on 
multiple criteria of social skills, anxiety, and 
activity, relative to each of the control groups. 
It was also hypothesized that the practice 
interactions plus social skills training group 
would be superior to the group that received 
practice interactions only. 


Method 
Subjects 


All subjects were college students, 18-23 years oi 
who had telephones at their aek es 
Population for the study was the most socially distressed 
15% of approximately 1,000 students who completed 
a social survey questionnaire distributed in a number 
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of large classes. Since norms for social activity and 
anxiety in friendship interactions were not available, 
this selection criterion assured that subjects were 
considerably distressed and inactive.! Twenty-eight 
subjects were male, and 26 were female. Their mean age 
was 19.8 years, and 8% described themselves as 
married or cohabiting, 18% engaged or steadily 
dating one person, and 74% as single. 


Experimental Design 
a 

Subjects were randomly assigned among the two 
treatment groups and two control groups. Initial group 
sizes were 13 subjects in the practice-only group, 14 
in the practice plus skill training group, 11 in the 
minimal treatment control group, and 16 in the delayed 
treatment control group. There was no mixing of sexes 
at any point in the study; males and females were) 
treated separately but identically. Assessment was 
conducted pretreatment, posttreatment, 3 months 
posttreatment, and 15 months posttreatment. Because 
the delayed treatment control group was not available 
for control purposes at follow-up (they had received 
treatment by then), a follow-up control group wasi 
separately recruited. 


Assessment 


Self-report questionnaires. Two questionnaires wert 
administered to all groups pretreatment, posttreat: 
ment, and at follow-up. The Social Avoidance and 
Distress Scale (Watson & Friend, 1969) is a 28-item 
social anxiety scale that has been demonstrated to 
possess good reliability and validity (Arkowit4,, 
Lichtenstein, McGovern, & Hines, 1975; Watson & 
Friend, 1969). The self-report questionnaire consist 
of free-response questions of retrospective frequency 
and range of same-sex social interactions and of dates 
and 7-point rating scales of same-sex social anxiety 
and social skills. 

Self-monitoring measures. Subjects were asked tô 
record their social interactions over a period of a wee 
in a social interaction diary, which they carried wi 
them. These diaries yielded a measure of the frequen)” 
and range (number of different people) of same 1 
and opposite-sex interactions in a 1-week pe 
Interactions that did not progress beyond @ me 
greeting were not recorded. This procedure had beg 
used by Christensen et al. (1975) in their practi” 
dating study. The diary was administered pretreatm®i 
and posttreatment to all subjects except those 1” 
follow-up control group. two 

Peer ratings. A peer questionnaire was sent o cH 
male and two female acquaintances of each subj 
The questionnaire yielded same-sex and opposite 
social competence scores, based on ratings of ee 
anxiety, social skills, and social activity. Because 


an ion 
1For a more detailed description of the E a 


criteria used in this study, as well as a discuss! i 
some general issues regarding selection criteria in stu! 
of this type, see Royce and Arkowitz (1977). 
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seemed likely to be a substantial lag between changes 
in subjects’ behavior and their peers’ perceptions of 
these changes, this questionnaire was administered to 
all groups pretreatment and at the 3-month follow-up 
only. 

i arora performance task. Subjects were brought 
into the laboratory where they engaged in a 10-min 
conversion with another subject. Four measures for 
each subject were taken from the audio recordings of 
the interactions: total talk time, number of silences 
longer than 10 sec, and ratings of social skill and social 
anxiety. These ratings were made by a corps of 22 
untrained undergraduates. Five or six raters of the 
same sex as the subject independently rated each 
interaction. The means of these ratings yielded each 
subject’s scores. This procedure, rather than the more 
usual one of training a small number of raters to a high 
degree of interrater agreement, was used to avoid the 
problem of relying solely on interrater agreement as 
the criterion of rating accuracy (Johnson & Bolstad, 
1973; Lipinski & Nelson, 1974). The procedure yielded 
instead a representative peer opinion as to what 
constituted levels of social anxiety and skill. These 
measures were collected pretreatment and posttreat- 
ment from all subjects except those in the follow-up 
control group. 

Process ratings. After each practice interaction, 
subjects completed ratings of themselves and their 
Reiners on 7-point scales of social anxiety and social 
skill, 


Treatment 


_ Practice only. Subjects were matched for 12 practice 
interactions, 2 per week, with a different same-sex 
partner each week, Matching was completely random. 
Each subject was sent the name and telephone number 
of his or her partner at the beginning of each week. 
Both interactions with that partner had to occur within 
& week, but decisions about all other details (initiation, 
time, place, length, etc.) were left up to the subjects. 

Practice plus skill training. Subjects in this group 
Participated in a series of 12 practice interactions as the 
Practice-only subjects did. In addition, these subjects 
attended six weekly group social skills training sessions. 
Each group meeting lasted 1.5 hours. Except for the 
initial meeting of each group, the first 30 minutes of 
each session were devoted to feedback given to 
subject by her or his partner from the previous week’s 
Interactions. The remaining hour was devoted to social 
skills training. Each week, subjects received a chapter 
ftom a social interaction training manual, based on 
similar manuals developed by McGovern (1972) and 
by Watkins (1972). Each chapter provided information 
about a particular topic in the realm of socially effective 
behavior and ended with a number of paper-and-pencil 
exercises to be completed by the subject during the 
Week, Each group session was devoted to skills training 
in the area defined by the week’s reading and exercises. 

odeling and behavior rehearsal were the primary 
techniques used. 

Mi ‘inimal treatment control. Subjects in this group 
Participated in six weekly group counseling sessions, 
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similar in format to the groups attended by the practice 
plus skill training subjects. These control subjects 
received the same readings and exercises as the practice 
plus skill training subjects, but group meetings consisted 
only of discussion and verbal counseling. These subjects 
received no practice interactions. 

Delayed treatment control. Subjects in this group 
were telephoned and informed that the limitations of 
the program staff and facilities and the large number of 
volunteers for the program necessitated a delay in 
treatment for them. They were told that they would be 
able to participate in the program during the following 
academic quarter. 

Follow-up control. Since the delayed treatment 
control subjects received treatment prior to any 
follow-up assessment, a separate group of subjects was 
recruited to serve asa no treatment control for follow-up. 
This group consisted of 13 male and 12 female students 
who had scored among the most distressed 15% on the 
screening questionnaire passed out in classes and who 
met the other screening criteria but who had not been 
contacted about participating in the treatment study. 
Data from these subjects were obtained only on the 
Social Avoidance and Distress Scale, the self-report 
questionnaire, and the peer questionnaire. 


Procedure 


Prospective subjects came in for a screening interview 
and to receive written and verbal explanations of the 
program. Interested and eligible subjects then signed 
a contract and consent form and paid a $2 fee to cover 
expenses plus a refundable $10 security deposit. During 
the third week of the winter academic quarter, pre- 
treatment assessment was done. Treatment was given 
through the remainder of the winter quarter. Post- 
treatment assessment was administered during the 
second week of spring quarter, and subjects’ $10 
deposits were then refunded. Delayed treatment control 
subjects were treated with practice plus skill training 
during spring quarter. Three months after treatment all 
subjects except those in the delayed treatment control 
group were paid $1 to complete the follow-up question- 
naires. The peer questionnaires were mailed again at 
that time. Fifteen months after treatment, practice-only 
and practice plus skill training subjects were again 
contacted and were asked to complete the follow-up 
questionnaires. 


Results 


All subjects completed the program, in- 
cluding posttreatment assessment, except for 
one practice-only subject, one practice plus 
skill training subject, and three delayed 
treatment control subjects. At the 3-month 
follow-up, there was an additional loss of two 
practice-only subjects, two practice plus skill 
training subjects, one minimal treatment 
control subject, and three follow-up control 
subjects. These losses were due to subjects’ 
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lack of interest in further participation in the 
program, 

One-way analyses of variance of pretreat- 
ment scores showed no significant differences 
among groups. In addition, the follow-up 
control group was compared to all other groups 
combined, and no significant differences were 
found. 

Analysis of all data (except the process 
ratings and the 15-month follow-up data) was 
by analysis of covariance, with the pretreat- 
ment scores for each measure serving as the 
single covariate for each analysis. Since two- 
way (Sex of Subjects X Treatment) analyses 
showed no significant sex or Sex X Treatment 
interaction effects, sexes were combined and 


Table 1 
Means on Selected Outcome Measures 
Posttreatment, and at 3-Month Follow-up 


W. STEPHEN ROYCE AND HAL ARKOWITZ 


one-way analyses were used. Consistent with 
the hypotheses stated earlier, analyses were 
by means of specific planned contrasts rather 
than by omnibus tests across all groups, 
Contrasts of practice only versus practice plus 
skill training showed no significant differences 
on any of the measures. Thus, there was no 
support for the hypothesis that social skills 
training would enhance the effects of the 
practice interactions procedure. Contrasts of 
minimal treatment control versus delayed 
treatment control versus follow-up control 
showed no consistent significant differences— 
Only 1 of the 24 contrasts reached the .05 
level of significance. Detailed results are there- 
fore reported only for the planned contrasts 


for All Groups Pretreatment, 


eee 


Treatment group 


Measure P+ PO MTC DTC FC 
Social Avoidance and Distress Scale 
Pretreatment 13.7 14.8 13.1 13.6 10.0 
Posttreatment 8.6 8.6 10.8 11.3 91 
3-month follow-up 7.9 8.3 10.9 22 
Social interaction diary 
No. same-sex interactions 
Pretreatment 24.8 20.7 41.8 33.5 
Posttreatment 42.2 39.8 42.1 34.4 
Range of same-sex interactions 
Pretreatment 13.8 14.1 16.5 19.7 
Posttreatment 20.4 23.5 16.5 14.6 
Peer questionnaire same-sex ratings 
Pretreatment 4.2 4.5 4.4 4.4 a 
3-month follow-up 47 5.0 4.6 4.5 
Self-report questionnaire $ $ 
Produc with friends, last month 
retreatment .2 12.2 
Posttreatment ng a ee Ta 15.4 
3-month follow-up 13.5 9.4 12.6 16.4 
Range with friends, last month È : 5 
retreatment 13. 
Posttreatment Pe ee ae o6 17.1 
3-month follow-up 9.2 8.7 10.1 13.2 
Self-rated same-sex social anxiety 5 . 
Pretreatment 3.9 4.1 3.5 3.4 31 
Posttreatment 24 3, 3 3, 5 3.4 3.4 
3-month follow-up 21 2.9 3.9 i gy 
Saroia same-sex social skill d ; S 
retreatment 4.6 
Posttreatment i ae os as 4.5 
3-month follow-up 49 43 44 i oe 


Note. P+ = practice 


lus skill training; 
= delayed treatment ae Cang EO 


control; FC = follow-up control. 


7 . pT 
= practice only; MTC = minimal treatment control; D 
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of the practice-only and the practice plus skill 
training groups combined versus each control 


group. 


Outcome Measures 


Table 1 presents the means on selected 
outcome measures for all groups pretreatment, 
posttreatment, and at the 3-month follow-up. 

Social Avoidance and Distress Scale. At 
posttreatment, the combined practice groups 
had improved significantly on this measure in 
contrast to each of the three control groups: 
#(35) = —2.01, p < .05, for the contrast with 
the minimal treatment control group; +(37) 
= —2.26, p < .05, for the delayed treatment 
control group; and ¢(49) = —3.12, p < 01, 
for the follow-up control group. At the 3-month 
follow-up, the practice groups continued to 
improve relative to the minimal treatment 
control group, #(30) = —1.98, p < .05, and 
the follow-up control group, #(42) = —2.49, 
p< .01. Thus, the treatment groups showed 
clear superiority on this measure over the 
control groups both immediately posttreatment 
and 3 months posttreatment. 

Self-report questionnaire. Responses to the 
questions about social interaction frequency 
and range, and number of dates, showed no 
significant improvements for the combined 
practice groups when compared to the control 
groups. On ratings of social anxiety, the com- 
bined practice groups improved significantly 
in contrast to each of the three control groups: 
1(34) = —2.50, p <.01, for the contrast with 
the minimal treatment control group; 4(33) 
= —2.19, p < .05, for the delayed treatment 
control group; and ¢(48) = —3.16, p < 01, 
for the follow-up control group. At the 3-month 
follow-up, these contrasts were maintained for 
the minimal treatment control group, +(31) 
= —3.83, p< .01, and for the follow-up 
control group, (41) = —3.46, p< 01. On 
Tatings of social skill, the combined practice 
groups improved significantly only in contrast 
to the minimal treatment control group, #(34) 
= 2.29, p< .05, and the follow-up control 
group, #(48) = 2.35, p < 05. Only the latter 
contrast was maintained at the 3-month 
follow-up, #(41) = 2.59, $ < 01. There was, 
then, substantial and consistent improvement 


243 


oe oe ae for the treatment 
al provement on self-rated 

Social interaction diary. For both frequency 
and range of same-sex social interactions, the 
combined practice groups improved signifi- 
cantly in comparison to the minimal treatment 
control group, #(34) = 2.64, p < .01, for fre- 
quency, and ¢(34) = 1.77, p < .05, for range, 
and the delayed treatment control group, 
1(33) = 2.49, p<.01, for frequency, and 
#(33) = 2.64, p < .01 for range. There were 
no significant contrasts for opposite-sex inter- 
actions, It should be noted that for each of the 
practice groups, less than 2% of subjects’ 
posttreatment interactions were with other 
subjects. Thus, these substantial increases in 
social activity were not due merely to subjects’ 
interactions with people they met in the 
program. 

Peer questionnaire. On the same-sex peer 
rating scale, the combined practice groups 
improved significantly in contrast to the 
follow-up control group, #(45) = 3.56, 2 < 01. 
The contrast with the minimal treatment 
control group was in the predicted direction 
and approached statistical significance, 4(28) 
= 134, p < 10, Analysis of the individual 
items on this scale showed that these results 
were due primarily to changes in ratings of 
social activity, somewhat to changes in ratings 
of social anxiety, and not at all to ratings of 
social skill, There were no significant contrasts 
for the opposite-sex scale. 

Behavioral performance task. None of the 
four measures taken from the audio recordings 
of these interactions showed significant 
contrasts. 


Process Ratings 


Matched sample t tests were used to compare 
practice subjects’ process ratings from the 
first 3 weeks of treatment with ratings made 
during the last 3 weeks. Subjects’ ratings of 
their own anxiety showed a significant decrease 
over time, (17) = 2.33, $ < 0% Subjects’ 
ratings of their own skill and of their partners 
anxiety and skill also changed in the predicted 
directions but did not reach statistical 
significance, 
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15-month Follow-up 


Questionnaires were returned by 19 of the 
27 subjects in the two practice interaction 
groups. Matched-sample ¢ tests computed on 
the pretreatment to 15-month follow-up data 
revealed that improvements noted at post- 
treatment and at the 3-month follow-up had 
been maintained. The mean score on the 
Social Avoidance and Distress Scale decreased 
to 7.2, (18) = —3.95, p < .01; self-reported 
range of social interactions with friends in- 
creased to a mean of 13.6, (18) = 2.34, 
p < .02; and self-rated same-sex social anxiety 
decreased to 3.0, #(18) = 2.55, p< 01. 
There were no significant effects for the other 
self-report items. 


Discussion 


The results indicate that the practice inter- 
action procedure was an effective treatment 
for socially isolated college students. On 
multiple outcome measures of social anxiety 
and social activity, the practice interaction 
groups improved significantly in contrast to 
the control groups, and these gains were 
maintained at 3-month and 15-month follow- 
ups. The practice interaction procedure worked 
equally well for men and women. Data from the 
social interaction diary indicate that the vast 
majority of Posttreatment social interactions 
were between subjects and persons whom they 
had not met through the program. This 
Suggests that substantial learning occurred as 
a result of the program, and that this learning 
generalized to social interactions with other 
Persons, 


practice-only group is consistent with findings 
which have also 
found no changes in social skill (e.g., Christen- 
sen et al., 1975). A surprising finding was that 
the addition of social skills training did not 
lead to any improvements on measures of 
social skill. Studies with minimal daters have 
found significant increases in social skills 
ratings as a result of skill training procedures 
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very similar to the ones used in the present 
study (Curran, 1977). Although unlikely, it 
is possible that the social skills training pro- 
cedure was not a good one. Or, it may be that 
the measures of social skills were not Sensitive 
to changes that did occur. However, a third 
possibility is that subjects were not deficient 
in social skills to begin with. If this were the 
case, then one would not expect any marked 
improvements in subjects’ social skills, even 
with a viable social skills training program, 
Tt may be that the majority of subjects were 
characterized by excessive anxiety but ade- 
quate social skills. Several studies with minimal 
daters have supported this view (e.g., Clark & 
Arkowitz, 1975; Glasgow & Arkowitz, 1975). 
It may be that the same is true for individuals 
who are socially isolated with respect to friend- 
ship interactions. If this hypothesis were 
correct, it would also account for the finding 
that social skills training did not enhance the 
effects of the practice interaction procedure. 
A most interesting question at this point 
concerns the mechanism by which the practice 
interaction procedure works. It may be that 
by “forcing” subjects to be socially active, they 
are exposed to the natural reinforcing proper- 
ties of social activity, which maintains an 
increased rate of social behavior. In a similar 
manner, the subjects’ social activity in the 
Program may lead them to change their self- 
Perceptions and labels of being shy and socially 
awkward. However, the explanation that seems 
most tenable at this point is that the practice 
interaction procedure functions as a means 0 
in vivo desensitization, in which repeated 
exposure to moderately anxiety-arousing situa- 
tions serves to reduce social anxiety and 
avoidance. Results of the present study give 
Some support to this hypothesis: The process 
ratings demonstrated that self-rated social 
anxiety decreased from the first 3 weeks of 
Practice interactions to the second 3 weeks, 
whereas skill ratings did not change. Christen- 
sen et al. (1975) found similar results. There 
is now a large and growing body of literature 
that demonstrates the effectiveness of in yg 
exposure as a means of anxiety reduction. 
Marks (1975) reviewed over 300 studies of 
behavioral approaches to anxiety reductioD 
and concluded that “real-life exposure is the 


therapeutic factor so far 
” (p. 93). 

uestions also remain concerning this 
as treatment for social isolation. 
h the procedure has been found to be 
ctive treatment for college students’ 
inhibitions, these results may not 
to other populations. The subjects 
tudy were student volunteers, not real 
Even though these subjects were 
ed stringently to approximate a clinical 
ion and to minimize the analogue 
of the study, a true clinical population 
ffer in important ways from the sample 
study. The procedure may not work with 
severely distressed clients or with those 
telligent and less verbal than college 
ts. Such individuals may also need 
social skills training. 
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The present study assessed the outcome of treatment of 121 mental health 
center clients using therapist and independent global improvement ratings and 
independent ratings of notes in case records based on a client-specific goal- 
oriented outcome technique (Goal Attainment Scaling; GAS). Telephone follow- 
up of 50 clients provided a second GAS assessment, client global improvement 
ratings, and three consumer satisfaction ratings. The findings indicated that (a) 
independently determined GAS scores and therapist and independent global rat- 
ings converged significantly, (b) the GAS procedure provided some increase in 
accuracy as well as an increase in specificity of outcome, and (c) client global 
ratings may reflect satisfaction with treatment rather than outcome. In view of 
the intercorrelations among measures and the relationship between GAS scores 
determined from case records and telephone interviews, case records may pro- 
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vide for accurate assessment of client problems and treatment success. 


The measurement of treatment effectiveness 
in the applied setting is subject to many 
constraints (Twain, 1975). Clinicians often 
resist the “invasion” of accountability into 
their professional activities because they fear 
that their individual efforts are not effective, 
resent the time spent in evaluation that could 
be used for therapy, or believe that the clinical 
process is not quantifiable (Ellsworth, 1975). 
A level of evaluation beyond simple program 
monitoring may also require shifting resources 
from clinical activities to research, a source of 
both clinical and administrative resistance. 
Furthermore, considerations such as objec- 
tivity about the program being evaluated, level 
of competency of the evaluators, and potential 
usefulness of the evaluation results often argue 
in favor of external rather than internal 
evaluation (Weiss, 1972). External evaluation, 
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however, may increase costs and further 
increase resistance, 

Methodological considerations also make 
treatment outcome research difficult (Bergin, 
1971). Though Bergin and Suinn (1975) and 
Malan (1973) have concluded that the evidence 
for effectiveness of psychotherapy is relatively 
strong, there is great concern with the adequacy 
of the measurement of that effectiveness 
(Fiske et al., 1970; Garfield, Prager, & Bergit 
1971; Luborsky, Chandler, Auerbach, Cohen, 
& Bachrach, 1971; Strupp & Bloxom, 1975): 
Indeed, the primary focus of discussion "M 
psychotherapy research has been the criterion 
of outcome measurement. The most frequen y 
used measure has been the therapist rating ° 
client global improvement (Luborsky et 2 
1971), which seems to be the only criterion 
that shows consistent correlations with other 
measures of outcome (Cartwright, Kirtner, 
Fiske, 1963; Fiske, Cartwright, & a 
1964; Strupp & Bloxom, 1975). However, © = 
criterion tends to reflect a more positive vie 
of outcome than specific change measu a 
(Garfield et al., 1971), and it is certainly ™ 
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oriented toward differentiating areas in which 
change occurred (Cartwright, 1975). Other 
measures are subject to similar concerns for 
validity and utility and have prompted sugges- 
tions that multiple measures be used (Waskow 
& Parloff, 1975). 

The aforementioned constraints on evalua- 
tion in the applied setting and concerns for 
treatment outcome measurement influenced 
the design of the present study. Based on 


" suggestions by Fiske et al. (1970) and Luborsky 


et al. (1971) that criterion measures should be 
oriented toward the type of change or goals 
each patient needs or desires, the present study 
incorporated the client-specific, goal-oriented 
technique developed by Kiresuk and Sherman 
(1968). This technique (Goal Attainment 
Scaling; GAS) has been extensively used in 
various mental health, drug, criminal justice, 
rehabilitation, and education settings (cf. 
Garwick & Brintnall, 1974). In the present 
application, goals were established and attain- 
ment was measured retrospectively by in- 
dependent raters who reviewed case records 
of former clients. This retrospective approach 
avoided potential resistance by eliminating 
therapist involvement in the evaluation, pro- 
vided independent external evaluation, and 
represented a test of the utility of an innovative 
approach to studying treatment effectiveness 
at relatively low cost. Global improvement 
ratings by therapists and raters provided an 
indication of the correspondence between these 
More typical criteria and this novel outcome 
Measurement approach (GAS). 

In addition to outcome assessment at termi- 
nation, a telephone follow-up of a sample 
of clients provided a second GAS score and 
client global improvement ratings to explore 
their utility as convergent outcome criteria. 
Clients were also asked three questions to see 
if their satisfaction with services might reflect 
Improvement during treatment. Altogether, 
the present study used eight outcome measures 
in three content areas: GAS, global improve- 
Ment, and consumer satisfaction. 


Method 
S etting 


C The present investigation was conducted at Granite 
ommunity Mental Health Center. The therapeutic 


orientation of the center is eclectic with a primi 
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dynamic focus, and all traditional therapist categories 
are represented. 


Observations 


Terminated client case records (N = 179) were 
drawn in a 10% random sample from about 1,700 files 
that met two criteria: (a) Treatment was received from 
January 1, 1970, to June 30, 1975; and (b) the clients 
had completed at least 10 outpatient sessions, 8 partial 
hospitalization days, or 5 inpatient days. The second 
criterion was intended to insure that case records 
contained enough information to permit assessment of 
problems and outcomes using GAS. Despite this 
precaution, 58 of the case records were not amenable 
to outcome assessment for at least one of three reasons: 
inadequate intake notes (53%), inadequate progress 
notes (72%), and/or inadequate termination notes 
(91%). 

Of the 121 measurable cases, 84 received outpatient 
treatment (mean of 26.6 sessions), 29 received partial 
treatment (24.5 days), and 8 received inpatient treat- 
ment (26.4 days). Fifty-three percent of the clients 
received primarily individua) therapy, and 47% 
received primarily group or family therapy. (Only the 
record of the sampled family member was evaluated. 
Four clients were treated by psychiatrists, 25 by 
psychologists, 59 by social workers or nurses, and 
by paraprofessional therapists. Forty-eight percent of 
the clients were male. The average was 33.4 years 
(range = 11-70 years), and average formal education 
was 11.7 years. Thirty-four percent of the sample were 
single, 42% were married, and 24% were separated or 
divorced. Average family income was $7,200 yearly, and 
average client family size was 3,96. Fifty percent of the 
clients had received some prior mental health treatment. 
‘The 188 problem areas identified for the 121 cases w! 
measurable outcome were classified ee eae 
ies: nt-child interactions 9%), de- 
pect et marital problems. (16.0%), other 


ression (16.5%) b 
Tecan problems (10.1%), self-understanding 
anxiety (4.2%), 


(9.0%), school problems (6.9%), 
agency referrals (3.7%), work problems (3.2%); and 
other problems (6.5%)- t 

A comparison between clients for whom outcome 
could be (n = 121) and for whom outcome 
could not be assessed (n = 58) revealed no significant 
or type of therapy (individual 
therapist, or client 
the 121 clients 


were measurable were reached for 


ed numbers), 50.7% had moved or 
potae en) TS that could not be traced, and 15.5% 


isti i arent 
had no telephone listing. Again, there were no appi ; 
treatment or client differences between those inter 
viewed and those not interviewed. 


Procedure 


Two advanced g 
chology who had no 


raduate students in clinical psy- 
filiation with the Granite Com- 
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munity Mental Health Center contracted to conduct 
the retrospective outcome study and telephone 
follow-up. Both raters were trained in assessment and 
had some experience with GAS through designing 
evaluation programs and training staff in two applica- 
tions of that technique. 

Goal Attainment Scaling is a client-specific assess- 
ment and outcome technique developed by Kiresuk 
and Sherman (1968) to measure success in meeting 
treatment goals. It is used to (a) select one or more 
client problems during initial assessment, (b) establish 
possible outcome levels for each. problem, and (c) 
compare actual outcome with established levels. 
Selected problems are each translated into verbal 
descriptions of possible outcome levels based on a 
common 5-point scale, which ranges from the least 
favorable treatment outcome thought possible (scale 
value = —2) to the most favorable outcome thought 
possible (+2). The expected outcome of treatment 
(scale value = 0) is based on clinical judgment about 
the most likely outcome for the individual client's 
particular problem, A goal attainment score with an 
expected value of 50 and an expected standard devia- 
tion of 10 can be computed for each client by combining 
scale outcome levels using a formula developed by 
Kiresuk and Sherman. A score of 50 indicates average 
expected treatment outcome, and a score of more than 
50 indicates more than expected outcome. Compu- 
tation of a single score for each client allows for com- 
parisons of outcome between clients or programs, 

As originally designed, the GAS technique requires 
therapist-client interviews for scale construction and 
outcome assessment, which can take up to 2 hours of 
client and therapist time. The retrospective nature of 
the present approach allowed scale construction and 
outcome measurement using only the closed case 
record. Scales were constructed from intake notes, and 
outcome was assessed from progress and termination 
notes. GAS scores were computed for each client’s status 
at intake and termination, thus allowing measurement 
of changes during treatment. All problems in the intake 
notes that could be made into goals were scaled. Since 
their relative importance could not readily be deter- 
mined, all goals were considered equally important for 
the purpose of computing GAS scores (cf. Kiresuk & 
Sherman, 1968), 

Additional outcome measures included global ratings 
made by therapists, raters, and clients using 6-point 
scales (where 1 = considerably worse, 3 = unchanged, 
and 6 = considerably improved). Therapist global 
ratings were routinely made at termination and were 
available in the case records. Rater global ratings were 
made at the time termination GAS scales were assessed. 
Client global ratings were requested during telephone 
follow-up. 

Telephone follow-up was conducted at an average 
of 2 years after treatment termination. The raters who 
interviewed the clients were unaware of clients’ status 
at follow-up. Raters asked clients to indicate their 
status on each GAS by first reading them the termina- 

tion level. (Clients were not told that this was the 
status recorded by their therapist at termination.) If 
clients described themselves as better or worse, the 
raters read the next scale level in the indicated direction, 
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They continued until the client said the approp: 
level was reached, or the end of the scale was r 
whichever came first. Those levels for each client. 
then used to compute the follow-up GAS score, 
Clients were next asked to rate the extent to wl 
they felt they had improved since they first came to the 
Granite Community Mental Health Center for 
ment using the global scale. Three additional quest 
were asked of each client to assess relationships betw 
satisfaction with services and treatment outcome 
Clients were asked to rate whether they would retun 
to the center if they needed help in the future 
whether they would recommend the center to of 
needing help on a 5-point scale (where 1 = defini 
no, 3 = maybe, and 5 = definitely yes). Actul 
recommendation to others was scored dichotomoush 
as the third measure. 


Reliability 


With any task involving rating, reliability of 
ratings is an important issue. Prior to conducting th 
present study, the two raters spent approximately | 
hours each in training to apply the GAS technique t 
the case records. They determined subjective criteri 
(based on the amount and quality of information in thi 
record) for deciding whether outcome was measurable 
discussed goals and scale points for several cases, and 
independently developed and then discussed goal 
attainment scales for 12 records. Reliability assessmén 
was conducted independently by the two raters on 44 
cases of the 179 reviewed (13 initially reviewed by 
rater and 14 by the other). There was complete ag 
ment that 7 of the case records were not amenabl 
scale construction. For the 20 case records allowing 
outcome assessment, correlations of .94 and .93 were 
obtained between raters on termination GAS sco 
and global improvement ratings, respectively. 
20 case records allowed identification of 36 probl 
areas, for which there was 86% agreement betw 
raters on the scale headings used to describe problem 
The average termination outcome level for 121 clier 
was 48.86, with a standard deviation of 9.77. THE 
similarity between the findings of the present study 4 
the expected values of 50 and 10 suggest that the ral 
were correctly using the GAS technique. 


Cost 


One intent of the present study was to provide & 
large amount of outcome data at relatively low Costi 
The retrospective outcome portion of the study ¢ 
$1,017 in direct rater time and indirect staff su 
at a cost per case reviewed of $5.68. Excluding tral 
and support time, the average time spent review 
each of the 179 cases was 21 minutes. This comp 
very favorably with past efforts at outcome assess™ 
at the Granite Community Mental Health Cente a 
telephone follow-up cost $6.72 per case for the 50 clienti 
who could be reached. The center administrat SA 
concluded that the benefits of the present study 
justify the cost in money and time that might othe 
have been devoted to clinical efforts. 


OUTCOME AND FOLLOW-UP 249 

Table 1 
Correlation Matrix of Outcome Measures 

Measure 1 2 3 4 5 6 7 8 
1. Termination GAS score 9,77 
2, Rater global improvement .80*** = 1.05 
3, Therapist global improvement 152% sabes 90 
4, Follow-up GAS score ‘A0** "30%" 18 13.96 
5, Client global improvement —.08 —.03 14 49 1.12 
6, Would client return? 12 EE A NAE o tel? 
, Would client re a f : ; ; ; 

oneei recommend :08 AS —.06 .26 bor -,74*** 1.02 

8. Did client recommend others? .00 -00 -08 ll SORT 28* 27 «46 


pe ahea ie ths diagonal are standard deviations. The measures are based on degrees of freedom of 119 
, 116 for 3, and 48 for 4-8. Significance tests are two-tailed. All correlations involving Measure 8 


are ae aia GAS = Goal Attainment Scaling. 
p< .01. 
“* p < 001. 


Results 


Both intercorrelations among outcome mea- 
sures and the correspondence in magnitude 
of similar measures (e.g., global improvement 
ratings) obtained from different sources (e.g, 
therapists, raters, and clients) were of im- 
Portance to the present study. The intercorre- 
lations among measures are presented in Table 
1. Notably, convergence among the major 
outcome measures was significant. Rater-deter- 
mined termination GAS scores, (116) = .52, 
>< .001, and rater global improvement 
ratings, r(116) = .45, p < .001, were both 
significantly correlated with therapist global 
ee ratings. Likewise, termination 
ns scores, r(48) = .40, p < .011, and rater 
Ag al ratings, (48) = 36, p<.01, were 
cet correlated with client follow-up 

S scores. Moreover, only 1 of the 50 clients 
contacted at follow-up reported that the 
ponies areas discussed were not those that 
nce them to treatment. However, the 
of 7 ionship between therapist and global 
y ngs and follow-up GAS scores was not 
ignificant, Thus, GAS scores derived inde- 
Pendently from the case record appear to 
aes a somewhat better index of client 
a ctioning than therapist global ratings, 
eee when both termination and follow-up 
ieee Tt is also of interest to note 
a. termination GAS scores and rater global 

ings were highly correlated, 7(119) = -80, 


p < .001, which suggests that raters used 
similar criteria with these measures. 

Client global ratings correlated significantly 
with follow-up GAS scores, 7(48) = .49, 
p < .001, but not with therapist or rater 
global ratings or with termination GAS scores. 
Yet, client global ratings were substantially 
correlated with all three consumer satisfaction 
ratings (i.e., whether the client would return 
for treatment, whether the client would 
recommend others, and whether the client did 
recommend others). In fact, measures obtained 
during follow-up tended to be internally 
consistent but (with the exception of follow-up 
GAS scores) were independent of other 
measures. That is, client global ratings and 
the three consumer satisfaction ratings tended 
to correlate significantly with each other but 
not with other outcome measures. 

With respect to the magnitude of measure- 
ments, the change in average GAS scores from 
intake (38.84) to termination (48.86) was 
highly significant, correlated #(119) = 11.02, 

< .001. This finding confirms the sensitivity 
of the GAS procedure for revealing changes 
and its accuracy in approximating the ex- 
pected level of 50. In addition, the 50 clients 
in the follow-up sample had a mean termina- 
tion score of 49.54 and a follow-up score of 
51.85, which were not significantly different. 
Clients seemed to improve during therapy and 


to maintain that improvement after therapy- 
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The magnitude of global ratings provided 
an additional perspective for evaluating the 
GAS procedure. Although the mean ratings 
of raters (4.06) and therapists (4.39) both 
indicated moderate improvement at the end 
of treatment (i.e., ratings of 3 = “unchanged” 
and 6 = “considerably improved”), thera- 
pists’ ratings were significantly higher, corre- 
lated #(116) = 3.84, p < .005. Since there was 
high agreement between rater global ratings 
and termination GAS scores, the difference in 
mean ratings between raters and therapists 
suggests that therapist global ratings over- 
estimate improvement (cf. Garfield et al., 
1971). 


Discussion 


The present study evaluated the application 
of GAS to client case records as a measure of 
treatment effectiveness and examined its 
correspondence to other measures of outcome. 
The most important findings were that GAS 
scores derived from case records by inde- 
pendent raters at the termination of treatment 
converged significantly with therapist ratings 
of global improvement and GAS scores ob- 
tained from client reports at follow-up. Also, 
a comparison of GAS levels at intake and 
termination, together with the scores derived 
at termination and follow-up, indicated that 
the GAS procedure is sensitive to changes 
produced by treatment and is accurate in 
predicting expected outcome. In fact, results 
suggested that the GAS procedure is more 
accurate in measuring outcome (cf, Garfield 
et al., 1971; Luborsky et al., 1971). An im- 
portant question, however, is whether raters 
can assess client problems from the case record. 
The significant correlation between GAS scores 
determined from the case record and GAS 
Scores determined from a client interview 
seems to mitigate this concern. Additionally, 
only 1 client of the 50 interviewed reported 
that the problem areas being discussed were 
not those that brought the client in for treat- 
ment. Thus, it seems that independent raters 
can accurately assess problems and outcome 
from the case record, 

Although client global improvement ratings 
appear to have face validity (ie., they should 
reflect the client’s experience with treatment), 
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and sometimes correlate with other outcome 
measures (Cartwright, 1975), client global 
ratings in the present study were unrelated 
to rater or therapist global ratings or GAS 
scores determined from case records, On the 
other hand, client global ratings were strongly 
related to whether the client would return 
for treatment, would recommend the center 
to others, and had recommended the center to 
others. Perhaps, when questioned about im- 
provements during treatment, client reports 
largely reflect satisfaction with services rather 
than gains made. 

Finally, it can be noted that the GAS 
procedure used in this study circumvented 
many problems frequently encountered in 
therapy outcome evaluation (cf. Ellsworth, 
1975; Twain, 1975; Weiss, 1972), and provided 
a large amount of data at relatively low cost 
(cf. Fiske et al., 1970). In addition, dissemi- 
nation of the generally positive results of this 
study have reduced staff resistance to evalua- 
tion and prompted center plans for further 
investigation, thereby carrying the intent of 
evaluation to utilization within the program 
evaluated (cf. Davis & Salasin, 1975). 
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This study compared the separate effects of three procedures for the reduction 
of high blood pressure (BP) in three treatment groups of eight patients each 
with medically verified borderline hypertension: (a) Biofeedback for simulta- 
neous reductions in systolic BP and heart rate was aimed directly at reductions 
in BP. (b) Biofeedback for reductions in integrated forearm and frontalis 
muscle electromyographic activity was aimed at general muscular relaxation. 
(c) Meditation relaxation based on the “relaxation response” procedure de- 
veloped by Herbert Benson was aimed at total bodily and “mental” relaxation. 


University of California, Los Angeles 


Each patient was studied in two baseline sessions, eight training sessions, and 
a 6-week follow-up. Half of the sample returned for a 1-year follow-up. Anal- 
ysis of variance of the three treatment 
trials per session, revealed significant effects for trials within sessions. However, 
there were no significant main effects or interactions related to differences be- 
tween the treatment conditions or to changes in BP over the course of training 
sessions. Although all groups showed moderate reductions in BP as compared 
to initial values, no technique could be seen 
greater than that observed in the baseline sessions, BPs of patients reporting for 
the 1-year follow-up were not different from pretreatment baseline levels. 


Essential hypertension has long been of in- 
terest to those concerned with behavioral fac- 
tors in disease. The disorder occurs in about 
10% of the population and is known to be 
affected by behavioral therapies (Shapiro, 
Mainardi, & Surwit, 1977). It remains unclear, 
however, as to which behavioral approaches 
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groups over eight training sessions, 20 


to produce a reduction in pressure 


can add most substantially to treatment pro: 
grams for the disorder. The main purpose 0 
this study was to compare the efficacy of three 
behavioral methods in the reduction of blo 
Pressure (BP) in patients with essential hype 
tension. 
Of the many behavioral techniques that have 
been used for treatment of hypertension, none 
has attracted as much recent interest as Ns 
feedback training for the control of 0) 
Shapiro, Tursky, Gershon, and Stern ea 
developed a constant cuff method to Bet a 
relative changes in BP occurring at each a 
of the heart and to provide information to SÙ 
jects about these changes. With this mel 
small but reliable relative increases an 
creases in either systolic or diastolic ; F 
were obtained in normal subjects CiT 
Schwartz, & Tursky, 1972; Shapiro et al., 1 ok 
Larger decreases in BP were found when 1 


: 
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back was provided for concomitant reductions 
in both heart rate (HR) and BP (Schwartz, 
1972). 

The constant cuff method of BP feedback 
was tested clinically in a series of case studies 
(Benson, Shapiro, Tursky, & Schwartz, 1971). 
Five patients with essential hypertension were 
trained to decrease systolic BP and showed re- 
ductions of 34, 29, 16, 16, and 17 mm of Hg 
after 33, 22, 34, 31, and 12 sessions of training, 
respectively. Using essentially the same proce- 
dure, Goldman, Kleinman, Snow, Bidus, and 
Korol (1975) reported decreases of 4% and 
13% in systolic and diastolic BP, respectively, 
in seven patients with average baseline values 
of 167/109 mm of Hg. Kristt and Engel (1975) 
also reported success using the constant cuff 
feedback method. Patients suffering from es- 
sential hypertension and having a variety of 

‘cardiovascular complications were taught both 
to raise and to lower systolic BP and were able 

to achieve reductions in pressure averaging 

10%-15% of their pretreatment baseline 

“values, 

Clinical data are less convincing for reduc- 
tions in diastolic BP. Summarizing unpublished 
tesearch, Miller (1975) noted that although 
Some patients appeared capable of lowering 
diastolic BP, their pressure eventually drifted 

“Up over time. Negative results were also re- 

“Ported by Schwartz and Shapiro (1973). In 
this study, the constant cuff method was used 
to provide feedback for changes in diastolic 
BP. Only one out of seven patients studied 
showed a progressive reduction in diastolic 
BP; the others showed no change. However, 
Elder, Ruiz, Deabler, and Dillenkoffer (1973) 
teported using feedback and verbal praise to 


achieve a 20%-30% reduction in diastolic BP 


n patients who were not on antihypertensive 
Medication. In addition, several studies previ- 
ously discussed reported significant reductions 
m diastolic BP for patients who were actually 
trained to lower systolic pressure. Since mor- 


| bidity has been shown to be related to eleva- 


tions in systolic as well as in diastolic BP (Kan- 
hel, Gordon, & Schwartz, 1971), it may not be 
of great clinical consequence that diastolic 
Pressure is more difficult to control with bio- 
feedback methods. 

Other biofeedback methods have been re- 
Ported to be effective in the treatment of es- 
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sential hypertension. Moeller (1973) demon- 
strated that training patients to reduce 
frontalis electromyographic (EMG) activity 
through feedback training led to mean systolic 
and diastolic BP reductions of 13% in 36 essen- 
tial hypertensive patients following 16 weeks of 
training. Patel (1973, 1975) used a combination 
of yoga and electrodermal (galvanic skin re- 
sponse; GSR) feedback to lower BP in hyper- 
tensive patients. In a well-controlled investi- 
gation, Patel and North (1975) randomly as- 
signed 34 hypertensive patients to one of two 
treatment conditions. Patients in the first con- 
dition were initially taught relaxation with 
yoga and were then given GSR and EMG feed- 
back in an effort to lower levels of autonomic 
and skeletal muscle activity. Patients in the 
second condition attended the same number of 
sessions (12), but they were told simply to re- 
cline on a lounge chair with no other specific 
instructions, Both groups showed reductions 
in BP, but the group that received yoga and 
biofeedback showed significantly greater re- 
ductions (from 168/100 to 141/88 mm of Hg) 
than the group that was simply instructed to 
rest (169/101 to 160/96 mm of Hg). Patients in 
the simple resting condition were then given 
12 weeks of yoga plus biofeedback training and 
managed to lower their pressure to the level 
achieved by the first treatment group. 
Another behavioral method reported to be 
effective in treating essential hypertension 
combines meditation and relaxation. Benson, 
Rosner, Marzetta, and Klemchuk (1974a, 
1974b), using procedures derived from trans- 
cendental meditation (Maharishi Mahesh 
Yogi, 1966), demonstrated BP reductions 1n 
22 untreated borderline and in 14 pharmacolog- 
ically treated hypertensive patients. During 
the 6-week pretreatment baseline period, the 
borderline hypertension group showed an aver- 
age systolic pressure of 147 mm of Hg and a 
diastolic pressure of 95 mm of Hg. During 
the 25 weeks of regular meditation relaxation, 
systolic BP was reduced by 8 mm of Hg and 
diastolic BP by 4 mm of Hg. The group being 
treated pharmacologically displayed an ae 
age systolic and diastolic BP of 146 mm of Hg 
and 92 mm of Hg, respectively, during @ 5- to 
6-week baseline period. The average reduction 
for this group during 20 weeks of regular medi- 
tation relaxation averaged 11 mm of Hg sys- 
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tolic and 5 mm of Hg diastolic. Using similar 
procedures, Stone and DeLeo (1974) recently 
compared meditation relaxation training to a 
no-treatment control in borderline hyperten- 
sive patients exhibiting pressures similar to 
those in the Benson et al. (1974a) study. Medi- 
tation relaxation was shown to produce small 
but significant pressure reductions (5-10 mm 
of Hg) as well as significant reductions in 
plasma dopamine-beta-hydroxylase activity 
and in furosemide-stimulated renin activity. 
On the basis of these data, Stone and DeLeo 
concluded that a reduction of peripheral ad- 
renergic activity may be associated with the 
reduction of BP observed in the practice of 
meditation relaxation. 

It seems clear that behavioral methods can 
produce reductions of BP. It is not clear, how- 
ever, which methods are more effective than 
others and whether treatment effects are main- 
tained consistently over time. Few follow-up 
data have been reported. Typically, various 
treatment approaches are combined, making it 
difficult to separate out critical variables and 
nonspecific or placebo effects. However, there 
is no easy solution to the problem of appropri- 
ate control procedures in evaluating behavioral 
therapies. The approach taken in this study 
was to compare several behavioral techniques. 
It was assumed that placebo effects would not 
differ across treatments, in that all treatments 
were “active” and could be expected to pro- 
duce some change. Such a design allows a con- 
trolled comparison of efficacy and subject 
compliance, if subjects are randomly assigned 
to groups and treated over equal periods of 
time. 

The present investigation compared three 
behavioral procedures in patients with es- 
sential hypertension: BP—biofeedback for 
reductions in BP and HR, EMG—biofeedback 
for simultaneous reductions in both frontalis 
and forearm muscle tension, and relax—a 
meditation relaxation practice. The rationale 
for the BP procedure was that BP biofeedback 
would facilitate direct control of reductions in 
BP and that simultaneous feedback for reduc- 
tion in HR in this condition would serve to 
maximize the degree of BP reduction possible 

(Schwartz, 1972). The rationales for the other 
two conditions were also derived empirically. 
The EMG condition assumes that excessive 
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muscle activity is a critical factor in regulating 
high BP levels and that reduction in muscle 
activity facilitates pressure reductions, 

By using the summed muscle activity of two 
muscle sites (frontalis and forearm) in the bio- 
feedback procedure, it was assumed that total 
muscular relaxation would be facilitated, Fur 
ther empirical support for muscle relaxation in 
essential hypertension has been reported by 
Shoemaker and Tasto (1975) with the use ofa 
modified form of progressive relaxation, More- 
over, it was further assumed that propriocep- 
tive feedback of changes in muscle activity 
would aid in the development of learned con- 
trol and maintenance of reduced muscle activ- 
ity outside of the training situation. The relax 
procedure was assumed to maximize complete 
mental and physical relaxation. As hypothe: 
sized by Benson (1975), this method of relaxa- 
tion produces a “hypometabolic state” that is 
presumed to represent a hypothalamic response 
that is antithetical to the “fight-flight” re- 
sponse and is consistent with a state of de- 
creased sympathetic nervous system activity. 
According to Benson, the basic components of 
this form of relaxation are a mental device of 
mantra, a passive attitude, regular deep breath- 
ing, decreased muscle activity, and regular 
practice. 

Follow-up evaluations of the treatment ef 
fects were made at 6 weeks and at 1 year after 
completion of the program. 


Method 
Subjects 


Male and female volunteer subjects with hype 
sion were solicited through newspaper advert ne 
and physician referrals. After a preliminary a 
screening, potential subjects received (a) a letter 0 
knowledgement and an explanation of the study, tory 
confidential medical questionnaire focusing 0n parr 
relevant to hypertension, and (c) a form e 
authorization for the volunteer’s physician to ‘at the 
medical information. Subsequently, a description K si- 
intent of the study was mailed to the volunteer $ Pi al 
cian along with the authorized request for mi ry of 
history including all available BP readings, summ™ and 
a physician’s examination within the past y Caer heir 
clinical laboratory data. Volunteers had to have tudy: 
own physician agree to their participation in the stuð" 
The investigators did not assume the role of prim 
care physicians. 

Half of the volunteers considered for the study a, 
taking antihypertensive or psychotropic medic 
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Table 1 
Characteristics of Patients 


BP treatment group 


EMG treatment group 
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Relax treatment group 


Age Sex L M BPs Age Sex L M BP Age Sex L M BP 

52 Male + — 170/78 49 Male + + 140/90 42 Male = 

48 Male + — 138/90 57 Male + — 160/92 54 Male E + 100 
42 Male — + 150/102 42 Male — — 150/95 43 Male — — 145/90 
49 Male — + 158/94 49 Male + — 147/96 35 Male + + 151/91 
52 Male — + 187/106 50 Female — + 158/103 27 Male + + 160/89 
37 Female + — 149/103 46 Male — + 157/93 53 Male + + 153/96 
37 Male + — 148/93 34 Male + + 159/87 51 Male — — 187/88 
52 Female — + 159/90 59 Female — — 168/96 54 Female — — 165/95 


Note. BP = blood pressure; L = lability; M 


= medication; EMG = electromyogram. 


* Average prestudy blood pressure was measured in millimeters of mercury. 


and the other half were not. The patients had a history 
of at least 1 year of labile or sustained systolic and/or 
po hypertension (average systolic BP > 140, 
average diastolic BP > 90). They were all less than 60 
years old, and none had evident organic etiology for 
their hypertension, major complications related to the 
, Or other serious illnesses (see Table 1). 
Potential subjects whose histories met these criteria 
were then seen by one of the investigators for a detailed 
medical and psychosocial history, physical examination, 
‘tnd bilateral sitting and recumbent BP measurements. 
Subjects still meeting the criteria of the study were as- 
signed to one of the three treatment groups. The groups 
‘Were approximately matched for age, sex, current use or 
nonuse of relevant medication, type of blood pressure 
levation (systolic and/or diastolic), and lability or 
ied of hypertension. BP was considered to be labile 
the subject had a previous history of normal systolic 
and diastolic readings bracketed by elevated readings, 
ith the lability being unrelated to medication effects. 
lubjects were requested to continue their current di- 
pay and medication practices and to inform the inves- 
Er of any changes in their regimen during the 
_ The BP and EMG groups were initially assigned nine 
jad each, and the relax group, eight members. 
a ever, one subject in the BP group dropped out of 
le study, and one subject in the EMG group was ex- 
a me because of persistent tachycardia, leaving eight 
jects in each group (Table 1). 


‘ocedure 


Subjects were scheduled to attend 10 sessions twice 
n Sn for 5 weeks plus follow-up sessions approxi- 
mately 6 weeks and 1 year after the 10th session. The 
ke 2 sessions, 1 hour each, were used to obtain pre- 
Si tment baseline measurements, and the remaining 8 
H hours each, were experimental treatment sessions. 
To facilitate scheduling, approximately half of the sub- 
 Jcts participated in either of two successive 5-week 
i is. In the baseline sessions, subjects were in- 
 Ptucted to sit quietly and relax. The 3rd session marked 
beginning of training. Subjects were read instruc- 


tions appropriate for the experimental condition to which 
they were assigned. Instructions consisted of a brief 
reminder of the concept of psychophysiological BP re- 
duction and a description of the general principle speci- 
fic for the given treatment group. Instructions for the 
feedback or relaxation methods followed. The feedback 
groups were encouraged to use whatever mental strategy 
would result in the correct feedback. For the next 2 
sessions, the instructions were paraphrased ; and for the 
subsequent sessions, the subject was told; “I think you 
know the procedure by now. Are there any questions?” 
Subjects were seated in a semirecumbent lounge chair 
in a wood-paneled sound-attenuated chamber. ‘The fol- 
lowing devices were connected to the subjects: a Bio- 
feedback Systems, Inc., frontalis EMG headband con- 
sisting of three stainless-steel electrodes mounted in a 
rubber strap; two Beckman biopotential electrodes 
placed over the extensor for right forearm EMG; two 
silver-plate electrodes (right arm to left leg) and one 
silver-plate ground electrode on the right arm for the 
EKG; a respiration strain-gauge belt around the dia- 
phragm; @ crystal Korotkoft (K) sound microphone 
ed over the left brachial artery and held in place 
by a BP cuff; and a Meditron electronic stethoscope 
affixed distal to the K sound microphone. During fol- 
low-up sessions, only the BP cuff and electronic stetho- 
scope were attached. All recording was done in an ad- 
jacent instrument room. Two-way communication with 
the subject was maintained via an open intercom. 
At the beginning of each session, with the subject 
alone in the training room, three BP measurements us- 
ing the standard Riva-Rocci method were taken re- 
motely by the experimenter in the adjoining instrument 
room. These readings were averaged to approximate the 
median systolic pressure, which was used as the initial 
constant cuff pressure setting. BP was then tracked 
continuously using the constant cuff method (Tursky, 
Shapiro, & Schwartz, 1972). All subjects were given 20 
60-sec inflations (trials) separated by 30-sec rests. At 
the end of each trial, the cuff pressure was raised 2 mm 
of Hg if a subject showed K sounds following more than 
75% of the heartbeats. Conversely, the cuff pressure 
was lowered by 2mm of Hg for thenext trial if K sounds 
followed less than 25% of the heartbeats. For 


position 
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group, sessions consisted of 20 60-sec trials, with the BP 
cuff inflated to median systolic pressure alternating with 
30-sec intertrial intervals with the cuff deflated. In the 
BP and EMG training sessions, feedback was given only 
during the 20 60-sec trials. The constant cuff pressure 
on each trial constituted the basic BP data for all 
subjects. 

For the BP group, BP and HR were measured, and 
changes in both were fed back using a pattern-feedback 
method (Schwartz, Shapiro, & Tursky, 1971). Briefly, 
BP was tracked as described previously. HR was 
tracked using a Lexington Instrument Company cardio- 
tachometer and a Grason-Stadler level detector, Dur- 
ing the initial calibration period, median HR was de- 
termined. This is the HR at which half the interbeat 
intervals are faster and half are slower than that rate for 
a given series of heartbeats, The level detector was set at 
the median HR for the first trial and reset from trial to 
trial, depending on the number of interbeat intervals 
above and below that rate in the trial. The level was 
raised by 2 beats per minute (bpm) if 75% or more of 
the interbeat intervals were faster and lowered by 2 bpm 
if 25% or less of the intervals were faster than the 
median HR for the previous trial. In the feedback pro- 
cedure, reductions of 2 bpm or more from the subject’s 
median HR triggered the detector. During training, 
subjects in the BP group received feedback whenever 
HR dropped below the median level accompanied by a 
simultaneous reduction of BP (absence of K sound 
within 400 msec following the R wave). Feedback con- 
sisted of an auditory signal and visual feedback (cumu- 
lative digital meter). Subjects were instructed as 
follows: 


During periods when the cuff is inflated, feedback in 
the form of a tone and increment on the meter will be 
provided whenever the equipment detects a decrease 
in your BP. If your BP goes up, or stays the same, no 
feedback will be given. Your job is to get as much 


feedback as possible (a high meter count) on each 
trial. 


The EMG group heard a change in click signal fre- 
quency and observed the deflection of an analogue 
meter as average EMG activity varied during each 60- 
sec trial. The right forearm extensor and frontalis EMG 
readings were integrated by two Beckman EMG cou- 
plers. The combined average of these signals was used 
to drive a voltage-controlled oscillator, which produced 
auditory clicks directly proportional to integrated EMG 
activity. The integrated average of the two EMG sig- 
nals also drove a large analogue meter placed directly in 
front of the subject. The task of the EMG group was to 
lower EMG activity using both auditory and visual 
feedback. Briefly, instructions were as follows: 


Your task is to reduce your blood pressure by relax- 
ing your muscles, During periods when the cuff is 


quency. Conversely, if 


R you tense your musc] 
meter will move towa TEEN EANES 


rd the right, and the clicks 
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will increase in frequency. Your job is to decrease | 
the click rate and keep the meter over to the left as 
much as possible. 


The relax group was asked to meditate following the 
relaxation response method of Benson (1975) during the 
20 trials of BP measurement. Briefly, instructions were 
as follows: 


During the session, the BP cuff will be inflated from $ 
time to time for 1-minute periods. No other stimuli 
will be provided. When you are told to start, close 
your eyes and relax your muscles. Breathe through 
your nose. As you breathe out, say the word one 
silently to yourself. For example, breathe in...out, 
one, in...out, one, etc. Breathe easily and naturally, 
Continue until you are told to stop. Do not worry f 
whether you are successful in achieving a deep level 
of relaxation. Maintain a passive attitude, If dis- 
tracting thoughts occur, try to ignore them by not 
dwelling upon them and return to repeating one, 

| 


At the end of each session, subjects in each group wert 
shown a cumulative graph of their total progress in 
lowering BP. i 

After the first training session, subjects were in- 
structed to practice lowering BP as often as possible 
between sessions using whatever strategy they had ar- 
rived at during the laboratory sessions but without the 
aid of a sphygmomanometer or other instrumentation. 
Subjects were requested to record the nature and ae 
of this practice, and the record sheets were collect 
before beginning each of the remaining seven Bie 
sessions. Generally subjects reported no siana 
changes during the course of training in their compliance 
with home practice. Nor were there any reported oe, 
nificant changes in medication during the course 0 
training. y ka 

After the eighth training session, subjects were 4 Hi 
to return to the laboratory for a 6-week and & Ly k 
follow-up session. All subjects returned for the HE 
follow up, but only 13 returned for the 1-year fo Ti 
(7 BP, 3 EMG, 3 relax). During both follow-up in, “a 
a BP cuff and an electronic stethoscope were attac f 
to the left arm. The subjects were then left alone es 
laboratory, and they were asked to practice the E 
for lowering pressure that they had learned e 
training. No feedback was provided for any sub) 
during these sessions. f sub- 

Two female experimenters ran equal nun 
jects in all three groups. Each subject was run PY 
same experimenter throughout the experiment. 


the 


Results 
Pretreatment Baseline Data 


To determine if the three treatment gou 
differed in average BP prior to the begint 7 
training, an analysis was made of five pretr 
ment pressures: (a) average BP obtained in i“ 
prior medical history, (b) average BP obta™ 
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Figure 1. Mean blood pressure (measured in millimeters of mercury; mm Hg) collapsed over treatment 


nthe physician’s examination, (c) average BP 
y the beginning of Baseline Session 1, (d) 
Ee BP at the beginning of Baseline Ses- 
ion 2, and (e) average BP at the beginning of 
Training Session 1. A two-way analysis of vari- 
ance was carried out for the three treatment 
p and five repeated measures of BP in 
a group. BMD 08V and BMD P2V pro- 
Sams were used in this analysis and in subse- 
quent statistical analyses (University of Cali- 
= Los Angeles, Health Sciences Comput- 
acility). Results of the analysis of variance 
a wed that baseline differences between treat- 
| aes were not significant for either 
Ai ic or diastolic pressure. The three groups 
a comparable BP values prior to the begin- 
a vi active treatment. However, pressure 
ao considerably under the different con- 
or m of observation. Pressures were highest 
E e payagan examination and previous 
E cal history and lowest for the two baseline 
N and the initial training session (Figure 
a teduction of 5 mm of Hg in systolic pres- 
e and a slight decrease (1 mm of Hg) in dia- 


Aa Pressure were observed from Baseline 
tie 1 to Training Session 1. Differences 
Fie the five baseline measurements were 
Pa cant for both systolic and diastolic 
a ae F(4, 84) = 36.79, p< 01; F(4, 84) 

35, p< .01. Systolic blood pressure 


7 


Í 


groups over the course of the study. (For Follow-up II, » = 13, Med, = medical; 


Exp. = experimental.) 


averaged over all 24 patients differed by 17 mm 
of Hg between medical history value (156 mm 
of Hg) and Training Session 1 (139 mm of Hg); 
diastolic blood pressure was lower by 11 mm of 
Hg (94 mm of Hg and 83 mm of Hg, respec- 
tively). Reductions of still larger magnitudes 
(26/15 mm of Hg) were found when the project 
physician’s recordings were compared to the 
first training session pressures. These reduc- 
tions were much larger than any observed dur- 
ing the course of the actual treatment sessions. 
These data indicate the potency of environ- 
mental influences on BP. They also point out 
the issue of proper baseline determinations and 
habituation as factors to be accounted for in 
evaluating the effects of treatment programs of 


this kind. 


Treaiment Data 


Average systolic pressures for each training 


session are shown in Figure 2, Also shown are 
the average systolic pressures for the baseline 
days and Follow-up Day 1. The graph suggests 
that the relax group decreased in pressure and 
the other two groups increased in pressure, 
comparing the baseline days and Training Day 
1. Several preliminary analyses of variance 
were computed to examine this apparent dif- 
ferential trend. First, an analysis of variance 
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PER SESSION 


AVERAGE SYSTOLIC BLOOD PRESSURE 


@——— Blood Pressure Group 
----— EMG Group 


+ 
. 
l 
Be 
L J kog 
BASELINE TRAINING FOLLOW-UP 
DAYS DAYS DAY I 


miy R 
Figure 2. Mean blood pressure (measured in millimeters of mercury; mm Hg) per session in the thr 


treatment groups. (EMG = electromyogram.) 


was carried out for the three treatment groups, 
comparing Baseline Session 2 and Training 
Session 1. Neither the interaction between 
treatment and session nor the session difference 
were significant. Second, the decrease in sys- 
tolic BP for the relax group by itself was not 
statistically significant. Third, an analysis of 
covariance for the values during the 8 training 
days adjusting for differences in Baseline Day 
2 did not yield a significant treatment effect or 
a significant Treatment X Session interaction. 
Therefore, the treatment data were analyzed 
without adjusting for baseline values, which 
were not significantly different between treat- 
ment conditions, 
The primary treatment data consisted of the 
systolic pressures obtained during eight ses- 
sions of training, 20 values per session, with 
each value being the median systolic pressure 
derived from the constant cuff method for each 
60-sec trial period. An analysis of variance was 
carried out for the three treatment groups with 
repeated measures for eight sessions and 20 
trials within each session, A significant main 
effect was obtained for trials, F(19, 399) 
= 15.72, p < .01. On the average, pooling 
sessions and groups, the initial value in a ses- 
sion was 140.2 mm of Hg, and the 20th value 
was 136.8 mm of Hg. No other main effects or 
interactions were statistically significant. 


Although the overall analysis of van 7 
training data indicated no significant ¢ ad 
between groups, inspection of the ee 
gested that each treatment group s ha 
somewhat different pattern of apparent f 
or habituation over training sessions. rd i 
poses of description only, an analysis S 
ance for trends was used to examine Cit 
over the course of treatment in each ee 
arately. Only linear and quadratic tren 
considered. Of the three treatment Oran | 
only BP showed a significant ne ae | 
sessions, F(7, 49) = 2.52, p < .03. can H 
ratic effect for sessions was also a T a 
this group, F(1, 7) = 24.87, p < 0 ah { 
cord with results presented earlier, E, i 
showed a highly significant mam ae ad 
trials. Significant linear trends for als 
curred in EMG and relax, consid nel 
sions or the first six sessions only. ce roi 
trial trend was not significant for the ‘a se 
however, the quadratic trend for trials 
nificant in this group for all Magee pe s 

Table 2 gives the average ist and i 
sure value over Baseline, Training, ae 
low-up 1 days. As can be seen in the * chia 
ductions in pressure over the 20 trials 
session were larger in baseline BEL. cont 
training sessions. An analysis of varia ing 98 
paring Baseline Session 2 with Trai” | 
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Table 2 
Systolic Blood Pressure: 
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First and Last Trial Values 


Treatment group 


Sages BP EMG Relax doaa 
Baseline 5 141-136 142-134 142-142 142-138 
ee Z 138-134 140-132 145-136 141-134 

raining 143-142 142-140 137-136 141-139 

; 140-139 141-137 138-134 140-137 
3 138-137 145-140 138-133 140-137 
4 137-137 138-137 140-138 138-137 
5 137-136 141-134 139-134 139-135 
6 134-128 141-140 138-132 138-133 
i 143-142 143-140 139-133 142-138 
ei 143-139 146-143 139-133 143-138 
ollow-up 1 139-135 136-130 137-134 138-133 


Note. BP = blood pressure, measured in millimeters of mercury; 


group. 


sion 6 (a session showing a good effect) supports 
this conclusion (p < .01). In Baseline 2, sys- 
tolic pressure started at 141 mm of Hgin 
Trial 1 and ended at 134 mm of Hg in Trial 20. 
Comparable values for Session 6 were 138 mm 
of Hg and 133 mm of Hg. 
Table 2 also reveals the other patterns dis- 
cussed above. The relax data appeared remark- 
ably uniform from session to session, with 
Trial 1 values varying from 137 mm of Hg to 
140 mm of Hg to 139 mm of Hg over sessions 
and with reductions in pressure over trials 
varying from 1 mm of Hg to 6 mm of Hg. The 
BP group showed little decrease over trials, 
with the exception of Session 6, and the pre- 
viously noted trend toward reduction over Ses- 
sions 1-6, from 143 mm of Hg to 134 mm of Hg 
for Trial 1 and from 142 mm of Hg to 128 mm 
of Hg for Trial 20. Findings in the EMG group 
ce less consistent. In this group, Trial 1 
ee varied up and down, and reductions 
luring sessions varied from 1 mm of Hg to7 mm 
of Hg. 
SE the treatment session data were re- 
i yzed for the effects of individual differ- 
ay in lability and medication. Two separate 
ae of variance were carried out, dividing 
i treatment group into two equal subgroups 
i the basis of lability and medication (Table 
e: the lability analysis of variance, & sig- 
$ cant three-way interaction occurred for 
Session X Trials X Lability, F(133, 2394) 
en p < 02. Although this interaction is 
ifficult to interpret, it appears from the data 


EMG = electromyogram. n = 8 for each 


that labile patients showed more variable, in- 
cluding occasional larger, reductions in pres- 
sure over trials and over sessions than did 
nonlabile patients. These differences were not 
related to treatment condition. As for medica- 
tion, three significant interactions were ob- 
tained: (a) Trials X Medication, F(19, 342) 
= 1.83, p < .02. On the average, for all treat- 
ments, medicated patients showed a slightly 
greater reduction (about 2 mm of Hg) in sys- 
tolic pressure from Trial 1 to Trial 20 than did 
nonmedicated patients. (b) Session X Treat- 
ment X Medication, F(14, 126) = 1.92, $ 
< .03. Patterns of change over sessions differed 
for the three groups according to medication, 
but no clear interpretation can be arrived at. 
(c) Session X Trials X Treatment X Medica- 
tion, F (266, 2394) = 1.38, p < .01. A four-way 
interaction such as this can only be speculated 
about. Examining the first and last trials for 
each treatment group broken down according 
to medication, we find that in general non- 
medicated patients in the BP group showed 
greater within-session reductions in pressure 
and increases in these reductions over sessions 
than medicated patients. In EMG, medicated 
patients seemed to do consistently better over 
sessions in terms of pressure reduction. In re- 
lax, the trends were about the same in both 
medicated and nonmedicated patients. 

Associated Changes During Treatment 


Data on HR in beats per minute were avail- 


able on each trial during the training sessions. 
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Table 3 
Physiological Changes on Training Day 6 


Treatment group 


Measure BP EMG Relax 
BP —6.0 -1.0 —6.0 
Forearm EMG (xv) —4.4 —4.4 —10.6 
Frontalis EMG (nv) +4.4 —12.2 —23.1 


Note. BP = blood pressure, measured in millimeters 
of mercury; EMG = electromyogram. 


An analysis of variance comparable to the one 
carried out on systolic pressure was done. No 
significant effects related to treatment group 
were found. As in the case of systolic blood 
pressure, a significant main effect for trials was 
observed, F(19, 399) = 115.33, p < .001. On 
the average, there was a 3-bpm decrease in HR 
during sessions from Trial 1 to Trial 20. Sepa- 
rate analyses of each group did not reveal any 
significant trends over sessions for any group. 

Integrated EMG was recorded from both 
frontalis and right forearm muscles during all 
baseline and training sessions. An analysis of 
the change in EMG from Trial 1 to Trial 20 
was carried out on Training Session 6 data. Al- 
though mean systolic BP of the three groups 
varied on that day (Figure 2), the change in 
systolic BP from Trial 1 to Trial 20 was not 
significant. 

Table 3 gives the average systolic BP change, 
average forearm extensor EMG change, and 
average frontalis EMG change from Trials 1 
to 20 on Day 6. No significant differences be- 
tween the groups were found in forearm EMG 
activity. A Link-Wallace shortcut analysis of 
variance (Mosteller & Bush, 1954) revealed 
a significant difference for frontalis EMG activ- 
ity between the BP and relax groups (p < .05). 
Differences in frontalis EMG activity between 
BP and EMG and EMG and relax were not 
significant. These data suggest that comparable 


reductions in BP can be achieved with differing 
skeletal muscular responses. 


Follow-up Data 


Turning to the 6-week follow-u 

Ti c ip data, no 
significant differences were obtained between 
groups, Average systolic and diastolic values 
were quite comparable (Figure 2), and the first 
and last trial values were also very similar 
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(Table 2) from group to group. Comparing 
these values to the medical history or to initial 
value obtained by the physician, the reductions | 
in pressure for all subjects were 18/8 mm of Hg 
and 27/17 mm of Hg, respectively. However, 
comparing these same follow-up values to the 
initial values obtained at the beginning of the 
first training session (actual third pretraining 
visit of patients to the laboratory), the reduc- 
tions were quite small—6 mm of Hg systolic 
and 3 mm of Hg diastolic. Whatever benefit 
there may have been during the training itself, 
it appears to be of a small magnitude, It is 
clear that significant reductions in pressure had 
already occurred prior to the actual inception 
of treatment, probably as a result of habitua- 
tion to the laboratory situation (see Figure 1). 
The apparent reduction in pressure for the re- 
lax group at the time of the 6-week follow-up 
as compared with the baseline days (see Figure 
2) was not statistically significant. 

In the 1-year follow-up, only about half of 
the total sample of patients returned for evalu: 
ation: 7 BP, 3 EMG, and 3 relax. At this time, 
the initial in-laboratory average pressures for 
the three groups were, respectively, 138/88 mm 
of Hg, 148/92 mm of Hg, and 138/91 mm of 
Hg. Given the small samples, it is not possible 
to determine whether the groups differed si- 
nificantly at this time. The average pressure 
for the 13 patients was 141/90 mm of a 
about 3/3 mm of Hg higher than the level ob- 
tained at the 6-week follow-up and at the 
beginning of Training Session 1 (Figure 1). 


Discussion 


The purpose of this study was to compare E 
effectiveness of three different, purely behave 
oral, means of lowering blood pressure 0 2 
tients with essential hypertension. It was the 
pected on the basis of previous studies that 4 
procedures would be effective to some co 
This was not the case. There was little one 
either during the course of the treaty) sige 
sions or in the follow-up evaluations, © the 
nificant reductions in pressure. Conde y 
initial pressure values obtained imm 
prior to the beginning of training as @ a é 
the average reduction for all 24 patients # 
time of the 6-week follow-up was 1 mm nial 
systolic and 2 mm of Hg diastolic. On the 
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ofdata available from about half the sample at 
the time of the 1-year follow-up, pressures were 
back to pretreatment levels. No one treatment 
procedure was obviously better than any other. 
Within sessions, they all resulted in significant 
reduction of pressure of about 4 mm of Hg sys- 
tolic on the average. 

For each of the methods used in this study, 
several earlier studies have been cited that re- 
ported average reductions in pressure ranging 
from 5% to 15% of pretraining baseline values. 
The failure to obtain even small reductions in 
the present study is therefore quite puzzling. 
Some speculations can be offered as the reasons 
for this failure and the apparent inconsistency 
between previous and present findings. Con- 
sider the BP group findings. In this group, 
there was only a small reduction in systolic 
pressure over the 20 trials within any one ses- 
sion, typically 1 mm of Hg. Benson et al. (1971) 
reported average within-session reductions of 
4.8 mm of Hg in seven patients, and this value 
isin accord with the change observed in norm- 
otensive subjects (Shapiro et al., 1969). Both 
of the latter studies contained simple systolic 
Pressure feedback. Since we know that patients 
can respond positively to systolic feedback by 
itself, it may be that the additional task of 
training to reduce HR at the same time made 
it more difficult for these patients to reduce BP. 
The average HR in these BP patients was not 
levated, possibly further adding to the diffi- 
ulty. Previous studies involving behavioral 
teatment of hypertension through direct BP 
iofeedback used patients with sustained hy- 
ertension whose pressures were usually higher 
han subjects studied here. Subjects in the 
Benson et al. (1971) study showed average pre- 
training systolic pressures of 165 mm of Hg, 
and Goldman et al. (1975) reported a sample 
average of 167 mm of Hg, whereas our subjects 
showed average pretraining pressures of 141 
mm of Hg during the first three laboratory €x- 
Posures prior, to training. It is possible that the 
effectiveness of behavioral treatment follows 
the law of initial values (Wilder, 1957), and 
that only patients with relatively elevated pres- 
sures will show substantial benefit from such 
treatment. Furthermore, although the average 
Pressures for our sample of patients were rela- 
tively high as recorded in various medical ex- 
aminations prior to treatment, the values were 
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considerably reduced in the quiet, relatively 
nondemanding laboratory conditions. The use- 
fulness of biofeedback training or of other 
methods of behavioral control under conditions 
in which the symptom is not shown in full 
force may be questionable, 

Another difference between this study and 
previous reports is that the present investiga- 
tion gave all subjects a rather limited amount 
of training and attention. Only 8 training ses- 
sions were used in the present study as com- 
pared to up to 34 sessions in the Benson et al. 
(1971) study or 42 sessions in the Kristt and 
Engel (1975) study. Tt is of interest, however, 
that the one subject studied by Benson et al. 
(1971) who had both similar baseline pressures 
and number of training sessions showed a re- 
duction in pressure similar to that of subjects 
in our study. Also, in the Goldman et al. (1975) 
study in which only nine treatment sessions 
were used, systolic reductions of only 4% were 
observed although diastolic changes were much 
larger. 

As for the EMG group, we asked these pa- 
tients to lower muscle activity in two sites, 
forearm and forehead. This double requirement 
may have contributed to the apparent diffi- 
culty in achieving total muscular relaxation. 
In any case, the EMG group data are quite 
variable and inconsistent with earlier reports of 
success. We could speculate that some other 
more potent ingredient such as the suggestion 
to relax in the treatment process or an expect- 
ancy of success was present in the other studies 
and missing in ours. 

In the relax group, the pressure data were 
remarkably consistent from day-to-day treat- 
ment and relatively consistent within sessions. 
It seems that the relax group involves the least 
task or problem-solving orientation of the three 
procedures. This is reflected in the apparent 
immediate reduction in pressure at the very 
beginning of Training Session 1 and the con- 
sistent 4-5 mm of Hg reduction 1n pressure 
from beginning to end of each session. The lack 


of pressure reactivity reflects the passive, re- 
laxed stance involved in the procedure, and the 


data attest to the ability of patients to comply 
with the instructions that seem to enhance 
their ability to relax and to show rapid habitu- 
ation to the treatment situation. If there 1s 
any lasting therapeutic benefit to such a pro- 
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cess, it was not revealed in any session-by-ses- 
sion trend or in the follow-up data. Because of 
its inherent simplicity, however, the meditation 
relaxation procedure could be seen to have an 
advantage over other methods. 


These results, in light of the interpretations 
and possible explanations offered here, suggest 
possible refinements in future research. First, 
we need more systematically to segregate labile 
from fixed patients, to account more thoroughly 
for the effects of medication, and to make a 
more thorough assessment of cardiovascular 
status to determine the suitability of feedback 
of different pressure (diastolic, systolic) or 
cardiac (heart rate, cardiac output) indices, 
whether singly or in patterns. Second, the arbi- 
trary use of a fixed number of training sessions 
may not be adequate, and more than eight ses- 
sions may be required. Third, the design of 
treatment procedures must carefully consider 
the degree to which task involvement should 
play a part in the process, For some patients, 
simple passive, nontask conditions, such as 
mediation relaxation, may be appropriate. In 
others for whom abnormal reactivity to stress- 
ful situations or other eliciting events is a fac- 
tor, the learning process could well incorporate 
a task, such as in biofeedback for pressure re- 
ductions, or a task involving alterations of 
physiological response in coping with stressful 
stimuli (Sirota, Schwartz, & Shapiro, 1974). 

Finally, rather large differences were ob- 
served between pressure levels obtained in the 
medical history and in our physician’s exami- 
nation as compared with the levels obtained in 
quiet laboratory conditions, particularly after 
a few baseline nontask recording sessions. 
These differences range up to 20 mm of Hg sys- 
tolic and 15 mm of Hg diastolic. They attest to 
the potency of real-life variables (e.g., doctor’s 
examination) versus the lack of potency of non- 
stimulating laboratory conditions. Such vari- 
ability has been noted repeatedly in the litera- 
ture (Julius & Schork, 1971). Engel (Note 1) 
has Suggested that the physician effect or habi- 
tuation to the physician and medical situation 
may account in part for the placebo effect 
noted in hypertensive research (Miller, 1974). 
The commonly reported reductions in pressure 
in many behavioral treatment studies may also 
reflect this process of habituation, It is obvious 


that we need to Pay more attention to the 
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sources of variability in blood pressure, par- 
ticularly in patients with essential hyperten- 
sion, as these sources may tell us about the 
nature of the disorder and give us clues about 
the design of relevant behavioral treatment 


procedures and appropriate methods of evaluat- 
ing treatment effects. 


Reference Note 


1. Engel, B. T. Personal communications, February 27 
to March 2, 1976. 
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Effects of Transcendental Meditation and Muscle Relaxation 


on Trait Anxiety, Maladjustment, Locus of 
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Sixty undergraduate volunteers were randomly assigned to receive training in 
transcendental meditation (TM), training in a muscle relaxation technique, or 
no treatment. The training in muscle relaxation was designed to be maximally 
similar in structure and atmosphere to training in TM. Measures of trait anxiety, 
locus of control, maladjustment, and drug use were collected before and after 
the 9-week treatment period. On a behavioral measure of trait anxiety, the scores 
of all three groups decreased equally, but on a self-report measure the TM sub- 
jects reported steady decreases in anxiety, whereas the scores of the other two 
groups remained unchanged. There were no differences in maladjustment, locus 
of control, or drug use as a function of treatment. Although TM subjects held 
higher expectancies for benefits, and were slightly more regular in practicing 
their technique, individual differences in expectancy and frequency of practice 
were not correlated with degree of reported anxiety reduction. It is concluded 
that TM may reduce trait anxiety, but it has not been shown to be of value in 


inducing general personality change. 


Studies of the effectiveness of transcendental 
meditation (TM) have generally lacked the 
methodological rigor and sophistication that is 
now expected in studies of psychotherapy out- 
come (Smith, 1975), Three particularly per- 
vasive flaws have been the following : the failure 
to obtain initially equivalent treatment and 
control groups by random assignment of sub- 
jects, the tendency to rely exclusively on self- 
report measures, and the failure to use 
“placebo” treatments to control for nonspecific 
treatment effects. It is interesting to note that 
with the exception of Smith’s (1976) method- 
ologically sound study, previously published 
research has been uniformly favorable to TM. 
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In particular, changes in the “healthier” direc- 
tion have been reported for each of the vari- 
ables of interest in the present study: anxiety 
(Ballou, 1973; Ferguson & Gowan, 1973; 
Hjelle, 1974; Orme-Johnson, Arthur, F. ranklin, 
O’Connell, & Zold, 1973); malajustment (Fer 
guson & Gowan, 1973; Hjelle, 1974; Seeman, 
Nidich, & Banta, 1972); locus of control 
(Hjelle, 1974); and marijuana use (Shafi, 
Lavely, & Jaffee, 1974). 

These results must be interpreted in light of 
the design flaws mentioned above; none of 
these studies used a placebo control group, and 
only Ballou (1973) achieved a true experi- 
mental design. Based on his review of these 
and other studies, Smith (1975) rightly con- 
cluded that methodological flaws in the studies 
reviewed were such that there “is not clear 
evidence that meditation is in and of itself 
therapeutic” (p. 562). 

Smith (1976) attempted to overcome the 
limitations of these studies by randomly 45 
signing students who had volunteered to ob- 
tain free treatment for reducing anxiety to 
receive either TM, no treatment, or a treat- 
ment designed to be equivalent to TM ™ 
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terms of both expectations for benefits and the 
repeated quiet sitting involved in meditation. 
Using the Spielberger State-Trait Anxiety 
Inventory (Spielberger, 1972), the Tennessee 
Self-Concept Scale, and the Sixteen Personal- 
ity Factor Questionnaire, Smith found that on 
virtually every measure of psychopathology, 
the two treatment groups improved signifi- 
cantly more than the untreated controls. How- 
ever, the TM subjects showed no more im- 
provement than subjects exposed to the 
placebo treatment. This finding was replicated 
in a second experiment comparing a TM-like 
meditation technique with an “antimedita- 
tion” technique involving the active generation 
of positive thoughts. Smith (1976) concluded 
that the crucial therapeutic agent in medita- 
tion is not the focusing of attention on the 
mantra (a Sanskrit word personally prescribed 
for each meditator) but rather some combina- 
| tion of just sitting and the expectation that the 
| technique one is practicing is therapeutic. 
Although Smith’s study is weakened by the 
high dropout rate—59% for TM and 53% for 
the control treatment in the first study—and 
the reliance on self-report measures, it cer- 
tainly casts considerable doubt on earlier 
claims that TM has specific treatment effects 
on anxiety and, more globally, on psychologi- 
cal adjustment. 

The present study, like Smith’s, provided 
for random assignment of subjects to the three 
experimental groups and for a nonspecific 
effects control group. However, the muscle re- 
laxation technique was designed to control not 
only for the nonspecific effects of TM, includ- 
ing placebo effects and the effects of simply 
sitting regularly, but also for its demonstrated 
properties of autonomic arousal reduction 
(Wallace, 1970). 


Method 
Subjects 


Subjects were recruited from the introductory psy- 
chology course at the University of Connecticut for “an 
experiment concerning transcendental meditation and a 
muscle relaxation technique.” Students who had prior 
training in any kind of meditation were not eligible, so 
our sample may not be fully representative of those who 
independently seek instruction in TM. However, our 
subjects were interested in meditation, and in fact al- 
most all subjects in the final sample stated that they 
had volunteered hoping to be assigned to the TM group- 
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One hundred fifty students volunteered, and from 
these we randomly selected 30 males and 31 females who 
were not receiving therapy or counseling: 1 female 
dropped out of the TM group shortly before training. 
All subjects were required to post a $10 deposit, forfeit- 
able to the scholarship fund if he/she failed to complete 
all the testing. Sixty-seven percent of the subjects were 
18 years old; 75% were in their first or second semester ; 
66% were firstborn or only children; and 52% were 
Catholic by birth. 


General Experimental Design 


Subjects were randomly assigned to receive either 
training in TM, training in muscle relaxation, or no 
treatment. Measurements were made before and im- 
mediately after the training, at the midpoint of the 
treatment period, and again at the end of the 9-week 
treatment period. 


Transcendental Meditation 


Instruction in TM, which consists of two group lec- 
tures of about 1 hour each, 1 hour of individual instruc- 
tion in the technique itself, and three additional 1-hour 
group meetings, was provided by experienced initiators 
from the Storrs branch of the Student International 
Meditation Society. Throughout the training period it 
is repeatedly stressed that TM is easy, natural, and 
spontaneous, and that regular practice will inevitably 
bring a wide range of benefits. The only departure from 
the standardized procedure introduced by the experi- 
ment was the use of grant funds to pay subjects’ $45 


initiation fees. 


Muscle Relaxation 
The training procedure in muscle relaxation was de- 
signed to duplicate as nearly as possible the structure 
M instruction, Considerable at- 


and atmosphere of T: | 
tention was devoted to making the treatment equivalent 


to TM in credibility and attractiveness and to heighten- 
ing subjects’ expectations for benefits. The training 10 
muscle relaxation followed the same sequence and pro- 
vided the same amount of contact with instructors as 
the training in TM. 

‘At the two initial lectures, the first author presented 
arationale for the technique, which suggested that “ten- 


ing refreshed and renewed. 
tion by three ad- 


vanced doctoral students in v 
males, one female) and a female doctoral student in 


Family Life Education who held s j 
dinical psychology They were all experienced in teach- 


D 


1 We would like to thank John Dufresne, Ron Lajoy, 
Margaret Nichols, and Ruta Teisman for their work in 
teaching the muscle relaxation subjects. 
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ing the technique and were supervised by a licensed clin- 
ical psychologist. They followed Paul’s (1966) instruc- 
tions for inducing muscle relaxation, spending between 
20 and 40 minutes with each subject. Paul’s (1966) 
procedure is an accelerated form of Jacobson’s (1938) 
technique that involves alternately tensing and releas- 
ing muscle groups until a state of deep muscle relaxation 
is achieved. Subjects were instructed to practice at home 
for approximately 20 minutes twice per day. 

The group meetings, which were held during the 3 
evenings following the individual instruction, included 
discussion of problems encountered by the subjects, ad- 
ditional lecture material, and group practice of muscle 
relaxation. The lecture material corisisted of scientific- 
sounding explanations of how muscle relaxation reduces 
muscle tension and promotes personality growth and 
stressed the simplicity, ease, and inevitable efficacy of 
the technique. 


No Treatment Control 


Subjects in the no-treatment control group were not 
given any training nor were they asked to alter their 
lives in any special way. They were exposed to virtually 
the same questionnaires and measurement procedures 
as subjects in the other groups. 


Measures 


Assessments were made in five areas: background 
and personality variables; expectancies; extent of 
arousal reduction; psychological maladjustment; and 
trait anxiety. Both behavioral and self-report measures 
of trait anxiety were used. 

Background and personality variables. All subjects 
were administered Rotter’s (1966) Locus of Control 
scale; the Social Desirability Scale (Crowne & Marlowe, 
1960) ; and questionnaires concerned with demographic 
characteristics, previous and current drug and alcohol 
use, and experiences with TM and muscle relaxation. 

Expectancies. The expectancy questionnaire in- 
cluded 25 possible benefits that muscle relaxation and 
TM subjects rated on an 11-point scale (0%-100%, in 
10% intervals) for likelihood that regular practice would 
lead to that benefit. The ratings were summed to yield 
a total expectancy score. The items were drawn from 
the four general areas of improvements in subjects’ 
emotional lives, achievements, interpersonal relations, 
and health. Some sample items are “lessened feelings of 

anxiety”; “deeper understanding of material in aca- 
demic courses”; “more satisfying relationships with 
those close to you”; and “improved body posture.” 
Arousal reduction. Autonomic arousal was concep- 
tualized as an abstract dimension, or factor, that reflects 
the common variance of many physiological response 
systems (Berlyne, 1967). Ideally, such a factor would be 
assessed by measuring a variety of different physiologi- 
cal indices, since any one index will share only some of 
its variance with the hypothetical construct of arousal. 
However, since we were not equipped to monitor several 
physiological channels, pulse rate alone was used as an 
approximate index of arousal. This particular variable 
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was selected because it has been shown to be affected 
by both muscle relaxation (Paul, 1969) and TM (Wal- 
lace, 1970), and it is generally considered to be one of 
the indices related to arousal level (Berlyne, 1967). 

The extent of pulse rate reduction achieved by sub- 
jects while meditating, practicing muscle relaxation, or, 
for the control subjects, simply resting with eyes closed, 
was monitored during an 18-min testing period. Sub- 
jects’ pulses were measured by a light-sensitive plethys. 
mograph (Grass Model 76604) attached to the index 
finger of the nondominant hand. The transduced signal 
was recorded by a Grass Model 76101 polygraph, The 
pulses were recorded for 20 sec immediately before the 
beginning of the testing period and for another 20 sec 
between 17 min 25 sec and 17 min 45 sec of the period. ` 

Psychological maladjustment. Rotter’s Incomplete 
Sentences Test, the measure of overall psychological 
maladjustment, was scored by the first author using 
Rotter and Rafferty’s (1950) scoring manual; he was 
blind to both group and time of administration. : 

Trait anxiety (self-report). Each week of the experi- 
ment, subjects completed Zuckerman’s (1960) Adjective 
Check List (ACL) scale of anxiety with instructions to 
check the adjectives that described how they felt “this 
week.” In addition, the S-R Inventory of Anxiousness 
(Endler, Hunt, & Rosenstein, 1962) was administered 
on four occasions, ‘This instrument requires the subject 
to indicate on a 5-point scale how strongly he or she 
would react with each of 14 common “anxiety” re- 
sponses (sweaty palms, nausea, etc.) in each of At dif- 
ferent situations. The ratings were summed to yield an 
overall index of trait anxiety. 

Trait anxiety (behavioral). In response to the need 
for a behavioral measure to supplement the self-report 
measures of trait anxiety, the Behavioral Anxiety Mea- 
sure (BAM) was developed by modifying a procedure 
used by Rehm and Marston (1968). The BAM consists 
of 10 tape-recorded situations, each of which contains @ 
short description of a social situation followed by a a 
of dialogue spoken by someone in the situation. ‘The 
subject’s task is to respond to the line of dialogue: 
Thirty seconds were allowed for each situation and aI 
sponse unit. Subjects’ performances were videotape 
through a one-way mirror and rated using a checklist 0 
anxiety indicators derived from that of Paul (1960) 

The details of the procedure were as follows: The € 
perimenter first presented written instructions, W of 
emphasized that the subject was to respond ' a 
you would if the situation were actually happening n 
that his or her performance would be rated for a 
psychological adjustment.” The experimenter, who 4 
mained in the room with the subject throughout a 
testing, then played a demonstration tape of three $ 


ations and gave a prearranged sample reply for a Sale 


parent-student, heterosexual, and same-sex pí 
tionships. This pool was submitted to 46 male rt 
female introductory psychology students, who make 
each situation for how uncomfortable it would that 
them. The final sets of situations were selected oa je) 
the four forms (A-male, A-female, B-male, B-fe g A 
would be nearly equal in mean discomfort rat! 


ple situation (professor-student) is: “You have a 
3 9:00 lecture class that you tend to be late for. 
Usually the professor just gives you a dirty look, but 
morning as you enter the lecture hall she says, ‘al- 
., how come you’re late again today’?” 

Nine relatively specific behaviors were chosen as 
anxiety indicators: head held downcast, abrupt head 
movements, swaying, extraneous arm and hand move- 
ments, arms held rigidly, hands restrained, failure to 
reply, blocking of speech, and extraneous comments. 
The raters practiced scoring videotapes of volunteer 
undergraduates until interrater reliability exceeding 
70% was established for each of the nine items. 

Three or four raters, blind to the subjects’ group 
memberships, viewed each videotape and made two 
ratings for each of the 10 situation-response units. The 
first rating was based on the subject’s behavior during 
the first 15 sec of the interval, and the second was based 
on the behavior of the final 15 sec. The score for the test 
was simply the sum of the number of 15-sec blocks in 
| which each behavior was observed. 


Procedure 


Subjects received groups of measurements at six test- 
ing Sessions: pretreatment, pretraining, posttraining, 
midtreatment, Posttreatment 1 and Posttreatment 2. 
(Posttreatment 1 and Posttreatment 2 were separated 
by approximately 1 week.) There was a delay of approx- 
imately 2 weeks between the pretreatment measures and 
the measures administered immediately before training 
(pretraining). The Locus of Control scale, the Social 
Desirability Scale, the first questionnaire, and the In- 
lete Sentences Test were group administered at the 
pretreatment session ; the Locus of Control scale and the 
complete Sentences Test were readministered at Post- 
treatment 2, along with the second questionnaire. The 
R Inventory of Anxiousness was administered at pre- 
aining, posttraining, midtreatment, and Posttreat- 
nt 1. The BAM was administered at pretraining and 
sttreatment 1, and at Posttreatment 1 it was fol- 
ved by the arousal-reduction measurement. All sub- 
s completed a weekly ACL anxiety scale. In addi- 
lon, subjects in the two treatment groups completed 
expectancy questionnaire four times during their 
g and again at Posttreatment 2, and they pro- 
ed a weekly report of the number of times they had 
tated or relaxed. 
‘The experimenters for the BAM were three male and 
three female undergraduates. A randomly selected half 
Mf the males and females within each group received 
AM Set A at pretraining and Set B at Posttreatment 1 
Ind the remaining subjects received the sets in the 
site order, 
| In accordance with the Student International Medita- 
tion Society’s requirement of 15 days of abstinence from 
igs prior to initiation, all subjects were asked to re- 
frain from using drugs for the 2 weeks separating the 
‘Pretreatment testing and the first day of training. 


Results 


The results of the study are presented in the 
following order: behavioral anxiety measure, 
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self-report anxiety measures, other outcome 
variables, and possible origins of differential 
treatment effects. 


Behavioral Anxiety Measure 


Interrater reliabilities for the scores on the 
BAM were computed using Pearson product- 
moment correlations. The mean reliabilities at 
pretraining and Posttreatment 1 were, respec- 
tively, .80 and .92. Using the Spearman-Brown 
formula for three independent raters, the reli- 
abilities of the averaged scores were estimated 
to be .92 at pretraining and .97 at Posttreat- 
ment 1. 

Test-retest reliabilities were computed using 
the control subjects’ scores, and the reliability 
of the BAM scores was found to be satisfactory 
(r = .58, p < .05). 

The BAM scores were analyzed in a repeated 
measurements analysis of variance design with 
one between (groups) and one within (time) 
factor. The time effect alone was significant, 
F(1, 57) = 29.08, p < .001. Inspection of the 
means revealed that there was an overall ten- 
dency to exhibit less anxiety at the second 
BAM testing but that there were no between- 
group differences in this regard. Thus, the 
behavioral anxiety measure provided no evi- 
dence of treatment effects of either muscle 
relaxation or TM on subjects’ trait anxiety. 


Self-report Anxiety Measures 


The weekly ACLs were averaged over inter- 
vals of 2 weeks to obtain six biweekly average 
ACL scores for each subject; the first 2-week 
interval extended through the first day after 
the personal instruction of treated subjects and 
essentially constitutes a pretraining baseline. 

A repeated measures analysis of variance dis- 
closed a main effect for time, F(S, 284) = 5.33, 
p < .001, and a trend toward a groups effect 
that nearly reached the conventional signifi- 
cance level, F(2, 57) = 3.06, p = .055. Al- 
though the control group consistently reported 
greater anxiety than the other two groups, 


2 All analyses were carried out using an unweighted 
means solution for unequal Ws. Missing data were aa 
mated by the Data-Text program, and error degrees © 


freedom were adjusted accordingly. 
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Table 1 


Means of Total Scores on the S-R Inventory of Anxiousness by Group and Sex 


— 
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Administration 


Pretraining Posttraining Midtreatment 


Posttreatment 1 


Group n 

Muscle relaxation 

Males 10 336.0 

Females 10 371.0 
Transcendental meditation 

Males 10 353.6 

Females 4 384.2 
Control 

Males 10 362.4 

Females i 434,4 


331.0 325.7 332.4 
374.5 355.6 351.9 
328.3 304.6 294.2 
369.3 346.1 344.3 
348.9 334.8 337.1 
435.9 438.4 436.6 


there was a downward trend in anxiety across 
the groups. A trend analysis (Meyers, 1972) 
indicated that there was a significant linear 
component to this trend, F(1, 57) = 10.41, 
p < 001. 

The total scores for the S-R Inventory of 
Anxiousness were analyzed in a repeated mea- 
sures analysis of variance design with groups 
and sex as between-subjects factors. Significant 
effects were found for sex, F(1, 54) = 12.20, 
b < .001; groups, F(2, 54) = 4.13, p< .05; 
time, F(3, 161) = 14.56, p< .001; and the 
Groups X Time interaction, F(6, 161) = 3.53, 
É < .01. It can be seen from the means (Table 
1) that females consistently reported greater 
anxiety than males and that the control group 
consistently scored higher than the two treat- 
ment groups. The most interesting finding was 


that although the muscle relaxation and control t 


groups’ scores remained essentially unchanged 
over time, the TM group reported substantial 
decreases in anxiety. The absence of a triple 
interaction inyolving sex indicates that the dif- 
ferential treatment effect found in these data 
was present. equally for males and females. 
Therefore, the sex factor was dropped from a 
trend analysis performed to investigate the 
nature of the significant Groups X Time effect. 
The S-R inventory was assumed to have been 
administered to all subjects on the ist, 11th, 
37th, and 71st days of the experiment, and 
orthogonal linear (—29, —19, 7, 41) and quad- 
ratic (3.85, —1.00, —6. 16, 3.31) coefficients 
were derived (Meyers, 1972). The linear com- 
ponent of the interaction was found to be sig- 
nificant, F(2, 57) = 4.72, p< .02, but the 
quadratic component was not, which indicates 


that the decreases in the TM group’s reported 
anxiety occurred steadily over the 9-week 
treatment period. ; 

Thus, of the two self-report measures of trait 
anxiety, the S-R inventory provided evidence 
of a treatment effect of TM, but the ACL did 
not. 


Other Outcome Variables 


Maladjustment scores from the Incomplete 
Sentences Test administered at pretreatment 
and Posttreatment 2 were analyzed in a re-i 
peated measures analysis of variance with 
groups as a between-subjects factor. The onl 
significant effect was time, F(1, 56) = 8.18 
$ = .006, which was associated with an over 
decrease in maladjustment scores. 

Analysis of subjects’ locus of control scor! 
disclosed no significant effects, indicating th 
the groups neither differed initially nor chang 
significantly over the course of the experiment. 

Subjects’ stated frequencies of drunkenness 
and marijuana use “in the recent past w 
collected at pretreatment and Posttreatment 
and were subjected to another repeated mea; 
sures analysis of variance. No significant effec 
were found for either frequency of drunkenn 
or frequency of marijuana use. 

At EE 2, subjects in the = 
treatment groups were asked to rate hey 
5-point scale how much benefit they felt if 
had gained in each of eight areas: eens, 
performance, interpersonal relations, he a 
condition of body and nervous system 
creased drug use, increased energy and T sel i 
overall happiness, decreased anxiety, an 
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teem. The first, third, and fifth points on the 
Je were labeled, respectively, not at all, a 
moderate amount, and a great deal. The mean 
ratings of the muscle relaxation and TM groups 
were compared by £ tests, and it was found that 
“the only significant difference was for academic 
performance, with TM subjects reporting 
greater benefits, 4(18) = 2.51, p < .05. It is 
noteworthy that the TM group tended to re- 
port only “moderate” benefits, with mean rat- 
ings ranging from 2.20 (decreased drug use) to 
3.32 (increased energy and vitality). 


Possible Origins of Differential Treatment Effects 
Additional analyses were performed to de- 
termine if there were unintended differences 
between the treatment groups that might have 
been responsible for the observed differential 
treatment effect. 

Unfortunately, the posttraining arousal re- 
duction data had to be discarded because of 
equipment problems that remained undetected 
for numerous subjects. The analysis of vari- 
ance of the Posttreatment 1 data revealed an 
Overall decrease in pulse rate (M change = 2.8 

beats per minute) from the beginning to the 

end of the 18-minute testing period, F(1, 57) 
= 13.78, p < .001, but there was no evidence 
that meditation produced greater reduction in 
arousal than muscle relaxation or simply sitting 
with eyes closed. 

The treated subjects’ weekly frequencies of 

2 practice were averaged over 2-week intervals 
and analyzed in a repeated measures analysis 
of variance. Main effects for groups, F(1, 37) 
= 6.70, p < .05, and time F(4, 147), p < .001 
were found, but their interaction was not sig- 
nificant. Both groups decreased in frequency 
after the first 2 weeks, and although the TM 
group maintained a consistently higher fre- 
quency, the magnitude of the average difference 

Was not great (for TM, M = 11.5; for muscle 
relaxation, M = 10.2). 

__ Subjects’ total expectancy scores after their 
training (posttraining) and at the end of the 
experiment (Posttreatment 2) were used as an 
index of the nonspecific effects of treatments. 
Repeated measures analysis of variance dis- 
closed a significant effect for time, F(1, 37) 
= 13.52, p< .001, and a trend toward a 
groups effect, F(1, 37) = 2.89, p = .10. Both 
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groups’ expectancies decreased over the treat- 
ment period. The trend in the groups factor 
reflected the TM group’s higher expectancies 
and suggests that the treatments may not have 
been equal in their nonspecific effects. 

To determine if the variables on which dif- 
ferences were found between the treatment 
groups were actually associated with the ob- 
served treatment effect of TM, correlations 
were computed between these variables and 
residual change scores on the S-R inventory. 
The change scores were computed from the TM 
subjects’ pretraining and Posttreatment 1S-R 
inventory scores, using the linear regression C0- 
efficient derived from the entire sample. The 
correlations between the treatment effect, as 
measured by the residual change scores, and 
variables we thought might be related to treat- 
ment effect were average frequency of practice, 
—.23; total expectancy, —.26; social desir- 
ability, —.20; psychological adjustment, .06; 
and locus of control, 53 (ps < .05). It is ap- 
parent that the two variables on which the 
treatments differed, frequency of practice and 
expectancy, were not significantly related to 
TM subjects’ decreases in trait anxiety. 


Discussion 


Interpretation of these findings is made 
difficult by the inconsistency of the results ob- 
tained with the three measures of trait anxiety. 
In particular, it is unclear whether the S-R 
inventory measured some change in TM sub- 
jects to which the other instruments were m- 
sensitive or whether that apparent change was 
an artifact of measurement. 

One possibility is that the TM treatment 
effect was a result of subjects’ perceptions that 
this was “an experiment on TM” and their re- 
sponses to the implicit demands (Orne, 1969) 
to report decreases in anxiety. On the other 
hand, the fact that Social Desirability scores 
did not predict residual gain scores within the 
TM groups argues against this hypothesis, 
since it would be expected that subjects more 
highly motivated to win the experimenter $ 


uld report larger decrements. 
oE E z dedine in anxiety 


jects’ self-images 
rather than actual changes in trait anxiety. It 
is difficult to reconcile the 
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tation of the S-R inventory results with the 
failure to find a treatment effect on the anxiety 
ACL, however, since the ACL is a classic 
format for eliciting self-images and should have 
been sensitive to beliefs such as “I don’t get so 
tense anymore.” Furthermore, there is some 
evidence (Otis, 1974) that TM does not affect 
subjects’ self-images. 

Thus, although not conclusive, the available 
evidence favors the view that the S-R inven- 
tory measured a real decrease in meditators’ 
trait anxiety. How, then, can we reconcile the 
positive results obtained from this instrument 
with the negative results obtained with the 
ACL and the BAM? A cue is provided by the 
fact that analyses performed using the response 
factor scores (Endler et al., 1962) in place of 
the overall score on the S-R inventory found 
TM to be more effective than the other condi- 
tions in reducing scores on the ‘‘distress” factor 
but not on the “autonomic” factor. Recalling 
Lang’s (1969) distinction among the self-report, 
behavioral, and autonomic dimensions of an- 
xiety, it seems reasonable to suppose that the 
distress factor corresponds to the self-report 
component of anxiety and that our S-R in- 
ventory results primarily reflect changes in 
subjective distress. This implies that although 
TM may have had no effect on the behavioral 
or autonomic components of anxiety, or on 
global self-images of anxiety proneness, it may 
have reduced subjectively experienced distress 
in the specific situations tapped by the S-R 
inventory. If this interpretation were valid, it 
would follow that TM first affects subjective 
distress, and, at least over short treatment pe- 
riods, does not alter overt behavior. 

Even if one accepts the conclusion that TM 
was effective in reducing some aspect of anxi- 
ety, it can still be argued that the greater re- 
duction in anxiety reported by TM subjects 
was due to the operation of a stronger placebo 
effect rather than to a Specific effect on TM, 
since training in TM produced higher expec- 
tancies for benefits, The plausibility of this in- 
terpretation of the S-R inventory results is 
reduced by three findings: First, though a 
significant relationship between expectancy 
and outcome measures would be expected if the 
treatment were functioning primarily as a 

placebo, TM subjects’ expectancies immedi- 
ately after the training period did not signifi- 
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cantly predict residual change scores on the 
S-R inventory. Second, subjects’ expectancies 
decreased over the treatment period in which 
their reported trait anxiety decreased, and it 
seems unlikely that a treatment could haye a 
strong placebo effect while subjects’ confidence 
in the treatment was waning. Finally, the 
decreases in anxiety were not abrupt, as would 
be expected from a placebo effect, but were 
gradual and continuous, as would be expected 
from a true treatment effect. 

This moderately supportive evidence for the 
existence of a specific treatment effect of TM 
is not consistent with Smith’s (1976) study, 
which found TM to be no more effective than a 
nonspecific effects contro! treatment, which 
consisted of sitting passively in a chair. This 
inconsistency could reflect differences in the 
measures of anxiety used in the two studies, 
differences in responses to treatment as a func- 
tion of the populations studied, or differences 
between the nonspecific effects control treat- 
ments. Smith’s subjects volunteered specifi- 
cally to receive treatment for anxiety, whereas 
the majority of subjects in the present study 
were hoping to receive training in TM. It is 
Possible that the groups who received what they 
were seeking—Smith’s two treatment groups 
and our TM group—responded with reports of 
decreased anxiety. Another possibility is that 
the high dropout rate in Smith’s study was not 
random but instead reflected a Treatment 
X Subjects interaction in which different sub- 
Populations responded specifically to each 
treatment, and the nonresponders in each group 
tended to drop out. Finally, Smith’s control 
treatment may have incorporated a therapeu- 
tic ingredient that was absent in this study’s 
muscle relaxation treatment, for example, the 
pairing of relaxation with continued cognitive 
activity, the feeling that the technique Y 
easy and enjoyable to practice, or a placebo 
effect fully as strong as that of TM. d 

In contrast to previous studies, the present 
study found TM to have no effect on subjects 
scores for locus of control, psychological ae 
adjustment, or frequencies of drunkenness an 
marijuana use. However, with the excepticn n 
the studies by Smith (1975) and Shafii et he 
(1974), the earlier results are compromised PY 
weak designs and measures of questiona A 
validity. The results of the last two studies a” 
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not directly comparable to those reported here, 
because they were obtained with different 
populations, different measures, and over 
longer treatment periods. Until further research 
clarifies the situation, the most that can be said 
with assurance is that there is at present as 
much negative as positive evidence concerning 
the effects of TM on maladjustment and drug 
use, and there is no sound evidence that it 
affects locus of control. 

In summary, though the results of this study 
provide some support for the hypothesis that 
TM is specifically effective in reducing normal 
college students’ experiences of anxiety, it must 
be remembered that negative results were ob- 
tained with the behavioral measure of trait 
anxiety and with measures of locus of control, 
psychological maladjustment, and frequency 
of drunkenness and marijuana use, and that 
the technique tended to-be rated as only 
“moderately” helpful with general life prob- 
lems. It appears, therefore, that TM has been 
oversold by its proponents, and unless it is 
shown that long-term practice does lead to 
great benefits, it should be considered irrespon- 
sible to advertise TM as a panacea. 
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Personality Correlates of Continuation and Outcome in 
Meditation and Erect Sitting Control Treatments 


Jonathan C. Smith 


Roosevelt University 


In a 6-month double-blind study, 49 anxious college student volunteers were 
assigned to transcendental meditation (TM) and 51 to a control treatment, 
periodic somatic inactivity (PSI). The control treatment was carefully designed 
to match the form, complexity, and expectation-fostering aspects of TM, but it 
incorporated an exercise that involved sitting erect with eyes closed twice daily 
rather than sitting and meditating. For each treatment 30 demographic and pretest 
personality variables were correlated with continuation in treatment and out- 
come defined in terms of trait anxiety change scores. As predicted, the TM 
dropout was more disturbed and less self-critical than the person who continued 
meditating. For TM, outcome correlated significantly with anxiety, Sizothymia 
(16 Personality Factor Questionnaire, Factor A), and Autia (16 Personality 
Factor Questionnaire, Factor M). Contrary to what was predicted, there was 
virtually no overlap between the variables correlated with continuation and 
outcome for TM and for PSI. It is concluded that differing treatment rationales 
rendered the treatments appealing, credible, and effective for different types 


of individuals. 


Meditation research seems destined to 
repeat the sins of a generation of psycho- 
therapy outcome research. In both areas the 
tendency has been to ask “does it work?” 
As Bergin (1971) has chronicled, this amorph- 
ous question has littered the journals with 
controversial and ambiguous results. Instead, 
Paul (1967) suggested: 


The question towards which all outcome research 
should ultimately be directed is the following: What 
treatment, by whom, is most effective for this individual 
with ihat specific problem, and under which set of 
circumstances? (p. 111) 


Meditation researchers have tended to ignore 
such questions of specificity (Smith, Note 1). 

A surplus of studies show that individuals 
generally display reductions in trait anxiety 
after learning meditation (Smith, 1975b). 
However, not everyone benefits, and up to 
half discontinue (Smith, 1976; Otis, Note 2). 
To date no one has systematically explored 
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the type of person who gains from meditation, 
and only one person, Otis (Note 2), has looked 
at the meditation dropout. On the basis of 
two questionnaire studies, Otis found that 
subjects who stop practicing transcendental 
meditation (TM) compared with those who 
continue feel “less positive about themselves, 
have “more serious problems,” and are more 
“withdrawn, irritable, and anxiety ridden.” In 
addition, he suggested that subjects who 
continue with TM are somewhat less disturbed, 
although they may admit to more problems; 
that is, they may be more self-critical. si 
The present study examined characteristics 
of subjects who drop out of TM and subjects 
who continue practicing for up to 6 months 
and who display significant reductions in trait 
anxiety. On the basis of Otis’s findings, it Mes 
predicted that subjects who continue practicing 
TM, compared with those who discontinue, 
are at pretest less disturbed, less anxious, 4” 
less withdrawn but more self-critical. Also, 4 
Previous study based on the same subject 
sample used in the present study (Smith; 
1976) concluded that TM and a yogalike 
treatment involving sitting erect are equally 
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psychotherapeutic, and that the therapeutic 
processes operating in both are the same—some 
combination of expectation of relief and daily 
sitting. For this reason it was predicted that 
the correlates of continuation and outcome are 
the same for TM and the treatment involving 
sitting erect. 


Method 
Procedure 


The present study used data collected in a previous 
study (Smith, 1976) that compared the psychothera- 
peutic effects of TM and a control treatment, periodic 
somatic inactivity (PSI). Subjects were 100 (51 male, 
49 female) Michigan State University students who 
were suffering from a high level of trait anxiety. All 
were carefully screened for motivation, were not in- 
volved in psychotherapy, and had at no time practiced 
meditation or yoga. Mean age was 22. 

All subjects were pretested, and 49 were randomly 
assigned to TM and 51 to PSI. TM was taught by two 
official TM instructors from the Students’ International 
Meditation Society and was jdentical to ordinary TM 
in every respect except that it was offered free. The 
TM technique involves sitting erect with feet flat on 
the floor and eyes closed while passively and con- 
tinuously attending to a special thought called a 

mantra.” This is done for 15-20 minutes twice daily. 
Complete TM instruction includes two introductory 
lectures that outline supporting theory and research, 
a 15-day drug fast, standardized individual instruction, 
3 days of follow-up instruction and discussion, and 
monthly follow-up checking. 

PSI was designed to control for the potentially 
therapeutic effects of daily sitting and expectation of 
relief. Specifically, the treatment matched TM in 
every respect with one exception—instead of sitting 
and meditating, the instructions were to simply sit. 
Subjects were told that while in this position they could 
think about anything (even worry) and the technique 
Would still work. Like TM, PSI instruction began with 
two introductory lectures that outlined a rationale 
explaining why sitting twice daily should be an im- 
mensely effective cure for most forms of psycho- 
pathology. In addition, bogus research was presented 
to support the claims made. Between lectures, subjects 
Pr apated in a 15-day fast from illegal drugs: After 
init $ fast and lectures, subjects were individually 
as pe and met for 3 days of follow-up checking. 
tat was taught double blind; both the subjects and the 
ae ructor were deceived into believing that the treat- 
b nt was legitimate and widely, researched and not @ 

gus control treatment. 
ene important feature of PSI was its rationale. Care 
ea taken to construct a rationale that was credible 
DA complex. To enhance credibility, actual psycho- 
ee concepts and research were woven together in a 
Pee elegant manner. That not one component 
ing“ theory was false or deceptive (although support- 

Process” and “outcome” research was faked) makes 
i ' 


jt unique among bogus treatment rationales. A sum- 


mary of the rationale given to subjects i 
below (Smith, 197Sa) : subjects is presented 


Built into life are factors that disrupt inner calm and 
generate and maintain anxiety. Research has shown 
that one of these factors is the desynchronization of 
circadian rhythms, daily rhythmic changes in physio- 
logical functioning. PSI works to bring circadian 
thythms into synchrony, 


The way PSI works is complex. All physical activity, 
no matter how small, generates a fatiguelike and 
stresslike nonspecific physiological by-product called 


reactive inhibition. Simple physical inactivity tends to | 


trigger the automatic dissipation of reactive inhibi- 
tion. Such dissipation appears physiologically as a 
decrease in physiological activity and as a small dip 
or signature in the constellation of circadian rhythms. 
PSI involves remaining physically inactive for 15 to 
20 minutes at the same time each day. The result is 
that regular inactivity-induced signatures appear at 
and become classically conditioned to the same point 
in one’s circadian rhythms each day. As one con- 
tinues practicing PSI, conditioning continues, over- 
learning occurs, “dips become conditioned onto 
dips,” and gradually, and automatically, the associ- 
ated physiological changes become deeper and deeper. 


The regular appearance of inactivity-induced signa- 
tures in circadian rhythms serve as zeitgeber, stimuli 
that pull and keep circadian rhythms in synchrony. 
PSI thereby functions to pull and keep circadian 
rhythms in synchrony, and as & result reduces 
anxiety and increases psychological well-being. 


Periodic inactivity is the single commonality among 
a variety of highly effective growth and’ therapy 
techniques including progressive relaxation, biofeed- 
pack training, autogenic therapy, self-hypnosis, 
meditation, and yoga. However, since PSI incor- 
porates only the essentials of these techniques, and 
does away with all the unnecessary and cumbersome 
extras associated with them, it isin fact more effective 
and efficient. 


Both treatments continued for 6 months, after which 
subjects were posttested and debriefed. Twenty g V 
and 24 PSI subjects reported for posttesting. To this 
sample were added 2 TM and 3 PSI subjects who were 
not available for 6-month posttesting but who did take 
an abbreviated posttest after 34 months, These subjects 
were included to increase sample size, since there 18 
strong evidence that 6 months of TM is no more 
therapeutic than 3} months (Smith, 1975a). 


Outcome and Continuation Measures 


Trait anxiety was selected as the main outcome 
variable, specifically pretest-posttest difference scores 
on the Anxiety Trait (A-Trait) scale of the State-Trait 
Anxiety Inventory (STAI; Spielberger, Gorsuch, & 
Lushene, 1970). Anxiety was selected because it 1s the 
most widely studied trait in meditation research (Smith, 
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1975b). Although proponents of TM claim that their 
technique has a desirable impact on other variables, 
notably self-esteem, psychosis, self-actualization, cre- 
ativity, and even intelligence (Glueck & Stroebel, 1975; 
Kanellakos & Ferguson, 1973; Orme-Johnson, Domash, 
& Farrow, 1974), the supporting evidence is scanty. 
Indeed, TM subjects in the present study displayed no 
change on any of these variables (Smith, 1975a). 

In addition, at posttest subjects were asked to 
estimate how often they had practiced each month 
throughout the project. Subjects were considered to 
have continued practicing if they practiced at least once 
during the last month. 


Demographic and Pretest Personality Measures 


Before the onset of the project, each subject was given 
the STAI A-Trait scale, the Epstein-Fenz Manifest 
Anxiety Scale (Fenz & Epstein, 1965), the 16 Person- 
ality Factor Questionnaire Forms A and B (16 PF; 
Cattell, Eber & Tatsuoka, 1970), the IPAT Neuroticism 
Scale Questionnaire (Scheier & Cattell, 1961), the 
Tennessee Self-Concept Scale (Fitts, 1965), and the 
Marlowe-Crowne Social Desirability Scale (Crowne & 
Marlowe, 1964). The Epstein-Ienz test was scored for 
symptoms of autonomic arousal and symptoms of 
striated muscle tension, The three Cattell tests were 
pooled to increase factor reliability and were scored for 
the 16 primary source traits. On the Tennessee scale, 
the Total Positive, Psychosis, Personality Disorder, 
Personality Integration, Defensive Positive, and Self- 
criticism scales were used. In addition subjects indi- 
cated their sex and age and stated if they had at any 
time prior to the project. considered psychotherapy or 
meditation and yoga. In sum, outcome and continuation 
were correlated with 30 variables. 


Results 


Tables 1 and 2 show the correlations of 
demographic and pretest personality variables 
with outcome and continuation for TM and 
PSI subjects. For TM subjects, outcome 
correlated significantly (r > 396, p < .047) 
with (ranked in order of significance): not 
having considered psychotherapy prior to the 
onset of the project, Sizothymia (16 PF 
Factor A), Autia (Factor M), anxiety (STAI 
A-Trait), Weaker Superego Strength (Factor 
G), and lack of Personality Integration. 

Continuation with TM correlated signifi- 
cantly (r > .397, p < .034) with a low degree 
of Psychoticism and a high degree of Self- 
criticism, as well as with having considered 
psychotherapy prior to the onset of the project. 

For PSI the results were quite different. 
Outcome correlated significantly (r > 462, 
Ż < .027) with Shrewdness (Factor N). Con- 
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tinuation correlated significantly (r>. 
Ż < .042) with Alaxia (Factor L), Shrew 
(Factor N), Desurgency (Factor F J 
troubled Adequacy (Factor O), High Stre 
of Self-sentiment (Factor Qs), and 
Strength (Factor C). 


Discussion 


One must be extremely cautious when 
drawing conclusions from research involviny 
many variables and few subjects. For thi 
reason I chose to give particular credence 
to those variables that correlated most hig 
with outcome and continuation and havi 
previous research displayed greatest validit 
reliability, and immunity to the effects 6 
motivational distortion. 

An intriguing picture emerges of 
individuals who continue with TM and di 
the greatest reduction in trait anxiety. 
only are they anxious, but they score 
on 16 PF Factors Sizothymia and A 
Cattell and his colleagues (Cattell, 1 J 
Cattell et al., 1970) describe Sizothmie in 
dividuals as “reserved, detached, critical, ci 
aloof,” and “stiff.” Emotionally, they 
“flat” or “cautious.” They tend to be critical 
precise, and skeptical, and like working alor 
with things or words rather than with people 
In interpreting this factor, Cattell (19% 
hypothesizes that it reflects a “steadiness 
purpose and a high level of interest in symb 
and subjective activity...a secondary re 
of blocking of easy interaction with the 
ing external world” (p. 180). In light of th 
apparent introversion of those who beni 
from TM, it is not surprising that they tend ‘ 
to have considered psychotherapy as a Be 
ment option. K- 

Those who benefit from TM also score 8 
on Autia. These individuals tend to be 
conventional and interested in “art, pe 3 
basic beliefs” and “spiritual matters.” Hoy 
ever, their most important characteris 
what Cattell variously describes as a tend 
to be “imaginatively enthralled by 
creations,” “charmed by works of the imaj 
tion,” and “completely absorbed” im 
momentum of their own thoughts, follo 
them “wherever they lead, for their int 
attractiveness and with neglect of rea 
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considerations.” Cattell speculated that funda- A quite different picture emerges of those 
mental to Autia may be a capacity to dis- who continue with and benefit from PSI. 
sociate and engage in “autonomous, self- They tend to score high on 16 PF Factor N. 
absorbed relaxation.” Such individuals, according to Cattell, tend 


Table 1 
Demographic and Pretest Personality Correlates of A-Trait Change Scores for 
Transcendental Meditation (T M) and Periodic Somatic Inactivity (PSI) 


TM PSI 
ARNE: ie ES G, 
Variable r n p r n p 

Sexe .297 19 .109 .021 18 467 
Age —.120 18 37 .206 18 .206 
Considered meditation .070 19 388 -366 18 067 
Considered therapy® —.546 19 «008 -387 18 .056 
A-Trait 488 19 017 359 18 .072 
SSMT —.136 19 +289 .086 18 367 
SAA 021 19 466 +106 18 338 
Factor At —.543 18 010 .272 18 138 
Factor B —.064 18 400 132 18 301 
Factor C —.133 18 300 —.013 18 480 
Factor E —.305 18 «110 104 18 41 
Factor F —.395 18 052 —,037 18 443 
Factor G —.485 18 .021 005 18 A92 
Factor H —.309 18 «106 «181 18 :236 
Factor I .378 18 061 .296 18 ANT 
Factor L 044 18 A381 150 18 276 
Factor M 519 18 014 —.001 18 499 
Factor N .286 18 125 462 18 027 
Factor O 183 18 234 —.197 18 216 
Factor Qı —.107 18 336 —.345 18 a 
Factor Qe (288 18 123 —.066 18 ate 
Factor Qa —.001 18 498 .009 18 ie 
Factor Q4 148 18 .279 —.220 18 Say 
Marlowe-Crowne .208 19 .197 .262 He 167 
Defensive positive? —.002 19 497 157 t 062 
Psychosis 077 19 378 ae + 243 
Personality disorder —.044 19 va) A7 18 322 
Personality integration —.396 19 047 “034 18 “47 
Self-criticism —.361 19 064 ae i8 ‘229 
Total positive —.044 19 428 1 : 


ntory; SSMT = TER of Faan 
nomic arousal; Factor A= Sizothymia vs. Affectothymia; 
Factor B = i { Factor C = Lower Ego Strength vs. Higher Ego 
Strength; F: se pean E vs. Dominance; Factor F = Desurgency vs. pey ME bo 
G = Weaker Superego Strength vs. Stronger Superego Strength ; Factor H = Threctia vs. fated haa k 
be ee vs. Premsia; Factor L = ana vs. ion; or iliras ce a E 
= Artless: . Factor O = i ; Fo Q 
servatism r Radicalism; Factor Qa = Group Adherence vs. Self-sufficiency; Factor Qs 
= Low Self-sentiment Integration vs. High Strength of Self-sentiment ; Factor Qe 
H igh Ergic Tension. 
Keyed so that 1 = male, 2 = female. > 
Keyed so that 1 = did not consider meditation prior to the project, 
: e project. 
i Keyed so that 1 = did not consider therapy, i ; 
16 Personality Factor Questionnaire (16 PF) factors were obtained by poolin, 
of the 16 PF with scores from the Neuroticism Scale Questionnaire. 
* This and the following variables were taken from the Tennessee Self-Concept Scale. 


Note. A-Trait = Trait Anxiety scale of the State-Trait Anxiety Invet 


muscle tension; SAA = symptoms of auto 


2 = did consider meditation prior to 


= did consider therapy. 
Ladle crt se g scores from Forms A and B 


276 


to have “exact calculating” minds and tend 
to be emotionally detached and disciplined, 
ambitious, and esthetically fastidious. They 
tend not to be gregarious or to get “warmly 
emotionally involved” with others. 

My hypothesis that the TM dropout is 
disturbed, anxious, withdrawn, and somewhat 
lacking in self-criticism appears to be sup- 
ported. Although only three dropouts reported 


Table 2 
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for posttesting, they had extremely high 
Psychoticism scores. Fitts (1965) reports the 
“normal limits” on Psychoticism to be from 
34 to 54 and the average score of hospitalized 
psychotics to be 62. The average Psychoticism 
score of the TM dropout was 63 (SD = 4.58), 
whereas the average score of those who 
continued was 48.74 (SD = 6.46). In addition, 
the dropouts scored lower in Self-criticism. 


Demographic and Pretest Personality Correlates of Continuation" for Transcendental 


Meditation (TM) and Periodic Somatic Inactivity (PSI) 
TM PSI 
Variable r n t r n p 
Sex .025 22 456 —.158 27 215 
Age 164 21 -239 281 27 .078 
Considered meditation -208 22 176 —.267 27 -090 
Considered therapy 397 22 -034 .158 27 -215 
A-Trait .180 22 .212 —.032 27 436 
SSMT 021 22 463 —.069 27 367 
SAA 014 22 476 —.039 27 423 
Factor A 194 21 .200 —.310 27 058 
Factor B —.170 21 .230 —.159 27 214 
Factor C 012 21 .480 .338 27 042 
Factor E —.094 21 343 .003 27 495 
Factor F -236 21 152 —.440 27 011 
Factor G —.028 21 451 —.052 27 398 
Factor H .056 21 404 —.119 27 278 
Factor I .211 21 .179 —.109 27 .294 
Factor L —.337 21 .067 —.548 27 002 
Factor M 161 21 243 130 27 260 
Factor N .038 21 436 455 27 009 
Factor O +123 21 297 —.377 27 026 
Factor Qi —.232 21 156 156 27 1219 
Factor Qs —.329 21 073 .267 27 089 
Factor Qs —.108 21 320 342 27 040 
Factor Qu -280 21 .109 —.298 27 066 
Marlowe-Crowne —.034 22 441 .093 27 322 
Defensive positive —.105 22 320 —.012 26 476 
Psychosis —.586 22 .002 .039 26 426 
Personality disorder —.054 22 405 135 26 «255 
Personality integration —.032 22 443 —.044 26 A16 
Self-criticism 409 22 029 —.193 26 173 
Total positive 028 22 450 .076 26 357 
ted 


Note. A-Trait = Trait Anxiety scale of the State~Trait Anxiety Inventory; SSMT = symptoms of striate 
muscle tension; SAA = symptoms of autonomic arousal; Factor A = Sizothymia vs. 
Factor B = Low Intelligence vs. High Intelligence; Factor C = Lower Ego Strength vs. Higher 
Strength; Factor E = Submissiveness vs. Dominance; Factor F = Desurgency vs. Surgency; Facta 


Affectothymias 


G = Weaker Superego Strength vs. Stronger Superego Strength; Factor H = Threctia vs. Parmia; Factor 


= Harria vs. Premsia; Factor L = Alaxia vs. Protension; Factor M = Praxernia vs. Autia; Factor 
= Artlessness vs. Shrewdness; Factor O = Untroubled Adequacy vs. Guilt Proneness; Factor 
servatism of Temperament vs. Radicalism; Factor Q, = Group Adherence vs. Self-sufficiency ; 
= Low Self-sentiment Integration vs. High Strength of Self-sentiment; Factor Qa = Low Ergic 


vs. High Ergic Tension. 


* Keyed so that 1 = did not practice at least once during the last month of the project, 2 = did prac 


least once during the last month of the project. 


Qi = Com 


Tension 


tice at 


Ego 


Factor Q 
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Again, a somewhat different picture emerges 
of those who stop PSI. Consistent with what 
was predicted, they score higher on 16 PF 
factors related to anxiety (Factors L, O, Qs, 
and C). However, they do not score higher on 
Psychoticism or lower on Self-criticism. Tenta- 
tively, they appear to be suspecting and are 
prone to dwell on frustrations (Factor L), as 
well as being a bit naive and gregarious 
(Factors N and F). 

Examining the overall pattern of these 
results, it appears that there is virtually no 
overlap between the variables correlated with 
continuation and outcome for TM and for 
PSI, Yet I had hypothesized that if the same 
therapeutic processes were operating in the 
two treatments, they should work for the same 
types of individuals. I propose that a key to 
interpreting this inconsistency can be found 
in the possible interaction between treatment 
rationale and treatment outcome. 

The rationale given for a treatment is fast 
emerging as a variable that can mediate 
attention placebo and possibly actual treat- 
ment effects (Borkovec & Nau, 1972; Mc- 
Reynolds, Barnes, Brooks, & Rehagen, 1973; 
Rosen, 1975, 1976). A treatment with a 
rationale that lacks credibility can be less 
effective than a treatment with a credible 
rationale. A rationale that imparts the expec- 
tation that a set of procedures is not thera- 
peutic, or does not constitute a treatment, can 
reduce the effectiveness of those procedures. 
In addition, different rationales might render 
treatments appealing, credible, and as a result, 
effective for different types of individuals. For 
example, Fish (1973) described a case in which 
a patient from the hippie counterculture 
responded to systematic desensitization when 
It was presented as a consciousness-raising 
technique, whereas an engineer responded to 
the same technique when it was explained in 
terms of reciprocal inhibition. 

Following what is common procedure in 
outcome research using attention placebo 
controls, highly credible but different rationales 
Were given for TM and PSI. These rationales 
may have rendered the treatments credible, 
appealing, and effective for different types of 
individuals. The TM rationale, a religious— 
Philosophical system from Hindu Vedantic 
tradition, can be summarized briefly + 
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The TM technique involves passively and continu- 
ously attending to a meaningless thought or word called 
a mantra. As this thought repeats in meditation, it is 
experienced at progressively earlier phases in its 
development as increasingly subtle, fine, and charming, 
This process is similar to following bubbles arising in an 
ocean to their source in the ocean bed. 


As one attends to the mantra, distracting thoughts, 
images, and feelings spontaneously emerge. These are 
treated with detached acceptance; one simply favors 
the mantra as soon as he or she recognizes that attention 
has wandered. Such distractions are normal and 
indicate that stress is being released. 


As a thought is experienced at developmentally earlier 
stages, relaxation increases, which, in turn, dissolves 
neurological knots of tension. One’s mind becomes 
increasingly still, like a rippleless pond or the depths of 
an ocean, One approaches the source of thought, the 
eternal and absolute field of transcendental being, 
creative intelligence. All aspects of life are thereby 
enriched and improved. 


It takes little imagination to see how the 
TM rationale may be highly appealing and 
credible to Sizothymic and Autic individuals. 
‘As described earlier, such people tend to be 
emotionally cool, steady, and detached, and 
might find the TM metaphors relating to 
stillness and detachment appealing. They tend 
to be “charmed” by “inner creations,” & 
characteristic that aptly summarizes the gist 


1 The specific content of the rationale given in TM 
lectures and instruction is not available to the non-TM 
researcher. However, after interviewing two 
instructors and 15 practitioners, T concluded that much 
of the rationale is similar to accounts published in the 

TM instructors and advocates 


ular press by 
Tahari Mahesh Yogi (1968) and Bloomfield, Cain, 


and Jaffe (1975). The rationale presented in this article 
is derived from the accounts of Maharishi and Bloom- 
field et al. : 

It should be noted that the TM program 1s shrouded 
with considerable secrecy- Not only are the specific 
content of lectures and instruction unavailable to the 
outside researcher, but TM practitioners are urged 
not to describe their technique or training to others. 
If psychologists wish to learn TM, they are asked to 
promise not to divulge the technique. Secrecy tends to 
be reinforced by the requirement that TM research 
proposals must first be approved by the TM organiza- 
tion. In my opinion, these restrictions pose serious 
ethical problems for the TM researcher. As exemplified 
in the present study, such secrecy limits the extent to 
which procedural components of the TM program can 
be isolated and investigated. At least one study on TM 
has been prematurely terminated partly because of the 
ethical difficulties associated with secrecy (White, 


1976). 
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of the claimed TM process. And they tend to 
be unconventional and interested in art, theory, 
basic beliefs, and spiritual matters. The TM 
philosophy is blatantly spiritual and relies 
heavily on visual “artistic” metaphore. 

In contrast to the TM rationale, the PSI 
rationale was highly intricate and mechanistic. 
The person who benefits from PSI was 
described as “shrewd, astute, exacting, calcu- 
lating, ambitious,” and having complex tastes. 
Such a person might well find the scientific- 
sounding precision and ambitious complexity 
of the PSI rationale highly credible and 
appealing, 

Another interpretation of these findings is 
that the treatment processes basic to TM and 
PSI are in fact different and are effective for 
different types of individuals, Indeed, the act 
of meditation has frequently been claimed to 
have therapeutic properties (Smith, 1975b), 
and Pratap (1972) argues that sitting erect 
with eyes closed may in itself be therapeutic. 
However, if sitting and meditation are equally 
therapeutic for different individuals, then one 
would expect more people to benefit from TM 
than from PSI, since TM is done in a seated 
position. This was clearly not the case. The 
most direct way out of this inconsistency is to 
make the somewhat awkward assumption that 

something in TM, perhaps some undisclosed 
aspect of the TM rationale, suppresses the 
therapeutic impact of erect sitting. 

! However they may be interpreted, the find- 
ings of this experiment clearly show that 
Personality characteristics are correlated with 
continuation and outcome in meditation, I 
Propose that this finding underlines the 
Importance of paying heed to questions of 
specificity. The trend in Past research has been 
to speak of a global meditation response 
experienced to some degree by all meditators, 
with the same overall desirable effect. Medi- 
tation is quite likely a heterogeneous phe- 
nomenon, producing effects ranging from sleep 
to enlightenment, and incorporating such 
diverse processes as insight, desensitization 
and suggestion. It is time that meditation re 
searchers examine the question of who experi- 
ences what state and trait changes with which 
technique. 


JONATHAN C. SMITH 
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Anxiety: States, Traits—Situations? 
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anxiety in predicting state anxiety reactions. Ninety-six male subjects pre- 5 


The state-trait model of anxiety (Spiel- 
berger, 1972) is based on the conceptual 
framework of transitory anxiety states (A- 
State) and relatively stable predispositions or 
traits (A-Trait). The task of state-trait 
researchers, according to Spielberger (1972), 
is to “describe and specify the characteristics 
of stressor stimuli that evoke differential levels 
of A-State in persons who differ in A-Trait” 
(p. 39). 

The central notion of the state-trait model 
is that persons high in A-Trait have a greater 
tendency to perceive situations as dangerous or 

reatening than persons who are low in A- 
Trait, and thus they are expected to respond 
to threatening situations with state anxiety 
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elevations of greater intensity. Essenti 

Spielberger views trait anxiety as the m 
of anxiety proneness from which predici 
state reactions can be made. 


S-R Inventory of Anxiousness (Endler, 
& Rosenstein, 1962) and its revisions ( 
& Okada, 1975). The major point that En 
and his colleagues have made is thai 
appropriate assessment of anxiety mus 
sider all sources of variability: indi 
differences, the responses that charac 
anxiety, and the situations that are lik 
arouse anxiety, 3 
Recently, Endler (1975) viewed the sit 
tional component as vital for predicting st 
anxiety reactions, He made this position 
when he stated that 


if one wants to examine the interaction of ph 
threat and A-Trait on state anxiety, it is neci 
assess physical danger A-Trait independent of y 
facets of A-Trait, (p. 161) 


Within the interaction model the state 
relationship is essentially similar to that 
state-trait model with the exception that 


ANXIETY: STATES, TRAITS—SITUATIONS? 


anxiety is multidimensional and not uni- 
dimensional. The interaction model and the 
state-trait model agree that subjects high in 
trait anxiety will show greater state anxiety 
reactions under stress than will subjects low 
in trait anxiety, but the interaction model 
separates from state-trait theory in the 
specificity of the trait measure needed to make 
the differential state anxiety predictions. 

The results of research using Spielberger, 
Gorsuch, and Lushene’s (1970) State-Trait 
Anxiety Inventory (STAI) have indicated 
that individual differences in anxiety proneness 
(Trait Anxiety scale—A-Trait) are relatively 
stable and impervious to stress (Auerbach, 
1973a, 1973b; Spielberger, 1972; Spielberger, 
Auerbach, Wadsworth, Dunn, & Taulbee, 
1973), whereas the State Anxiety scale (A- 
State) has been found to be sensitive to various 
Stresses (Hodges & Spielberger, 1969; Kendall, 
Finch, Auerbach, Hooke, & Mikulka, 1976). 
In addition, stressful situations of an ego- 
threatening nature have been found to evoke 
teater increases in state anxiety for high- 
ttait-anxious than for low-trait-anxious sub- 
jects (Auerbach, 1973a; Hodges & Spielberger, 
1969; O'Neil, Spielberger, & Hansen, 1969; 
Rappaport & Katkin, 1972) except where 
Subjects have equal requisite skills (Kendall 
ttal., 1976). On the other hand, in situations 
Mvolving physical danger, such as threat of 
electric shock (Hodges & Spielberger, 1966), 
'mminent surgery (Auerbach, Kendall, Cuttler, 
t Levitt, 1976; Johnson, Dabbs, & Leventhal, 
1970; Spielberger et al., 1973), or films depict- 
Mg physically painful accidents (Kendall et al., 
A state anxiety reactions have been found 

€ unrelated to level of trait anxiety. 

The S-R Inventory of. General Trait 
nxiousness (S-R GTA; Endler & Okada, 
mec, S a self-report inventory designed to 
ie trait anxiety and emphasizes the 
fac” of measuring trait anxiety in 
a c situations. The S-R GTA has been 
cea to be both a reliable measure of 
en trait anxiety and one that is insensi- 
1975) poaa stress (Endler & Okada, 
the A n addition, the authors reported that 
a ‘uational scales were found to be rela- 

Y Independent measures. 
ecg, yet, no known study has compared the 
‘cy of the S-R GTA situation-specific 
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measures and the STAI for predicting differ- 
ential state anxiety reactions. The purpose of 
the present study was to conduct a comparison 
of the models of anxiety by investigating 
differential state anxiety reactions in two 
stress situations—a physical danger stress and 
an evaluation stress. It was hypothesized 
that high-physical-danger/trait-anxious sub- 
jects (S-R GTA Physical Danger trait 
measure) would show greater state anxiety 
elevations than low-physical-danger/trait-anx- 
ious subjects for the physical danger stress, 
and that high-evaluation-trait (S-R GTA 
Evaluation trait measure) and high A-Trait 
(STAI) subjects would show greater state 
anxiety elevations than low evaluation trait 
and A-Trait anxious subjects for the evaluation 
stress. The basic hypothesis was that the trait 
anxiety measure corresponding to the situation 
would be the best predictor of the state 
anxiety aroused in that situation. For the S-R 
GTA each situational trait was expected to 
be predictive of that stress situation, whereas 
for the STAI A-Trait previous research in 
evaluation situations (see above), the uni- 
dimensionality of the scale (Kendall et al. 
1976) and the anticipated relationship between 
the STAI A-Trait and S-R GTA Evaluation 
trait measure are indicative of accurate pre- 
diction only in evaluation stress situations. 
Confirmation of the hypothesis would provide 
support for the interaction model and would 
emphasize the importance of assessments of 
situational trait anxiety. 


Method 


Subjects 

The subjects in the present experiment were 101 
male college students selected from a pool of 173 male 
students (M age = 20.1) at an urban Virginia uni- 
versity, Only male subjects and a male experimenter 
participated to control for, reactions characteristic ol 
the Subject Sex X Experimenter Sex interaction. For 
example, it was felt that males might protect their 
masculine image with a female experimenter and would 
subsequently report less anxiety, whereas females might 
seek to appear in a feminine stereotyped role with a 
male experimenter. è 

Subjects were selected based on their scores on 
preexperimental measures of trait anxiety. The subject 
selection procedure was designed to gather subject 
groups of high and low trait anxiety on the ae 
preexperimental trait measures. The criterion or 
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Table 1 
Means, Standard Deviations, and Range 
of Scores of the Three Trait Measures 


Trait measure M SD Range 
S-R GTA Physical 54.09 12.19 18-75 
Danger 
S-R GTA Evaluation 44.34 12.32 18-73 
STAI A-Trait 38.89 8.85 22-59 


Note. S-R GTA = S-R Inventory of General Trait 
Anxiousness; STAI A-Trait = Trait Anxiety scale 
of the State-Trait Anxiety Inventory. 


inclusion in the subject groups required scores that were 
in the upper 40% for the high-trait-level group and the 
lower 40% for the low-trait-level group. Also, to control 
for the other trait measures, subjects selected as high 
(or low) on the physical danger trait measure (S-R 
GTA) were not in the upper (or lower) 40% of the 
evaluation trait and A-Trait (STAT) scores. Likewise, 
subjects selected for high (or low) evaluation trait 
(S-R GTA) or A-Trait (STAT) groups were not in the 
upper (or lower) 40% of the physical danger trait 
scores. The means, standard deviations, and ranges of 
the trait scores are presented in Table 1. Three subjects 
were eliminated because they had previously seen the 
stress film, and 2 subjects were randomly eliminated to 
achieve an equal number of subjects in each cell. The 
remaining 96 subjects, 56 whites and 40 blacks, were 
distributed comparably across the six subject groups 
(one high and low group for each trait measure). 


Measure of State Anxiety 


State anxiety was measured using the A-State portion 
of the STAI. This scale consists of 20 descriptive 
statements that require the subjects to individually 
endorse on a 4-point scale (not at all, somewhat, 
moderately so, very much so) the degree to which each 


statement characterized their feelings at a particular 
moment in time, 


Measures of Trait Anxiety 


thought runs 
and bothers me”) by endorsing 1, 2, 


ever,” “sometimes,” 


onsists of 15 items for each 
rait anxiety is assessed.! The 
wo sit study were (a) “You are in 
situations where you are about to or may encounter 
Physical danger” and (b) “You are in situations where 
you are being evaluated by other people.” The 15 items 
for each of these situations required respon: 
Š indicating “very much” to “not at all.” 


of five situations in which t 
two situations used in this 


ses from 1 to 
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Types of Stress 


A physical danger and an evaluation stress were 
arranged to induce anxiety within an experimental 
laboratory setting. 

Physical danger stress. This stress was a 22-minute 
Harvest (1970) film entitled Zn the Crash. The film 
presents graphic scenes of automobile crash tests at 
both high and low speeds. These crash tests were filmed 
under experimental as well as rea! highway circum- 
stances. In the Crash has been used in previous research 
and has been found to be an effective stressor (Kendall 
et al., 1976). 

Evaluation stress. The evaluation stress was a 
decoding task in which a word problem had to be 
decoded into an arithmetic solution. ‘The problem is as 
follows: 


DONALD 
+GERALD 


ROBERT ii 
Subjects were instructed that letters had been substi- 
tuted for numbers and that the task would be to decode 
the letters. One part of the solution, D = 5, was 
provided. Support for the stressful nature of the task 
is found in previous research using similar tasks (Finch, 
Kendall, Montgomery, & Morris, 1975). To maximize 
the stress, ego-involving instructions (Spence & Spence, 
1966) and a brief failure instruction (Finch et al., 1975) 
were included. The exact instructions and time limit 
are presented in the Procedure section. 


Procedure 


Subjects participated in groups according to o 
two sequences, The sequences pertained to the or x 
of presentation of the stress situations (physical dang? 
then evaluation and vice versa). The two sequel 
were included as a counterbalanced control for ones 
and carryover effects, The stress sequence for & al 
evening was decided on a restricted random ba 
before subjects arrived. their 

As subjects arrived at the laboratory room for i 
7:00 p.m. appointment, they were instructed to ma 
seat and get comfortable. The experiment began Nn 
a 5-10 minute late grace period. Subjects were 
informed by the experimenter that 


I am interested in the relationship between ie 
and behavior, and in order to investigate this if 
will be asked to fill out the “How I feel Bee 
naire” at certain times and later to describe s0 
behaviors. 


1 The S-R Inventory of General Trait Anxiousnet 
which was published in the Journal of Consultee 
Clinical Psychology, 1975, 43, 319-329, has only fo 
situations. The revised version has five situations ple 
15 items per situation. The revised version is aval nto, 
from Norman S. Endler, York University, Tor 
Canada. 
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Following a pause, the experimenter stated that 


Į will be instructing you throughout your participa- 
tion, so listen and you will find your job easy. If you 
are uncertain at any time about what you are to do, 
please feel free to ask. 


Once everyone appeared ready, the experimenter 
continued by saying “What I would like you to do first 
is to complete the How I feel questionnaire.” The 
STAI A-State scale was distributed with pencils, the 
standard instructions were read from the top of the 
form, and the forms were collected when all subjects 
had completed them. 

Subjects were then instructed that they would be 
watching a film. They were told to “pay attention and 
try to get into the film, but you should not try to 
remember facts or details about the movie because I 
will not be asking questions—just watch and get the 
feeling.” Initially, and when necessary, subjects were 
instructed not to talk to any of the other people during 
the course of the study.? 

Following the film, subjects were again asked to 
complete the How I feel questionnaire by reporting 
how they had felt during the film. Subjects were next 
informed that the experimenter would have to rewind 
the film and that there were magazines available to 
tead. The experimenter provided numerous sports and 
entertainment magazines (prescreened to provide an 
absence of stressful material, e.g., car crash pictures) 
and suggested that the subjects stand, stretch, relax, 
and look through the magazines. Subjects were Te- 
minded not to converse with each other. 

When the film was rewound (approximately 10 
minutes), the experimenter announced, “Now I want 
tach of you to do something else.” A pencil and a sheet 
of blank white paper (8} X 11 inches) was given to 
each subject at his desk, and he was told to write his 
ja across the top of the page. The experimenter 
5 oo the evaluation stress task on the blackboard 
nd provided the following instructions: 


This is an addition problem, only letters have been 
oe for the numbers. For instance, D stands 
Bae In each place where there is a D there is really 
ay Your task is to solve or decode the rest of the 
y ers, There is a possible solution, I guarantee it. 

‘ou must work on this individually, and you will be 


allowed a sufficient amount of tìme to work out the 
Problem. 


oo were answered at this time, but no additional 
Provided w was given. Finally, the experimenter 
give me an ego-involving comment. “This task will 
can use ane information about your abilities, that I 
ee ae eee you.” After 3 minutes subjects 
Paper ove ‘OK, everyone please stop and turn your 

State ie Do not talk to your neighbors.” The STAT 
While the orm was again distributed to the subjects 

experimenter commented : 


Fill th; 

eet out according to how you felt during the 

Tease ae all should have gotten the solution or at 

the sol lost of it, but if you didn’t, I will show you 
lution at the end of tonight’s projects. 
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No subject solved the task in the allotted time, and 
all subjects should have experienced some failure and 
thus maximized the evaluation stress. When all STAT 
A-State forms were completed and collected, subjects 
were told that “the project is almost over, and you 
should sit and relax for a few minutes while I get these 
papers straight.” After a short break subjects were 
asked to fill out a questionnaire not related to the 
present study. 


Debriefing 


Although the present study was rather straight- 
forward, the debriefing explained to the subjects the 
actual interests of the experimenter and a brief back- 
ground for the study. Subjects were also told that the 
experimenter had never intended to ask them to describe 
behavior as he had stated originally. The answer to the 
decoding task was presented along with a clarification 
of its true difficulty and an explanation that most 
people should not have been able to solve the problem 
in the allotted time. Subjects were asked not to discuss 
the project with other students but were told that they 
could talk among themselves about it. When the 
questions and discussion were completed, subjects were 
dismissed. i 

Only one session was held on a given evening to 
prevent subjects just finishing the project from passing 
on information to subjects just arriving. This also 
prevented subjects from arriving early and overhearing 
the debriefing session. 


Results 


Preexperimental Measures 


Correlations of the trait measures using the 
original pool of 173 subjects resulted in a 38 
correlation between the S-R GTA Physical 
Danger and Evaluation measures, a 19 correla- 
tion between the S-R GTA Physical Danger 
and STAI A-Trait measures, and a .52 correla- 
tion between the S-R GTA Evaluation and 
STAI A-Trait measures. por for ERE 

i es between the corre ations revealec 
Tor S-R GTA Evaluation/STAL A-Trait 
correlation was significantly greater than the 
S-R GTA Physical Danger/STAI A-Trait 


j i tress tended to 
2 Jn my previous research, induced stress | 
iene he conversion level. To prevent intemubie 
comparisons, subjects were instructed not to talk to 


in which the evaluation 
stress was identical 
exception: 
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a" 
Table 2 Counterbalancing. To confirm the utility of 
Means and Standard Deviations of the Initial counterbalancing the stress presentations, an t 
(Prestress) A-State Scores for the Subject analysis of variance of difference scores was 


Groups conducted for subjects receiving the physical 
Erop oa ae danger stress first and for those receiving it 


Initial prestress 


ASEE second. There was not a significant difference 
between the two, F(1, 94) = 2.13. Similarly, 
Trait measure and level M SD an analysis for the sequence of the evalua- 
tion stress revealed no significant difference, 
S-R GTA Physical Danger F(1, 94) = 2.86. 
High 33.00 5.69 KAA ; ae 
Ta 35.62 6.48 Experimental sessions. Two additional 
S-R’ GTA Evaluation : analyses were carried out to test whether there 
High 34.37 8.41 were meaningful variations in the state anxiety 
eae : 30.81 7.67 difference scores due to the experimental 
Hah pura 37.06 7,47 Sessions. To this end, a single factor analysis 
Dow. 32.56 643 Of variance of A-State difference scores for the 


five experimental sessions was conducted. The 
Note. S-R GTA = S-R Inventory of General Trait results yielded no significant difference in 
Anson Beary A = Anxiety Trait scale physical danger or evaluation stress reactions” 
Cr nag matez Rate eboney AnVeneOry for subjects participating in any of the sessio , 
; F(4, 91) =.75, p> .10; F(4, 91) = 2.45, p 
correlation, ¢(167) = 4.33, p < .001, and that > «10; respectively. This outcome indicated 
the S-R GTA Evaluation/STAI A-Trait that different experimental sessions did not 
correlation was also significantly greater than produce different results. : 
the S-R GTA Evaluation/Physical Danger 4 
correlation, #(167) = 1.71 P< .05. These 5 ; 
analyses indicated that the STAI A-Trait 1/90 Hypothesis 
and S-R GTA Evaluation measures correlated Th j i i 
RaT. : $ : e major hypothesis was that the change 
Rae me wore re other than either in A-State from an initial prestress period toa 
Ril ate A Physical Danger Stress period would be greater for high-trait 
x Rice level subjects than for low-trait-level subjects 

ae we and standard deviations of the when the trait measure corresponded to the” 
aie cee A-State scores for the stressor. Thus, the A-State difference score 
asl RE f PS are presented in Table 2. An (stress score minus initial prestress score) was 
oe n wie ha of the Prestress Scores the dependent measure. This hypothesis was ~ 
significantly, (5,90) = gp, 7 SMe examined via a 3% 2% 2 analysis of varanee 
AWE EAEE: V in which type of stress was a within variable — 
3 and trait level and trait measure were between 
Experimental Checks variables (a three-factor mixed design with | 


repeated ; Winer, 1962, 
Stress. The stressful nature of the film was ie 337-344). on one factor; Winer, o 


aan fe a significant ¢ test for differ- The results of this analysis indicated that the 
yi €n related means of the initial and main effect for trait level and the Type of 
ye col ee of A-State, t(95) = 6.08, Stress X Trait Level X Trait Measure triple 
lone ca milarly, it was demonstrated, interaction were significant, F(1, 90) = 7.84) 

20, p < .001, that the evaluation P< .01, and F(2, 90) = 9.52 p < 001, Te” 
<3. i “ons in state anxiety. In spectively. The maj sae it measure ~ 
ae gy 

ce on e i $ 5 s 

showed that all e a on aE X Trait Measure, Type of Stress X Tralt 


cts performed at : 4 
level. Specifically, no one solved more ae ile Bee Sirisa X Trait Mea 


; re interactions were not significant. These results | 
of the letters beyond the information given. demonstrate that the aug in state anxiety | 
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was greater for high- than for low-trait-level 
subjects but was not significantly different for 
groups based on the trait measure or for the 
type of stress. 

The nature of the significant triple inter- 
action is presented in Figure 1. This illustration 
shows how the state anxiety difference scores 
varied for the trait measures, how they 
differed for the types of stress, and also how 
they varied for the high- and low-trait-level 
subjects. An analysis of the predicted simple 
effects (£ tests) indicated that for physical 
danger, high-trait-level subjects were signifi- 
cantly greater than low-trait-level subjects 
on the A-State difference score under the 
physical danger stress, (30) = 3.87, p < .001, 
but not significantly different under the evalua- 
tion task, ¢(30) = .90, p > .10. When high- 
and low-evaluation-trait subjects were com- 
pared, there was no significant difference under 
the physical danger stress, #(30) = .64, $ 
> .10, but there was a significant difference in 
the evaluation task situation, ¢(30) = 2.62, 
?<.01, with the high-evaluation-trait sub- 
jects showing the greater A-State difference 
score. When high and low STAI A-Trait 
subjects were compared, there were no signifi- 
cant differences under either the physical 
danger stress, (30) = .47, p > -10, or the 
evaluation stress, (30) = .56, p > .10. These 
a indicated that the difference scores 
phish teins subjects were greater 
a ced of the low-trait-level subjects when 
oa oy trait measures were congruent 
EA stress but not when subjects were 
an into high and low groups on the basis 

e STAI A-Trait. 


Discussion 


ee study demonstrated that the 
Pasi -Trait and the S-R GTA Evaluation 
TA oe correlated significantly higher with 
s than either did with the S-R GTA 
Meresta ries measure. This outcome 
foe ik at the STAI A-Trait measure is 
aS na the S-R GTA Evaluation measure 
osc the S-R GTA Physical Danger 
oly and appears to support previous 
releteg | which found the STAI A-Trait to be 
to state anxiety in evaluation stress 


Situati 
uations (Auerbach, 1973a; Hodges & Spiel- 
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Figure 1. Mean A-State (State Anxiety scale) difference 
scores for the two types of stress and for subjects high 
and low on each of the trait measures. (A-Trait = STAI 
Trait Anxiety scale.) 


berger, 1969). Also of interest from the corre- 
Jational analyses is the relatively low relation- 
ship among the measures of trait anxiety 
(other than the STAI A-Trait/S-R GTA 
Evaluation correlation). These correlations 
support the notion of relatively separate 
anxiety traits. 

The efficacy of using a situational anxiety 
trait measure that is congruent with the stress 
situation in the prediction of state anxiety 
reactions was demonstrated in the present 
study. The trait measures that were viable 
in the present study were two situation-specific 


measures of anxiety traits, physical danger 
the subjects were 


evaluation. When } 
grouped according to their physical danger 
trait, the high-trait-level subjects showed 
significant differential responsiveness to the 
physical danger stress put not to the evaluation 
stress. The opposite was true when subjects 
were grouped on the basis of their evaluation 
trait scores. On the other hand, the use of a 
nonsituational, unidimensional trait measure 
(STAI A-Trait) did not predict differential 


state anxiety reactions. Thus, the results of 


Baie Cee 


Berge “ee 
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the present study support both the need for 
including situations in the measurement of 
trait anxiety and, correspondingly, the inter- 
action model of anxiety. 

The present results also support the utility 
of the state-trait distinction, a distinction 
that has already been supported in both 
manipulation studies and factor-analytic work 
(Kendall et al., 1976; Newmark, Fasching- 
bauer, Finch, & Kendall, 1975). However, the 
Present investigation of “states” and “traits” 
Suggests a clarification of their relationship: 
Anxiety traits are predictive of anxiety states 
when the trait measure is congruent with the 
evocative situation. It appears that there is 
use for the inclusion of situational trait 
measures. 

The lack of differential state anxiety re- 
sponsiveness for high and low STAI A-Trait 
subjects under both stresses was unexpected. 
Since the STAI A-Trait measure correlated 
more with the S-R GTA Evaluation trait 
measure than with the physical danger 
measure, since previous research has reported 
greater state reactions for high STAI A-Trait 

subjects in evaluation stresses, and since the 
trait measure is unidimensional, the STAI 
A-Trait measure was viewed as indicative of a 
predisposition to become anxious in evaluation 
situations. Based on these findings, differential 
state reactions for high STAI A-Trait subjects 
were hypothesized under the evaluation stress. 
However, this hypothesis was not supported 
in the present study. 

_ In speculating about the reasons for the 

insensitivity of the STAI A-Trait measure 

in the evaluation stress situation, the previous 
findings of Kendall et al, (1976) are suggestive. 
In that study, in which subjects were shown to 
have performed equally under stress and thus 
were considered to have had relatively equal 
requisite skills, high and low A-Trait subjects 
reacted similarly to an examination stress. In 
the present experiment subjects indeed per- 
formed equally; no one decoded more than one 
letter beyond the given information. The 
present inability of the STAI A-Trait to 
predict differential responses in an evaluation 
stress could be due to the similar performance 
of all subjects. However, this Possibility is 
directly contradicted by the results found when 
the S-R GTA Evaluation measure was used 
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(ie, high- and low-evaluation-trait anxious 
subjects did respond differentially). 

A more parsimonious speculation about the 
unexpected results using the STAI A-Trait 
measure concerns the measure itself. That is, 
whereas the STAI A-Trait measure requires 
subjects to indicate how they generally feel, 
it does not specify a situation in its assessmen 
of the anxiety trait. The ignoring of situational 
specificity by the STAI A-Trait measure coul 
account for the present findings. 

More importantly, the present study pro- 
vides major construct validation for the 
Physical Danger and Evaluation portions oi 
the S-R GTA. Since only two situations were 
directly investigated, the validity of the other 
situation traits is yet to be examined. But both 
the present study and the work of Endler and 
Okada (1975) suggest that it would be beneficia 
to further examine situational measures of 
trait anxiety. 

The outcome of the present study has 
implications for future research in personality 
in general and anxiety in particular. The 
generality of this finding suggests that person- 
ality trait measures should be reexamined in 
light of the situational dimensionality of each 
particular trait. This reexamination would 
entail investigating all the situations related 
to the trait, organizing these situations into 
dasses, and developing an instrument for the 
assessment of each situational class. Thus, 
instead of continuing to assess cross-situational 
traits or attempting to measure situation- 
specific responses, researchers would potentially 
Profit from focusing on the assessment of 
traits in relation to classes of situations. 
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Comparison of Measures of Adaptive Behaviors 
in Preschool Children 


Linda I. Garrity i 
Mendota Mental Health Institute and Waisman 
Center on Mental Retardation and Human 
Development, Madison, Wisconsin 


Six measures of adaptive behavior were compared to determine which tests and 
items best discriminate between behavior 


items discriminated better than gross-motor, language, social, and behavioral 


The purpose of this study was to determine 
whether problem and nonproblem preschool 
children perform differently on six commonly 
used measures of adaptive behavior, 

Although a large variety of screening instru- 

` ments can be found in the literature, few studies 
exist that simultaneously compare a number of 
commonly used instruments on the same sub- 
ject population. An exception is a study by 
Cowen, Dorr, and Orgel (1971) in which 367 
5- and 6-year-olds were administered the AML 
Behavior Rating Scale, the Ottawa School Be- 
havior Survey, the Teacher’s Adjective Check- 
list, and the Teacher’s Behavior Rating Scale. 
All four measures were significantly correlated, 
and the latter two discriminated between ad- 
justed and maladjusted children. 

There exists a paucity of data on preschool 
children despite the increasing emphasis on 
early screening and identification of problems. 
In addition, most studies comparing problem 
and nonproblem children do not match the 


This study is based on a maste] 


i c t’s thesis completed at 
the University of Wisconsin by the second author. 


Requests for Teprints should be sent to Linda I. 
Garrity, Mendota Mental Health Thstitute, 301 Troy 
Drive, Madison, Wisconsin 53704. 
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Department of Behavioral Disa’ 


bilities, 
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roblem children scored significantly 
measures. In addition, fine-motor 


same chronological age, the results of th ý 
studies are biased in favor of the nonproblem 
population, making it difficult to evaluate the 
instruments’ effectiveness, Eo 
In the present study, the following six mi 
struments were administered to deter 
which, if any, differentiate previously ideni 
groups of problem and nonproblem: 
children matched on both chronological and 
mental age. 


The Minnesota Child Development Inven- 

tory (MCDI; Ireton & Thwing, 1972) is a 
checklist filled out by the parent. The Denver 
Developmental Screening Test (DDST; Frank- 
enburg & Dodds, 1967) is a performance test, 
relatively easy to administer by a trained ex- 
aminer, in which the child is required to answer 
questions, draw pictures, and exhibit motor 
coordination. To date, no studies have com- 
pared problem and nonproblem children on 
hese two tests. The present study compared 
he two groups by examining the entire tests 
as well as each individual subtest from the 
MCDI and the DDST. 
It was hypothesized that the behavior prob- 
lem children would evidence higher scores (sug- 
gesting greater maladjustment) on the above 
our teacher behavior rating scales, lower de- 
velopmental age on the MCDI (total test and 
he eight subtests), and more delays (failure to 
perform a task passed by 90% of the children 
at the same chronological age) on the DDST 
(total test and four subtests). 


Method 
Subjects 


Eleven nonproblem and 13 behavior problem children 
with no medical or physical problems participated in the 
Study. Of the 11 nonproblem children, 7 were white and 

were nonwhite (2 black and 2 Chicano). Seven of the 
cle children were from middle-income families 
f white, 2 nonwhite), and 4 were from lower-income 
families (2 white, 2 nonwhite). Of the 13 problem chil- 

fae 11 were white and 2 were nonwhite. Eleven (10 
White and 1 black) of the problem children were from 
middle-income and 2 (1 white and 1 Chicano) were from 
k wer-income families, The children were selected from 

ursery schools and day-care centers in the Madison, 
f isconsin, area, Approximately 75 letters and consent 
orms were distributed to nursery school and day-care 
rats It is not known how many of these were di- 
ectly given to parents. Thirty parents volunteered 
Participation and signed consent forms. Of these, 27 
snes the study. Of the 27 original participants, 3 
24 ren were dropped from the study. The remaining 

y a were similar in both age and Peabody 
Gea Vocabulary Test IQ scores. The problem chil- 
x had a mean age of 4.08 years (SD = .60) and a 

fan IQ of 110.53 (SD = 11.30). The nonproblem 
Q a had a mean age of 4.13 (SD = .57) and a mean 
for ere (SD = 14.31). The t for age = -245 the t 
aaien were assigned to problem and nonproblem 

ured S on the basis of their performance in three struc- 
situations: (a) while being administered the Pea- 
Y Picture Vocabulary Test in school by the ex- 
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aminer; (b) while being administered Borke’s (1975) 
adaptation of Piaget and Inhelder’s (1956) three moun- 
tains task, Flavell, Botkin, Fry, Wright, and Jarvis’ 
(1968) block test, and the Coloured Progressive Mat- 
rices (Raven, 1956) in the clinic by the second author! 
and (c) while participating in a mother-child interaction 
study (completing puzzles) in the clinic for another ex- 
aminer. Specific precaution was taken to insure that the 
test items differentiated the children’s cognitive capac- 
ity rather than attention span. The experimenter made 
certain through repeated demonstrations that the chil- 
dren understood each task before proceeding with the 
next. In each situation, the children were evaluated on 
a 6-point scale on each of two dimensions—degree of 
compliance and degree of on-task behavior. Children’s 
actual scores range from 6 to 33 (M = 19.5, SD = 9.6), 
out of a possible range of 6 to 36. Three children whose 
scores were within $ standard deviation from the mean 
were excluded from the study; the sample thus con- 
sisted of 24 children. 

The problem children represented both ends of the 
clinical spectrum. Based on the experimenter’s observa- 
tions, 5 of the 13 problem children showed evidence of 
withdrawn behavior and 4 exhibited negativistic and 
acting out behavior. The problems of the remaining 4 
children were less clearly defined, but they could be 
labeled anxious and immature as evidenced by whining, 


crying, and so on. 


Procedure 


After receiving parental permission slips, the experi- 
menter went to the child’s school and administered the 
Peabody Picture Vocabulary Test. At that time the 
teacher was asked to fill out the four behavior rating 
scales: the CARS, the AML, the TRS, and the SBS, 
which required about a total of } hour per child. 

Mothers were invited to our center, whereupon each 
was asked to fill out the MCDI, which required about 
20 minutes. Both mother and child were then taken to a 
testing room where the DDST, the Coloured Progressive 
Matrices, the three mountains task, and the blocks test 
were administered to the child. 


Results 


t tests were used to compare the scores of the 
problem and nonproblem children on the 18 
measures (the six entire tests, the four DDST 
subtests, and the eight MCDI subtests). Sig- 
nificant differences between problem and non- 
problem children were obtained on 16 of the 18 
measures (see Table 1). 

Problem children scored significantly lower 
than their nonproblem counterparts on seven 


1 These three tests, which were administered along 
with the Denver Developmental Screening Test in the 
clinic, are part of another study. 
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Table 1 
Comparison of Mean Test Scores 
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__ 


Nonproblem Problem 
Test M SD M SD t 
i i ment Inventor: 

S eea 4 305.54 10.14 282.15 19.20 asie 
General Development 123.45 5.09 112.23 6.72 a 
Personal-Social 32.64 1.37 31.15 1.61 ae 
Gross Motor 31,27 1.13 30.00 1.62 ee 
Fine Motor 42.18 1.70 38.00 3.19 pa 
Expressive Language 53.54 05 52.15 1.66 RIR 
Comprehehsion-Conceptual 60.36 3.44 51.76 6.83 Bh 
Situation-Comprehension 39.72 2.37 35,92 3.36 $ D 
Self-help 33.90 2.90 31.69 3.66 k 

Denver Developmental Screening Test t 
Entire test arke of failures) .09 30 1.58 1.16 ee 
Gross Motor 09 30 58 66 2.2 3 
Fine Motor 0 0 58 719 a 
Language 0 0 .25 45 i 
Personal-Social 0 0 16 38 Ha 

Teacher Rating Scale 34.45 11.40 50.25 9.26 3. p 

Ottawa School Behavior Survey 19.72 .64 16.30 4.15 pa 

Classroom Adjustment Rating Scale 5.18 5.22 17.76 15.50 2.5 y 

Behavior Rating Scale 6.63 2.61 11.92 7.47 2.23 


Note. The degrees of freedom for the Denver Developmental Screening Test are 21 ; for all other tests they 


are 22. 
E EAN 
> <.01. 


of the eight MCDI subtests. Similar results 
were found on the DDST. Again, the problem 
children had lower developmental ages, as 
evidence by a greater number of delays ex- 
hibited. The problem group had a total of 15 
delays as compared to the nonproblem group, 
which had only 1, 

A discriminant function analysis was per- 
formed using the six different tests in their en- 
tirety. The formula used to obtain a discrimi- 
nant score for each subject was .058 (MCD1I) 
+ .074 (CARS) — .165 (SBS) — .091 (AML) 
+ .088 (TRS) + 1.0 (DDST) + 15.094. Thus 


Table 2 


Effectiveness of the Discriminant 
Function Analysis 


—— ee I 


Classified as 
Problem Nonproblem 
Actual n pb n $ 
Problem 9 69 Ay Fai 
Nonproblem 1 09 10 91 


to classify one subject, the scores on the six 
tests would be computed and multiplied D 
the corresponding coefficient and the constan 
then added. To validate the above formula, at 
observation was successively omitted and the 
data for the remaining 23 subjects were a 
to develop the formula to classify the 2 
subject (Lachenbruch & Mickey, 1968). r 

The Lachenbruch and Mickey procedure ae 
used to estimate the probability of misclass y 
cation when using the discriminant taie 
There were 5 errors out of 24 attempted ce 
fications: One nonproblem child was e D 
as a problem child, and four problem chil n 
were classified as nonproblem. The associa 
probabilities are presented in Table 2. T 

Those items in each test that best ana 
tiated (showed the greatest mean aiie 
between) problem and nonproblem groups 
listed in Table 3. 


Discussion 


of 
This study compares the performance 


: ; D- 
problem and nonproblem children on six scre® 
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Tables Lb 
Within Each Test with the Greatest Mean Difference Between 


lem and Nonproblem Groups 
k Subjects 


pS paea ky 
Test item Nonproblem Problem Difference 


Minnesota Child Development Inventory 


Reads 4 or more words 121 077 650 
Prints 2 or more simple words from memory 636 .000 636 
Ties shoelaces 636 .000 636 
Draws recognizable pictures 1.000 385 615 
Plays table games with cards .909 .308 .601 
Prints first name .818 .231 587 
Jumps rope «127 .154 .513 
Colors within lines in coloring book 1.000 462 538 
Cuts with scissors, following simple outline 1,000 462 -538 
Draws picture of man/woman with at least 6 parts 909 385 524 
Talks in past tense correctly .909 385 524 
818 308 510 


Prints a few simple words from memory 


Ottawa School Behavior Survey 
.909 462 
1.000 


Is obstinate 
Has difficulty learning 
Is restless 


Classroom Adjustment Rating Scale 
Poor concentration, attention span oy A T 
Difficulty following directions ‘091 “308 2217 


_ Defiant, obstinate, stubborn 


Teacher Rating Scale 
136 


Cooperative /uncooperative with adults 273 


g ; 
Good /| ; i 
k poor self-control 3 —311 
_ Empathic/unempathic RG Gi 
E Denver Developmental Screening Test au 
E. 292 =; 
_ Fine motor skills Rt 292 — 246 
000 125 = 125 


ih: Gross motor skills 
___ Language skills 


16 resulted in significantly 


ing devices Th d 11 nonproblem i 
children Aihe r were PENH on poorer performance by the problem children. 
both chronological and mental age, 4 control The MEDI I EN ooon 20 ee 
_ Procedure not ordinaril d. Assignment to to fill out, could easily De given to a 

pr y Sar EE ildi re entering nursery 
Problem and n bl roups was 0D the parents whose children a g nur 
basis of th ae ae formance (on school. In fact, all but one of the individual 
O N E a on-task/off- MCDI subtests could E used ame i K ae 
: iors) i ituati ter mass screening device o identi z 
O o ana eni E Sidra The DDST also requires about 


the 18 measures of adaptive behavior com- 


pared in the study, 
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20 minutes, but it must be administered in- 
dividually by a trained examiner. The four 
teacher rating scales, which also differentiated 
problem and nonproblem children, require no 
more than 5 minutes to complete and can be 
done by a teacher or an aide. Of the above six 
tests, only the MCDI and the DDST span the 
6-month to 6-year range; thus they would pre- 
sumably be useful with children even younger 
than preschool age. 

The discriminant function analysis indicated 
that problem children would be identified cor- 
rectly 69% of the time and the nonproblem 
children 91% of the time. It should be pointed 
out that the problem children in this study had 
no medical, physical, or previously identified 
behavioral problems, and all were functioning 
in typical day-care and nursery centers. Fur- 
thermore, problem and nonproblem children 
were similar on IQ, and all were functioning 
within the average range of intelligence. It is 
therefore expected that the probability of cor- 
rectly identifying the more extreme problem 
children would be substantially higher if a 
random sampling procedure were used. 

With regard to individual items, a possible 
relationship between early motor development 
and later behavioral and emotional problems is 
Suggested. Of the 12 MCDI items that showed 
the greatest mean difference between problem 
and nonproblem children, 10 involved motor 
skills. Results on the DDST support this find- 
ing: In the nonproblem group, only one child 
failed or was delayed on an item that 90% of 
children with the same chronological age 
passed. In contrast, there were 15 delays (7 
fine motor, 5 gross motor, 2 language, 1 per- 
sonal-social) in the problem group. Thus, on 
both tasks nonproblem children functioned at 
a higher developmental level on motor skills in 
spite of the fact that the groups were matched 
on both age and IQ. 

A strong relationship between motor skills 
and behavioral and emotional problems was 
also found by Rider (1973), who compared 
groups of “normal” and “emotionally dis- 
turbed” children 62-123 years old on the Pur- 
due Perceptual-Motor Survey and the Southern 
California Sensory Integration Tests. The emo- 
tionally disturbed children exhibited signifi- 
cantly more abnormal reflex responses than the 
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normal children. More specifically, the emo- 
tionally disturbed children scored significantly 
lower on 11 of the 18 Purdue Perceptual-Motor 
Survey subtests and on 12 of the 16 Southern 
California Sensory Integration subtests, This 
relationship between motor development and 
emotional problems needs further exploration, — 
Tf motor skills were found to be a good indicant _ 
of emotional problems, this would be particu- 
larly useful in screening very young children, 
Early identification of problem children in- 
creases the opportunity for effective interven- 
tion and treatment. 

There is clearly a need for screening devices 
designed to detect behavioral and emotional 
problems in young children. Further validation 
of the findings of the present study is needed, 
using a randomly selected population as well as 
younger children. 


Reference Note 


1. Cowen, E. L. Personal communication, May 21, 
1975. 


References 


Borke, H. Piaget’s mountains revisted: Changes in the 
egocentric landscape. Developmental Psychology, 1919) 
11, 240-243, a 

Brownbridge, R., & Van Vleet, P. (Eds.) J: nvestments it 
prevention: The prevention of learning and behavior” 
problems in young children. San Francisco: Pace ID y 
Center, 1969. R 

Cowen, E. L., Dorr, D. A., & Orgel, A. R. Interrelations 
among screening measures for early detection 
school dysfunction. Psychology in the Schools, 1971, 
8, 135-139. ja 

Cowen, E. L., Zax, M., Izzo, L. D., & Trost, M. A. Prem 
vention of emotional disorders in the school setting? A 
A further investigation. Journal of Consulting PI ri 
chology, 1966, 30, 381-387. 

Flavell, J. H., Botkin, P. T., Fry, C. L., Wright, J. Wa 
& Jarvis, P. E. The development of role-taking and comi- 
munication skills in children. New York : Wiley, 1968: 

Frankenburg, W. K., & Dodds, J. B. The Denver De 
velopmental Screening Test. Journal of Pediatrics 
1967, 71, 181-191. 

Grossman, B. D., & Levy, P. S. A factor analytic study h 
of coping behavior in preschool children. J ourn of | 
Genetic Psychology, 1974, 124, 287-294. Í 

Treton, H. R., & Thwing. E. J. Minnesota Child De 
velopment Inventory, Minneapolis, Minn. : Behavior 
Science Systems, 1972. ; 


A COMPARISON OF MEASURES 


Lachenbruch, P. A. & Mickey, M. R. Estimation of 
error rates in discriminant analyses. Technometrics, 
1968, 10, 1-11. 

Piaget, J., & Inhelder, B. The child's conception of space. 
London: Routledge & Kegan Paul, 1956. 

Pimm, J., & McClure, G. Behavior observations of grade 
one pupils. Ottawa, Canada: Ottawa City School 
District, 1966. 

Raven, J. C. Coloured Progressive Matrices. London: 
Lewis, 1956, 


293 


Rider, B. A. Perceptual-motor dysfunction in emo- 
tionally disturbed children, American Journal of 
Occupational Therapy, 1973, 26, 316-320. 

Zax, M., & Cowen, E. L. Research on early detection 
and prevention of emotional dysfunction in young 
school children. In C. D. Spielberger (Ed.), Current 
topics in clinical and community psychology (Vol. 1). 
New York: Academic Press, 1969, 


Received May 24, 1977 m 


Journal of Consulting and Clinical Psychology 
1978, Vol. 46, No. 2, 294-297 


Effects of Controlled Background Interference on Test 


Performance by Right and Left Hemiplegics 


Richard E. Nemec 
Rehabilitation Institute of Chicago 


The general and lateralized effects of background interference on verbal and 
perceptual-motor functioning were studied as a function of presence and lat- 
eralization of brain damage. Thirty non-brain-damaged controls and 30 right 
and 30 left hemiplegics were given a word-naming task and the Bender-Gestalt 
Test under noninterference and background interference conditions. As hypoth- 
esized, (a) brain-damaged patients had significantly greater overall interference 
effects than controls; and (b) laterality effects were significant, that is, verbal 
interference was greatest in the left-hemisphere-damaged group; perceptual in- 
terference was greatest in the right-hemisphere-damaged group. Implications for 
treatment programs with such patients are discussed. 


The laterality hypothesis states that the left 
hemisphere of the brain tends to be dominant in 
terms of verbal functioning, whereas the right 
hemisphere tends to be dominant for visual, 
spatial, or “nonverbal functioning.” This hy- 
pothesis has been the subject of a variety of 
studies. Many of them have upheld the hy- 
pothesis, particularly in regard to verbal func- 
tioning (Hecaen, 1962; Milner, 1967; Penfield 
& Roberts, 1959; Reitan, 1955; Sperry, 1965). 
The hypothesized function of the right hemi- 
sphere with regard to perceptual, visual, or 
spatial functions has not been as clearly sup- 
ported and has been subject to numerous but 
often contradictory studies (Hanvik & Ander- 
son, 1950; Hirshenfang, 1960; Milner, 1967; 
Warrington, James, & Kinsbourne, 1966). 

This study was designed as a further test of 
the laterality hypothesis, Verbal and perceptual 
interference were the independent variables 
used to assess the effect on verbal and percep- 
tual performance in right-hemisphere (left 
hemiplegics) and left-hemisphere (right hemi- 
plegics) brain-damaged patients and non- 
brain-damaged controls, 

The symptom of increased distractability as 
a result of brain damage, resulting in the need 


Requests for reprints should be sent to Richard E. 
Nemec, Rehabilitation Institute of Chicago, 345 East 
Superior Street, Chicago, Illinois 60611. 


for limited, structured, and repetitive stimuli is 
commonly reported (Strauss & Lehtinen, 
1947). Cantor (1966) used this symptom of 
distractibility to modify the Bender-Gestalt, 
with the dependent variable being decrement 
with perceptual inteference. The decrement m 
performance was found to be highly sensitive 
to organic brain damage, though the variable 
of laterality of lesion was not controlled. De- 
crement as a result of interference within the 
verbal area has been studied (Smith, 1966) 
but not applied to organic-nonorganic Or 
laterality of lesion variables per se. Thus geni }] 
eralized as well as lateralized effects of dis- 
tractibility on performance by brain-damaged 
subjects would be expected. j 

Specifically, this study investigates the relä- 
tionship of distractibility in the verbal and 
perceptual areas to performance on verbal an 
perceptual-motor tasks as a function of brain 
damage and laterality of lesion. Using the find- 
ings of Cantor (1967) and Strauss and Lehti- 
nen (1947) concerning the general effects of 
brain damage on distractibility and the lateral 
ity literature concerning differential function” ~ 
ing of the left and right hemispheres, the 
following predictions were made: 


1. Background verbal interference we 
the verbal mode will cause a greater decrement 
in the verbal performance of organics than non: 
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prain-damaged controls regardless of site of 
lesion. 

9, Background perceptual interference in 
the perceptual mode will cause a greater de- 
crement in the perceptual performance of 
organics than non-brain-damaged controls 
regardless of site of lesion. 

3, Patients with lesions in the right hemi- 
sphere will show more of a decrement in the 
perceptual sphere with perceptual interference 
than patients with lesions in the left hemi- 
sphere. 

4. Patients with lesions in the left hemi- 
sphere will show more of a decrement in the 
verbal sphere with verbal interference than 
patients with lesions in the right hemisphere. 

5. Perceptual interference in the verbal 
sphere will show less of a decrement in per- 
formance than verbal interference in the 
verbal sphere for all groups. 

6. Verbal interference in the perceputal 
sphere will cause less of a decrement than per- 
ceptual interference in the perceptual sphere 
for all groups. 


Method 
Subjects 


_ The subjects in this study consisted of 90 patients 
involved in an active rehabilitation program in two re- 
habilitation hospitals. Of the 90 subjects, 60 were diag- 
nosed by a neurologist on admission as having suffered 
a cardiovascular accident resulting in varying degrees 
of paralysis to the side of the body opposite that of the 
involved hemisphere. Thirty of these patients displayed 
pane involvement of the right hemisphere and 30 
tegu primary involvement of the left hemisphere. 
zi e control group consisted of 30 patients displaying no 
: seule central nervous system dysfunction on 
ae and psychological evaluation. The non- 
ae peers controls were all involved in an inpatient 

h abilitation program, had suffered some degree of 
P! ysical impairment, and were selected from an age 
Nee similar to the cardiovascular accident patients. 
oe ay the subjects were diagnosed as psychotic or 
ery ly aphasic, and those with severe perceptual prob- 
caw to produce a scorable Bender) were ex- 
ree rom the study. Fifty-one males and 39 females 
Maree Mean age was 63.8 years (range = 53-72). 
We in ote tet level was 11.2 years (range = 6-13). 
et lucational level, and sexual composition were 

significantly different across the three groups. 


Procedure 


qe Patient was administered the Bender-Gestalt, 
using plain paper, then using Cantor’s perceptual 
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background interference procedure. The patients were 
asked to copy the designs on special paper, according to 
Cantor’s method, and the performance was transferred, 
via carbon paper, to a blank sheet to aid in scoring. 
During the verbal portion of the task, the subjects were 
asked to name all the animals they could within 30 sec 
and then repeat the task while a tape, consisting of the 
examiner naming tools, was presented at a constant 
volume (approximately 65 dB[A]). In addition, to 
check the possibility of cross-modal interference (i.e, 
verbal interference affecting perceptual performance), 
each patient was readministered the Bender-Gestalt 
under conditions of verbal background interference 
(tape consisting of the names of tools). To investigate 
the possibility of perceptual interference affecting verbal 
performance, each patient was asked to repeat the word- 
naming task while fixating a series of flashing lights. The 
presentations were randomized to control for any prac- 
tice effect. In accordance with Cantor’s technique, non- 
related tasks such as sections of the Wechsler Memory 
Scale were presented between the experimental tasks to 
break the patient’s “set.” The Bender-Gestalt tasks 
were coded by a colleague, so that the experimenter 
knew nothing of the presence or locus of brain damage. 
‘The records were scored by the examiner using Cantor’s 
modification of the Pascal-Suttell method (scoring re- 
liability r = 82). The measure of change used was 
Cantor’s: a difference score that results from the sub- 
traction of the standard score from the interference 
score, with a positive score reflecting decrement in 
performance. 

For the verbal task, a difference score reflecting the 
difference between the initial verbalizations and those 
occurring with interference was recorded, A positive 
difference score reflects a decrement, whereas a negative 
difference score reflects an increment. 


Results 


A two-way analysis of variance of the effects 
of verbal and perceptual interference actoss the 
three groups on the verbal (word-naming) task 
was performed. The significant main effect 
among groups indicated that the neurological 


groups differed significantly on ae in 
formance with background interter- 
TROE .01. There was 4 


ence, F(2, 87) = 13.66, p< 1 
significant main effect for type of interference, 
F(2, 87) = 28.91, p < 01. Verbal interference 


had a greater effect than perceptual interfer- 


ence, as expected. ; 
Table 1 presents the means for the difference 
scores of the three groups under the two condi- 
tions of interference, and Table 2 presents the 
ween pairs of means. 


verbal interference caused 


tial decrement in 
non-brain-damaged controls. 
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Table 1 
Verbal Difference Score Means Between 
Groups and Interference Mode 


RICHARD E. NEMEC 


Table 3 
Perceptual Difference Scores Between Groups 
and Interference Mode 


SS ee oo ae a o 
Interference mode Interference mode 
Neurological group Verbal Perceptual Neurological group Verbal Perceptual 
Non-brain-damaged Non-brain-damaged 
controls 53 63 controls 57 53 
Right 1.80 17 Right 2.07 20.63 
Left 3.97 -80 Left 


1.50 8.30 


however, verbal interference within the verbal 
task significantly impaired the performance of 
both brain-damaged groups, with the left-sided 
group being significantly more impaired in 
their performance than the right-sided group. 
The presentation of perceptual interference 
during the verbal task did not produce differ- 
ences in performance among the three groups. 
The results of a two-way analysis of variance 
on the effects of verbal and Perceptual inter- 
ference across the same three groups on the per- 
ceptual (Bender-Gestalt) task indicated that 
the neurological groups diffeted significantly 
on the variable of Perceptual decrement with 
background interference, F (2, 87) = 26.56, 
p < 01. There was a significant main effect for 
type of interference, F(2, 87) = 22.74, p < .01. 
Perceptual interference had a greater effect 
than verbal interference. A significant interac- 
tion effect indicated that the magnitude and 
direction of the effects of background interfer- 
ence differed for different neurological groups, 
F(, 87) = 22.74, PELON 


Table 2 


Summary Table of t Tests for Differences 
Between Means on Verbal Task 


Table 3 presents the means of the difference 
scores for the three groups under the two con- 
ditions of interference. Table 4 presents the 
results of the ¢ tests between pairs of means. 
Table 4 indicates that neither perceptual nor 
verbal interference caused a significant differ- 
ential decrement in the perceptual performance 
of the non-brain-damaged controls, As pre- 
dicted, perceptual interference during the per- 
ceptual performance had a significant negative 
effect on both brain-damaged groups, with the 
right-sided group doing significantly poorer 
than the left-sided group. The presentation of 
verbal interference during the perceptual task 
did not significantly differentiate among the 
performance for the three groups. 


Discussion 


All predictions were upheld. It has been 
shown that the decrement in performance in the 
Perceptual area with perceptual interference 
was significantly related to brain damage 


Table 4 
Summary Table of t Tests for Differences 
Between Means on Perceptual Task 


Groups compared t Groups compared i 
NBD (verbal) vs. NBD (perceptual) 1.62 NBD (verbal) vs. NBD ceptual AT 
Right (verbal) vs. right (perceptual) 3.41* Right een vs. right Ea 5.28* 
Left (verbal) vs, left (perceptual) 4.30* Left (verbal) ys. left (perceptual) 4.39" 
NBD (verbal) vs. right (verbal) 2.74* NBD (verbal) vs. right (verbal) 1.76 
NBD (verbal) vs. left (verbal) 3.97* NBD (verbal) vs. left (verbal) 143 
Right (verbal) vs. left (verbal) 2.82* Right (verbal) vs. left (verbal) 1.01 
NBD (perceptual) vs. right (perceptual) -67 NBD (perceptual) vs. right (perceptual) 6.72" 
NBD (perceptual) vs. left (perceptual) 1.21 NBD (perceptual) vs. left (perceptual) 4.13" 
Right (perceptual) vs. left (perceptual) -85 Right (perceptual) vs, left (perceptual) 3.83" 


Note. NBD = non-brain-damaged controls; paren- 
theses indicate the type of interference, df = 58. 
*p < 01. 


Note. NBD = non-brain-damaged controls; paren- 
theses indicate the type of interference. df = 58. 
*p <01. 
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within either hemisphere, as was verbal decre- 
ment with verbal interference. Hence,- inter- 
ference within either mode significantly 
reduced the performance of brain-damaged as 


compared with non-brain-damaged subjects 
regardless of the site of the lesion. 

Interhemispheric differences within the 
verbal mode were in the predicted direction, 
with the decrement for the left-hemisphere 
group being significantly greater than for the 
right-hemisphere group. Similarly, in the 
perceptual mode the decrement for the right- 
hemisphere group was significantly greater than 
for the left-hemisphere group. No significant 
cross-modal effects were found, indicating that 
perceptual interference had little effect on 
verbal performance, nor did verbal interference 
significantly affect perceptual performance. 

Relating the results specifically to the lateral- 
ity hypothesis and the nonlocalization theory, 
the results clearly support both views. Strauss 
and Lehtinen (1948) hypothesized that the 
effects of brain damage tend to be general and 
diffuse and tend to be related more to the dis- 
inhibiting effect of brain damage on the lower 
centers of the brain, which are related specifi- 
cally to distractibility. This view was given 
support in both the verbal and perceptual areas 
tested. It was shown that for the sample tested, 
brain damage on either side caused a greater 
decrement in verbal functioning with verbal 
interference and in perceptual functioning with 
perceptual interference than for the non-brain- 
damaged controls. There were, however, find- 
ings that did argue for the laterality hypothe- 
sis; that is, the left-hemisphere-damaged group 
Was significantly more distractable than the 
tight-hemisphere-damaged group on a verbal 
task with verbal interference and vice versa 
for the effects of perceptual interference on a 
Perceptual task. 

In summary, the findings generally support 
the laterality hypothesis, that is, left hemi- 
sphere dominant for verbal functioning and 
pent hemisphere dominant for visual-spatial 
Be The results further indicate that 

rain damage has both a general and a specific 
effect. Distractability increases with damage 
to the brain, but verbal and perceptual dis- 
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tractibility have a differentially greater effect 
on the damaged left and right hemispheres, 
respectively. In terms of treatment and re- 
habilitation, this study supports the rationale 
for a structured, controlled, and limited stimu- 
lation program for any patient having suffered 
a cardiovascular accident. An additional im- 
plication is that in setting up an ideal treatment 
environment, one must consider not only the 
level of potentially distracting environmental 
stimulation but also the type of stimulation 
with reference to the site of the lesion. 
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The Enigma of Androgyny: Differential Im 


for Males and Females? i 
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A program of studies (N 
androgyny (i.e., a balance 


= 1,404) tested the hypothesis that psychological 
of masculine and feminine characteristics) permits 


greater behavioral flexibility and consequently leads to better adjustment. A 
variety of methods were used to compare androgynous with sex-typed and op- 


posite sex-typed individuals along several attitudinal, personality, 


dimensions. Contrary to expectation a 


and behavioral 


pattern of findings replicated across mea~- 


sures of attitudes toward women’s issues, gender identification, neurosis, intro- 
version—extraversion, locus of control, self-esteem, problems with alcohol, cre- 


ativity, political awareness, confidence in 


one’s own ability, helplessness, and 


sexual maturity indicated that flexibility and adjustment were generally asso- 
ciated with masculinity rather than androgyny for both males and females. A 
subsequent experiment further revealed that feminine subjects, independent of 
gender, would prefer to become more masculine were that possible. These re- 
sults are interpreted as Suggesting an alternative to Bem’s theory of androgyny. 


Additional analyses indicated few differe; 


inal definitions of androgyny. 


A common assertion in the feminist move- 
ment and in recent psychological theory is that 
traditional sex roles are confining and that new 
roles for women will result in more rewarding 
options for men as well, In this context, the 
present research tested a wide variety of adapt- 
ability-adjustment hypotheses, derived from 
androgyny theory (Bem, 1974). 

In an important series of articles, Sandra 
Bem (1974, 1975) challenged the assumption 
frequently found in the literature that persons 
who adopt a conventional masculine or femi- 
nine role are somehow “healthier.” Bem argued 
that internalizing a culturally imposed, “ap- 
propriate” sex role may inhibit the develop- 
ment of a full and satisfying behavioral Teper- 
toire. By contrast, the androgynous individual 


Preliminary reports of this article were presented at 
the meeting of the Southwestern Psychological Associ- 
ation, Albequerque, New Mexico, April-May, 1976. 

The authors gratefully acknowledge the assistance of 
Ken Slade, Shasta Mead, Cathy Ingle, and Shirlynn 
Nichol. 

Requests for reprints should be sent to Warren H. 
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nces between the additive and the orig- 


AE w | 
plications | 


who identifies with both desirable masculine 


and desirable feminine characteristics is free 
from such stereotypic sex role limitations and is 
able to more comfortably and effectively en 
gage in both “masculine” and “feminine” be- 
haviors across a variety of social situations. 
Thus, the concept of androgyny denotes a per- 
son who is flexible, socially competent, able to 
respond to shifting situational demands, and! 


more complete and actualized in the sense of 
developing and maximizing personal potential, 


Initial validation studies using the Bem Sex 


Role Inventory (BSRI; Bem, 1974) generally 


revealed the behavior of androgynous persons 
to be less stereotyped or constrained by con- 
ventional sex role standards, Given a choice,of 
activities in which they believed they would be 
Photographed, such persons were less likely 
than sex-typed individuals to prefer gender 
appropriate activities, and if required to pub- 
licly perform a gender-inappropriate behavior, 
they experienced less discomfort (Bem & Len- 
ney, 1976), Androgynous women were less de- 
pendent and conforming in a standard con- 
formity experiment, and androgynous men 
Were more likely than their sex-typed counter- 
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parts to display “feminine playfulness” with a 
small kitten (Bem, 1975). Masculine men were 
less likely to approach and play with a human 
baby and showed the least sympathy for the 
personal problems of another person. Although 
feminine women showed the most sympathy, 
they were expressively deficient with the kitten, 
whereas androgynous subjects were able to 
perform both functions adequately (Bem, 
Martyna, & Watson, 1976). 

Bem’s theory suggested a wealth of new hy- 
potheses concerning the greater flexibility, 
adaptibility, social competence, and psycholog- 
ical health of androgynous individuals, and 
several pertinent empirical studies using the 
BSRI have now been reported. A review of 
these studies, however, raises a number of 
questions regarding (a) differential patterns 
of results concerning the effects of androgyny 
for males and females, (b) the generalizability 
of current demonstrations of the flexibility of 
androgynous individuals, and (c) the validity 
of the BSRI. 

The first issue concerns the idea that less 
traditional roles will be equally rewarding for 
both men and women. Although a few incon- 
sistencies have been reported, the presently 
aaa evidence suggests that androgynous 
Ba, cf are less conventional and less con- 
a by sex role identification than their 
ie counterparts. For example, androgy- 
tied aes as compared to feminine females 
fe een found to be less traditional, inhib- 
ai or restrained regarding occupational and 
in cational objectives, marital and childbear- 
Heiman na behavior and attitudes, 
ae to discuss menstrual problems, and 
co ne family orientations (Allgeier, 
ee ligeier, Note 1; Kamens & Liss-Levin- 
ban ne 2; Brooks & Birk, Note 3; Cherno- 
ae ansson, & Jones, Note 4), The data for 
a subjects in these studies, however, are 
ne ee el with most studies reporting 
Se erences, or in some instances, results 

rary to expectation. 
= oo major issue concerns the general- 
aa ity of currently available demonstrations 
€ adaptability of androgynous individuals. 
es early validation studies were creative, 
aH fal did not adequately test a range of 
eae encies sufficient to justify the conclu- 
that androgynous persons are behavior- 
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ally and emotionally more adaptable. Indeed, 
much of the presently available data do not 
directly address the issues of mental health and 
social competence. Moreover, in one study, an- 
drogynous subjects yielded not only higher 
self-esteem and lower general maladjustment 
and psychosis scores but also higher scores on 
a neurosis scale (Nevill, Note 5). Also, the 
studies of investigators other than Bem con- 
tain considerable methodological variation, 
thereby rendering generalization difficult. For 
example, some studies have homogenized the 
subject sample, whereas others have used all 
available subjects. Some have failed to include 
opposite sex-typed subjects or males in the de- 
sign, whereas others have included them, and 
still others have combined males and females, 
or opposite sex-types individuals with conven- 
tionally sex-typed individuals. 

Finally, several questions concerning the 
validity of the BSRI as a measure of psycho- 
logical androgyny have been raised. For ex- 
ample, low correlations between sex of subject 
and endorsement of specific items of the mascu- 
linity and femininity subscales (Rothman & 
Bryson, Note 6), possible misuse of the ¢ statis- 
tic, and a possibly overlooked significant dif- 
ference concerning the desirability of masculine 
traits for females between male and female 
judges in the initial scaling sample have been 
reported (Strahan, 1975). Furthermore, three 
factor analyses of BSRI responses have been 
reported that show contradictory results 
(Wakefield, Sasek, Friedman, & Bowden, 
1976; Rothman & Bryson, Note 6; Gaudreau, 
Note 7), and the one study that attempted to 
replicate Bem’s original scaling procedures with 
a new sample generally failed to do so (Ed- 
wards & Ashworth, Note 8). 

Of particular concern has been the criticism 
of Spence and her colleagues (Spence, Helm- 
reich, & Stapp, 1975) regarding the inability of 
Bem’s original subtractive scoring method to 
detect subjects scoring low on both masculin- 
ity and femininity. The major implication is 
that although such persons might be androgy- 
nous as a result of their equivalent masculinity 
and femininity subscores, they are probably 
Jess well adjusted due to a negative self-concept 
reflected in such lower scores. After an empiri- 
cal examination of this criticism, Bem (1977) 
acknowledged that on some variables the per- 
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formance of undifferentiated subjects (i.e., 
those whose masculine and feminine scores both 
were below the observed median) indicated a 
Poorer sense of self-esteem than did that of the 
androgynous subjects (whose masculinity and 
femininity scores both were above the median). 
Such differences, she concluded, justified re- 
stricting the designation androgynous to per- 
sons whose masculine and feminine scores did 
not differ significantly, and for whom both 
scores were above the observed median. 

In response to these considerations, we con- 
ducted a series of investigations designed to as- 
sess the implications of psychological androg- 
yny as measured by the BSRI for individual 
adaptability, adjustment, and competence and 
to further explore the apparently inconsistent 
findings concerning the implications of an- 
drogyny for males and females. These objec- 
tives were pursued by testing 16 hypotheses 
across five areas of psychological functioning: 
(a) feminist ideology and gender identification, 
(b) personality and adjustment, (c) intellec- 
tual competence, (d) helplessness, and (e) 
sexual maturity and heterosexuality. In addi- 
tion, the question of undifferentiated subjects 
was examined by comparing, on selected vari- 
ables, the male and female androgynous sub- 
jects as defined by Bem’s method versus the 
Spence et al. method. 


Feminist Ideology and Gender Identification 


The first major focus concerned the relation- 
ship between sex role orientation as measured 
with the BSRI and identification with the fem- 
inist movement—an attempt to locate the 
construct of androgyny in the context of con- 
temporary and relevant social phenomena, 
Contrary to expectation, BSRI scores have 
failed to correlate with two separate measures 
of attitudes toward women (Zeldow, 1976; 
Kamens & Liss-Levinson, Note 2). It is rea- 
sonable to assume, however, that individuals, 
particularly females, who conceive of them- 
selves in a less traditional manner would be 
more favorably disposed toward a social move- 
ment attempting redefinition of appropriate 
sex role behavior. Another, and perhaps less 
obtrusive, index of feminist orientation could 
be obtained from the frequenc: 


y and type of 
gender reference used in self-descriptions, since 
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two common themes of the feminist move 
have been consciousness raising and chall 
to manifestations of sexism in the lang 
used to describe the sexes. Thus, it was 
pothesized that subjects with less conven 
sex role orientations would be (a) more sy! 
thetic to the goals and objectives of wo 
liberation; (b) more inclined to refer to ge 
in self-descriptions, particularly females; | 
but less likely to use a diminutive or denig 
ing form of gender reference such as girl vers 
woman or boy versus man. 


Personality and Adjustment 


| 


A second area of inquiry involved the rela 
tionship between androgyny and other pe 
sonality constructs concerned with ad 
ability, coping strategies, adjustment, 
self-direction, The major implication of B 
concept of sex roles is the notion that androg 
nous individuals are healthier and more adap 
able as a consequence of their greater behav 
oral flexibility, Because only limited and som 
what contradictory data have been prod 
to date to support this assumption, five perso 
ality-adjustment dimensions were selected | 
assess Bem’s androgyny equals mental he: 
hypothesis including neurosis, introvers 
extraversion, locus of control, self-esteem, 
problem drinking. ; 

Neurosis is generally viewed as a disposi 
tional index reflecting tendencies toward a 
ety, avoidance, moodiness, ineffective an 
self-defeating coping strategies, poor „socia 
skills, and a negative self-image. Within th 
context of neurosis, Eysenck and Rach 
(1965) defined the construct introversion 
traversion as a continuum ranging from vulner 
ability to anxiety, apathy, irritability, dep 
sion, contemplation, and aloofness _ 
hypochondriasis, sex problems, and optim 
Locus of control refers to the expectation a 
one’s behavior will lead to desirable goals ane 
reinforcement. Individuals might perceive the 
outcome of a behavioral event as under 


their environment, less effective at cognitiv 
Processing and academic achievement, and les> 
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independent and self-reliant (Phares, 1976). 
Self-esteem denotes an evaluation of one’s own 
adequacy and worth. Low levels of self-esteem 
have been related to a variety of adjustment- 
related variables including anxiety, unhappi- 
ness, neuroticism, drug usage and alcoholism, 
lack of confidence, susceptibility to external 
influence, and so forth (e.g., Coppersmith, 
1967; Wylie, 1965). Even though the implica- 
tions of problem drinking are straightforward, 
it is important also to note that some theories 
of problem drinking and alcoholism suggest the 
involvement of exaggerated sex role anxiety. 
Thus, male problem drinkers appear to derive 
a sense of power and masculinity from intoxi- 
cation, whereas female problem drinkers gain 
a greater sense of feminine adequacy (e.g., 
McCelland, Davis, Kalin, & Wanner, 1972; 
Wilsnack, 1973a, 1973b). 
If, as we assume, androgyny implies adapt- 
ability along a variety of dimensions and in 
humerous settings, then it would be expected 
that androgynous subjects would yield a pat- 
tern of responses on these variables reflecting a 
strong sense of self-worth, inner-direction, 
social competence, and freedom from path- 
ology. Specifically, it was predicted that an- 
drogynous individuals would be (a) less 
neurotic; (b) more extraverted; (c) more in- 
anal; and (d) higher in self-esteem, (e) with 
ewer alcohol-related problems. 


Intellectual Competence 


Less conventional sex role orientations have 
aa associated with superior intellectual func- 
es and competence, suggesting indirect 
Upport for Bem’s adaptability hypotheses, 
oo these differences seem more import- 

t for females than for males (Maccoby, 1966; 
of al & Jacklin, 1974). A more direct test 
ual È implications of androgyny for intellec- 
cati unctioning might involve relevant appli- 
Lan of intellectual capacity, for example, 
ele problem solving or @ sophisticated 
fined of political awareness. Creativity 1S de- 
priat as the production of original but appro- 

Sa solutions to a problem and has been 
tive to correlate with a preference for cogni- 
and complexity, independence, intuitiveness, 
intelli elf-acceptance, although typically not 

gence (Butcher, 1968). Political aware- 


301 


ness has been found to predict a wide variety 
of important politically related behaviors and 
predispositions, for example, ideology, political 
participation and commitment, political 
apathy and disenchantment, and so forth (e.g., 
Campbell, Converse, Miller, & Stokes, 1964; 
Robinson, 1967). 

Recent research has suggested still another 
related influence on the probability of success- 
fully accomplishing one’s goals in our culture, 
that is, an individual’s self-perception of com- 
petence in endeavors requiring assertiveness, 
concentration, and skill preparation. One im- 
portant recent study in this area found women 
less willing to trust their own capacities when 
given an opportunity to engage in either a 
game of skill or a game of chance (luck) to 
acquire a valued prize (Deaux, White, & Far- 
ris, 1975). Although this finding seems gener- 
ally attributable to the dominant sex role 
orientation of most males and females, passiv- 
ity and lack of confidence in one’s own skill 
might be viewed as an underlying character- 
istic of individuals socialized as feminine, inde- 
pendent of gender. An analysis of such data 
that included the relative influence of sex and 
sex-type would more completely address the 
question of the capacity of feminine persons of 
either sex to acquire sufficient aspiration, as- 
sertiveness, self-preparedness, and confidence 
for entry into the competitive atmosphere of 
employment and professional endeavor. 

Specific hypotheses derived from Bem’s the- 
ory were that when compared to sex-typed per- 
sons, androgynous subjects would be (a) more 
creative, (b) more politically aware, and (c) 
less likely to prefer luck over skill in an effort 


to acquire a goal. 


Helplessness 


Learned helplessness is defined as a decre- 
ment or interference in instrumental respond- 
ing due to inescapable and uncontrollable 
aversive events and has been suggested as a 
model of human depression (Seligman, 1975). 
Since the consequent behaviors of helplessness 
bear resemblance to the vulnerability, con- 
formity, lack of assertiveness, and behavioral 
limitations attributed to sex-typed females, 
ht be expected to medi- 


sex role orientation mig 
ate susceptibility to the helplessness effect. 
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More specifically, it was hypothesized that 
helplessness would vary as a function of the 
degree of femininity of the subject, with femi- 
nine females and opposite sex-typed males 
showing the greatest vulnerability, whereas 
androgynous subjects regardless of sex and 
sex-typed males would not become as helpless. 


Sexual Maturity and H; eterosexuality 


One consequence commonly attributed to 
unconventional sex role orientation is retarded 
heterosexual involvement and development. 
Recent studies by Allgeier (1975; Allgeier, 
Note 1) indicate instead that androgynous fe- 
males may be in fact heterosexually precocious, 
whereas contradictory results were reported 
for androgynous males. To further clarify the 
relationship between sex type and behaviors 
related to heterosexual adjustment, subjects 
completed an inventory requesting retrospec- 
tive data concerning their adolescent sexual 
and dating experiences, as well as interpersonal 
feelings. It was hypothesized that androgynous, 
as compared to sex-typed subjects, would re- 
port (a) more intimate heterosexual involve- 
ment; (b) fewer heterosexually inhibiting feel- 
ings, for example, shyness; (c) greater knowl- 
edge and awareness of sex and reproduction ; 
and (d) fewer parental restrictions in the area 
of sexual behavior, 


Method 
Subjects 


Eight separate samples (total N = 1,404) of gen- 
eral psychology college students served as subjects in 


the several studies to be reported, in exchange for 
nominal course credit, 


Procedure 


Students in introductory psychology classes routinely 
completed the BSRI. Subjects were then recruited from 
the resulting subject pool, and each sample was inde- 
pendent except as indicated below. 


Bem Sex-Role Inventory 


The BSRI is a self-report scale designed to measure 
the extent of an individual’s identification with desirable 
masculine and feminine traits (Bem, 1974). It contains 
60 personality characteristics, previously scaled as being 
desirable traits for males (masculine items), for females 
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(feminine items), or desirable for both males 
males (neutral items). Subjects are asked to 
on a 7-point scale the extent to which each a 
tic is “true of them.” The BSRI was originally 
by calculating the £ ratio of scores in response 
culine items versus responses to feminine items. 
manner, both males and females have been cl: 
sex typed (feminine females and masculine 
drogynous, or opposite sex typed (masculine e 
and feminine males). 
Bem (1974) initially identified five levels of sex typ 
using the BSRI: feminine (¢ > 2.025, p < 05), n 
feminine (1 < ¢ < 2.025), androgynous (—1 < t< 1) 
near masculine (—2.025 < £ < —i), and masculi 
(¢ > —2.025, p < .05). For the present series of stu 
three levels of classification were used: sex ty 
(t > 1.0, in the direction of the subject’s sex), androgy 
nous (¢ < 1.0, reflecting a relatively equal endorsel 
of masculine and feminine characteristics), and 
site sex typed (¢ > 1.0, in the direction opposite 
the subject’s sex). Subjects were thus classifi 
masculine males (MM), androgynous males ( 
feminine males (FM), feminine females (FF), andro 
nous females (AF), and masculine females (MF), T 
system of classification reflects a decision not tod 
the large number of subjects who fall in the “near” 
typed and “near” opposite-sex-typed categories. Mol 
over, analyses that include the entire population r 
tribution of sex typing and not just comparisons 
more extreme cases provide a more rigorous test of 
construct and would be expected to enhance the general 
izability of findings. 
Using this method of classification (1 >t > —1), Bi : 
reported that about 50% of her samples were conven 
tionally sex typed, 35% were androgynous, and 15 
were opposite sex typed (Bem, 1975). Similarly, for a 
samples reported below, 51.3% of the subjects were 
ventionally sex typed, 34.1% were androgynous, [ese 
14.6% were opposite sex typed, indicating the com 
parability of the present subjects to Bem’s sample 
As previously indicated, a controversy has arisen 
the literature regarding the most appropriate procedu! 
to be used in operationalizing the construct of ant 
yny. Bem’s subtractive technique has been criti i 
for its inability to detect undifferentiated subjects, Ma 
contrary to the original formulation, appear to be: ‘i 
ficient in self-esteem. As an alternative, Spence et a 
(1975) have suggested an additive procedure in Wi ie 
a fourfold classification is generated based on the ol 
served subscore medians. Thus subjects are classifie i 
androgynous (masculine and feminine scores above # 
medians); sex typed or opposite sex typed (one nee 
above, the other below the median); and we 
ated (both scores below the median). Bem (1977) ; 
conceded the problem that is created by including U; 
differentiated subjects in the androgyny category, fm 
she prefers the use of multiple linear regression te 
niques as a solution. ee 
Much of the initial appeal of the construct of #! 
drogyny resided in the simplicity of the central nee 
that a balance of desirable sex-typed chara ae 
would have liberating effects, reducing the cone 
of convention, thereby permitting interpersonal 
bility. Since the additive model appears to add 


qualifying dimension of self-esteem to the construct of 
androgyny and since the use of regression techniques 
abandons the concept of balance altogether, the original 
subtractive method was retained in the present series 
of studies as the method of classification. However, to 
assess. the theoretical implications of the additive 
model, comparisons were also made between androgy- 
nousand undifferentiated subjects for feminist ideology, 
adjustment and personality, and intellectual compe- 
tence variables. 

Feminist ideology and gender identification. To test 
the feminism and identity hypotheses, 155 subjects 
T completed the Women’s Liberation Ideology Scale 
(WLIS; Goldschmidt, Gergen, Quigley, & Gergen, 
19), a 12-item scale that measures endorsement of the 
feminist” orientation on several contemporary issues, 
for example, abortion, equal pay for equal work, and so 
forth, A second sample of 163 subjects completed the 
TAm” test (Kuhn & McPortland, 1954), consisting of 
20 sentence completion responses to the statement I 
am, from which the frequency and type of sex referents 
can be determined. Inclusion of a particular reference, 
for example, ethnic group, is considered to indicate the 
silience of that reference dimension to the individual 
respondent (Robinson & Shaver, 1969). 

Personality and adjustment. Subjects (n = 176) 
completed two widely used instruments designed to 
assess variables theoretically related to the adjustment 
ind social competence implications of androgyny in- 
ine neurosis, extraversion-introversion (Eysenck 
pack, 1963), and locus of control (Rotter, 1966). 
ere samples (ns be 168 and 147, respectively) 
ie eted a problem with alcohol inventory (Manson, 

7 ) and a measure of self-esteem (Coopersmith, 1967). 
ee competence. The first sample listed in the 
| ee section (» = 176) also was instructed to list as 
| ain current United States senators as they could 

a i 10-minute time limit. Questions dealing with 
a edge of political figures have previously been 
Willa ia discriminate on more general dimensions of 
A ical knowledge and awareness (Robinson, 1967). 

eee sample of subjects (n = 91) completed the 

Christ Uses Test, a measure of divergent creativity 
fe ace Guilford, & Wilson, 1957). Specifically, 
a S were presented with the names of six objects 
$ ea an old tire, a button, etc.) and were instructed to 
night Ha uses that the object or a part of the object 
then ee beyond its typical function. Responses were 
l eted om by calculating the frequency of each sug- 

Ree for each object and by assigning the value of 
ubject G totally unique response generated by each 
med i.e., not listed by any other subject) and sum- 

Co across the six stimulus objects. For the luck versus 
02 a poetin; 191 general psychology students 

ed le and 99 female) previously categorized as Sex 
BSRT androgynous, and opposite sex typed using the 
ae i given an opportunity to volunteer to par- 
credit ie one of two experiments for extra course 
Dying ley were told that both experiments involved 
fects aa electronic game and that although all sub- 
the total earn minimal credit simply for participating, 
of how s amount of credit earned would be a function 
è y one performed the game. Success in 

e, they were told, was a function of luck; the 
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other required skill. Subjects then chose the game they 
preferred to try. Their choice, to trust their skill versus 
luck, was the dependent variable, and the games were 
not actually conducted. 

Learned helplessness. Subjects (n = 156) were 
tested individually, following exactly the human learned 
helplessness procedure of Hiroto and Seligman (1975). 
Masculine, androgynous, and feminine males and fe- 
males were assigned randomly to three pretreatment 
conditions in which they were (a) given an insoluble 
concept learning task (helplessness group), (b) given a 
similar but soluble concept learning task (soluble 
group), or (c) were simply shown the stimulus cards 
without completing the task (control group). Helpless- 
ness on a subsequent problem-solving task (a series of 
five-letter anagrams) was then measured by latency to 
criterion, number correct to criterion, and trials to 
criterion. 

Sexual maturity and heterosexuality. Subjects (n 
= 136) completed an inventory requesting retrospec- 
tive self-report data concerning their adolescent sexual 
and dating experiences, as well as feelings and experi- 
ences that might inhibit or facilitate heterosexual activ- 
ity. Item content dealt with such factors as having a 
steady boyfriend-girlfriend prior to age 16; kissing be- 
fore age 16; sexual intercourse before age 18; learning 
to dance prior to age 16; popularity with the opposite 
sex; sensitivity to criticism; shyness; awkwardness; 
embarrassment; sexual curiosity; sexual worry ; knowl- 
edge of sex; frequency of dating; relative importance of 
dating versus academic performance; parental prohibi- 
tions against dating; and parental admonitions against 


pregnancy. 


Results 


and F tests for feminist 
attitudes, personality, and adjustment corre- 
lates, intellectual competence variables, and 
items from the sexual maturity and hetero- 
sexuality scales are presented in Table 1. 


Table 2 contains percentages and chi-square 


tests for those variables elicited in a frequency 


data format. 


Asummary of means 


Feminist Ideology and Gender Identi ification 


Table 1 shows that as expected, sex type was 
related to feminist ideology. Although there 


was no main effect of sex type on WLIS scores, 


there was a significant interaction between sex 
(t tests 


and sex type. Analysis of simple effects 
between sex types within sex groups) revealed 
differences among female subjects only, with 
MF indicating significantly more, favorable 
attitudes toward women’s liberation issues than 
either AF or FF, and a trend of AF to showing 


more favorable attitudes than FF (p< .10). 


304 


Table 1 
Feminist Ideology, 
Maturity Variables 


Personality and Adjustment, 
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Intellectual Competence, and Sexual 


as a Function of Sex and Sex Type 


ee 


F ratio 
G M — 
S Sex Type 
V P 
Variable n MM AM FM FF AF MF (A (B) AXB 
Pe S E a a o a A 
WLIS 155 59.91, 51.73, 51.44, 55.64, 59.69, 68.11), / 
Locus of control 175 7.92, 10.92, 12.12, 10.32, 11.05, 10.22, 08 3.31" i 
Neurosis 176 9.80, 10.964, 14.22, 11.17, 11.00, 11.78, 18. 3.62 ca 
Extraversion 176 13.92, 13.92, 12.1 ia 11.56, 12.85, 16.22, 4.13% 1.85 can 
Self-esteem 147 19.52, 19.14, 16.00, 17.32, 18.46, 19.89, 20. samt a 
Problem drinking 168 7.56, 10.52) 11.78, 7.07, 6.12, 10.14, 3 0c Fo pir 
Political awareness 176 6.39, 3.41, 5.67.» 2.42, 3.68 4.00,  6.78°* 1.2 5 


Creativity 


28a, 7.00, 3.69, 6.14, 4.00 . 
PSS NR RN Sa ee EE oe el 


7 5,79" 


Sexual maturity (n = 136) 


Curiosity 


3.65 4.12 
Popularity 3.00, 2.76, 
Worry 2.31 2.47 
Shyness 2.81, 2.59, 
Embarrassment 2.81, 3.35, 
Knowledge 4.15 4.35 
Dating 2.04 231 
Grades vs. dates 3.19 265 
Sensitive to criticism 3.35, 3.59, 
Awkwardness 2.32 2.24 


Note. Within sex groups, means with different subscri 


= Women’s Liberation 

nine males; FF 
*p <05. 
“p< 01, 


Ideology Scale; MM 
= feminine females; AF = a 


Not surprisingly, all females regardless of sex 
type scored significantly higher than males, 
Thus being female, and particularly, being a 
less traditionally sex-typed female was related 
to greater endorsement of contemporary wo- 


men’s issues, providing construct validation 
for the BSRI. 


Somewhat contra 


dictory results were ob- 
tained on the I Am 


Test, as indicated in Table 
2. As a group, females listed sex more fre- 
quently than did males, x2(1) = 4.19, p < .04, 
suggesting greater salience of gender identifica- 
tion for females. Among females, however, 
there were no significant effects of sex type on 
how frequently sex was included. For males 
there were sex type effects, with both FM (z 
= 6.01, p < .01) and AM (2 = 3.44, p < .01) 
listing gender more often than MM. 

Among those subjects listing their sex, fe- 
males tended to refer to themselves as a girl 
more frequently than males referred to them- 


4.22 3.80 3.73 3,93 68 .90 56, 
2.78, 2.46, 2.93, 3,80, 1.18 2.87 z 
2.56 2.13 2.17 2.40 81 AL be 
3.22, 3.33, 2.40, 3.13, 1.33 245 co 
3.44, 3.69, 2.70, 2.06, 3.33 1.86 10. p 
411 4.00 3.87 3.93 1.83 06 P 
att EBI 238 2.73 Al 1,47 i 
2.78 3.05 3.23 3.00 92.37 roe 
4.56, 4.22, 3.53, 3.67, OL 2.86 m 
2.67 26a 1.77, 2.53, .17 3.26 . 


ipts are significantly different at the .05 level. WLIS 


: FM = femi- 
= masculine males; AM = androgynous males; FM = femi 
ndrogynous females; and MF = masculine females. 


selves as a boy, X?(1) = 3.24, p < .07. TA 
pattern was unexpected and may suggest A 
internalization of at least some types of a 
typic social attitudes differentiating the sex 
for females. Ti 
Despite the contradictions, these da 
gest that gender is currently a more ae 
aspect of self-concept for females, which wre 
reflect the recent politization of female co 
sciousness by the feminist movement. 


Personality and Adjustment 


The data were analyzed in a series of 2 x 
(Sex X Sex Type) analyses of variance. T 
1 reveals significant sex differences for ex a 
version and problem drinking, with males oa $ 
ing significantly higher on both aiea it 
significant sex type differences for locus ei fo 
trol, neurosis, and problem drinking; an li- 
nificant interactions on extraversion and $ 
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steem, Analysis of simple effects indicated 
ihat with the exception of extraversion, all of 
| the personality differences associated with sex 
type occurred among male subjects. 

Contrary to expectation, AM showed greater 
[externality of control, more problems with 
drinking, and a trend (p < .06) toward greater 
introversion than MM. Similarly, FM scored 
more external, more neurotic, lower in self- 
steem, and had more alcohol problems when 
compared with MM. FM were also lower in 
}sef-esteem and more neurotic ($ < .10) than 
[vere the AM. Contrary to prediction, these 
results suggest that less sex-typed males ex- 
perience more rather than fewer adjustment 
problems. For example, the relatively disfunc- 
tional scores for problem drinking, neurosis, 
and self-esteem suggest existing behavioral 
|ificulties and a poor image of self-worth, 
{Whereas the locus of control and extraversion 
4ores imply ongoing or future inadequacy in 
tems of social competence and self-direction. 
Itshould be noted that the AM scored signifi- 
tantly in the “less adaptable” direction on only 
| two of these scales relative to the MM. On the 
ther hand, in no instance did the AM score 
‘gnificantly “more adaptable” than the MM. 
_ Among females, analyses of simple effects 
dicated that MF were more extraverted than 
tther AF or FF. No other significant differ- 
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ences were observed for females, although a 
trend suggested that MF had more problems 
with alcohol than did AF (p < .07). 


Intellectual Competence 


Table 1 shows that although no main effects 
of sex or sex type were observed, significant 
interaction effects were obtained for both poli- 
tical awareness and creativity. AM scored 
lower on political awareness than did MM, and 
there was a similar trend (p < .09) for creativ- 
ity. Likewise, AM scored significantly lower on 
creativity than the FM, whereas FM and MM 
did not differ in these analyses. Thus, AM 
clearly performed more poorly than did either 
sex-typed or opposite-sex-typed males on these 
indices of intellectual competence. 

By contrast, analysis of simple effects among 
female subjects supported the predictions, in 
that both AF and MF scored higher on politi- 
cal awareness than did FF while not differing 
from one another. For creativity, the expecta- 
tion was more directly confirmed, with AF 
yielding higher scores than FF with a similar 
trend (p < .10) in relation to MF. 

Thus, the intellectual variables included in 
the present series of studies support Bem’s the- 
ory with respect to females while directly con- 
tradicting it for males. Less traditional sex role 


Table 2 
qiatency Data of Gender Identification, Sexual Maturity, and Luck Versus 
ill as a Function of Sex and Sex Type 
4 x 
te a AET 
Variable MM AM FM FF AF MF Males Females 
Gender identification (1 
r 55 
ME tinge pene 152 43.5 33.3) 444 AL20 502 rh Eri 
Boy-girl o 100). 14.31, 312 aeetomee : : 
Sexual maturit 
y (136) N 
Boyfriend-girlfriend 333 a2 222 308 433 967 re om 
Kissing 741. 10.6 S56 SA uu OT gutted 486.63" 
Intercourse tha) (ATR a33 ASI meee 03 4&7 
Dancing sos 588 | S50 MAD EROS Se eed 370 226 
Pregnancy warning 704 41.2 55.6 436 600 60.0 i 5 
| _Huck vs, skill (191) 268 22.2 154 518 35.6 200 se bih's 


ine EN umbers in parentheses are ns. MM 
#2 S05, 
?<.01, 


= masculine males; AM 


iy 

1 

Note, p k 

than skij na tages for the luck versus skill data represent the proportion o eee csi AMS ene 
; FF = feminine females; AF = androgynous females; MF 


of each group choosing luck rather 


= masculine females. 
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Table 3 
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Means by Sex and Sex Type for the Helplessness Experiment ' 


| 


Males Females 

Variable and sex type Helpless Soluble Control Helpless Soluble Control 
M latency to criterion Masculine 30.65 19.95 30.43 28.80 31.60 22.60 
Androgynous 43.88 42.86 37.73 40.88 26.08 46.81 

Feminine 42.33 27.41 27.61 33.35 26.29 35.94 | 
Trials to criterion Masculine 9,20 6.75 8.70 10.00 11.00 80 
Androgynous 12.40 10.29 12.22 12.46 8.67 14.89 

Feminine 13,33 10.00 10.14 12.23 8.54 1118 | 

M errors to criterion Masculine 2.70 1.33 2.30 2.20- 3.00 Wi | 
Androgynous 6.00 3.86 4.11 5.00 1.92 a 

Feminine 4.67 2.00 1.71 3.23 1.54 2l 


Note. Data for mean latency to criteria are in seconds, 


orientations appear to be associated with 
greater intellectual preparedness among fe- 
males, whereas the pattern among males is 
more complex, with both sex-typed and oppo- 
site-sex-typed males manifesting greater intel- 
lectual competence than AM, 

Analysis of the luck versus skill manipulation 
replicated the earlier finding of Deaux et al. 
(1975), with males more frequently than fe- 
males selecting a game of skill rather than a 
game of chance to earn extra course credit in 
general psychology, X2(1) = 7.35, p< 01. 
However, as indicated in Table 2, further 
analysis revealed no sex role interaction among 
males and only a trend (p < 08) among fe- 
males. Nevertheless, the pattern of Percentages 
was in the predicted direction for females, with 
slightly more than half of the FF selecting the 
game of chance rather than the skill task. On 
the other hand, only one third and one fifth of 
the AF and MF, Tespectively, chose to rely on 
chance rather than their own abilities, These 
data then support Deaux et al.’s earlier find- 
ings and suggest that they may be partially 
accounted for by the lack of confidence of the 
FF group. Furthermore, these findings are 
consistent with the literature Suggesting timid- 
ity, lack of confidence, self-effacement, andlack 


of a sense of self-worth among sex-typed fe- 
males, 


Helplessness 


The three dependent variables were analyzed 
with a series of 2 X 3 x 3 (Sex X Sex Type 


— 


X Helplessness Condition) analyses of va 
ance. Table 3 presents cell means for each oh 
these variables. The analyses revealed signili 
cant sex type effects on mean latency of trid 
to criterion, F(2, 138) = 3.35, p < .05, anda 
mean errors to criterion, (2, 138) = 40 
$ < .02. Also, a nearly significant tren 
emerged for the helplessness condition on aa 
errors to criterion, F(2, 138) = 2.51, $ < dl 
No other significant main effects or interacti0l 
terms were observed. Further analyses (i¢ 
planned-comparison ¢ tests between sex i 
within sex groups, and collapsing across eP 
lessness conditions) revealed that the only K 
nificant differences that occurred were betwei 
AM and MM for both mean latency to 
terion, ¢(50) = 2.29, p < .03, and mean erto 
to criterion, (50) = 2.25, p < .03. v 

AM were slower than MM in terms of e 
ing of anagrams, and they made more E 
before reaching criterion. Thus there bes, í 
sex type differences in susceptibility H E 
helplessness manipulation. However, a 
data parallel that of political awareness e 
creativity, indicating apparent deficits p A 
AM in the areas of problem solving and 8 
quired knowledge. 


Sexual Maturity and Heterosexuality 


The sexual maturity and heterosexual) 
data were analyzed with 2 X 3 (Sex ie 
Type) analyses of variance and 2 X 3 ( ding 
False X Sex Type) chi-square tests depen 13 
on the nature of the data. As indicate 
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Table 1, several Sex X Sex Type interactions 
were observed while one main effect for sex 
typing emerged (for awkwardness). Generally, 
the results indicated support for the predictions 
that androgynous as compared to feminine fe- 
males would report more intimate heterosexual 
involvement and fewer feelings of inhibition, 
but they lack confirmation for the expectation 
that androgynous females would indicate 
greater knowledge and awareness of sex and 
fewer parental restrictions. Planned-compari- 
son analyses revealed that AF reported being 


less sensitive to criticism, less shy, less awk- 


: 
| 


ward, and less easily embarrassed than FF. 
Also, MF reported being more popular with the 
opposite sex than AF, with a similar trend to- 
ward being less easily embarrassed (p < 07), 
but MF also reported being more awkward 
than AF. Moreover, as indicated in Table 2 
analogous patterns were observed among fe- 
males in the frequency data, with more active 
adolescent sexual and related behavior being 
associated with less traditional sex role orien- 
tations. More MF than AF reported having a 
boyfriend before age 16 (z = 2.30, p < .02), 
kissing before age 16 (z = 2.31, p < .02), and 
sexual intercourse before age 18 (z= 2.28, 
?<.02). Similarly, more MF than FF re- 
Ported having boyfriends before 16 (z = 4.00, 
>< 01), kissing before 16 (z = 4.31, $ < 01), 
intercourse before 18 (z = 4.20, p < .01), and 
a trend in the direction of more frequently 
aming how to dance before 16 (z = 1.70, 
>< .09). AF also were more likely than FF to 
‘port dancing before 16 (3 = 3.20, p < .01), 
ssing before age 16 (z = 1.86, p < .06), and 
Sexual intercourse before 18 (z= 1.77, 
$< 07). 
Sex type was a less powerful predictor of the 
Sexual maturity and heterosexuality items for 
ps however. Beyond the tendency of FM 
Br greater sensitivity to criticism than 
a er MM or AM, no significant differences 
a observed on any of these items except for 
3 F Ed of AM to report being more easily 
arrassed than MM (p < .10). 
pe together, the male data suggest dif- 
Fier ces only in terms of what might be called 
personal feelings, with traditionally sex- 
A ed males reporting fewer personal liabilities 
nd limitations that might be expected to in- 
It or constrain heterosexual involvement. 
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Table 4 
Means for Ideal Sex Role Identification 


Sex type 
Item Masculine Androgynous Feminine 
Males 
Masculine 4.63 5.13 5.63 
Feminine 4.25 4.11 3.98 
Females 
Masculine 4.46 4.69 5.09 
Feminine 4.30 4.40 4,47 


Note. Higher means indicate greater desired change. 


By contrast, less traditional females not only 
less frequently reported restraining personal 
characteristics (e.g., shyness), but they also 
reported greater heterosexual involvement. 
Thus, sex role identification appears to be a less 
powerful influence on the heterosexual be- 
havior of males, as was found by Allgeier 
(1975). However, the present analyses also 
suggest that greater heterosexual involvement 
is probably a masculine characteristic inde- 
pendent of gender. 


Ideal Sex Role Identification 


The main purpose of the present series of 
studies was to examine attitudinal, personality, 
and adjustment implications of psychological 
androgyny. Analysis of data from the foregoing 
studies, however, failed to provide support for 
the hypotheses concerning androgynous males. 
One explanation is that Bem’s theory is inac- 
curate, at least for males. A test of this possi- 
bility would be to measure the extent to which 
subjects would change their personality as re- 
flected by the BSRI if that were possible. It 
might be the case that certain sex type groups 
are dissatisfied with their present behavior and 
inclinations. For example, if sex-typed subjects 
yielded significant desired change scores in the 
direction opposite to their gender, this would 
indicate the recognition of the limitations of 
conventional sex role expectations and the pre- 
ference to be more androgynous, thereby in- 
directly supporting Bem’s theory. It is not im- 
plied that subjects would conceptualize such 
change in the language of sex typing, but rather 
that feminine females might prefer to be more 
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Table 5 
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Analysis of Variance of Ideal Sex Role Identification 


Mis... <r 


Source of variation df SS MS F 
Between subjects 134 50.496 
Sex of subject (A) 1 185 185 .535 
Sex type (B) 2 429 214 618 
AXB 2 5.233 2.616 7.561** 
Error 129 44.649 +346 
Within subjects 135 56.301 
Type of item (C) 1 21.112 21.112 89,838**+** 
AXC 1 2.449 2.449 10.421*** 
BxC 2 -382 191 «813 
AXBXC 2 2.092 1.046 4.451* 
Error 129 30.266 -235 
Total 269 106.797 
* p = .0135. 
** p = 008, 
*** b = 0016. 


w p = 0001. 


assertive and masculine males might desire 
greater tenderness. 

Thus, the heterosexuality sample (n = 136) 
also completed the BSRI for a second time with 
a new set of instructions, For the second ad- 
ministration, subjects were asked to indicate 
the extent to which they would prefer to have 
more, less, or remain the same on each of the 
traits and qualities of the BSRI. Subjects in- 
dicated their responses on a 7-point scale, with 
1 = less, 7 = more, and 4 = no change de- 
sired. 

Responses to the desired change instructions 
for the BSRI were summed for masculine and 
feminine items separately for each subject, and 
means (by sex type) are presented in Table 4, 
These scores were then analyzed in a 2 X 3 
X 2 (Sex X Sex Type X Masculine vs. Femi- 
nine Items) analysis of variance with repeated 
measures on one factor. As indicated in Table 
5, the strongest effect occurred for the type of 
item, with subjects indicating greater desired 
increases on masculine than feminine items, 
The Sex X Sex Type interaction suggested that 
greater change was desired by opposite-sex- 
typed males and by sex-typed females. The Sex 
X Items interaction revealed a tendency for 
males to desire relatively greater increases in 
masculinity and relatively lesser increases in 
femininity than did the female subjects, 

Analyses of simple effects indicated that AM 
desired greater increases on masculine items 


than did MM, #(40) = 3.25, p < .01; a similar 
difference between FM and MM, #(33) = 4.41, 
? < .01; and that FM desired greater increases 
on masculine traits than did AM, ¢(23) = 2.20, 
$ < .05. For feminine items there were no dif- 
ferences among males except for a tendency ol 
FM to desire less increase than did the MM, 
1(33) = 1.83, p < 10. Among female sub- 
jects, FF desired significantly greater increases 
on the masculine items than did AF, (67) 
= 3.53, p < .01; FF showed a similar prefer- 
ence for increases on the masculine traits 
compared to MF, ¢(52) = 4.78, p < .01, but 
the difference on masculine items between AF 
and MF was not significant. Also there were no 
significant differences or trends among females 
on desired changes in femininity. 
Thus, two important conclusions can be 
drawn from the ideal sex role identification 
data. First, both males and females indicated a 
strong preference for changes in the direction i 
masculinity as defined by the BSRI. Again, ! 
should be noted that such desired change ie 
not, and probably is not, conceptualized by the 
subjects as a desire to become more masculine. 
Rather, subjects indicated a strong desire i 
increase their capacity to behave in an ele 
mental fashion, that is, more assertively, mor 
decisively, and so forth. It is interesting thar 
the desired change in the direction of met 
linity does not occur at the expense of the Mies 
measuring femininity, which might be inte 
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indicated, only one comparison produced sig- 
nificant differences (male undifferentiated sub- 
jects were more external) and two nonsigni- 
ficant trends. 


Discussion 


The initial purpose of the present series of 
studies was to test hypotheses derived from 
Bem’s theory of androgyny with respect to 
conventionality, adaptability, social compe- 
tence, and adjustment. Only partial support 
was obtained, however, and in two important 
respects, data patterns systematically con- 
tradicated predictions from Bem’s theory. 
First, the androgyny equals adaptability hy- 
pothesis seems not to hold for males. In most 
instances androgynous males scored in the less 
adaptive direction than masculine males, and 
frequently these differences were significant. 
In no case were androgynous males found to be 
significantly more adaptive, flexible, or compe- 
tent than masculine males. Moreover, the 
failure of androgynous males to yield scores 
suggesting more adaptive or effective coping 
ability occurred across a wide variety of per- 
sonality, adjustment, and intellectual vari- 
ables, for example, locus of control, anagram 
solution, alcohol problems, creativity, political 
awareness, and so forth. It seems reasonable to 
conclude that these differences derive from 
basic psychological processes rather than the 
specific tasks and instruments selected for in- 
clusion in these studies. Also, with only two 
exceptions, the dispositional tendencies of 
feminine males appeared to be even less adapt- 
ive. Again, as compared to masculine males, 
feminine male subjects were less secure and 
flexible in numerous areas such as self-esteem, 
problems with alcohol, sensitivity to criticism, 
neurotic conflict, locus of control, and so on. 
Only on the intellectual dimensions of creativ- 
ity and political awareness did feminine males 
manifest socially desirable characteristics, 

Thus, a pattern emerged in which masculine 
males can be described as more competent and 
confident on numerous dimensions, whereas 
less traditionally sex-typed males are generally 
more limited and restricted, less effective and 
more vulnerable to influence, less sure of them- 
selves, and perhaps even less well adjusted. 
Not surprisingly then, when asked to indicate 
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their preference for change on BSRI items, 
feminine and androgynous males preferred to 
become more masculine, whereas masculine 
males indicated relatively little desire to 
change. It is also important to note that with 
the exception of intellectual functioning mea- 
sures, the relationship between sex typing and 
adaptability was generally linear among males, 
Second, although greater support for Bem’s 
formulations was obtained with female sub- 
jects, in one important aspect, there was a con- 
tradiction here as well. In support of the theory 
are the findings that androgynous females were 
less conventional, more outgoing, politically 
aware, creative, heterosexually active, and less 
awkward, shy, sensitive to criticism, and so on, 
than were feminine females. However, the in- 
clusion of opposite-sex-typed females, revealed 
that masculine females are even more feminist 
in their attitudes, more politically aware, more 
extraverted, more popular with the opposite — 
sex, more heterosexually involved, and so on, 
Stated more succinctly, the more masculine in 
orientation, the more adaptive, competent, and 
secure the female subject was. Minor exceptions 
to this pattern were detected, such as more 
drinking problems and a greater sense of awk- 
wardness for the masculine females. However, 
the weight of the evidence favors the above 
conclusion. Moreover, when given the oppor- 
tunity to indicate desired change in relation to 
BSRI items, it was the feminine females who 
expressed the greatest desire to change 1n the 
direction of masculinity, with less change baf 
dicated by androgynous females and least by 
masculine females. As was the case with males, 
the less masculine the female, the more desir- 
able increased masculinity became. A 
It should be noted that the precise manner mM 
which the present findings fail to support Bem s 
theory is somewhat subtle. Bem initially sug- 
gested that greater adaptability of the androgy- 
nous person can best be detected cross-situa- 
tionally. That is, although she did not claim 
that androgynous individuals would always be 
the most adaptable in a single situation (e-8» 
least conforming, most responsive to See 
needs, etc.), she did argue that when severa 
situations are taken into consideration, they 
will yield the most adaptive average patter? g 
behavior. As regards the first part of this argu- 
ment, the present data suggest that Bem !$ 


correct, in that only for creativity and 
awareness among male subjects was an 
ious group found to be significantly 
Je or competent than their sex- 
opposite-sex-typed counterparts. 
., the notion that androgynous subjects 
eld the most desirable pattern of re- 
across several situations is directly con- 
ed by the present data, in that sex- 
es and opposite-sex-typed females, 
ry few exceptions, showed the most 
le and competent pattern of responses. 
most succinct description of the present 
sis that the more adaptive, flexible, un- 
tional, and competent patterns of re- 
g occurred among more masculine sub- 
lependent of their gender. It is possi- 
efore, that the various consequences 
lly attributed to sex typing might be 
conceptualized as a function of mascu- 
the constellation of traits that the 
and other inventories define as mascu- 
y. In other words, it appears that general 
bility varies as a direct linear function of 
ive mix of traits dominated by such fac- 
assertiveness, decisiveness, and intel- 
ity, as opposed to nurturance, responsiv- 
d emotionality. In one regard this is not 
ing, since the items that comprise the 
nity subscale have the underlying 
n ionality of being instrumental in nature, 
tis, the ability to effectively and efficiently 
plish objectives. Similarly, most of the 
les examined involve stereotypically 
ne endeavors. The social appropriate- 
Jof such tendencies for males has long 
recognized in the literature of sex typ- 
F {eg., Broverman, Broverman, Clarkson, 
nkrantz, & Vogel, 1970; Maccoby & 
, 1974), What was unanticipated was 
females who completely violated societal 
ole expectations appear to be happier, 
Competent, and more adaptive than 
; androgynous or sex-typed females. 
Although beyond the scope of the present 
, the reason for this effect probably de- 
from a contingent relationship between 
ma nifestation of instrumental behaviors 
application of various social rewards 
as acceptance, approval, esteem, defer- 
ind the like. If this is so, it would explain 
ndency of subjects to judge the cross-sex 
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behavior of males (i.e., feminine behavior) 
more harshly than the cross-sex behavior of 
females (ie. masculine behavior; e.g., Fein- 
man, 1974). It may also have implications for 
many of the sex stereotyping studies (e.g., 
Broverman et al., 1970), in that what is being 
devalued in society is perhaps not female 
gender but, rather, feminine behaviors. 

Thus the important issue becomes not 
whether one has internalized the traits and be- 
haviors appropriate ‘to one’s gender but the 
extent to which one has assimilated the tend- 
encies most highly valued by society. Some 
theorists (e.g., Bakan, 1966) have proposed a 
continuum ranging from the agentic role (i.e. 
a dominance of instrumentality, rationality, 
strength, assertiveness, masculinity, etc.) to 
what is termed communality (i.e., femininity, 
nurturance, emotionality, expressiveness, etc.)- 
In a society that prefers the former to the 
latter, it becomes reasonable to conclude that 
individuals high in agentic tendencies will not 
only be more successful within the context © 
such a society’s values, but such persons will 
feel more confident due to a history of differen- 
tial application of social rewards. f 

Conceptualizing sex role phenomena in this 


manner raises several intriguing questions con- 
f both men and women. who 
ntation in a society that pre- 
fers and rewards masculinity; the appropriate- 
ness of currently developing clinical techniques 
that attempt the “androgynization” © 
sex-typed males and females (e.g, Kap% 
and the long-term implications 
t rewards agency, perhaps to 
detriment of commun ity. 
Regarding of defining androgyny 
according to the additive model, the present 
results are inconclusive. As Bem has noted, not 
be significant to sub- 


all comparisons need to i n P 
stantiate the problem of undifferentiated sub- 
jects. However, the weight of evidence In the 


present series of studies suggests that the 


cerning the role o 
are feminine in oriei 


androgyny 
changed by inclusion of the un 


subjects. 

Moreover, the Spence et al. (1975) method 
raises several conceptual issues that have not 
as yet been fully addressed in the literature. 
For example, the additive model tends to ob- 
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scure relative differences between masculinity 
and femininity that in the present studies were 
consistently related to important behaviors 
and dispositions for both males and females, 
Similarly, in some instances the additive 
method classifies subjects in a manner that 
appears to be inconsistent with the original 
concept of androgyny (e.g., subjects whose 
masculinity and femininity subscores are not 
significantly different but one is above and the 
other below the medians are classified as sex 
typed or opposite sex typed). Also, the additive 
method defines androgyny in such a way that it 
may be self-esteem and not androgyny that is 
being measured. Thus further research is 
needed to determine both the validity and ef- 
fectiveness of the additive definition of androg- 
yny. In particular, studies are needed that 
compare the subtractive and additive methods 
to variables that on a priori grounds would be 
expected to indicate sex typing or its absence. 
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During Training and Posttraining Effects of Live and Taped 
Extended Progressive Relaxation, Self-relaxation, 
and Electromyogram Biofeedback 


Irving Beiman, Eileen Israel, and Stephen A. Johnson 
University of Georgia 


This study compared live and taped progressive relaxation (LR, TR), self- 


relaxation (SR), and electromyogram biofeedback (BF) on measures of auto- 
nomic and somatic arousal and subjective tension. Male and female respondents 
(N = 40) to an ad for therapy were evaluated in five training sessions and a 
Posttraining assessment of self-control. During training, LR was superior to TR 
on reductions in physiological arousal; SR and BF were equivalent except for 
the superiority of SR on reductions in autonomic arousal. After training, LR 
was superior to the other procedures on self-control of autonomic arousal. It was 
concluded that LR is the treatment of choice for a variety of clinical objectives. 


Although progressive relaxation training 
appears to be widely used in the clinical 
setting, research is equivocal regarding the 
physiological effects of the various forms of 
the procedure (Mathews, 1971). Using un- 
selected, nonvolunteer female psychology 
students as subjects, Paul and Trimble (1970) 
found abbreviated live training superior to 
taped training on all physiological systems 
measured. Russell, Sipich, and Knipe (1976) 
also found live training superior to taped 
training for undergraduate females. Consider- 
ing the economy and efficiency potentially 
afforded by taped training, it is important 
to determine the generalizability of these 
results to the clinical setting. Thus, one 
purpose of the present investigation was to 
compare the during training effects of extended 
live progressive relaxation to taped relaxation 
in a clinical population. 
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Other forms of relaxation training have also 
been recommended to therapists, including 
electromyogram (EMG) biofeedback (Cox, 
Freundlich, & Meyer, 1975; Reinking & 
Kohl, 1975) and self-relaxation training (Ben- 
son, 1976; Benson, Greenwood, & Klemchuk, 
1975). Reinking and Kohl found EMG bio- 
feedback superior to taped relaxation instruc: 
tions for the facial muscles on reductions 
forehead muscle tension in unselected psy- 
chology students. Using selected clinical 
subjects complaining of tension headaches, 
Cox et al. (1975) found live progr 
relaxation training and EMG biofeedbac! 
equally superior to a medication placebo 
group on posttreatment resting frontalis 
EMG. Beary, Benson, and Klemchuk (1974), 
using male and female normal subjects ma 
one group/own control design, found a se 
relaxation procedure to be effective in reducing 
oxygen consumption, carbon dioxide prea 
tion, and respiratory rate relative to contro 
periods. : 

Since a self-relaxation procedure with 4 
therapeutic rationale has not been compare 
to any other of the more commonly use 
relaxation procedures, a second purpose ° 
this study was to compare such a procedure 
to EMG biofeedback on during training 
reductions in physiological arousal and subje rd 
tive tension. Third, since these four relaxatl 
procedures have not previously been compa 
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ning control of tension level, a 
e was to evaluate their relative 
in teaching clients self-control 
‘the reduction of physiological arousal 
bjective tension. 


nts were 19 males and 21 females chosen 
ondents to local newspaper ads that solicited 
ple to participate in a psychological study to 
their tension. On the basis of an initial clinical 
A respondents were selected who were free 
ite illness, acute situational stress that could 
ed during the course of therapy, and usage 
tive medication; had not previously received 
in any of the treatment procedures to be used; 
in therapy to deal with tension; and indicated 
tension was a serious problem for them, Partici- 
were then ranked according to age (M = 27.1; 
20-54) and were randomly assigned to one of 
g conditions (n = 10/group): live progres- 
tion (LR); taped progressive relaxation 
self-relaxation (SR); and EMG biofeedback 
a to-male sex ratios were 5:5 for all groups 
exception of BF, which was 6:4. Each subject 
luled for six sessions at the same time, each 
ys apart, with treatment conditions random 
e of day. As a further control for physiological 
each female received her first session at least 
t more than 2 weeks after the onset of her last 
cycle. 


ment conditions were conducted in two adjoin- 
dproof, airconditioned, and electrically shielded 
ental chambers. All physiological recordings 
e on a Grass Model 7 polygraph with five 
iver amplifiers. Standard preamplifiers and a 
electrolyte were used except as described 

eart rate was recorded by gold-plated elec- 
ached to the left wrist and leg; respiration 
orded by a thermistor placed slightly inside 
tril permitting the most air flow; integrated 


system. A Consol BSR/GSR preamplifier 
a constant current of 16 pA was used to 
n resistance level (SRL) and skin resistance 
(galvanic skin response; GSR). Beckman 
chloride electrodes 2 cm? in diameter were 
: pace volar surface of the left palm and left 
electrolyte was a saline solution in 
eea by Lykken and Venables 
i R was defined as any decrease in resistance 
% of basal skin response, as recommended 
(1967). Each subject was seated in a 
e recliner chair. 
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For feedback, the output from the EMG driver 
amplifier (integrator time constant = «2 sec) was 
connected to a Narco Limit Indicator (LI-300) per- 
mitting the establishment of an adjustable threshold. 
When the biological signal was below threshold, the 
LI-300 turned on a Mallory Sonalert, which provided 
a 1,000-cps tone. The tone volume was adjusted at the 
beginning of each feedback session by a variable 
resistance potentiometer according to the subject's 
preference. 


Procedure 


Training. For all groups, the first five sessions 
involved training. Upon the subject’s arrival at the 
first session, the polygraph operator introduced himself 
and the therapist (in the LR condition only) and 
presented a general introduction to the study, The 
subject’s particular problem with tension was then 
discussed, and a therapeutic rationale for the treatment 
was presented. The polygraph operator then attached 
the transducers and administered the Anxiety Differ- 
ential (Husek & ‘Alexander, 1963). This was followed 
by a 10-min adaptation period, during which the subject 
sat quietly in a semirecumbent position. The last 3 min 
of this period in each session served as the pretreatment 
basal level for all physiological measures. ‘Training 
procedures were then begun 
eyes closed for all subjects. The final 3 min of training 
served as the posttreatment assessment 
logical measures. The posttraining Anxie 
was then administered, electrodes were detached, and 
the subject was instructed to practice his/her relaxation 
skills once per day. Home practice was discussed at the 
beginning of the second through the fifth sessions, and 
solutions were offered for any difficulties encountered. 

Therapists in the LR and TR conditions were two 
male graduate students in clinical psychology: They 
were trained by the first author in the administration 
of a shortened form of the progressive relaxation 
training described by Bernstein and Borkovec (1973) 
and were experienced with the procedure. Each thera- 
pist treated one half the participants in each of the 
relaxation training conditions. Separate tapes were 
recorded by each therapist for each of the five training 
sessions. Taped presentation differed from live presen- 
tation only in that progression to the next muscle group 
was not contingent on the client’s report of complete 
relaxation in the current muscle group. The first two 


sessions involved training in 16 muscle groups with 
ions involved 4 muscle 


groups with tension rel 

jnvolved training in relaxation by rec 
treatment package in this study progressed over five 
sessions rather than the nine sessions, as described by 
Bernstein and Borkovec. Training was preceded in 
Sessions 1, 3, and 5 by a description of tensing instruc- 
tions. The actual administration of training lasted 
approximatel: 35 minutes in the first two sessions, 
sees in fhe next two, and 5 minutes in the fifth 


session. k 
BF training involved binary au! 


successively lower amounts of ini 


ditory feedback for 
tegrated frontalis 
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muscle action potential. The threshold was set at the 
beginning of each session so that the tone was on 
approximately 50% of the time. When the tone had 
been continuously on (EMG level below threshold) for 
15 sec, the threshold was lowered approximately 2.5% 
(1-3 uv), If the subject did not meet the 15-sec criterion 
within a 2-min period, the threshold was gradually 
raised until the criterion was met. Thus the procedure 
involved gradually shaping the desired response. 

In SR training the rationale emphasized the subject’s 
potential to develop control over her/his tension level 
by regular practice of the relaxation response (cf. 
Beary et al., 1974). The client was instructed to relax 
as much as possible but not to go to sleep. As a control 
for time from pretraining to posttraining for BF, SR 
training also lasted 30 minutes for each training 
session. 

Assessment of self-control. After the standard pre- 
training physiological assessment in the sixth session, 
the subject was instructed to relax as much as Possible 
using the relaxation skills he/she had developed in the 
previous five training sessions. This self-control period 
lasted 10 minutes, with the final 3 minutes serving as 
the postphysiological assessment. Another Anxiety 
Differential and additional questionnaires (described 
below) were administered, electrodes were detached, 
and the subject was debriefed, 

Assessment of trait anxiety. To assess Potential 
changes in trait anxiety, all subjects were administered 
the Trait scale of the State-Trait Anxiety Inventory 
(Spielberger, Gorsuch, & Lushene, 1968) and the 
Multiple Affect Adjective Checklist (MACL; Zucker- 
man & Lubin, 1965) in the initial screening session and 
at the end of the sixth session. 


Results 


After confirming that there were no pre- 
treatment differences between groups on any 
measure in Session 1 (all ps > -20), the data 
for the first five training sessions were analyzed 
apart from the sixth session. The sixth session 
was conceptualized as an assessment of the 
subjects’ ability to reduce their Own arousal 
level when asked to do so without the con- 
current aid of any therapeutic procedure. The 
first five sessions afforded an assessment of 
each procedure’s specific potential for accom- 
plishing, during training, in-session reductions 
in physiological and subjective tension. 

The results for the five training sessions 
were analyzed separately: LR was compared 
to TR; and SR was compared to BF. This was 
because the duration of LR and TR was 
reduced Systematically across sessions by 
virtue of the reduction from 16 muscle groups 
in the first session to relaxation by recall in 
the fifth session. The duration of training 
for SR and BF was held constant at 30 minutes 
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for each of the five sessions. This difference 
between the two pairs of treatments was 
implemented to enhance the external validity 
of results. Progressive relaxation training in 
the clinical setting is reduced in duration as 
training progresses (Bernstein & Borkovee, 
1973), whereas the duration of BF training 
is typically the same for each training session, | 

Physiological variables analyzed were electro. | 
dermal response (GSR frequency/min), heart 
rate (beats/min), respiration rate (cycles/min), | 
and muscle tension (mean v/min). The data 
were quantified by trained raters, with inter- 
Scorer reliability exceeding .99. After the 
reduced data were keypunched, all statistical 
analyses were performed using SOUPAC 
Programs on the IBM 360 computer of the 
University of Georgia. A significance level of 
-05 was adopted for all statistical tests. 


Assessment of Training Effects 


Live versus taped progressive relaxation trait 
ing. In-session changes for the five training 
sessions were evaluated by three-way analyses 
of variance (Treatment X Session x Pre- 
Post) on each of the five dependent variables. 
These analyses revealed significant pre-post 
main effects for respiration rate, heart rate, 
muscle tension, and the Anxiety Differential, 4 
Fs(1, 18) = 10.31, 4.82, 13.38 and 44.31, 
respectively. Each of these main effects 
indicated that there was a significant reduction 
from pretraining to posttraining, although this 
was qualified by higher order interactions for 
all measures with the exception of respiration 
rate. 

There were significant Treatment X Pre- 
Post interactions for the measures of eA 
tonomic arousal—frequency of electrodermð 
response and heart rate, Fs(1, 18) = 4.81 an f 
9.93, respectively. Pre-post change average 
across sessions for GSR frequency/min was 
—145 for LR and +1.94 for TR; and for 
heartbeats/min, —2.54 for LR and a 
for TR. Because LR led to decreases in bo 
GSR frequency and heart rate while TR led n 4 
increases, the significant interactions for thes 
two measures indicate that live relaxation 
training was superior to taped training We 
accomplishing in-session autonomic relaxation. 
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The analyses revealed a significant Treat- 
‘ment X Session X Pre-Post interaction for 
muscle tension, F(4,72) = 3.21, The same 
three-way interaction approached significance 


Yor the measure of subjective tension, the 


Anxiety Differential, F (4, 72) = 2.35, p < .08. 
Figure 1 depicts the mean change in muscle 
tension from pretraining to posttraining for 
each of the five training sessions. Analysis of 
the simple main effects for each session (Kirk, 
1968, p. 222) indicated in the first session that 
TR was superior to LR and that in the second 
and third sessions that the treatments were 
equivalent, whereas in the fourth and fifth 
sessions, live training led to greater reductions 
inmuscle tension than taped training, Fs(1, 90) 
= 8,07, .11, .20, 4.65, and 4.60, respectively. 
For taped training there was a clear trend 
toward attenuated treatment effects as training 
progressed, whereas reductions in muscle 
tension generally increased across sessions with 
live training. Figure 2 presents the pre-post 
means in each session for the Anxiety Differ- 


ential. Post hoc analyses were not performed 


on these data, because the interaction was not 
significant. Descriptively, LR led to greater 
mean reductions in subjective tension than 
TR in all but the fifth session, and LR subjects 
tonsistently attained a deeper mean level of 
Subjective relaxation than TR subjects. 
p ae live progressive relaxation 
freon n was superior to taped training in 
ie pening in-session relaxation on three 
Re 4 our physiological variables measured. 
py caning led to significant decreases in 
mic arousal (GSR frequency and heart 


M —— Live PRT 
|_ ——-— Taped PRT 


b Oo 
T 


MEAN MICROVOLTS/MIN 
$ 


ab 
eal 
Ok 
ne 
aa L 4 Sio 
2 3 4 T 


TRAINING SESSIONS 


igure 
i 1. Mean pre to post change in muscle tension for 


ach trainin i = 
e ng i 
2 g session. (PRT = progressive relaxation 


317 
ie O Pre Li 
da @ Post ---- Foret PRT 
EL 
52 1 
& Lt | 9 
a i | i 
> er l i | 
G | 
al ea 
H 
1 L fi f AE 
1 2 3 4 5 


TRAINING SESSIONS 
Figure 2. Pre/post Anxiety Differential means for each 
training session. (PRT = progressive relaxation train- 
ing.) 


rate) across sessions, whereas taped training 
resulted in increases in these two autonomic 
systems. Furthermore, live training was 
superior to taped training in reducing muscle 
tension, and a similar nonsignificant trend was 
noted for subjective tension. 

EMG biofeedback versus self-relaxation train- 
ing. In-session changes in the five training 
sessions for these two groups were evaluated 
by similar analyses of variance (Treatment 
X Session X Pre-Post). These analyses indi- 
cated significant pre-post reductions for heart 
rate, muscle tension, and the Anxiety Differ- 
ential, Fs(1, 18) = 25.26, 7.90, and 14,03, 
while the main effect for GSR approached 
significance, F(1, 18) = 3.99, p< 06. 

Main effects for the two autonomic measures 
were qualified by significant interactions. The 
Treatment X Pre-Post interaction for heart 
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rate, F(1, 18) = 4.29, indicated that SR was 
superior to BF: Pre-post change averaged 
across training sessions in beats/min was 
—4.34 for SR and —1.91 for BF. The Treat- 
ment X Sessions X Pre-Post interaction for 
frequency of electrodermal response, F(4, 72) 
= 4,69, is presented in Figure 3. Analysis of 
simple main effects for the change scores in 
each training session indicated no differences 
(p > .10) between the two treatments in the 
first four sessions, Fs(1, 90) = 2.35, .74, 2.58, 
and 2.56, respectively. In the fifth training 
session, BF was apparently superior to SR, 
F(1, 90) = 8.37. Closer examination, however, 
revealed that the atypically large increase from 
pretraining to posttraining for SR could have 
been a function of a low pretraining baseline 
(2.56 responses/min). This, in conjunction 
with a high pretraining baseline for BF (7.0 
responses/min), could account for the signifi- 
cant interaction. Analysis of simple main 
effects confirmed, in Session 5, that BF had a 
higher pretraining baseline than SR, F(1, 90) 
= 7,52. Interpretation of the significant 
interaction is therefore seriously qualified by 
differential pretraining baselines in Session 5. 

To summarize, both SR and BF led to 
significant reductions in somatic and subjective 
tension. Additionally, self-relaxation was differ- 
entially more effective than EMG biofeedback 
in reducing heart rate. 


Table 1 
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Assessment of Self-control 


The sixth session data were analyzed by 
two-way analyses of variance (Treatment 
X Pre-Post) to evaluate the participants’ 
self-control over tonic physiological arousal 
and subjective tension. There were pre-post 
main effects for all voluntary response systems 
(cf. Paul, 1969): respiration rate, muscle 
tension, and the Anxiety Differential, Fs(1, 36) 
= 19.99, 10.59, and 17.78, respectively. The 
analyses additionally indicated significant 
Treatment X Pre-Post interactions for fre- 
quency of electrodermal response and respira- 
tion rate, Fs(3, 36) = 3.26 and 3.40. Duncan’s 
multiple-range test (Duncan, 1955), presented 
in Table 1, was applied to the mean change 
scores for each group to analyze the differential 
change from pretraining to posttraining. The 
ordered means for GSR frequency were 
LR = —2.98; TR = +1.56; BF = +1.70; 
SR = +2.70. The statistical comparison (df 
= 36) among all possible pairs of means 
indicated that live relaxation was superior to 
each of the other treatments, which were not 
different from each other. The ordered means 
for respiration rate were SR = —2.86; LR 
= —247; BF = —.50; TR = —.36. The 
multiple-range test indicated that neither 
self-relaxation nor live relaxation were different 
from each other, and both were superior to 
biofeedback and taped relaxation. 


Duncan's Multiple-Range Comparisons on Physiological Change Scores 


During Assessment of Self-control 


——————————— 


Group Obtained SSR Obtained SSR Obtained SSR 
TR BF SR 
GSR frequency 
LR 454 > 4.04 4.68 > 4.25 5.68 > 4.36 
TR 14 4.64 1.14 4.25 
BF 1.00 4.04 
LR BF TR 
Respiration rate 
SR «79 1.84 2.36 > 1.94 2.50 > 1.90 
LR 1.97 > 1.84 241 > 1,94 
BF 14 1.84 


Note. SSR = shortest significant range; LR = live relaxation; TR = taped relaxation; BF = biofeedback: 
SR = self-relaxation; GSR = galvanic skin response. All comparisons in which the obtained value is greater 


than the SSR value are significant at the .05 level. 


of Trait Anxiety 


atment X Pre-Post analyses of variance 

erformed on the ACL and Trait form of 
PAI, which were administered prior to 
ı 1 and at the end of Session 6, There 
mificant pre-post main effects for 
the Trait scale and the ACL, Fs(1, 36) 
and 19.00. This indicated that all 
s reported reductions in trait anxiety 
scale: pre = 49.66, post = 42.69; ACL: 
11.87, post = 8.58), with no differential 
e across groups. 


Discussion 
Training Effects 


present results are consistent with the 
results of Paul and Trimble (1970) and 
ll et al. (1976): Live relaxation training 
ior to non-response-contingent taped 

ing in producing decreases in physio- 
il arousal during training. Paul and 
's data were obtained from unselected 
nteer female psychology students, and 
eviated relaxation training was used (two 
of training with 16 muscle groups). 
present study extends those findings to 
der self-referred male and female clients. 
he fact that LR was superior to TR across 
training sessions on both measures of 
nomic arousal is significant because the 
on, in the present study, from 16 
cle groups to relaxation by recall is similar 
al clinical practice. The clinical context 
investigation and the emphasis on 
daily practice of relaxation skills 
enhance the study’s external validity. 
the present results are readily generaliz- 
to the clinical setting and contraindicate 
of taped training when reductions in 
5. logical arousal are the clinical objective. 
\ preliminary report of the during training 
involving a comparison of LR, TR, and 
the same analysis (Israel & Beiman, 
yielded the conclusion that LR was 
lor to TR and SR on reductions in 
e tension, with no differential training 
$ for reductions in physiological arousal. 
analyses involved only three training 
with fewer subjects than the present 
Including the SR group in the 
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analyses apparently introduced a time con- 
found because of the differential durations of 
the progressive and self-relaxation procedures. 
These factors, combined with the reduced 
power of the earlier analyses (Israel & Beiman, 
1977), account for the differential results of 
the two studies. The present, more precise 
analyses are more appropriate for the evalua- 
tion of differential effects of live and taped 
modes of extended training. | 

Paul and Trimble (1970) speculated that the 
inferiority of taped training is a function of the 
loss of response-contingent progression through 
the training procedure. This hypothesis has 
yet to be adequately tested, although Riddick 
and Meyer (1973) found that taped training 
with response-contingent (gross motor move- 
ment) feedback was as effective as live training. 
Generalization of that finding to the clinical 
setting is limited because of the study’s non- 
clinical situational context and subject sample, 
as well as the administration of only one 
rther research, therefore, 


training session. Fu elo} 
might examine the basis for the superiority 


of live over taped modes of training. Hypothe- 
ses of interest might include presence/absence 
of a therapist as well as response-contingent 
versus non-response-contingent progression 
through the procedure. 
The comparison of self-relaxation and EMG © 
biofeedback indicated that both types of 
training led to significant reductions in physio- 
logical arousal and subjective tension. Interest- 
ingly, SR was differentially more effective than 
BF in accomplishing heart rate reductions 
during training. This may reflect the partici- 
pants’ general ability to decrease heart rate 
in the absence of 
(Ray & Lamb, 
that a rather elab D y 
involving complex and expensive equipment 
is no more, and possibly less, € 
producing relaxation during T ee 
i relaxation procedure. 
anA m Ti d for each client 
ing procedure used. It 
feedback/ 


pendent variable in the present sti 
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research on normal subjects has indicated 
that EMG biofeedback is superior to simple 
instructions to relax when EMG is the primary 
dependent variable (Coursey, 1975; Haynes, 
Moseley, & McGowan, 1975). The present 
results suggest that the addition of a thera- 
peutic rationale and daily practice of relaxation 
skills to self-relaxation instructions yields a 
procedure that is at least as effective as EMG 
biofeedback for producing in-session relaxation 
in chronically anxious subjects. In the present 
study the superiority of self-relaxation for 
reducing heart rate, when considered with 
earlier cautions about the effectiveness of EMG 
biofeedback as a general relaxation training 
technique (Alexander, 1975; Shedivy & Klein- 
man, 1977), suggests that such feedback should 
not be used clinically to attempt to provide 
a general state of relaxation. For future 
research on how to best accomplish in-session 
relaxation, rather than Pursue costly and 
questionably effective feedback as an alter- 
native to extended progressive relaxation 
training in the clinical setting, it Would seem 
preferable to investigate the potential benefits 
of self-relaxation training. 


Posttraining Effects 


This is the first investigation to report data 
regarding client control of tonic physiological 
arousal and subjective tension after relaxation 
training procedures have been completed. The 
comparison of all four training procedures on 
Posttraining control indicated that all groups 
were able to significantly reduce muscle and 
subjective tension when asked to do so, with 
no differential change across groups. With 
respiration rate, the system under direct 
voluntary control (Paul, 1969), live progressive 
relaxation and self-relaxation training were 
both superior to taped progressive relaxation 
and biofeedback training. Only live extended 
Progressive relaxation training led to client 
control over frequency of electrodermal re- 
sponse, a system enervated solely by the 
sympathetic nervous system (Sternbach, 1966). 
Differential Patterning of control (reductions 
in somatic and subjective response, increases 
in sympathetic response) was true for all 
groups but live progressive relaxation. This 
could indicate that the other groups may have 
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been “actively” trying to relax, They were 
successful for the somatic (muscle tension, 
respiration rate) and subjective systems bu 
not for the electrodermal measure of sym- 
pathetic arousal. 

Considering the generally accepted opera- 
tional definition of anxiety as subjective 
distress accompanied by sympathetic arousa 
(Paul, 1969), the fact that live progressive 
relaxation training alone led to reductions in 
sympathetic arousal seems particularly im- 
portant. Since these clients were selected for 
their reported difficulties with anxiety and 
tension in their everyday lives, live extende 
Progressive relaxation training would appear 
to be the clinical treatment of choice when 
client control over tonic sympathetic arousal 
is one of the clinical objectives. 

In a separate investigation in the same 
laboratory, live abbreviated relaxation training 
was superior to self-relaxation training In 
analogue clinical subjects on subjective re- 
sponse to phobic stimuli (Beiman, Green, 
Webster, Rosmarin, Holliday, & Graham, 
Note 1). Generalization of the present investi- 
gation’s results to other relaxation procedures, 
for example, those involving direct induction, 
should not be assumed and awaits empirical 
Verification. In the interim, considering the 
Present data as well as those of Beiman et al. 
(Note 1), it seems appropriate to recommend 
live extended progressive relaxation training 
as used here for the treatment of chronic or 
pervasive anxiety/tension. t 

The use of this treatment in behavioral 
medicine appears to have considerable promise. 
Since stress (involving prolonged adreno- 
sympathetic arousal) has been shown to 


produce tissue damage and contribute to an 


extensive and varied list of medical disorders 
(Selye, 1976), the use of extended relaxation 
training in the psychological treatment of such 
stress-related disorders seems particularly 
appropriate and has been recommended pre- 
viously (Schwartz & Shapiro, 1973). The 
present investigation provides empirical sup 
port for such a recommendation, with the 
qualifications regarding specific type of training 
discussed above kept in mind. Extended live 
progressive relaxation training presented as a 
self-control coping skill for maladaptive tension 
and anxiety has also proven successful in the 


> 


nonpharmacological treatment of essential 
hypertension (Graham, Beiman, & Ciminero, 
1917; Beiman, Graham, & Ciminero, Note 2). 
This and other stress-related medical disorders 
| are potentially good targets for future outcome 
| research on the effects of live extended pro- 
| gressive relaxation training. 
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Correlational and Factor Analysis of the Peabody Individual 
Achievement Test and the WISC-R 


Richard L. Wikoff 
University of Nebraska at Omaha 


Subtest scores from the Peabody Individual Achievement Test (PIAT) and the 
Wechsler Intelligence Scale for Children—Revised (WISC-R) for 180 children, 
ages 6-17, were factor analyzed to determine the number and kinds of factors 
measured by the PIAT. Two factors were found when the PIAT was factored 


alone. Reading Recognition, Reading Comprehension, and Spelling loaded highly 
on a Word Recognition factor, whereas Mathematics and General Information 
had moderate to high loadings on a School-related Knowledge factor. When the 
PIAT was factored with the WISC-R subtests, a Word Recognition factor was 
found in addition to the three factors usually reported for the WISC-R. The 
School-related Knowledge factor of the PIAT was subsumed by the other fac- 
tors. General Information loaded highly on the Verbal Comprehension factor, 
and Mathematics loaded highly on the Freedom from Distractibility factor. 
Implications for the interpretation of the PIAT are discussed. 


The Peabody Individual Achievement Test 
(PIAT; Dunn & Markwardt, 1970) has 
gained wide acceptance and use during the 
past several years. It contains five subtests 
that are separately scored and have separate 
norms for both age and grade, The materials 
for the subtests are presented either verbally 
or by presenting a page of material for the 
subject to read, The subject responds by 
choosing from among four choices, which are 
visually presented. Raw scores are determined 
by the number of correct responses and can be 
converted to either percentile ranks or standard 
Scores with a mean of 100 and a standard 
deviation of 15. In addition, there are norma- 
tive tables for the total test score. 

The Mathematics subtest was designed as a 
measure of the ability to apply mathematical 
concepts to the solution of problems, as 
opposed to a measure of computational skills 
or quantitative aspects of concept formation. 
The objective of the Reading Recognition 
subtest “is to measure skills in translating 
Sequences of printed alphabetic symbols which 
form words into speech sounds that can be 


a. 
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Wikoff, 


understood by others as words” (Dunn & 
Markwardt, 1970, p. 19). The Reading Com- 
prehension subtest is presented by its authors 
as a test of the ability to derive meaning from 
printed words. The Spelling subtest attempts 
to measure the ability to recognize correct 
spelling as opposed to the ability to recall the 
correct spelling of words. The General Infor- 
mation subtest is presented as a measure of 
general encyclopedic knowledge. 

The interpretation of these five | subtest 
Scores depends on the extent to which they 
are measuring different factors. It is quite 
Possible that only one factor, for example, 
general intelligence, is present. In this case, 
the best interpretations would be made from 
the total test score, since this score would be 
the most reliable. On the other hand, if the 
subtests are measuring five different factors, 
interpretations based on each separate subtest 
would be appropriate. , 

One of the most widely used intelligence 
tests is the Wechsler Intelligence Scale for 
Children-Revised (WISC-R, Wechsler, 1974). 
The factor structure for this test was reporte! 
by Kaufman (1975), who found three meaning- 
ful factors—(a) Verbal Comprehension, (b 
Perceptual Organization, and (c) Freedom 
from Distractibility. 
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The purpose of this study was to determine 
he number of factors being measured by the 
PIAT subtests and the extent to which each 
‘abtest measured the factors found. A second- 
ary purpose was to investigate the relationship 
ifthe PIAT to the WISC-R to determine if 
it measures some aspect of intelligence or 
measures factors not found in the WISC-R. 


Method 
‘Subjects 


The subjects of this study were children who were 
ferred to me because of learning problems. Each 
ject’s classroom achievement was below either his 
or her parents’ or the school’s expectations. Nearly all 
re described as hyperactive, easily distracted, having 
Ashort attention span, not completing assignments 
ine on. The children ranged in age from 6 years to 
ee with a median age of 9.87 and a mean age of 
ct were 123 males and 57 females. No minority 
ldten were included. All socioeconomic groups were 
presented, but most were from the middle class. 


tocedure 


Eoee was administered the WISC-R and 
a nly the protocols for those subjects who 
aa S 10 WISC-R subtests (excluding Digit 
Be pase) and all five PIAT subtests were used. 
Aprine subjects who met this criterion. 
led factors solution was used to factor analyze 
an scores for the PIAT subtests. Communali- 
Rion paeained by iteration, and the resulting 
tations i rotated using the varimax method, 
fors to Wes performed for one, two, and three 
et 197, ee the most appropriate structure 
ae analysis was performed that included 
hats nee 10 subtests of the WISC-R. This 
kisring mei letermine if the PIAT was actually 
Bid ievement or if it was measuring the 
RR ae intelligence as found in the WISC-R 
ii ies p ). Varimax rotations were performed 
Shtion was i ad and five factors. The four-factor 
ie Most meaningful. 


Results 


T 
Wo factors were retained for the PIAT. 


Subte A 
| ees Recognition, Reading Com- 
actor nen Spelling loaded highly on 
* ition OL all three subtests require 
Reco oF words, this factor was named 
at eee General Information and 
Factor 9 a had the highest loadings on 
kowego, oth of these subtests require a 
ge of facts acquired in school; 
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Table 1 

wes Structure of the Peabody Individual 
chievement Test Subtest. ing- 

oct s for Learning-Problem 


EEE 


School- 
Word related 
Recogni- Knowl- 
Subtest tion edge We 
Mathematics 33 a 
Reading y i 
Recognition Diera 37 97 
Reading 
Comprehension 13 „50 .18 
Spelling .10 40 65 
General 
Information 36 T 63 
Eigenvalue 2.09 1.49 3.58 
% of k? 58.38 41.62 100.00 
41.80 29.80 71,60 


% of total variance 


Note. N = 180. 


this factor was named School- 


therefore, 
factor loadings are 


related Knowledge. The 
presented in Table ik 
When the PIAT and WISC-R subtests were 
factor analyzed together, four factors were 
retained. The structure is presented in Table 


2 along with means and standard deviations 


for each of the variables. The first factor had 
high to very high loadings for Reading Recogni- | 
tion, Reading Comprehension, and Spelling. 
This factor was clearly the Word Recognition 
factor from the PIAT and was obviously 
measuring a factor that is only barely tapped 
by the WISC-R subtests. The second factor 
had the highest loadings for Vocabulary, 
Comprehension, Similarities, and Information. 
It was labeled Verbal Comprehension, since it 
corresponds to the first factor of the WISC-R 
(Kaufman, 1975). The PIAT subtests had 
only low loadings on this factor, with the 
exception of General Information, whi 
loaded highly. The third factor corresponds to 
the Perceptual Organization factor of the 
WISC-R (Kaufman, 1975). Its highest load- 
ings were for Object Assembly, Block Design, 
Picture Completion, and Picture Arrangement. 
Mathematics and Arithmetic had the highest 
loadings on Factor 4. This factor appears to 
be similar to the Freedom from Distractibility 
factor of the WISC-R (Kaufman, 1975). The 


loadings were smaller than those reported by 


sh fale 
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Table 2 


Means, Standard Deviations, and Factor Structure for the WISC-R and 


PIAT Subtests for Learning-Problem Children 


——— eeEeeeeEeEeEeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeSeSeSeSeSsSsSsSsSsSsSsssesF 


RICHARD L. WIKOFF 


Factor 
4: Freedom 
1: Word 2: Verbal 3: Perceptual from 
Recogni- Compre- Organiza- Distracti- 

Subtest M SD tion hension tion bility kè 
Information 9.44 2.87 .34 -66 15 .22 62 
Similarities 10.17 2.80 34 64 33 .07 64 
Arithmetic 9.03 2.78 37 AL .28 AYE il 
Vocabulary 10.28 3.21 .29 81 .22 10 80 
Comprehension 10.42 2.89 -08 66 34 -30 65 
Picture Completion 10.71 2.72 09 34 60 -00 48 
Picture Arrangement 10.43 2.96 -08 35 50 16 40 
Block Design 10.09 3.18 AT Bb 17 .08 65 
Object Assembly 10.74 3.09 .03 .09 78 13 63 
Coding 8.43 3.23 .18 -09 35 fii 18 
Mathematics 96.69 12.63 38 34 YA 58 64 
Reading Recognition 96.83 12.87 91 .29 19. -09 96 
Reading Comprehension 96.54 13.45 75, 35 .23 19 VEN 
Spelling 93,89 14.00 -17 21 .07 .21 69 
General Information 99.94 12.45 39 72 20 122 16 

Eigenvalue 2.82 3.72 2.49 1.01 9.59 
% of h? 29.41 34.10 25.96 10.53 100.00 
% of total variance 18.80 21.80 16.60 6.73 63,93 


Note. WISC-R = Wechsler Intelligence Scale for Children-Revised; PIAT = Peabody Individual Achieve- 


ment Test; N = 180. 


Kaufman because some of the variance was 
rotated to other factors. 

The Reading Recognition, Reading Compre- 
hension, and Spelling subtests were all highly 
correlated with the Word Recognition factor. 
These subtests had only low correlations with 
the other factors, except that Reading Com- 
prehension correlated .35 with the Verbal 
Comprehension factor. 

The General Information subtest correlated 
.72 with Verbal Comprehension and .39 with 
Word Recognition. Correlations with the other 
factors were low. 

The Mathematics subtest correlated the 
highest with the Freedom from Distractibility 
factor but had loadings of .38 and .34, respec- 
tively, for Word Recognition and Verbal 
Comprehension. 

Another way of looking at the relationship 
of the PIAT to the WISC-R is by the corre- 
lation of the PIAT subtests with the Verbal, 
Performance, and Full Scale IQs of the 
WISC-R. These correlations are presented in 
Table 3. 


Discussion 


The structure of the PIAT subtests has not 
been reported previously. The results of this 
study indicate that there are actually only two 
factors rather than the five implied by the 
subtest organization. Factor 1 was a Word 
Recognition factor that included the two 
reading subtests and Spelling. Factor 2 was & 
School-related Knowledge factor that included 
Mathematics and General Information. 

When the PIAT subtests were analyzed 
with the WISC-R subtests, the Word Recogni- 
tion factor remained as a factor distinct from 
those previously reported for the WISC-R 


. (Kaufman, 1975), The second factor was 


subsumed by the WISC-R factors. General 
Information had its highest loading on the 
Verbal Comprehension factor, and Mathe- 
matics loaded along with the WISC-R Arith- 
metic subtest on the Freedom from Distracti- 
bility factor. ; 
The results of this analysis are not conclusive, 
but they suggest the possibility that the 


ions of PIAT Subtests with 
R Verbal, Performance, and 
cale IQs 


Bes Perform- Full 
= Subtest Verbal ance Scale 
ics .57 39 57 

ng Recognition -58 39 56 

g Comprehension _.64 43 -60 

g .50 .28 43 
Information -18 44 71 


PIAT = Peabody Individual Achievement 
WISC = Wechsler Intelligence Scale for 
Revised. N = 180. 


lom from Distractibility factor may 
be a numerical factor. This follows 
two lines of reasoning. First, in Kaufman’s 
) analysis of the WISC-R, Arithmetic, 
Span, and Coding (all tasks involving 
ers) had moderate loadings on the 
edom from Distractibility factor. Second, 
PIAT Mathematics subtest and the 
R Arithmetic subtest have similar 
s on this factor. However, the PIAT 
matics subtest appears to minimize the 
for concentration by presenting material 
ly, which the subject can refer to while 
g the problem. The WISC-R Arithmetic 
st requires the subject to retain informa- 

presented orally. This problem might be 
estigated further by factor analyzing these 
ith numerical tests whose structure is 
the two reading subtests and the 
subtest of the PIAT are measuring 
ne factor, they should be interpreted 
Remedial treatment designed to 
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increase the scores on any one of these subtests 
should improve the scores on the other two as 
well. To increase reliability, it is suggested 
that these three subtests be combined and 
new norms presented for the new variable, 
Word Recognition. The other two subtests, 
Mathematics and General Information, can 
be appropriately interpreted as separate 
subtests. The strong relationship to Verbal 
Comprehension should be kept in mind when 
interpreting scores from the latter test. Further 
study is recommended to determine if persons 
taking this subtest might know the information 
sought but are unable to understand the 
language. 

Finally, these results support the use of the 
PIAT as a separate test in a battery containing 
the WISC-R. There is at least one factor that 
is different from those measured by the 
WISC-R that provides additional information 
of value in developing a treatment program 
for helping children with learning problems. 
The Mathematics subtest appears to provide 
supplementary information, also. However, 
the General Information subtest correlates so 
highly with Verbal Comprehension that it 
would not be necessary to include this subtest. 
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Comments 


Social Psychological Concepts Applied to Clinical Processes: 
On the Need for Precision 


John H. Harvey 
Vanderbilt University 


Ben Harris 
Radford College 


This comment analyzes and critiques an attempted application of cognitive dis- 
sonance and reactance concepts to a therapy analogue. An experiment by R. M. 
Gordon is discussed. It is shown that key prerequisites for dissonance arousal 
(choice and responsibility) and for reactance induction (initial free choice) 
were not present. Also, self-selection of volunteers for treatment is shown to be 
a likely, unreported factor. Original results are reinterpreted to suggest that 
they were due to the frustration of some subjects’ expectations rather than to 
arousal of reactance. A need is cited for well-informed application of experi- 


mental social psychological theories. 


Recently, clinical psychologists (e.g., Brehm, 
1976; Goldstein, Heller, & Sechrest, 1966) have 
offered theoretical analyses concerning the role 
of two social psychological processes (cognitive 
dissonance and psychological reactance) in medi- 
ating patient attitudes toward clinical treatment. 
Although this work has not yet had a profound 
impact on the practice of clinical psychology, it 
reflects clinicians’ increased attention to social 
and intrapersonal processes that may affect psy- 
chotherapy theory and practice. Consistent with 
this emphasis, Gordon (1976) has attempted to 
apply cognitive dissonance and reactance theory 
to the important clinical variables of incentive to 
engage in therapy and choice of treatment. Un- 
fortunately, he has primarily succeeded in il- 
lustrating the difficulties involved in such an en- 
deavor. This comment criticizes the internal and 
external validity of Gordon's study, offers an 
alternative explanation for his findings, and at- 
tempts to clarify some of the conceptual issues 


that are raised, 
Subjects 


In Gordon’s study, volunteers and nonvolun- 
teers were recruited for a session of relaxation 


The authors would like to thank Robert Falk for 
his assistance in the preparation of this comment. 
Thanks are also due to Robert Gordon for his 
gracious cooperation in answering inquiries by the 
first author. 
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training and were given a choice between two 
types of training (choice conditions) or were 
assigned to a treatment type chosen by another 
subject (no-choice conditions). A major chal- 
lenge to the validity of Gordon’s study is posed 
by his manner of operationalizing the concept of 
volunteering for treatment. In this experiment, 
volunteer and nonvolunteer subjects were stu- 
dents from a single undergraduate psychology 
class. On a scale of self-reported relaxation, the 
mean pretreatment score for all subjects was 
6.8 (out of a maximum 10), suggesting that 
these subjects were not comparable to a clinical 
population. More importantly, both “volunteer 
and “nonvolunteer” groups received the same 
amount of course credit for their participation, 
both groups were apparently recruited during 
class hours in the same week, and many volun- 
teers and nonvolunteers received the experi 
mental treatment in pairs (Gordon, Note 1). 
Thus, the act of volunteering for treatment was 
one with seemingly no subsequent discriminative 
effect on any of the subjects: The subjects made 
no differential sacrifice after volunteering, they 
received no differential incentive, and they were 
given no preferential choice of treatment. Thus, 
any experimental effect found for the factor of 
Volunteering/nonvolunteering would be most 
likely due to the self-selection of subjects into 
Volunteer and nonyolunteer groups rather than 
due to an actual treatment effect (i.e, disso 
nance arousal). In regard to this self-selection 
hypothesis, the male-to-female ratio for volun- 
teer subjects was approximately 1:1, whereas it 
was 3:1 for nonvolunteer subjects. Also, there 
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110% difference in the self-reported anxiety 
olunteers and nonvolunteers. These differ- 
between subjects in the volunteer and non- 
nteer conditions are consistent with Rosenthal 
md Rosnow’s (1975, pp. 13-25, 56-59) review 
selection effects in experimental and clini- 
earch. 
sequent to the selection of volunteers and 
olunteers for relaxation training, all subjects 
Gordon study (Gordon, 1976) were ex- 
(in randomly constructed pairs) to the 
pal experimental manipulation—the “re- 
sibility for choice” of treatment modality. 
manipulation was accomplished by having 
member of each pair of subjects choose 
Ween two types of relaxation training; then, 
Subjects were exposed to the chosen treat- 
c The major theoretical reason for this 
pulation was that feelings of personal re- 
ibility represent one of many necessary 
ions for producing cognitive dissonance 
cklund & Brehm, 1976, pp. 51-71). Unfor- 
; itely, the study examined here does not 
¢tly manipulate responsibility; it simply gives 
dice between two therapy treatments to half 
e subjects and no choice to the others. Even 
perceived choice may have been varied 
Harvey & Harris, 1975) and may have been 
Wated to responsibility (Harvey, Harris, & 
ames, 1975), Gordon (Note 2) did not use a 
Hiinipulation check to assess his subjects’ actual 
N s of either responsibility or choice. As a 
t, at least one important prerequisite for 
Onance arousal may have been absent. 
i evaluating Gordon’s application of cogni- 
issonance theory, an even more important 
hodological question concerns the nature of 
treatment offered to subjects. In introducing 
choice manipulation, Gordon argued that 
ing an irrevocable choice between two evenly 
d alternatives will produce cognitive dis- 
na since the chooser’s subsequent behavior 
ite partial to the chosen alternative than its 
e attractiveness would allow. Thus, in an 
imental situation similar to the one in ques- 
| Gordon would predict subjects’ engaging in 
ance-reducing behavior such as selective 
eption of choice-related information (Ehr- 
Guttman, Schonbach, & Mills, 1957) or 
Bhtened evaluation of the selected treatment 
ity. Unfortunately, Gordon’s methodology 
4 allow a test of this hypothesis, since his 
S did not choose between evenly matched 
atives (it was found that one treatment was 
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tant prerequisite for dissonance arousal was ab- 
sent. 

In addition to cognitive dissonance theory, 
Gordon invoked J. W. Brehm’s (1966) theory of 
psychological reactance to explain his experi- 
mental results, To do this, he hypothesized that 
low-responsibility (no-choice) subjects will ex- 
perience reactance because of their limited con- 
trol over the treatment they receive. As with the 
dissonance-based explanation, Gordon’s assump- 
tion about the involvement of a reactance process 
cannot be tested using his experimental method. 
Theoretically, reactance is induced by eliminating 
one or more of a person’s existing free (possible) 
behaviors (J. W. Brehm, 1966, p. 4). In Gor- 
don’s study, the “Jow-responsibility” subjects 
were not first given choices and then restricted 
in their options; they were simply given a treat- 
ment selected by another subject. According to 
reactance theory, one would not expect this type 
of variation to result in either attempts to restore 
choice or in devaluation of the imposed treat- 
ment. Also, as with dissonance, Gordon provided 
no data to demonstrate directly the operation of 
a reactance process in his study. 

If cognitive dissonance were aroused by Gor- 
don’s manipulation of subjects’ choice, one would 
expect a significant main effect of this factor 
(high choice vs. no choice) for subjects’ attitudes 
toward treatment. Gordon’s failure to find a 
main effect of choice/no choice for the measure 
of subjects’ attitudes would seem to be consis- 
tent with the above critique of the study’s dis- 
sonance-related methodology. 

Gordon invoked reactance theory to explain his 
findings for the measure of subjects’ self-rated 
change in relaxation following treatment. He 
found that only volunteers with no choice of 


treatment failed to show a significant (post- 


treatment) increase in relaxation. As the above 
d suggests, this result 


examination of the metho! 
is most likely not due to psychological reactance 
being aroused in the no-choice volunteers. A 
more likely explanation would center on the 
frustration experienced by the no-choice volun- 
teers. A review of Gordon’s procedure shows 
that volunteer subjects both expressed interest in 
the treatment earlier than nonvolunteers and 
were somewhat less relaxed (M = 6.3) than non- 
volunteers (M = 7.4) before the treatment, Ac- 
cordingly, they may have both desired and ex- 
pected preferential treatment. For those volun- 
teers in the no-choice condition, however, their 
treatment was picked by another subject who 
then received his oF her preferred treatment 

the no- 


simultaneously and in the same room as 
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choice volunteer. Thus, the effect of this manipu- 
lation (less self-reported change in relaxation) 
can be more parsimoniously explained by attend- 
ing to the relative powerlessness and frustrated 
expectations of the no-choice volunteer subjects. 

We are not arguing here that cognitive disso- 
nance and psychological reactance are irrelevant 
to theorizing about clinical variables such as atti- 
tude toward treatment and self-perceived thera- 
peutic change. However, the elaborate previous 
work on these variables and theories in the area 
of social psychology necessitates considerable 
caution when making applications to clinical con- 
cerns. It is hoped that the continuing interest of 
clinicians in experimental social psychology (e.g., 
S. Brehm, 1976) will promote well-informed at- 
tempts to apply social psychological ideas to 
clinical problems and processes. 


Reference Notes 


1, Gordon, R. M. Personal communication, Febru- 
ary 1977, 


2, Gordon, R. M. Personal communication, Decem- 
ber 1976. 
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and Harvey (1978) expressed a fear 
Clinical researchers will apply social psycho- 
theories without due regard or under- 
ling. As a social clinician whose predoctoral 
jalty was in social psychology, I don’t think 
roblem is with sloppy cross-fertilization. 
and Harris (1975; Harvey, Harris, & 
mes, 1975), who have done interesting work 
the choice and responsibility variables, must 
at my study (Gordon, 1976), which deals 
with the same variables, cannot also be 
‘Unfortunately, they appear to reconcile our 
osed differences by making much ado about 
nificant differences and misinterpreting my 
edures. The only real disagreement, as I see 
the efficacy of the volunteering variable—a 
tical issue that they avoid while focusing 
ceived impreciseness, 

and Harvey infer that my study (Gor- 
1976) has no external validity, since the 
s used were college students rather than a 
population—perhaps believing that disso- 
and Teactance are limited to college stu- 
As evidence, they stated that the initial 
tment mean for relaxation was 6.8. As a 
demonstration, I asked 14 of my private 
with anxiety problems the same question 
š used in the Gordon study: “How relaxed 
U right now?” (1 = “not at all” and 10 = 
letely”). The results were very similar 
9, SD = 1.0), not surprising to anyone 
kes a distinction between state and trait 


s and Harvey do not feel that volunteer- 
an indication of motivation involving 
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Imprecision or Dissonance? A Reply to Harris and Harvey 


Robert M. Gordon 


Allentown, Pennsylvania 


Harris and Harvey’s criticism of Gordon’s research as imprecise is based on their 
misunderstanding of my procedures and making much ado about initial nonsig- 
nificant differences between the volunteers and the nonvolunteets. Using this to 
explain away the volunteer factor, they claim that the prerequisites of disson- 
ance and reactance were not present. I suggest that the real issue is the efficacy 
of volunteering as an independent variable, not imprecision. 


choice, responsibility, or interest. And since they 
don’t, they claim that the prerequisites of disso- 
nance and reactance were not present. They claim 
instead that the volunteer effects are just a 
methodological fluke, due to initial group differ- 
ences in sex and anxiety. However, sex was not 
found to be significantly correlated with any of 
the dependent measures; and the 10% difference 
in initial anxiety between the volunteers and 
nonvolunteers was also nonsignificant, F(1, 26) 
= 1.84. They also did not attempt to explain why 
the volunteers continued to volunteer after the 
relaxation experiment, when again there were no 
volunteer main-effect differences. 

Harris and Harvey (1978) misinterpret the 
procedure as it “simply gives a choice between 
two therapy treatments to half of the subjects 
and no choice to the others” (p. 327). They con- 
clude that both dissonance and reactance could 
not have existed. In reducing the actual pro- 
cedure to a simple choice -no choice situation, 
they have eliminated the entire relationship 
process, which was the reason for running the 
subjects in pairs—to accentuate the differential 
feelings of responsibility and choice. Both sub- 

two very different forms 


jects were told abou i 
of relaxation training that were available to them 


and that they could have only one of them. The 
ed one of the subjects which 


experimenter ask 
form he/she would prefer and then played that 
tape without considering the other subject. The 
subject who had chosen the tape chose it for the 
other subject as well. The yoked other was not 
only unfairly denied the right to choose, but he/ 
she also had to hear a tape chosen by the other 
subject. As predicted, the inferred feelings of 
dissonance in the former and reactance in the 


latter occurred for only those students who vol- 
The dissonance (high- 


unteered for treatment. 
responsibility-and-choice) group valued the treat- 
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ment highly and claimed that it was very suc- 
cessful, whereas the reactance group devalued 
the treatment and was the only group of four 
who found the treatment to be unsuccessful. The 
nonvolunteers, who were not particularly in- 
terested in the treatment to begin with, were 
unaffected by the subsequent experimental ma- 
nipulation. 

Harris and Harvey claim that the subjects did 
not choose between evenly matched alternatives. 
They must have misunderstood my comment: 
“Surprisingly, many of the subjects had a definite 
preference” (Gordon, 1976, p. 800), taking it to 
mean that there was a consistently clear pref- 
erence for one alternative over the other. This 
was not the case. I was surprised because there 
are no such forms of relaxation as “neuroglandu- 
lar” and “cardiovascular.” Yet people had defi- 
nite preferences. Harris and Harvey contend that 
the processes of dissonance and reactance were 
not directly measured. This is true, but as far as 
I know, no theoretical construct is ever directly 
measured—it is only inferred. 

Harris and Harvey claim that since there was 
no main effect of choice, they are correct in their 
argument. This circular reasoning assumes that 
the volunteering effects are not real. Harris and 
Harvey’s alternative hypothesis erroneously as- 
sumes (a) that the volunteers expressed an 
earlier interest than the nonvolunteers; however, 
the volunteers and nonvolunteers expressed their 
differential attitudes at the same time. Harris 
and Harvey may be confusing the later manipula- 
tion to get the nonvolunteers into treatment with 
their initial attitudes. Also, the subjects’ initial 
attitudes remained unchanged even after the 
treatment. They never became “volunteers.” (b) 
The volunteers were less relaxed. Again, there 
were no significant volunteer pretreatment or 
Posttreatment main effects in anxiety. (c) The 
volunteers expected preferential treatment. This, 
I suppose, is based on the above two assump- 
tions. Treatment was not based on need or ona 
first-come, first-served basis. 


Conclusions 


Theoretically I would agree with Harris and 
Harvey’s criticisms if their interpretations were 
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indeed true. The problem is certainly not in the 
improper understanding of social psychological 
theories. In attempting to reconcile perceived 
mutually exclusive findings, they have reind 
terpreted my study (Gordon, 1976) to seem 
imprecise. This is not an unusual form of disso- 
nance reduction for researchers; Witness the 
plethora of straw-man controversies. 

The only theoretical difference we may have is 
whether volunteering represents a methodological 
problem or an independent variable in its own| 
right (Rosenthal & Rosnow, 1975). The volun-| 
teer issue has been a controversial one for some} 
time. Perhaps part of the reason it has been so 
unwelcome is that it is so humbling. It infers) 
that our procedures are only effective if people 
are willing to let them be effective. It infers thal 
their initial motivations, feelings of personal 
responsibility, and resistances are more powerful 
than the treatment per se. The clinician, who has 
long observed this in therapy, cannot explain 
this away as self-selection or randomize or factor 
it away. The clinician must deal with these dy- 
namics if the treatment is going to work. Volum 
teering is not a mistake—it is choice and respon 
sibility. 
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Selecting a Short Form of the MMPI: 
Addendum to Faschingbauer 


Norman G. Poythress, Jr. 
Center for Forensic Psychiatry, Ann Arbor, Michigan 


Faschingbauer offered guidelines to clinicians in the selection of Minnesota 
Multiphasic Personality Inventory (MMPI) short forms as substitutes for the 
full MMPI. This comment offers an addendum to Faschingbauer in the form 
of a review of empirical studies of the clinical validity of MMPI short forms 
and a discussion of the MMPI-168, which was not considered in Faschingbauer’s 
earlier article. For diagnostic and interpretive accuracy, the empirical evidence 
to date seems to favor two short forms—the Faschingbauer Abbreviated MMPI 
and the MMPI-163—over the other available short forms. 


In a recent review Faschingbauer (1976) dis- 
cussed numerous clinical considerations one 
should make in selecting a short-form Minnesota 
Multiphasic Personality Inventory (MMPI) for 
clinical use in place of the full MMPI. In his 
article, Faschingbauer offered a critical analysis 
of four recently developed short forms—the 
Mini-Mult (Kincannon, 1968), the Hugo (Hugo, 


| 1971), the Midi-Mult (Dean, 1972), and the 


Faschingbauer Abbreviated MMPI (FAM; 
Faschingbauer, 1974)—and addressed a variety 
of issues including (a) correlations of short-form 
scales with the full MMPI, (b) differences in 
item content, and (c) concordance rates between 
short forms and the full MMPI for Welsh code 
types. 

? Faschingbauer’s review offered excellent guide- 
lines for the clinician who may consider using an 
MMPI short form. If shortcomings are to be 


noted in his review, they would have to be that 
(a) he failed to include the MMPI-168 (Overall 
7 Gomez-Mont, 1974) in his discussion, and (b) 
e excluded from discussion studies on the clini- 
al (as opposed to statistical) validity of the 
Various short forms. This comment offers an 
addendum to Faschingbauer (1976) and includes 
a discussion of the MMPI-168 and a review of 


pe clinical validity studies of the MMPI short 
orms, 


MMPI-168 


ae development and description of the Mini- 
ult, the Hugo, the Midi-Mult, and the FAM 
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was covered in Faschingbauer (1976) and will 
not be repeated here. The MMPI-168, developed 
by Overall and Gomez-Mont (1974), is the newest 
of the MMPI short forms. It consists of the first 
168 items of the regular MMPI (the first 7 pages 
of the MMPI, Form R), which are scored by 
the regular scoring keys. These reduced MMPI 
taw scale scores are then converted to estimates 
of the full MMPI scale scores via either regres- 
sion equations (Overall & Gomez-Mont, 1974, 
Table 1) or a conversion table (Overall, Higgins, 
& Schweinitz, 1976, Table 5). Early reports of 
the statistical concordance between the MMPI- 
168 and the full MMPI have been encouraging. 
Overall and Gomez-Mont reported correlations 
with the full MMPI scales ranging from .79 to 
.96, with a median correlation of .89 in a sample 
of 339 psychiatric patients. Newmark, Newmark, 
and Cook (1975) found that for male psychi- 
atric patients, scale correlations between the 
MMPI-168 and the full MMPI ranged from .78 
to .96, with a median of .88; for female psychi- 
atric patients the range was from .77 to .93, with 
a median of .90. Further, in about 72% of cases 
the MMPI-168 predicted either the first or first 
and second peak scales found on the full MMPI. 
These initial findings for the MMPI-168 compare 
quite favorably with the correlations and con- 
cordance rates established for the other available 
short forms (Faschingbauer, 1976, Tables 1 and 


2). 


Statistical Concordance and Clinical Validity 


For the practicing clinician who is looking for 
an MMPI short form appropriate for use in indi- 
vidual assessment, the relevance of statistical 
concordance data between various short forms 


ts of reproduction in any form reserved. 


331 


332 


and the full MMPI must be questioned. Clearly, 
short-form to long-form scale correlations offer 
little of value concerning the utility of short- 
form profiles for individual profile interpretation 
and clinical decision making. The concordance 
for elevated peaks and code types is somewhat 
more relevant, and several investigators of the 
utility of MMPI short forms have offered nega- 
tive evaluations of short forms based on con- 
cordance rates, which they judged to be unac- 
ceptably low. Hoffman and Butcher (1975), for 
example, found low hit rates for the Mini-Mult, 
the FAM, and the MMPI-168, and concluded 
that 


there is insufficient evidence to advocate clinical use 
of the MMPI short forms. It seems that with such 
low classification accuracy in the short forms they 
would not simply plug into the existing interpreta- 
tions and uses of the standard form without some 
modification and cautions. (p. 38) 


Similarly, Griffin, Finch, and Edwards (1976) 
concluded their study of the Midi-Mult with the 
following warning: “The Midi-Mult cannot be 
used as an MMPI short form . . . because its 
scales are not sufficiently accurate in predicting 
the MMPI” (p. 56). 

Other investigators have found these judg- 
ments and warnings to be premature or inappro- 
priate. Poythress and Blaney (in press) noted: 


This conclusion . . . suffers from three shortcomings. 
First of all, discordance of code type does not neces- 
sarily mean discordance of interpretation. One popu- 
lar MMPI handbook (Gilberstadt & Duker, 1965) 
lists anxiety neurosis as an appropriate diagnosis for 
code types 1-3-9, 2-7, or 4, no two of which share 
even a single peak. Secondly, most investigators of 
code type concordance have concluded that the 
values they obtained were unacceptably low. How- 
ever, if the code type concordance rate for test- 
retest studies of the full MMPI is utilized as the 
criterion, values such as those obtained by Hoffman 
and Butcher appear to be much more acceptable, 
For example, Faschingbauer (1974) found that for 
61 subjects who took the full MMPI twice with 
only a one day interval, the same two-point code 
type was obtained in only 41% of cases, This sug- 
gests that the degree of slippage between long and 
short forms may not be markedly greater than the 
slippage between two long forms administered in 
close succession. Finally, the conclusion of Hoffman 
and Butcher and others is based on only an inter- 
mediate step in clinical interpretation. The final 
step, the clinician’s profile interpretation, can be ob- 
served and measured directly; thus these investi- 
gators have made inferences where empirical in- 
vestigation was called for. (p. 3) 


The remainder of this comment is devoted to 
a brief review of empirical studies in which some 
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clinical judgment, decision, or interpretation was 
used to evaluate the utility of one or more of the 
MMPI short forms relative to the MMPI. It is 
asserted that this is perhaps the most relevant 
body of research literature for the practitioner 
to consider when contemplating the use of an 
MMPI short form. 


Clinical Validity Studies: Diagnostics 


Table 1 summarizes the findings of seven re- 
cent studies comparing one or more of the MMPI 
short forms to the full MMPI in diagnostic 
decision-making situations. In four of the studies 
shown, the consensual diagnoses of two or three 
PhD psychologists in their blind analysis of 
short- or long-form profiles are compared. These 
studies suggest that the FAM and MMPI-168 
are about equal (near 80% agreement with the 
full MMPI) and are superior to two other short 
forms, the Midi-Mult and the Hugo. In three 
validity studies using various external diagnostic 
criteria, the MMPI-168 has been found slightly 
superior to the full MMPI in diagnostic validity. 
These studies suggest that for diagnostic pur- 
poses, the clinician should consider using the 
FAM or the MMPI-168 over the other available 
short forms, with a slight edge going to the 
MMPI-168. 


Clinical Validity Studies: Profile Interpretation 


To date there are but three studies that di- 
rectly address the issue of accuracy of individual 
profile interpretation—Newmark, Conger, and 
Faschingbauer (1976); Newmark, Falk, and 
Finch (1976); and Poythress and Blaney (in 
press), 

In the studies by Newmark and his associates, 
Newmark served as a blind interpreter of long- 
form and short-form profiles, generating 200- to 
300-word interpretive statements about the pa- 
tients whose profiles were provided. Subse- 
quently, psychiatric residents in charge of patient 
care have rated his blind interpretive statements 
on a scale from 1 (totally inaccurate) to 5 (to- 
tally accurate), using their own knowledge of the 
patient as the criteria. 

Newmark, Conger, and Faschingbauer (1976) 
found that the FAM compared favorably with 
the full MMPI, with 84% of the FAM interpre- 
tations and 92% of the full MMPI interpreta- 
tions receiving a rating of either 4 (80% accu- 
rate) or 5 (totally accurate). In paired compat 
sons of long-form interpretations and short-form 
interpretations on the same patient, the f 
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Table 1 
Diagnostic Validity of MMPI Short Forms 


Diagnostic outcome variable 
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Agreement between 


; 
: Study 


Newmark, Cook, Clark, & Consensual diagnosis of 2 or 3 
PhD psychologists—general categories 


Faschingbauer (1973) 


MMPI and FAM: 76% 


of psychotic, neurotic, or 


personality disorder 


Newmark, Newmark, & 
Faschingbauer (1974) 


Newmark, Owen, & 
© Newmark (1975) 


| 
Newmark, Newmark, & 
| Cook (1975) 


| Newmark & Finch (1976) Same diagnostic decision as above 
for profiles, but hospital staff 


same as above 


same as above 


same as above 


MMPI and Midi-Mult: 43% 
MMPI and Hugo: 57% 
MMPI and FAM: 78% 
MMPI and Midi-Mult: 52% 
MMPI and Hugo: o 
MMPI and FAM: 18% 


MMPI and MMPI-168: 83% 


MMPI & criterion: 84% 
MMPI-168 & criterion: 88% 


diagnosis utilized as criterion 


Overall, Butcher, & 
Hunter (1975) 
MMPI-168 scores 


| Overall, Higgins, & 
Schweinitz (1976) 


Using linear discriminant function to 
discriminate normals from psychiatric 
patients based on MMPI or 


Using multiple discriminant function for 
differential diagnosis across 10 major 
psychiatric diagnostic categories 


MMPI-168 slightly superior to full 
MMPI discriminating normals 
from psychiatric patients 


MMPI-168 slightly superior to full 
MMPI in this differential 
diagnostic task 


based on MMPI or MMPI-168 scores 


Ne 3 cae 
lle, MMPI = Minnesota Multiphasic Personality Inventory; 


MMPI. 


at interpretations were more often rated 
[above the FAM interpretations than was the 


h 
f 
H 
( 


as but further analyses estimating validity 
heen (corrected for chance) for these two 
tal ie yielded a coefficient of .94 for the 
bity. b PI and .89 for the FAM. A second 
ae eat ett Falk, and Finch (1976), ex- 
Hugo is same methodology to the FAM, the 
a a the MMPI-168. These investigators 
k about 90% of the full MMPI inter- 
Ea s received ratings of 4 or 5; the FAM, 
450, a and the MMPI-168 received 82%, 
a tie : nae respectively. The mean rating 
BA bs se MMPI interpretations was signifi- 
ea m! er than that for either the FAM or the 
waited a not for that of the MMPI-168. In 
e poe Ton; the MMPI interpretations 
e AA often rated above the short-form 
Masi ions, but this dominance was pro- 
fiede si oe for the Hugo. The MMPI-168 
indged S y better than the FAM and was 
ot a e, within the limits of the methodol- 

NE 7 study, interpretively equal to the 


P 
ee and Blaney (in press) noted the 
S in the Newmark, Conger, and Fasch- 


FAM = Faschingbauer Abbreviated 


976) and Newmark, Falk, and Finch 
(1976) methodology—the use of only one clini- 
cal interpreter and the gross measure of interpre- 
tive accuracy—and approached the assessment 
of short-form interpretive validity in a different 
manner, First, a large pool of clinical psycholo- 
gists (N = 29) from various clinical and aca- 
demic settings were recruited as blind interpreters 
of MMPI profiles. Second, profile interpretation 
was standardized and quantified by the use of a 
30-item Q sort. For a given patient, four different 
raters were asked to rate profiles via the comple- 
tion of a Q sort; two raters were mailed a pro- 
file based on full MMPI scoring, one rater 
received a Mini-Mult profile, and one rater 
received a FAM profile. The distribution of long- 
form/long-form Q-sort correlations was Com- 
pared with each of the Jong-form/short-form 
correlation distributions to see if independent in- 
terpretations of the same long-form MMPI pro- 
file were in fact more similar (i.e. correlated 
more highly) than were independent interpreta- 
form and a short form. The in- 
M was interpre- 


ingbauer (1 


vestigato: 
tively similar enough to the full MMPI to 
recommend it for clinical use; in 50% of com- 
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parisons, the FAM/MMPI Q-sort correlation was 
equal to or greater than the MMPI/MMPI Q- 
sort correlation for the same patient. However, 
the Mini-Mult did not generate interpretations 
sufficiently similar to those generated by the full 
MMPI to warrant a recommendation for its 
clinical use. 


Discussion 


Faschingbauer (1976) provided an extensive 
review of the literature on the ability of various 
short forms of the MMPI to predict various 
parameters of the full MMPI—such as elevated 
peaks or code types—and a discussion of dif- 
ferences in content among the various short 
forms. This article extends the review of Fasching- 
bauer by providing a brief review of the empiri- 
cal studies on the MMPI short forms in which 
an actual clinical decision or judgment was used 
as a dependent variable. It further includes a 
discussion of the recent MMPI-168, a short form 
that Faschingbauer’s earlier article did not con- 
sider. 

The needs and working constraints of the indi- 
vidual clinician will determine the selection of 
the MMPI short form. Unusual time constraints 
or illiterate subjects may dictate the use of one 
of the more abbreviated of the MMPI short 
forms, such as the Midi-Mult (86 items) or the 
Mini-Mult (71 items), which can easily be ad- 
ministered orally. The present review of validity 
studies, however, suggests that the FAM or the 
MMPI-168 should be selected over the other 
available short forms, since these two seem to 
provide the most accurate diagnostic and interpre- 
tive information. A slight edge in favor of the 
MMPI-168 is suggested, and it is the easiest of 
the short forms to administer, Form R of the 
regular MMPI can be used, with the patient being 
instructed to complete only the first 7 pages. Ad- 
ministration time is about 40 minutes, and con- 
version to full scale score estimates is easily ac- 
complished via a conversion table, 
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A Note on the MMPI as a Suicide Predictor 


James R. Clopton 
Texas Tech University 


Comments are offered on a recent article by Leonard on the Minnesota Multi- 
phasic Personality Inventory (MMPI) as a suicide predictor. Her appraisal of 
former studies and her consideration of the differences between individuals with 
different suicidal behaviors are critically evaluated. Future research should em- 
phasize the development of useful MMPI indices of suicidal risk and should 
recognize the need to cross-validate these indices. 


A recent article by Leonard (1977) regarding 
the Minnesota Multiphasic Personality Inventory 
(MPT) as a suicide predictor deserves com- 
‘ment. MMPI data obtained from psychiatric in- 
patients who had committed suicide were com- 
ped with two matched control groups, one of 
highly suicidal patients and the other of non- 
suicidal patients. Discriminant analyses revealed 
that the committed-suicide MMPI profiles were 
distinguishable from nonsuicidal profiles. Leonard 
discussed the personality differences between sui- 
tidal and nonsuicidal patients and emphasized 
: the lack of homogeneity among suicidal patients. 
. Despite mentioning some of the limitations of 
Previous research on suicide, Leonard (1977) de- 
clared that this research provides “encouraging 
a However, to date there has been no in- 
- that standard MMPI scales, MMPI pro- 
4 analysis, or specially developed MMPI sui- 
es can reliably predict suicide at useful 
. ‘i ae review articles (Clopton, 1974; 
a 970) show that the results of previous 
his aa are inconsistent and reveal that previ- 
e: u S often have had major methodological 
a aoe as combining individuals with differ- 
Bie ics of suicidal behavior into the same 
fb oe Broida, 1954). Leonard also appears 
vided overlooked research studies that have 
b aeons results. For example, Ravens- 
Rive Foss (1969), using a multivariate 

R a ound no differences in the MMPI pro- 
A pene who had committed suicide in a 

ospital, patients who died of natural 


Causes j r: i. 
tients, in the same hospital, and nonsuicidal pa- 
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Leonard’s (1977) study and two earlier stud- 
ies (Clopton & Jones, 1975; Devries & Farberow, 
1967) have found some evidence that multi- 
variate analyses can differentiate suicidal from 
nonsuicidal patients, Correct classification was 
obtained for 41% of the patients in the Devries 
and Farberow (1967) study and for 66% of the 
patients in the Clopton and Jones (1975) study. 
Leonard reported that for the female patients in 
her study, a discriminant function using seven 
MMPI scales correctly classified all 16 women 
who committed suicide and 14 of 16 female pa- 
tients in a comparison group. Close examination 
reveals that these results may be less impressive 
than they appear. As Leonard explained, the 
clinician wants a single index or other straight- 
forward means of determining suicidal risk from 
MMPI data, and the variables in a discriminant 
analysis form a complex cluster. None of the 
three studies using multivariate procedures have 
examined the usefulness of these procedures to 
clinicians faced with the task of predicting sui- 
cidal behavior. Another consideration not men- 
tioned by Leonard is that before multivariate 
procedures can be considered reliable in clas- 
sifying patients as suicidal or nonsuicidal, they 
need to be cross-validated. That is, the combina- 
tion of differentiating variables needs to be de- 
rived in one comparison of suicidal and non- 
suicidal individuals and then extended to new 
samples. The percentage of correct classifications 
could be considerably lower jn the validation 
samples, None of the three studies using multi- 


variate procedures have cross-validated their 


procedures. sign 

Leonard (1977) stressed the diversity existing 
among different suicidal groups, and at one point 
she stated that increasing the sample size in 
suicide studies by including suicide threateners 
or attempters with those who commit suicide is 
appealing but not justified. She stated that the 
populations (actual suicides, suicide attempters, 
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and suicide threateners) are obviously different. 
One is puzzled then to find that the high-suicide 
comparison group in her study contained pa- 
tients with histories of suicide attempts and/or 
prolonged threats and preoccupation. Combin- 
ing patients who threaten suicide with those who 
make suicide attempts is a questionable pro- 
cedure. Previous MMPI research reveals that 
patients threatening suicide have the most devi- 
ant MMPI profiles of any suicidal group and that 
there are differences in the MMPI profiles of pa- 
tients who threaten suicide and those who at- 
tempt suicide (Farberow, 1956; Farberow & Dev- 
ties, 1967; Rosen, Hales, & Simon, 1954; Simon 
& Gilberstadt, 1958). 

Leonard (1977) has perpetuated the notion 
that the study of suicide attempters is not a valid 
way to gain information about individuals who 
commit suicide. She stated that the two popula- 
tions are different, and in support she noted that 
approximately three times more females than 
males attempt suicide, but approximately three 
times more males commit suicide. Despite these 
sex differences, the prevailing view is that suicide 
attempters and those who commit suicide can be 
regarded as members of the same population 
(Stengel, 1972). Chance often plays a critical 
role in influencing the outcome of a suicide at- 
tempt, regardless of the lethality of the attempt. 
Also, those who commit suicide frequently have 
prior records of unsuccessful attempts. In Leon- 
ard’s committed suicide group, 80% of the males 
and 75% of the females had a history of one 
or more suicide attempts prior to committing 
Suicide, Finally, observed sex and age differences 
between those who commit suicide and those 
who attempt suicide may be largely due to in- 
cidental cultural factors. The higher rate of com- 
mitted suicide among males is probably related 
to their use of more violent methods, such as 
gunshot, which provide less opportunity for 
rescue than self-poisoning, the preferred method 
of females. Because men have more access to 
Instruments of violence and more training in 
their use, it is Teasonable to expect more of 
their suicide attempts to be successful. 

Leonard’s (1977) study is a valuable contribu- 
tion to the research literature, but neither her 
results nor the findings of previous studies have 
indicated that the MMPI can reliably predict 
suicide at useful levels. Multivariate procedures, 
such as those used by Leonard, have not been 
cross-validated, nor have they been demonstrated 
to be of practical value to clinicians attempting 

to predict suicidal behavior. 
Leonard (1977) combined patients who threat- 
ened suicide and patients who attempted suicide 


COMMENTS 


into one group. However, patients who threaten 
suicide are the most distinct suicide group, It is 
possible that the patients who attempted suicide 
were more similar to the patients who committed 
suicide than to the patients who threatened sui- 
cide. Despite Leonard’s comments to the con- 
trary, the current view is that patients who at- 
tempt suicide are probably all members of the 
same population regardless of the success of 
their attempts. 
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Response to “A Note on the MMPI as a Suicide Predictor” 


Calista V. Leonard 
University of California, Los Angeles 


Response is made to comments by Clopton on an article by Leonard on the 
Minnesota Multiphasic Personality Inventory (MMPI) as a suicide predictor. 
Clopton’s viewpoint that patients who attempt suicide are probably all members 
of the same population regardless of the success of their attempts is discussed. 
His concern that future research should emphasize the development of useful 
MMPI indices of suicidal risk and should recognize the need to cross-validate 


these indices is endorsed. 


Clopton’s (1978) main concern in his comments 
on Leonard (1977) seems to center about his 
belief that “patients who attempt suicide are 
probably all members of the same population 
Tegardless of the success of their attempts” (p. 
$36). This is an individual viewpoint rather than 
“@ “current view.” It does not take a statistician 
to deduce that if attempted suicides are all from 
the same population, then one would expect male 
Suicides, for example, to be equaled by female 
Suicides instead of being approximately three 
times more numerous. Unquestionably, there is 
Overlapping between the groups, but many sui- 
tides have never attempted or overtly threatened 
Suicide (Robins, Gassner, Kayes, Wilkinson, & 
Murphy, 1959), and many attempted suicides 
who have seemed lethally suicidal do not go on 
to commit suicide (Pitts & Winokur, 1964). 


Another concern expressed by Clopton is that 
Negative results were not reported, and I agree in 


Baier However, negative results must be 
fr for comparability, and the Ravensborg 
ore) oss (1969) study that he cites (Clopton, 
JMA contained control groups that were not 
ched with the committed suicide group on 
et variables. The control groups were sig- 
te ae older, less educated, and had lower in- 
why T scores (p < .001). One must wonder 
A ea multivariate analysis found no signifi- 
ae Naa in Minnesota Multiphasic Per- 
“a ty Inventory (MMPI) profiles in spite of 
€ important group differences. 
lad criticism with which I agree whole- 
ly is that cross-validation and helpful 
J 


R ; 
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hints for clinicians are sorely needed in studying 
the MMPI as a predictor for suicide. Even, 
though Clopton (1978) did not report a cross- 
validation for his study (Clopton & Jones, 1975), 
his discriminant analysis of the data lent some 
support to the use of MMPI profiles to identify 
suicidal psychiatric patients. His caution in this 
excellent study concerning the need for identify- 
ing the obtained differences in profiles is well 
taken. 

I would like to conclude this response with a 
note on armchair theorizing, which I enjoy and 
define as “making hypotheses without providing 
evidence for them” and which seems to me to be 
the forerunner of attempts at finding testable 
answers. Clopton (1978) theorizes that men com- 
mit suicide more frequently because they have 
more access to instruments of violence and more 
training in their use, whereas women are less fre- 
quently suicides because they use self-poisoning 
and are more readily rescued. The immediate 
response, of course, is “but any ambulatory and 
determined would-be suicide could select jump- 
ing as a method, and most women drive and 
could use a car lethally even if they didn’t have 
training or access to guns.” Which brings the 
issue directly back to the cultural and person- 
ality variables that lead one person to choose a 
gun, another to make a half-hearted pill-taking 
attempt, and another to choose a nonsuicidal solu- 
tion to environmental stresses, This is what re- 
search on the prediction of suicide is all about. 
What environmental and personality differences 
make one person more self-lethal than another? 
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Brief Reports 


and Repressors 


Glenn A. Miller 


Arizona State University 


TAT-like (Thematic Apperception Test) 
affect were presented to 30 sensitizers, 
Lefcourt’s formulation that sensitizers p! 
pressors devalue such expression. 
of Lefcourt’s model, sensitizers 


number of positive, negative, 0) 


did not 
with their responses to positive a: 


_ The traditional interpretation of repression 
and sensitization as referring to defensive dis- 
sitions to anxiety cues was challenged by Lef- 
ourt (1966), He suggested that sensitizers and 

Teptessors differ in their valuation of emotional 
‘expression, with sensitizers positively and re- 
pressors negatively valuing such expression. For 
example, both repressors and sensitizers describe 
themselves in positive ways. The self-descrip- 
a of sensitizers, which have been typically 

poe as negative, are actually positive 

a ae viewpoint because they perceive 
h admissions [of emotionality] as revealing 
lonesty with one’s self, and a lack of fear of 

ooe (Lefcourt, 1966, p. 445). 

a oF reformulation minimizes the importance 
Rie k ety cues and increases the relevance of 
a expressiveness of repressors and 

aM rs to positive and neutral stimuli. The 

a attempted to determine whether 

iesi s theory would predict the emotional 

cai A of repressors and sensitizers to 

Baral, neutral as well as to negative 
a, ie 90 male introductory psychol- 

Bees a were divided on the basis of 

1961) int epression-Sensitization scale (Byrne, 
feast three equal groups of sensitizers, 10- 
Pai es es, and repressors and were presented 

eight TAT-like (Thematic Apperception 


i 
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consistent only with the responses of subjects to negative a 
ffective stimuli. 


~~ Characteristics of the Emotional Responsiveness of Sensitizers 


to Social Stimuli 


William Nuessle 
University of Kentucky 


stimuli of positive, negative, and neutral 


30 intermediates, and 30 repressors to test 
ositively value emotional expression and re- 
Contrary to the expectation based on an extension 


differ from repressors on the 
their stories. His model appears 
ffective stimuli and not 


significantly 


Test) slides to which they wrote stories. One 
third of each group received stimuli of positive, 
neutral, or negative affective tone. The depen- 
dent measures were the number of positive, of 
negative, and of total affective words contained 
in the stories. 

The clearest test for the extension of Lef- 
court’s interpretation to both positive and nega- 
tive emotional stimuli is the comparison of the 
three subject groups on the total number of 
emotional words to all three types of stimuli 
combined, with differences in length of story 
covaried out, Contrary to Lefcourt’s theory, the 
sensitizers did not exceed repressors and inter- 
mediates on negative, positive, or total affective 
words. An interaction between the affective tone 
of the stimuli and the three subject groups oc- 
curred in the analysis of both the total emo- 
tional words and total negative words, F(4, 71) 
= 2,82, p<.05; F(4, 71) = 2.50, p<.05, The 
interaction for total emotional words is a con- 
sequence of the fact that sensitizers exceeded 
repressors on the negative slides but repressors 
gave more emotional words to the positive and 


neutral slides. 


In the analysis of total positive words, there 


were no significant differences between groups 
and there was no interaction between subject 
groups and types of stimulus slides. In fact, 
sensitizers gave fewer positive words than re- 
pressors. This difference in means is opposite in 
direction, although not significantly, to that 
which would be predicted from Lefcourt’s inter- 


pretation. 
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Thus the greater emotional expressiveness of 
sensitizers, as compared with repressors, appears 
to be limited to responses to negative stimuli. 
This would seem to mean that Lefcourt’s inter- 
pretation of the repression-sensitization dimen- 
sion as a reflection of differential valuation of 
emotionality requires additional constructs to 
account for the interaction between the affective 
tone of the stimuli and the status of subjects as 
repressors and sensitizers. 
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San Diego County Mental 


Early termination from 
studied with lower-class 
The only significant correl: 
patient. Patients remained longer with th 


toward whites, patients’ perceptions of the: 


This study explored early termination from 
Psychotherapy in an inner-city community men- 
fal health clinic in Philadelphia. The 32 black 
male and 55 black female lower-class patients for 
Study were obtained after they had had 
cal evaluations. They were randomly as- 
ned to 1 of 10 staff therapists who were black 
or white, male or female. After the first therapy 
Session, the therapists filled out the discrepancy 
scale described below. The patients were orally 
‘presented with this and other scales by an experi- 
mental interviewer. 
Attitudes toward whites were assessed by an 
tude behavior scale of blacks toward whites. 
wels of this scale were used that examined 
tients’ personal behavior and their atti- 
des toward societal stereotypes. 
Patient-therapist discrepancies in perception of 
erapy were measured by a scale constructed 
it this study. It included an inquiry as to what 
patients considered their most important 
blem; questions as to whether the therapy 
itched the patients’ expectations; a sampling of 
patients’ attitudes toward seeking profes- 
a psychological help; estimates of the thera- 
ts’ understanding and acceptingness as per- 
tived by the patients; and questions about the 
J tients’ preference for certain types and de- 
S of therapist activity. 
4 Finally, patients’ perceptions of their thera- 
ts understanding and acceptingness were mea- 
red in detail. 
The 43 patients who came for three sessions 
I less were considered the dropouts; the 44 who 
e for four or more sessions were considered 


tine for reprints and for an extended report 
É this study, and for a copy of the scales used, 
ould be sent to Anthony Vail, 12732 Gibraltar 
tive, San Diego, California 92128. . 


correlations were found between remaining in trei 
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Factors Influencing Lower-Class Black Patients Remaining in Treatment 


Anthony Vail 


Health, San Diego, California 


individual therapy in a community mental health clinic was 
black patients assigned to therapists who were black or white, 
ate was the interaction between sex of therapist and sex of 
rapists of the opposite sex. No significant 


atment and black patients’ attitudes 


rapists’ understanding and acceptingness, or 


patient-therapist discrepancies in their perception of therapy. 


remainers, To analyze for effects of patient- 
therapist similarity in race and sex on continua- 
tion in treatment, a three-dimensional contin- 
gency table was used. 
The only result significant at the .05 level was 
the Sex of the Therapist X the Sex of the Pa- 
tient interaction. The race of the therapist did 
not matter statistically. Most of the interaction 
originated with male therapists being Jess effec- 
tive with black male patients than with black 
female patients. Female therapists were also less 
effective with patients of their own sex, although 
this was only about half as pronounced as the 
finding for male therapists. The interaction effect 
was opposite to the direction predicted. At least 
29% of the variance among patients’ remaining 
in treatment was accounted for by this interac- 
tion effect. 
No signifi 
remaining in treat 


cant correlations were found between 
ment and the black patients’ 
attitudes toward whites on either of two levels; 
patients’ perception of their therapists’ under- 
standing and acceptingness; or the similarity of 
their view of therapy to that of their therapists.* 
In all of these analyses, it did not matter sta- 
tistically whether the therapists were black or 
white. 

It seems worth looking further into the dynam- 
ics of the cross-sex phenomenon between thera- 
pists and lower-class patients to see if it is based 
on the attractiveness of a heterosexual encounter, 
inhibitions associated with being painfully re- 
vealing of oneself before a therapist of the same 
sex, or some other reason. 
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report. 


Inc, All rights of reproduction in any form reserved, 


341 


Journal of Consulting and Clinical Psychology 
1978, Vol. 46, No. 2, 342-343 


Effects of Explanation and Information Feedback on the i 
Illusory Correlation Phenomenon 


Ronald W. Waller and Stuart M. Keeley 
Bowling Green State University 


An attempt was made to attenuate the illusory correlation phenomenon existent in 
one set of diagnostic stimulus materials, Draw A Person, through training on another 
set, Rorschach. Three training methods were tried: information feedback plus expla- 
nation, information feedback only, and explanation. None of the training conditions 
attenuated the illusory correlation effect when no relationship existed in the stimulus 
set, Significant attenuation occurred only when a negative correlation existed. 


Even though it seems clear that clinical psy- 
chologists have considerable room for improve- 
ment in making judgments, only recently have 
researchers begun to isolate factors contributing 
to nonoptimality of judgment and to the fre- 
quently misplaced faith in such judgments. One 
such factor is the presence of “illusory correla- 
tions”; that is, subjects with strong a priori 
expectations often respond to expected relation- 
ships rather than to the relations actually in the 
stimulus materials (e.g., Chapman & Chapman, 
1969). This tendency to maintain a belief in 
illusory correlates when the evidence does not 
support such a belief presents a major obstacle 
to making valid clinical judgments. This study 
asks the question, “Can this obstacle be overcome 
by sensitizing judges to the phenomenon?” Such 
training of judges would have maximum utility 
if it were to generalize to a variety of judgment 
tasks, The present study examines whether spe- 
cial training on illusory correlation using Ror- 
schach test stimulus materials will reduce the 
illusory correlation effect for a second set of 
materials, the Draw-A-Person (DAP) task. 

Subjects were 120 volunteer undergraduate 
introductory psychology students. Thirty sub- 
jects were randomly assigned to each of four 
different training conditions: (a) explanation, 
(b) information feedback, (c) explanation and 
information feedback, and (d) control. Subjects 
assigned to each of the four training conditions 
were divided into three subgroups (n= 10), 
each receiving a different set of generalization 
stimulus materials. Thus, there were 12 groups in 
all. Each group participated in three phases: (a) 


à 
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pretraining judgments, (b) training, 
posttraining generalization judgments. 
In the pretraining phase, subjects made judg: 
ments about how they believed specified Ror 
schach and DAP cues were related to various 
symptoms. In the training phase, subjects in 
each training group were exposed to the samé 
three sets of 24 Rorschach stimuli each. Each 
set included one cue-symptom pair that had 
been found to be an illusory correlate in previ- 
ous research or else had been rated as being 
related by more than 50% of a group of 25 in 
troductory psychology pilot subjects. Set 1 con- 
tained no relationship among cues and symptoms; 
Set 2 contained a strong negative relatidnsh 
between the illusory correlate pair; and Set 
contained a strong positive relationship. 
Subjects in the information feedback and 
explanation training condition (FE) were giveni 
an explanation of the nature of the illuso 
correlation phenomena and were informed of 
research findings related to this phenomenon 
prior to viewing the three training sets. They 
were warned that illusory relationships might 
bias their own judgments of the stimulus ma- 
terials and were cautioned to try to avoid this 
kind of bias. After viewing each stimulus set and 
judging the cue-symptom relationships, subjects 
were given the actual cue-symptom relationships) 
contained in that set to compare with their 
predictions. Subjects in the explanation (E) con 
dition received a training procedure identical to 
the FE group, except they received no informa- 
tion concerning the actual relationships com 
tained in the stimulus materials. Subjects in the 
information feedback (F) condition were only 
given feedback information about the true cue 
symptom: relationships. Control subjects C 
received no information concerning the actua 


and, (c) 
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cue-symptom relationships and no explanation 
of the illusory correlation effect. 

Jn the generalization judgment phase, each 
subgroup was exposed to one of three sets of 
DAP stimulus materials, each set varying in 
actual cue-symptom relationships. Stimuli were 
similar to those used by Chapman and Chapman 
(1967). Each set contained the illusory corre- 
Jate of broad shoulders and muscular drawing 
characteristics and the symptom “he is worried 
‘about how manly he is.” The three cue-symp- 
tom relationships were (a) no relationship among 
DAP drawing characteristics and the symptom 
statements (DS 1), (b) a strong negative rela- 
ionship between the illusory correlate pair (DS 
2), and (c) a strong positive relationship between 

illusory correlate pair (DS 3). 

Subjects made two kinds of responses during 

each phase. First, they used a 7-point scale indi- 
ting the likelihood that the patient had each 
‘symptom given a specified cue. Second, they 
‘rated the confidence they had in the judgments 
‘they made about the relationships in each stim- 
lus set by marking a 15-cm line. 
_ As expected, a strong a priori bias existed in 
‘the pretraining estimates of the relationship be- 
tween the illusory cue-symptom pairs. The ma- 
jority of subjects in all groups reported positive 
‘relationships for both Rorschach and DAP il- 
lusory pairs. 

Likelihood ratings of the illusory DAP cue- 
‘symptom combination were analyzed by a three- 
Way fixed effects analysis of variance (4 Train- 
ing Conditions X 3 Stimulus Sets X 2 Judgment 
Times), with the subject factor nested under 
both training conditions and stimulus sets. Main 
effects were found for both stimulus set (p< 


identical fashion on pretraining and posttraining 
judgments to DS 1 and DS 3 sets, reporting a 
Positive relationship; However, subjects in all 
training conditions responded very differently 
to the DS 2 stimulus set following training than 
nor to training. Their judgments shifted from 
sitive to either zero or negative. This differen- 
| shift is supported by the findings of a simple 
Thain effect of stimulus set for the second set of 
judgments (p< .01) and none for the first set. 
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A three-way analysis of variance on confidence ` 
judgments showed only one significant effect— 
time of judgment (p<.01). Subjects became 
more confident in their judgments from pre- 
training to posttraining, regardless of which 
training condition they were in and regardless of 
which stimulus set they were exposed to. 

Sensitizing subjects to the illusory correlation 
phenomenon or Rorschach materials through 
feedback about actual relationships, explanation 
of the phenomenon, or both did not attenuate 
the tendency to see-positive cue-symptom rela- 
tionships when no relationship actually existed. 
However, negative relationships between illusory 
correlates markedly attenuated the effect. Such 
findings suggest that subjects do attend to the 
evidence. They also suggest that the number of 
confirmations of the original biased expectancy 
relative to the number of disconfirmations may 
be a crucial determinant of whether the original 
belief is maintained. 

It is important to note that in the typical 
illusory correlation demonstration, subjects are 
provided with only affirmations of the symptoms; 
negations must be implied. Thus, they typically 
see some instances of cue present, symptom 
present for the illusory pair but never see the 
cue with an explicit statement of the absence of 
the symptom. Only a few affirmatives may be 
necessary to maintain the illusory belief. 

The present study provides further suggestive 
evidence that human judges have a difficult time 
using evidence appropriately in the face of un- 
certainty. Short-term explanation and/or feed- 
back procedures do not seem to be a sufficient 
remedy for this particular kind of error in hu- 
man judgment. Whether long-term training pro- 
cedures would attenuate the effect is presently . 
an empirical question. 
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The 4-3 MMPI Profile Type: A Failure to Replicate 


Jeffrey A. Buck and John R. Graham 
Kent State University 


Using a sample of 65 prison inmates, the incidence of violent crimes for persons with 
the 4-3 Minnesota Multiphasic Personality Inventory profile type and with other 
two-point code types was compared. The results failed to replicate the findings of 
some earlier investigators, who reported that violent behaviors are more common for 
persons with the 4-3 profile type. The failure to identify a significant relationship 


between the 4-3 profile type and violent behavior suggests that caution 


should be 


exercised in generalizing to populations that differ from those in which the relation- 
ship between violence and the 4-3 profile type is established. 


Previous investigations of the Minnesota 
Multiphasic Personality Inventory 4-3 profile 
type have yielded inconsistent results. Some 
studies (eg., Davis & Sines, 1971; Persons & 
Marks, 1971) have indicated that subjects pro- 
ducing the 4-3 profile are prone to act in more 
aggressive and violent ways than subjects not 
producing the 4-3 profile. Other studies (eg., 
Gynther, Altman, & Warbin, 1973) have not 
found violence or hostility to be characteristic of 
persons with the 4-3 profile. Two explanations 
offered for this disparity have been differences in 
the ages of the subjects studied and the ways in 
which the 4-3 profile type have been defined. The 
present study attempted to gain information 
about both of these possible sources of variance. 

Subjects were male inmates of a medium- 
security penitentiary for adult male felons lo- 
cated at Marion, Ohio. The incidence of violent 
crime was examined in a sample of 65 prisoners 
producing a 4-3 profile type and 64 not of this 
type. Violent crime was defined as murder, rape, 
aggravated assault, or robbery. The 4-3 and non- 
4-3 groups were separated into “old” subjects (30 
or older) and “young” subjects (under 30). 
Within each of the resultant subgroups, the num- 
ber of subjects committing violent and nonvio- 
lent crimes was determined and a chi-square 
test was made to determine if the subgroups 
differed on this variable. In addition to this over- 
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all comparison, different combinations of the 
subgroups were examined for possible differences 
on the violence measure. Analyses of variance 
also were performed to determine if the abso- 
lute and relative elevations of Scales 3 and 4 
influenced the relationship between the 4-3 pro- 
file type and violent or aggressive behavior. All 
of the above tests were performed separately for 
black and white subjects. None of the relevant 
comparisons was significant. 

Comparisons of the methodology of the pres- 
ent study with those of previous ones indicate 
that the way in which the 4-3 profile has been 
defined and the criterion measures of aggressive 
and violent behavior cannot completely account 
for the differences in findings between this in- 
vestigation and others. A possible explanation 
for these differences is that the subjects in this 
study came from a facility in which the most 
aggressive and violent prisoners already had 
been selected out and sent elsewhere. The failure 
to identify a relationship between the 4-3 profile 
type and aggressive and violent behavior in this 
study suggests that caution should be exercised 
in generalizing to populations that differ from 
those in which the relationship between violence 
and the 4-3 profile type is established. 
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cantly, positively 
parents’ perceptions of the 


faction and child adjustment. 


dren who exhibit deviant social behavior 
ly come from families characterized by 
dissatisfaction and discrepant parental 
In a recent study based on small sam- 
‘Ferguson, Partyka, & Lester, 1974), par- 
f well-adjusted children showed closer 
ent in their perceptions of their child than 
a of clinic-referred children. It was pre- 
that in the general population, parents who 
ribed their children as more favorably ad- 
would show closer agreement in their 
tions of their child, would report greater 
satisfaction, and would show greater 
ent between descriptions of themselves 
escriptions of them by their spouses. 
etters requesting cooperation in a research 
y were distributed to all parents of kinder- 
first-, and second-grade children in the 
Michigan, Public School System (approxi- 
1,000 children). Two hundred four fami- 
returned postcards indicating their willing- 
3 to participate in the study. Parents of 95 
(51 males and 44 females between 5 
„years old) completed the following ques- 
res: Marital adjustment and satisfaction 
assessed for each parent by his/her re- 
ses to the Locke-Wallace Scale (Locke & 
ce, 1959), The Interpersonal Checklist (La- 
& Suczek, 1955) was used to assess par- 
Perceptions of self and spouse. The Chil- 
Behavior Checklist (CBCL) was the in- 
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Congruence of Parental Perception, Marital Satisfaction, 
and Child Adjustment 


Lucy Rau Ferguson and Deborah R. Allen 
Michigan State University 


Fathers and mothers of 95 children 5-7 years old completed the Locke-Wallace Scale, 
the Interpersonal Checklist, and the Children’s Behavior Checklist to assess marital 
satisfaction, congruence of self- and mate-perceptions, and agreement in parents’ per- 
ceptions of their child and child adjustment, 
intercorrelated, Strongest association was between congruence in 
child and child adjustment. Similarity in partners’ self- 
concepts and psychological empathy were significantly associated with marital satis- 
f A general dimension of family harmony 
is seen as contributing to children’s social adjustment. 


respectively. All variables were signifi- 


(vs. conflict) 


strument used to assess parents’ perceptions of 
their child. It consists of 154 interpersonal and 
symptomatic items referring to the behavior of 
children; the parent checks whether each item is 
applicable to the child and/or characteristic of 
him/her. The CBCL was also used as a measure 
of child adjustment. Sixty-six items have been 
found to discriminate significantly between 
clinic-referred and nonclinic (adjusted) children, 
with 32 being more characteristic of clinic-re- 
ferred children and 34 being more characteristic 
of nonclinic children (Ferguson et al, 1974). 
Adjustment is calculated by subtracting each 
parent’s score on the 32 clinic items from his/ 
her score on the 34 nonclinic items; the mean of 
the mother’s and father’s scores yields a com- 
bined adjustment score for the child. 

Interitem phi coefficients expressed the agree- 
ment between the husband’s perception of him- 
self and his wife’s perceptions of him and between 
the wife’s perceptions of herself and her hus- 
band’s perceptions of her. Agreement in parents’ 
perceptions of their child was determined by cal- 
culating the correlations between the mother’s 
ratings of her child on the CBCL and those of 


the father. 
Distributions of Locke-Wallace scores and of 


CBCL child adjustment scores were all nega- 
tively skewed; thus this is a sample of families 
who preponderantly reported themselves as 
happy and well-adjusted. Associations between 
the various measures of congruence of parental 
perception (of spouse and child), marital satis- 
faction, and child adjustment were examined by 
means of product-moment correlation coefficients. 


bined all correlations were sig- 


For the sexes com! a j 
nificant at the .05 level in the predicted direc- 
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Figure 1. Intercorrelations of variables based on combined sample. 


tion; 9 out of 13 were significant at the .001 
level. 

Figure 1 shows the interrelations among the 
variables. 

Parents’ agreement in viewing their child was 
closely associated with congruence in their per- 
ceptions of each other, and each of these varia- 
bles in turn was significantly related to marital 
satisfaction. Thus we seem to be tapping a more 
general dimension of family harmony that under- 
lies the child’s social adjustment. When parents 
see their child as possessing the characteristics 
of well-adjusted children, they also tend to agree 
closely in their perceptions of all aspects of their 
child’s behavior, to express satisfaction with their 
marriage, and to see their spouses the way their 
spouses see themselves. Further research is 


underway to confirm these relationships using an 
independent measure of child adjustment. 
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on a separate factor, 
global assertiveness. 


Systematic progress in the area of assertive- 
ess training has been hampered by a general 
lack of concensus as to what specific responses 
are involved in behaving “assertively.” Some 
z Workers have chosen to define assertive behaviors 
intuitively and then modify these components 
(Wolpe & Lazarus, 1966). More recently, Eisler, 
Miller, and Hersen (1973) have identified spe- 
cific verbal and nonverbal behaviors that dis- 
tinguished psychiatric patients judged to be rela- 
tively assertive from those who were unassertive. 
= present study was designed to statistically 
Westigate the relative contributions of these 
at behavioral components of assertiveness 
ict actor-analytic procedure (principal axis 
| k ie varimax rotation). A second purpose was 
Fi SoN subjective global ratings of assertive- 

The “ci behavioral measures. 
ines es were 55 males hospitalized for 
A ine o alcoholism. We constructed a series 
AA vocationally related interpersonal en- 
Pee assertive responses. A male 
a cnsee sites employed to prompt subjects’ 
Videotaped o these situations. Responses were 
tents of cae, slat on the following compo- 
fin of ae aa uration of eye contact, dura- 
tating Corie” ee statements indi- 
Teasonable requests d a entry R 
ien ic chan an ee requests for the 
Measure of aiden behavior, and a global 

ê r * e i 
(60), eee Be duration of reply 
EL foeech conte 7 ontact (.70), noncom- 
Westing new ee n (.74), speech content re- 
avior (.78), and overall asser- 
Reques 8 

"ile aein pan be sent to Richard M. 
hic sychology, Virginia Poly- 
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A Factor Analysis of Assertive Behaviors 


Joseph S. Pachman, David W. Foy, and Frank Massey 
Veterans Administration Center, Jackson, Mississippi 


ae Richard M. Eisler 
Virginia Polytechnic Institute and State University 
Blacksburg, Virginia 


Five objective measures of interpersonal behavior presumed to be tapping “assertive- 
ness” were factor analyzed. In addition, each behavioral measure was correlated with 
a subjective global rating of assertiveness. Four of the behavioral measures loaded 
highly on a general factor of assertiveness. The fifth behavioral measure loaded highly 
Response Latency, All behavioral measures with the exception 
of response latency evidenced significant correlations with the subjective rating of 


Table 1 
Correlations Between Global Assertiveness and 
Behavioral Components of Assertiveness 


Behavior component r 
Response duration ah 
Response latency —.117 
Request for change Erih 
Compliance —.50** 
Eye contact Ag** 
*p < 05. 
** p < 001. 


tion (.75) were all loaded highly on Factor 1 
(Assertiveness). However, latency of reply 
loaded nearly entirely (.93) on a separate factor 
(Factor 2, Response Latency). In addition, cor- 
relation coefficients obtained between the sub- 
jective global measure of assertiveness and the 
other behavioral measures, with the exception of 
latency of reply, were statistically significant. 
The results corroborate the importance of the 
previously jdentified component behaviors (Eis- 
ler et al., 1973), with the exception of response 
latency, in reflecting assertiveness (see Table il); 
The utility of a subjective global rating of as- 
sertiveness by trained observers was also dem- 


onstrated in view of the significant correlation 


coefficients obtained between this measure and 
] measures of assertion. 


the objective behavioral 
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Life Change, the Sensation Seeking Motive, and Psychological Distress 


Ronald E. Smith, James H, Johnson, and Irwin G. Sarason 
University of Washington 


The relationship between life change and psychological distress as a function of the 
sensation seeking motive was investigated. Scores on Lanyon’s Discomfort scale were 
unrelated to positive and total life change scores but were significantly related to 
amount of negative life change occurring over the previous year. However, as pre- 
dicted, this relationship was restricted to subjects low in sensation seeking; high 
sensation seekers are apparently more tolerant of negative life change. 


The role of life change in physical and psycho- 
logical well-being has received considerable em- 
pirical attention in recent years, and degree of 
life change has been shown to be significantly 
related to both physical illness and measures of 
psychopathology. Most of the life stress research 
has been based on the possibly erroneous as- 
sumption that change per se is stressful, regard- 
less of whether it is perceived by the individual 
to be positive or negative. In addition, the role 
of personality variables as mediators of responses 
to life change has been virtually ignored, 

The present study examined the relationships 
between both positive and negative life change 
and a measure of psychological distress as a func- 
tion of subjects’ scores on the Sensation-Seeking 
Scale (SSS; Zuckerman, Kolin, Price, & Zoob, 
1964). It was hypothesized that low sensation 
seekers, who presumably have a low optimal level 
of stimulation, would be more negatively af- 
fected by stressful life changes than would high 
sensation seekers. 

Forty-two male and 33 female college under- 
graduates were administered the 22-item SSS and 
the Life Experiences Survey (Sarason & Johnson, 
1976), On the latter Measure, subjects indicate 
which of a series of life changes they have ex- 
perienced during the past 12 months, and they 
rate the positiveness or negativeness of each 
experienced change, Positive, negative, and total 
life change scores are obtained by summing these 
ratings, (The positive and negative change scores 
have been found in this and other studies to be 
essentially uncorrelated with one another.) 


Requests for reprints should be sent to Ronald E. 
Smith, Department of Psychology, NI-25, Univer- 
sity of Washington, Seattle, Washington 98195, 


In the present study, subjects scoring above 
and below the median of the positive, negative, 
and total life change distributions and above and 
below the median of the SSS were assigned to 
cells of three separate 2 x 2 factorial designs 
The subjects also completed the Discomfort 
scale of the Psychological Screening Inventory 
(Lanyon, 1970), which served as the dependent 
variable measure of psychological distress. 

No significant main or interaction effects were 
found in analyses of variance involving eithet 
the positive or total life change scores. However, 
a significant main effect for negative life change 
was found, F(1, 71) = 4.75, p< .05, with high | 
change subjects (M = 10.97) having higher 
Scores on the Discomfort scale than low scorers 
(M=8.71). The interaction effect closely ap 
proached but did not attain significance. Subse- 
quent Duncan multiple-range tests of cell means 
showed no significant Discomfort scale differ- 
ences between high sensation seekers who dif 
fered in negative life change. However, among 
low sensation seekers, high-negative-change sub- 
jects (M = 12.44) had significantly higher dis- 
tress scores than did those who had experience 
low levels of negative change (M = 9.00). The 
role of sensation seeking as a moderator ga 
was also indicated in a correlational analysis a 
disclosed a significant relationship (r T 5 
between negative life change scores and Discom 
fort scores in low sensation seekers but no sig 
nificant relationship (r =.15) in high sensation 
seekers, e 

The results of this study suggest that i 
changes are related to psychological distress only 
if the individual perceives them to be neima 
and that the sensation seeking motive influencé: 
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Relation of Patient Attributes to Perceptions of the 
Treatment Environment 


Rudolf H. Moos 
Social Ecology Laboratory 
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Department of Psychiatry 
University of Pittsburgh 


The social climate of four residential alcoholism programs was assessed using the 


Community-Oriented Programs Environment 


Scale (COPES). COPES perceptions 


were essentially independent of 18 patient background, psychosocial functioning, and 


alcohol-related characteristics. Considered in 
other psychiatric and nonpsychiatric Settings, 
the social climate of treatment Programs are 


characteristics of the perceivers. 


A number of instruments have recently been 
developed to measure the social-environmental 
characteristics of psychiatric treatment programs. 
Several of these techniques assess the treatment 
environment by asking patients and/or staff 
about the relevant characteristics of their milieu. 
In using these “perceived environment” scales, 
the usual procedure is to average responses from 
a set of patients and/or staff and to assume that 
these mean values describe the particular en- 
vironment being studied. 

One problem with this procedure is that there 
may be substantial differences among individuals 
in the way in which the “same” treatment en- 
vironment is perceived. These individual differ- 
ences have raised the important issue that scales 
measuring environmental Perceptions may re- 
flect background and/or personality character- 
istics of the perceivers rather than independent 
attributes of the environment, 

The limited empirical evidence that is available 
indicates that personal characteristics are only 
minimally related to environmental perceptions 
(Moos, 1974). The Purpose of the present study 
was to provide more extensive evidence on this 
issue by relating a wide range of background and 
intake functioning characteristics to patients’ per- 
ceptions of the social climate of residential alco- 
holism programs, 

The research population was composed of 326 
alcoholic inpatients treated at one of four resi- 
dential alcoholism Programs. Two of the pro- 
grams were in an urban area (Salvation Army 


SS ae ee 


Requests for reprints and for an extended report 
of this study should be sent to Rudolf H. Moos, De- 
partment of Psychiatry, Stanford University, Stan- 
ford, California 94305. 


conjunction with previous research in 
the results indicate that perceptions of 
not simply measures of the personal 


and a public hospital unit) and were populated 
by patients who tended to be single, separated, or 
divorced; residentially mobile; and average or 
below average in occupational status, income, 
and education. The other two programs were 
located in suburban areas and admitted patients 
who were more often from middle- to upper- 
middle class backgrounds, who were residentially 
stable, and who were living with their families. 
Therefore, we combined patients in the first two 
Programs (Study Group 1; n= 171) and those 
in the latter two programs (Study Group 2; n= 
155). 

Shortly after admission to the programs, pa- 
tients were administered a detailed background 
information form, which obtained information 
about sociodemographic characteristics (nine 
items), alcohol consumption, behavioral impair- 
ment, physical impairment, subjective rating of 
drinking problem, drinking pattern for the month 
before entering the program, previous hospitaliza- 
tion for alcoholism, occupational functioning, 
social functioning, and psychological well-being. 
There was substantial variability in these socio- 
demographic and premorbid functioning character 
sitics due to within-program differences among 
patients, 

Approximately 2-3 weeks after admission, par 
tients were administered the Community-Or- 
ented Programs Environment Scale (COPES) t0 
evaluate the social climate of their program, The 
COPES is composed of 10 subscales: Involve- 
ment, Support, Spontaneity, Autonomy, Practical 
Orientation, Personal Problem Orientation, Anger 
and Aggression, Order and Organization, Program 
Clarity, and Staff Control (Moos, 1974). The 
Proportion of subscale variance accounted for by 
differences among the four programs average 
20% and ranged from 29.5% for Order and Or- 
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n to 9.2% for Program Clarity, indicat- 
there was substantial subscale variabil- 
each of the programs. 
correlation coefficients between the 
attributes and the 10 COPES subscales 
mputed for the two study groups. A series 
ultiple regression analyses were then per- 
to assess the overall contributions of the 
nt attributes to perceptions of the treat- 
jironment. Ten multiple regression analy- 
for each of the 10 COPES subscales, which 
e dependent variables) were run for each 
wo study groups. 
7 of the 180 correlation coefficients were 
greater in Study Group 1, whereas only 1 
e 180 correlations was .20 or greater in 
Group 2. 
of the 10 multiple regressions (for Sup- 
id Practical Orientation) were statistically 
t (P< .05) in both study groups; how- 
one of the 180 possible relationships be- 
he 18 patient background and functioning 
s and the 10 COPES subscales were rep- 
the two groups. For example, in Group 
nts who were better educated and more 
ally stable perceived less emphasis on 
Orientation, whereas patients who were 
ing better socially perceived more em- 
this dimension. These relationships were 
ated in Study Group 2, however, in 
Catholic patients and patients who were 
less and who showed less physical im- 
perceived more emphasis on Practical 
tion. 
ll, there was a slight tendency for better 
ited patients to be somewhat more negative, 
Patients with fewer symptoms and less 
drinking severity to be somewhat more 
in their evaluations of the social climate 
rogram. However, the correlations were 
and represented less than 5% of the 
relation matrix. The most parsimonious 
to be drawn from these data is that 
ons of treatment environments are only 
ty related to patients’ background and 
lity characteristics. 
3 though environmental perceptions are 
nt of a broad array of background 
tics, they are related to an individual’s 
BS ition jin an environment. In general, 
responsible for an environment view it 
‘avorably than those not responsible. In 
Staff members of different roles may 
the same programs differently; for ex- 
es and day personnel tend to feel that 
fessional staff is more supportive than do 
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aides and evening or night personnel. These find- 
ings probably reflect differences in the suben- 
vironments experienced by different individuals, 

Environmental perceptions are also related to 
how well people actually function in an environ- 
ment. People who see their environments more 
positively tend to be more satisfied with and 
perform better in those environments. For ex- 
ample, patients who perceive their programs 
more positively are more likely to participate in 
aftercare services (Pratt, Linn, Carmichael, & 
Webb, 1977), whereas residents who perceive 
their programs more negatively are more likely 
to abscond (Chase, 1975, chap. 8). 

Although there is little or no relationship be- 
tween individual patient characteristics and their 
perceptions of the social climates of their 
treatment programs, the aggregate characteristics 
of patient populations may be related to differ- 
ences in social climate among treatment settings. 
However, relationships between aggregate per- 
sonal and behavioral characteristics and the 
aggregate social climate cannot be generalized to 
individuals in any particular setting. Thus, for 
example, the fact that programs with more dis- 
turbed patients have less emphasis on autonomy 
(Moos, 1974) does not necessarily mean that 
within one program, the more disturbed patients 
will perceive less emphasis on autonomy than 
the less disturbed patients. 

The present findings demonstrate that within 
programs, personal attributes of patients are 
only minimally associated with their environ- 
mental perceptions. Considered in conjunction 
with previous research, the results indicate that 
indices of perceived climate are not simply 
measures of background or psychosocial char- 
acteristics of the perceivers. However, there may 
be substantial differences in how different indi- 
viduals perceive the same environment, and it is 
important to clarify the reasons for these differ- 
ences, Characteristics of peoples’ functioning or 
performance in an environment might be fruitful 
to investigate in this regard. 


References 


M. The impact of correctional programs: 
Evaluating cor- 
New York: 


Chase, 
Absconding. In R. Moos (Ed.), 
rectional and community settings. 
Wiley, 1975. ? 

Moos, R. Evaluating treatment environments: A 
social ecological approach. New York: Wiley, 1974. 

Pratt, R. Linn, M, Carmichael, J., & Webb, N. 
The alcoholic’s perception of the ward as a pre- 
dictor of aftercare attendance. Journal of Clinical 
Psychology, 1977, 33, 915-918. 


Received March 2, 1977 m 


Journal of Consulting and Clinical Psychology 
1978, Vol. 46, No. 2, 352-353 


Effect of Psychotomimetics (LSD 


and Dextroamphetamine) on the 


Use of Primary- and Secondary-Process Language 


Michael Natale 
Department of Psychiatry 
Columbia University 


Joseph 


Charles C. Dahlberg 
William Alanson White Institute 
New York City 


Jaffe 


Department of Psychiatry, Columbia University and New York State 
Psychiatric Institute, New York City 


The present study sought to determine the e 


dextroamphetamine) on primary-process 
minute monologues were recorded for four 


amphetamine, and placebo conditions. An in 


each subject was treated as an individual 
LSD-induced attenuation of secondary- 
LSD-induced increase of primary-process 
process language when dextroamphetamine 
that psychotomimetics do affect ego functio) 
nature of the effect (inhibition or disinhibi 


It has been postulated that psychotomimetics 
(LSD, Ditran, psilocybin) have certain consist- 
ent effects on verbal behavior. Gottschalk and 
Gleser (1964) found that LSD, psilocybin, and 
Ditran promote disorientation and the use of 
denial language (subscale of Schizophrenia 
scale), Fink (1974) claims that Psychotomimetics, 
such as dextroamphetamine, Ditran, and LSD 
cause a decrease in “defensive language” (denial 
and disorientation). The contradictory nature of 
these findings may be the result of the fact that 
Gottschalk and Gleser’s results were obtained 
from normals, and Fink’s data were obtained 
from psychotic depressed patients who had re- 
ceived electric shock. 

In light of the above-described “state of 
knowledge,” the purpose of the present study 
was to verify the effects of psychotomimetics on 
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ffect of psychotomimetics (LSD and 
and secondary-process language. Five- 
Psychoanalytic patients in LSD, dextro- 
tensive research design was adopted, and 
experiment; (a) One patient manifested 


Process language; (b) one patient showed 
language and an increase of secondary- 
was ingested. The present findings suggest 
ning as expressed in language but that the 
ition) is not determined by the drug alone. 


ego functioning as expressed in language use. 
“Predrug” and “postdrug” 5-minute monologues 
were recorded for four psychoanalytic patients 
(Jaffe, Dahlberg, Luria, & Chorosh, 1973) in 
LSD (50-100 mg), dextroamphetamine (15 mg), 
and placebo conditions. Each patient participated 
in 6-9 sessions for each drug condition, and an 
“intensive research design” was adopted, with 
each patient being treated as a separate experi- 
ment. Each monologue was transcribed, key- 
punched, and analyzed by Martindale’s (1973) 
“count” program, which tabulates the percent 
frequency of words that belong to primary-pro- 
cess or secondary-process language categories: 
The results were as follows: (a) One patient 
manifested a significant attention of secondary- 
Process language when under the influence of 
LSD, ¢ (7) = 2.48, p <.05, two-tailed, with no 
significant placebo effect. (b) One patient showed 
a significant increase of secondary-process V0- 
cabulary when under the influence of dextro- 
amphetamine, ¢(8) = 3.07, p<.05, two-tailed, 
and a significant increase in primary-process lan- 
guage after having taken LSD, t(5) = 2.58, P< 
05, two-tailed, with no significant placebo ef- 
fect. (c) Two patients manifested no drug of 
Placebo effects. The present and previous find- 
ings suggest that psychotomimetics do affect ego 
functioning, as expressed in language, but that 
the nature of the effect (inhibitory or disinhibi- 
tory) is not determined by the drug alone. 
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Short Forms of the MMPI with Back Pain Patients 


Judith Turner 
University of California, Los Angeles 


This study compared the Faschingbauer 


abbreviated Minnesota Multiphasic Person- 


Charles McCreary 
Department of Psychiatry 
University of California, Los Angeles 
School of Medicine 


ality Inventory (FAM) and the Midi-Mult on a sample of 176 back pain outpa- 
tients. Correlations with the standard MMPI scales ranged from .67 to .93 (M = .84) 
on the FAM and from .71 to .89 (M = 80) on the Midi-Mult. The FAM showed 
higher agreement with MMPI code types than did the Midi-Mult, Finally, the three 
versions were compared to independent physician ratings of amount of functional 
component to patients’ pain, and all three forms discriminated between “functional” 


and “organic” patient groups. The results 


provide tentative evidence that abbreviated 


MMPIs are useful measures of personality in this population. 


A number of short forms of the Minnesota 
Multiphasic Personality Inventory (MMPI) have 
been constructed in attempts to produce an instru- 
ment that gives the same information but with 
much less time required of the respondent. The 
present investigation addressed the issue of the 
utility of two of these short forms—the Midi-Mult 
(Dean, 1972) and the Faschingbauer abbreviated 
MMPI (FAM) (Faschingbauer, 1974)—in a uni- 
versity hospital outpatient orthopedic clinic, 

The standard form of 
istered to 186 male and female outpatients (M 
age = 44 years) who were seen at the Back Clinic. 
Each answer sheet was scored for the standard 
MMPI, Midi-Mult, and FAM scales except that 
10 MMPIs were excluded because more than 30 
Ratings by the 


MMPI profiles, None of the Midi-Mult profiles 
error rate in iden- 


Requests for reprints and for an extended report 
of this study should be sent to Charles McCreary, 
Department of Psychiatry, University of California, 
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tifying invalid standard MMPI profiles, The four 
standard MMPI profiles found to be invalid were 
excluded from further analysis of data. 

Midi-Mult scale scores correlated with corre 
sponding standard MMPI scale scores from .69 
to 89 (M = 80). Correlations between FAM and 
full MMPI scales ranged from .67 to .93 (M= 
85). The FAM was more highly correlated with 
the full MMPI than was the Midi-Mult on all 
clinical scales except Hypochondriasis, whereas the 
Midi-Mult was more highly correlated with the 
standard form on F and K. 

Next, code-type agreement between the standard 
MMPI and the two short forms was examined. 
On a two-point code type comparison, the FAM 
was found to have 34% agreement with the long 
version in the same order and 43% agreement re- 
gardless of order. The Midi-Mult showed 22% 
agreement (same order) and 32% (any order). 
Less agreement resulted when three-point codes 
were compared (FAM: 14% same order, 31% 
any order; Midi-Mult: 9% same, 20% any). 

Finally, comparisons were made between stan- 
dard MMPI, FAM, and Midi-Mult profiles and 
independent physician ratings of the amount of 
functional component to patients’ pain. Patients 
judged by physicians to have a very large func- 
tional component (n= 39) had standard Mut 
Scores that were significantly higher than those © 
Patients judged to have little functional component 
(n=25) on the K, Hypochondriasis, Hysteria, 
Schizophrenia, and Social Introversion scales < 
:05) and Psychopathic Deviate scale (p < 01). 
A similar pattern of differences was shown by the 
FAM, except that there were no significant dif- 
ferences between “functionals” and “organics” on 
the K and Psychopathic Deviate scales, whereas 


reserved. 
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the “functional” patients scored significantly higher 
| on the Depression and Psychasthenia scales (p < 
05), The Midi-Mult also showed a pattern of 
differences similar to the MMPI, except for no 
simificant difference on the K scale and signifi- 
cant differences on Depression, Paranoia, and 
Psychasthenia (p < .05). 

That both short forms were found to discrimi- 
mte between “functional” and “organic” back 
| pain patients is encouraging evidence that these 
h measures may be useful with this population, Since 
the FAM, in particular, was shown to provide a 


355 


very good estimate of standard MMPI scores, 
further attention seems warranted in the direction 
of its use in such a setting. 
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Practicing A and B Psychotherapists’ Responses to Schizoid 
and Neurotic Patient Prototypes 


William B. Goodwin, Donald M. Quinlan, and Jesse D. Geller 
Yale University and Yale School of Medicine 


Eighty-three practicing male psychotherapists completed the A-B scale, 


responded to 


recordings of schizoid and neurotic patient prototypes, and rated their subjective re- 


actions to each type. Multivariate analysis revealed a 
A-B Type X Patient Type interaction. Although liking 
higher in therapist-patient dyads, which the literature 
schizoid, B—neurotic), felt compatibility and desire to 


significant (p < .013) overall 
and ease of responding were 
Suggests are effective (A— 
work with the patient were 


higher in the opposite (“mismatched”) dyads. These results suggest that subjective 


reactions underlying the A-B interaction 


effect are complex and that therapists re- 


spond in a differentiated, not global, fashion. 


The A-B therapist “type” distinction originated 
by Whitehorn and Betz (cf. Betz, 1967) is an 
intriguing but conceptually elusive variable as- 
sociated with a differential therapeutic aptitude. 
Based on items from the Strong Vocational In- 
terest Blank reflecting activities mostly technical, 
mechanical, or manual in nature, the A-B scale 
differentiated therapists more effective with 
schizophrenics (As) from those less effective 
(Bs). McNair, Callahan, and Lorr (1962) later 
found Bs to be more effective with neurotic out- 
patients, giving rise to the “A-B interaction hy- 
pothesis” that As and Bs are differentially effec- 
tive with schizophrenic and neurotic patients, re- 
spectively. Owing to the lack of clinical studies 
with the requisite factorial design, the interaction 
hypothesis has not yet been fully evaluated, al- 
though one such study appears to Provide sup- 
port (Berzins, Ross, & Friedman, 1972). 

Experimental studies using schizoid and neu- 
rotic patient stimulus materials have given rise 
to contradictory findings regarding therapists’ sub- 
jective reactions, however, Although Kemp 
(1966), for example, found a “paradoxical dis- 
comfort” on the part of As with schizoid “pa- 
tients” and Bs with neurotic patients, Berzins 
and Seidman (1968) failed to replicate. As a 
whole, such analogue studies have lacked realism 
in numerous respects: in the nature of the pa- 
tient stimuli, in the mode of response, in the so- 
cial context, in the physical setting, and in the 
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sample of subjects used as “therapists.” Fur- 
ther, the impact of therapist experience level on 
the A-B variable has not been evaluated. Out 
primary aim was to extend findings of subjec- 
tive reactions of A and B undergraduate “quasi 
therapists” to a sample of practicing A and B 
psychotherapists, using the experimental analogue 
approach. 

Eighty-three out of 136 practicing male psy- 
chotherapists with at least 1 year of experience 
completed the Schiffman, Carson, and Falken- 
berg (Note 1) 23-item version of the A-B scale. 
The 20 highest scoring (A) and 20 lowest scor- 
ing (B) therapists were presented realistic audio 
recordings of avoidant of others (schizoid) and 
turning against self (neurotic depressive) pè 
tient prototypes used by Berzins and Seidman 
(1968), listened to five 1-minute segments of 
each patient in the privacy of their own offices, 
and spoke their responses—which could include 
interrupting the patient or silence—into a cas- 
sette recorder. Order of presentation of proto- 
types was counterbalanced across therapists. 
After each 5-minute segment, therapists com- 
pleted a six-item questionnaire assessing subjec- 
tive reactions to the patient, 


Results 


A four-factorial multivariate analysis of vati- 
ance (Therapist Type X Experience Level X Pa- 
tient Type X Order) yielded a significant (p 
= .013) Therapist Type X Patient Type interac- 
tion. The standardized discriminant function C0- 
efficients weighted the six components as fol 
lows: (a) desire to work with the patient, 
—1.165; (b) ease of responding, .832; (c) lik 
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d) felt compatibility, —.328; (e) felt 
1; and (f) satisfaction with spoken 
—.137. Two univariate analyses of 
‘revealed trends for reversed Therapist 
ent Type interactions: Higher scores 
ined for “mismatched” (B schizoid, A 
therapists on desire to work with the 
=.055) and felt compatibility (p= 
significant main effects were found nor 
any significant interactions with ex- 
or order of presentation. 


Discussion 


ly summarized, although these therapists 
jer to respond to and like the “matched” 
better, they felt more compatible with 
ed to work with the mismatched pa- 
results offer support for the hypothe- 
A and B psychotherapists can be dif- 
d by their subjective reactions to 
and neurotic patients. The direction and 
of these differences is complex. 

s investigators have hypothesized that 
Of therapeutic communication can re- 
n resemblances between patient defense 
s and therapist coping style, and that 
certain personological similarities be- 
s and schizoids and possibly between 
eurotic depressives (e.g., Berzins, Seid- 
elch, 1970). Taken together, these con- 
__ Suggest that (a) therapists may 
larity with compatibility and there- 
er to work with patients because they 
them as similar, hence compatible or 
herapists are aware of their limitations 
atched patients, their preferences may 
desire to grapple with and overcome 
nd spots” by means of massed prac- 
ere is some suggestion (Goodwin, in 
t inexperienced B therapists do see a 
‘Oportion of schizoid patients than in- 
ed As. If this is a replicable finding, 
bility that B (and/or A) therapists in- 
it clinical effectiveness with mismatched 
as they gain experience needs empirical 
third explanation for the present find- 
t some degree of felt incompatibility 
ivorable to therapeutic outcome. 

of the subjective reactions—liking, ease 
nding, felt compatibility, and so on— 
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showed the same pattern of higher scores for 
matched dyads, we might have speculated that 
there is a simple underlying affective dimension 
describable as “good vibrations.” However, the 
different signs of the discriminant function coef- 
ficients and the contrasting patterns of means in- 
dicate that the subjective alternative reactions 
underlying the A-B interaction effect are com- 
plex, and that more broadly, therapists’ reac- 
tions to patients are differentiated rather than 
unidimensional. Finally, how therapists use such 
feelings—“countertransference” or otherwise—as 
a data base from which to formulate therapeutic 
responses is a question that deserves empirical 
study. 
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Comparisons of Rating Scales of Child Psychopathology in 
Clinic and Nonclinic Samples 


Susan B. Campbell 
McGill University and the Montreal Children’s Hospital 


Yvonne Steinert 
Jewish General Hospital 
Montreal, Canada 


Mothers of 45 control and 35 clinic children completed two factor-analytically de- 
rived rating scales of child psychopathology, the Behavior Problem Checklist and the 
Parent Questionnaire. Intercorrelations among factors indicated some factor overlap 
and some differences between apparently similar factors, Conduct problem ratings on 
the Behavior Problem Checklist covaried with ratings of conduct problems, hyper- 
activity, and learning problems on the Parent Questionnaire. Patterns of correlations 
suggested that mothers of nonreferred children rated pathology per se, whereas 
mothers of referred children rated behaviors that fell into internalizing and external- 


The Behavior Problem Checklist (Quay & Peter- 
son, Note 1), the Parent Questionnaire (Conners, 
1970), and the Teacher Rating Scale (Conners, 
1969) are factor-analytically derived instruments 
based on common descriptors of child psychopath- 
ology. They are widely used in epidemiological, 
diagnostic-descriptive, and outcome studies with 
child populations (Quay, 1972), However, despite 
overlap and variation in both item content and 
factor labels, it is unclear how the factor scores 
covary. 

Content analysis suggests that different behav- 
iors are covered by the same factor label and 
similar behaviors by different labels from scale to 
scale. Thus, all include a conduct problem factor 
reflecting discipline problems and defiance of au- 
thority. While one scale also includes hyperactive 
behavior under this tubric, the others isolate it 
as a separate factor, Likewise, all include a neu- 
rotic factor with items about fears and worries 
but differing in emphasis on social withdrawal and 
low self-esteem. Thus, the present study provides 
data on the relationships among those factors with 
overlap in content and/or label, Several factors on 
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the Parent Questionnaire that load on only a few 
items and are infrequently rated were excluded, 
as was the socialized delinquency factor on the 
Behavior Problem Checklist, which has little rele- 
vance with younger age groups. 

Forty-five control and 35 clinic children (M 
ages = 7.6 and 7.5) took part in this study. Non- 
clinic subjects were at grade level and had never 
received psychological help, Clinic subjects were 
perceived as inattentive and poor school achievers 
with a variety of additional behavior problems. 
There were five girls in each group. All children 
were of at least average IQ, from lower middle 
to middle class, and in kindergarten through Grade 
4 in school. Mothers completed both the Parent 
Questionnaire and the Behavior Problem Check- 
list by rating a series of statements as no problem, 
a mild problem, or a severe problem. In addition, 
six classroom teachers completed the Behavior 
Problem Checklist and the Teacher Rating Scale 
on a similar and partially overlapping sample of 
22 clinic and control boys (M ages = 8.3 and 8.5). 

Factor scores on the two scales completed by 
mothers were correlated for clinical and control 
groups separately. Similar analyses were carried 
out on teacher ratings. Data are summarized in 
Table 1. Ratings of clinic children by mothers sug- 
gest consistency between scales in that conduct 
problems, learning problems, and hyperactivity | 
intercorrelated, indicating a cluster of externalizing 
behaviors. It appears from the pattern of relation- 
ships that conduct problems as measured by the 
Behavior Problem Checklist include discipline 
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Table 1 
Correlations Between 
Scale Factor Scores 
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Behavior Problem Checklist and Conners’ Parent and Teacher 


Rating 
Peterson-Quay behavior problem checklist 


Control Clinic 
Conduct Personality Inadequacy- Conduct Personality I 
[ nadequacy- 
Scale problem problem immaturity problem problem 3% BE 
Parent Questionnaire 
Conduct Problem 498*** 283 310* 

d $ F : 147+ 266 .262 
Anxiety .354* 191 .387** 214 .562** sere 
Hyperactivity S330 .479%** .337* 824*** 257 "224 

l Learning Problem ‘4213+ 1253 "360*** ‘635 «279 268 
Psychosomatic "252 137 "264 177 134 "340* 

pier Rating Scale 
onduct Problem 155 —.023 506* 
; 7 i i ? .536** 135 .448* 

p ane Tasive Mp .162 .580** .416* .245 ase 
peo aey —.115 .372 -248 —.349 .456* 122 

| Hyperactivity 92104 .190 .215 569** —.124 567** 


Note. For the Parent Questionnaire, ns 
Teacher Rating Scale, » = s 
@ < 05. 
k p< 01. 

p < .001. 


ae tional difficulties, and dislike of 
w a y, immaturity, anxiety, and per- 
Ris. on lems are interrelated, indicating a 
7 ateari symptoms, with the anxiety 
ity. ing behaviors measured by both the 
PEPA and personality problem factors. Psy- 
Pons “4 ects are relatively independent 
ia a ors. Although control mothers’ ratings 
an ee parems, additional correlations be- 
Bele ane and externalizing factors, for 
ik they ay uct problem and anxiety, suggest 
“thology = au a more heterogeneous array of 
eroe” aane than behaviors that cluster into 
ENA or “acting out” forms of pathology. 
sts distinction between internalizing and 
the P E earm appears more evident in 
Ee me ipe recognized pathology or a 
Teachs r of deviant behaviors. 
Eii ; ratings form similar patterns of cor- 
; Bai both groups. Thus, inattentive and 
ie ae ehaviors show strong relations to con- 
an lems, However, conduct problem ratings 
n pan somewhat different behaviors in 
nting zs clinic groups. Teachers appear to be 
ee imarily fidgety, impulsive, and restless 
Problem a the control group, whereas conduct 
tflet disci ae in the clinic group appear to also 
Piave Sn e problems. Moreover, teachers, who 
aae ‘lence with a large number of children, 
oie group internalizing and externalizing be- 
intuitively whether they are rating clinic 


= 45 and 35 for control and clinic samples, 
22 for both the control and clinic samples. 


respectively. For the 


or control groups, something not evident in the 


ratings of control mothers. 
These data indicate that behaviors considered 
indicative of child psychopathology differ in pat- 


terning as a function of the child’s clinical status 
Jn addi- 


and with the experiences of the rater. 
tion, there is evidence that these scales measure 


both overlapping and distinctive aspects of behav- 
ior problems. This appears to be the case even for 
factors with the same name. Thus, studies that 
define samples on the basis of high scores on fac- 
tors such as conduct problem or anxiety need to 
consider differences among scales as well as char- 
acteristics of raters and subjects. 
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script, University of Ilinois, 1967. (Mimeo) 
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Buss-Durkee Assessment and Validation with Violent Versus 
Nonviolent Chronic Alcohol Abusers 


Gisele J. Renson, John E. Adams, and Jared R. Tinklenberg 
Department of Psychiatry and Behavioral Sciences, Stanford School of Medicine, 
and Veterans Administration Hospital, Palo Alto, California 


Twenty-six chronic alcohol abusers who had been violent while intoxicated and 25 
nonviolent alcohol abusers were administered the Buss-Durkee Inventory. All sub- 
jects were Caucasian men with a reported daily intake of ethanol of 227 ml + 89 ml 
for at least the last 5 years. Violence was documented by police records and by 
patient and family reports. Violent drinkers scored significantly higher than control 
subjects on the inventory Total Hostility score and on subscales measuring Assault, 
Irritability, Verbal Hostility, Indirect Hostility, and Resentment. 


The earliest hostility inventories developed dur- 
ing the late 1950s, with few exceptions (Schulz, 
1954; Zaks & Walters, 1959), consolidated vari- 
ous aggressive behaviors into a single omnibus 
score (Buss, 1961), Buss and Durkee (1957) de- 
veloped subclassifications of overt and covert hos- 
tility, Normative data for this inventory have been 
reported by Buss (1961) for adult normal and 
psychiatric populations and by Morrison, Chaffin, 
and Chase (1975) for normal adolescents, The 
present study attempted to validate the Buss- 
Durkee as an inventory of hostility and to provide 
normative data for a population of violent and 
nonviolent alcohol abusers. We hypothesized that 
the Buss-Durkee Inventory would discriminate be- 
tween these two groups and that overt hostility 
would be higher in the scores of alcoholics who 
became violent while intoxicated. 

The experimental subjects were 26 Caucasian 
men (M age=37 + 10), who were seen as out- 
patients at the Alcohol and Violence Clinic of the 
Stanford University Department of Psychiatry and 
Behavioral Sciences, All of these subjects had 
demonstrated alcohol abuse, and virtually all (25 
of 26) had demonstrated violent behavior only 
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while intoxicated. A control group of 25 Caucasian 
men (M age=43 +9) was selected from the 
Stanford Inpatient Service for alcohol abusers at 
the Palo Alto Veterans Administration Hospital, 
None of the control subjects had demonstrated 
violence while intoxicated. 

All subjects were screened by one of the au- 
thors and at least one other member of the Stan- 
ford psychiatric staff to exclude patients with 
neurological disorders or history of other forms of 
drug abuse. Experimental and control groups were 
matched for age, education, occupational level, 
and alcohol consumption. The majority of subjects 
in both groups were unskilled or semiskilled work- 
ers (65% of the experimental and 72% of the 
control subjects), according to the Hollingshead 
(Note 1) classification, All subjects had had 4 
daily ethanol consumption of 227 ml + 89 ml fot 
the past 5 years or more, and most had begun 
drinking heavily in their late teens (16 + 3 years). 

Experimental subjects had police-documented 
histories of violence while intoxicated. Twenty-four 
experimental subjects had documented physical as- 
saults against others. Both of the remaining €% 


perimental subjects had histories of verbal hos- ” 


tility toward their families. In addition, one had 
threatened his wife with a deadly weapon and the 
other had a police record for destruction of 
property. 

Within the first 2 weeks of admission to treat- 
ment, each subject completed and signed the 
Buss-Durkee Inventory under the staff psycholo- 
gist’s supervision, All subjects were assured com- 
plete confidentiality. 

The Buss-Durkee Inventory (Buss & Durkee, 
1957) is a self-rating scale of 75 true—false items 
based on the rationale that hostility can usefully 
be divided into seven subgroups or scales: (8) 
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kee Inventory Scores of Violent 
ental) and Nonviolent (Control) 
busers 


Control? 


Experimental* 


M SD M SD t 
36.58 9.07 28.64 9.40 3.07** 
6.15 2.33 4.40 1.76 3,02** 
§.54 1.79 4.36 2.00 2.22* 
6.88 2.45 4.92 2.56 2.79** 
2.35 1.23 2.48 1.60 33 
3.32 1.86 2.60 2,00 1.33* 
3.88 1.77 3.28 2.30 1.05 
8.58 2.76 6.36 2.58  2.96** 
5.12: 2.53 §.58 2.62 64 


OF direct physical violence against persons, 
ndirect Hostility either against persons, 
Bossip or practical jokes, or against ob- 
ch as slamming doors or breaking things; 
titability or explosiveness and exasperation 
slightest stimulus; (d) Negativism as either 
bellion or as passive compliance to rules 
thority figures; (e) Resentment, anger, 
and/or hate of others due to real or im- 
mistreatment; (f) Suspicion, varying from 
0 projections of hostility onto others and 
ief that these others are derogatory and 
and (g) Verbal Hostility in style or con- 
se seven scales yield a Total Hostility 
re is also a Guilt scale, which is inde- 
of the above. 

1 shows the scores obtained by the con- 
experimental groups. The Total Hostility 
veals that violent alcohol abusers scored 
ly higher (p <.01) than the nonviolent 
more refined presentation of the kinds 
tility experienced by the subjects is reflected 
subscale scores, Violent alcoholics scored 
ntly higher (p <.01) on the scales mea- 
Assault, Irritability, and Verbal Hostility, 
(p< .05) on Indirect Hostility and Re- 
The two groups did not differ signifi- 
Negativism, Suspiciousness, or on the 
nt Guilt scale. 
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The data were analyzed for any relationship be- 
tween age, individual scores, and/or Total Hos- 
tility score. Both groups were divided into younger 
(up to 40 years) and older (40 years and above) 
subjects. A linear regression procedure showed no 
relationship between age and any scores for the 
control subjects. For the experimental subjects, 
an age effect was shown for the Suspicion scale 
only, which tended to decrease with age (p < 05). 
The five most verbally aggressive subjects were 
compared with the five least verbally aggressive 
subjects in each group to assess whether subjects 
with higher verbal aggressiveness scores tended to 
have lower assaultiveness scores, This was not 
found to be the case; subjects with higher verbal 
aggressiveness scores did not have lower assaul- 
tiveness scores. All subjects had high scores on 
the Assault scale. 

Our findings show a positive correlation be- 
tween violence while intoxicated and Buss-Durkee 
Inventory scores on Total Hostility as well as 
five of seven hostility subscales among two groups 
of alcohol abusers. This confirms the validity of 
the Buss-Durkee Inventory as a measure of hos- 
tile feelings and aggressive behavior in this sample 
and suggests that the Buss-Durkee Inventory may 
be useful in assessing potential for violence in 


alcohol abusers. 
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On the Word Preferences of 


Suicidal Versus Nonsuicidal 


College Students 


Steven Thurber and 


David P. Torbet 


Boise State University 


A word preference format was used to investigate reactions to verbal stimuli of 


suicidal and nonsuicidal persons. Six words 


with either aggressive or submissive de- 


notative meanings significantly differentiated the two groups. In addition, the word 


suicide was selected at a higher frequency 
pared to their nonsuicidal counterparts. 


Jung (1918) and Luria (1932) were notable 
among several early clinicians who investigated 
the premise that affective reactions to verbal 
stimuli might yield information as to personality 
characteristics and motivational states. The tra- 
ditional approach was to base clinical inferences 
on the denotative meaning of words eliciting 
high levels of emotionality as judged by physio- 
logic indicators (e.g., skin resistance; changes in 
breathing). More recently, affective arousal (or 
what has since been termed reinforcement 
value) has been assessed by ratings on a like- 
dislike continuum (Rychlak, 1966) or by noting 
word preferences in a series of paired associates 
(Torbet, Note 1). 

The present study was designed to see 
whether suicidal individuals could be distin- 
guished from a nonsuicidal group on the basis of 
word preferences, The participants were 21 per- 
sons referred to a university counseling center 
following documented instances of attempted 
suicide (2 of these individuals were later success- 
ful in suicide attempts). They were compared to 
students from general psychology classes con- 
trolling for age (M = 25.9) and sex (13 females; 
8 males). The subjects selected the most pre- 
ferred word for each of 233 word pairs, Ran- 
domly distributed among these items were six 
words expected to elicit differential reactions 
from the a priori groups. They included the 
words murder, kill, attack, submit, suicide, and 
martyr. Together, they formed 30 word pairs 
and in accord with circular unfolding scaling 
procedures they constituted an undimensional 
metric scale or what was termed an aggressive— 
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level by suicidal individuals when com- 


submissive dimension (Cooper, 1971). Following 
the designation of preferences across all items, 
the frequencies within this dimension were tal- 
lied, with each word having a possible score of 5. 

Wilk’s lambda criterion indicated that the equal- 
ity of mean vectors for the two groups was un- 
tenable, y*(6) = 17.96, p < .006. Univariate 
analyses indicated that the suicidal group showed 
a significantly higher preference for suicide (M 
=3.76 vs. M=1.90), t(20) = 4.47, p <.001, 
whereas the nonsuicidal subjects significantly 
preferred attack (M=3.38 vs. M = 2.62), t 
(20) = 2.61, p < .01. 

A linear discriminant analysis yielded a canoni- 
cal correlation of .62. The primary contributor 
to the discriminant function was suicide, with a 
standardized coefficient of 1.29 followed in order 
of magnitude by submit (.56), martyr (.52), kill 
(.48), murder (.29), and attack (.02). Evidence 
for the discriminating ability of the word scale 
is shown in the finding that 34 of the 42 pat- 
ticipants (81%) were correctly classified into 
suicidal and nonsuicidal categories on the basis 
of discriminant scores. 

The results suggest the potential clinical utility 
of a word preference technique in the assessment 
of suicidal probabilities. The denotative mean- 
ings of the stronger discriminating variables sup- 
Port the notion that attempted suicide may 
Tepresent a surrendering to hostile impulses by 
turning them toward oneself in an intropunitive 
manner (Kisker, 1964). Finally, the data sug- 
gest that the word suicide has reinforcement 
value for those with suicidal inclinations. 
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Role of Locus of Control in Frustration-produced Aggression 


Kiran Bhatia and Sanford Golin 
University of Pittsburgh 


This experiment tested the hypothesis that external locus of control subjects would 
exhibit greater frustration-produced aggression than internal locus of control subjects, 
Incorrect responses by a frustrating or nonfrustrating confederate were punished by 
electric shock. Analysis of the shock data indicated increased aggression after frustra- 
tion by externals and less aggression by internals. The results support the hypothesis 


and indicate that aggression is cognitively regulated by a personality-related belief in 
uncontrollability; the less the believed control, the greater the aggression. 


This experiment was concerned with the cog- 
nitive control of frustration-produced aggression. 
According to one view (e.g., Geen, 1972), frus- 
tration results in aggression when it increases 
arousal in the presence of cues associated with 
aggression. Recent reports indicate that the ex- 
tent to which frustrating events are perceived 
as aversive and thereby increase arousal is a 
function of the extent to which such events are 
perceived as controllable (Donnerstein & Wil- 
son, 1976; Glass & Singer, 1972), Perception of 
controllability is related to the characteristics of 
the particular situation in which frustration oc- 
curs. Such perception, however, may have per- 
sonality as well as situational determinants. 
Specifically, one’s locus of control (see Phares, 
1976), that is, transsituational beliefs about one’s 
ability to exercise control over outcomes, may 
influence the extent to which frustration results 
in aggression. Those who generally do not be- 
lieve they can control outcomes (external locus 
of control) are expected to exhibit greater ag- 
gression in response to frustration than those 
who generally believe they can control outcomes 
(internal locus of control). The present experi- 
ment tested this hypothesis, 

Forty-eight internal locus of control and 48 
external locus of control subjects were equally 
divided into frustration and nonfrustration ex- 
perimental subgroups. Using a standard proce- 
dure to measure aggression, subjects taught a con- 
federate a task by punishing incorrect responses 
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with electric shock, which was ostensibly ad- 
ministered to the confederate. Shock durations 
and intensities were recorded as measures of 
aggression. Each subject was motivated to have 
the confederate learn the task as quickly as pos- 
sible; the confederate learned relatively poorly 
in the frustration condition and relatively well 
in the nonfrustration condition. : 

An analysis of variance of shock duration 
data showed a Locus of Control X Frustration 
interaction, F(1, 88) = 5.86, p< .05, Scheffé’s 
test (p < .05) showed that the mean shock dura- 
tion of frustrated externals (M = .65 sec) was 
greater than the means of nonfrustrated ex- 
ternals (M = .49 sec) and frustrated internals 
(M = .50 sec). Frustrated internals showed less 
aggression than nonfrustrated internals (M = 
-58 sec), but this difference was not significant, 
As predicted, therefore, frustration resulted in 
increased aggression for externals but not for 
internals. 

Analysis of the shock intensity data showed the 
mean intensity of frustrated internals (M = 2.97) 
to be less (p < .05) than that of nonfrustrated 
internals (M = 4.37), The mean intensity of 
frustrated externals (M = 5.10) was greater than 
that of nonfrustrated externals (M = 4.32), as 
predicted, but this difference was not significant. 

The prediction that externals would exhibit 
greater aggression in response to frustration was 
based on the view that aggression would be 
regulated by a generalized belief in uncontrol- 
lability. If, however, locus of control were con- 
sidered to be a measure of a need to control, 4 
Motive said to be characteristic of internals (se 
Phares, 1976), then frustration should have 
been more aversive and arousing for internals 
than for externals. A prediction of greater ag- 
gression in response to frustration by internals 
follows from this perspective, Internals, how- 
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ever, showed reduced rather than increased ag- 
gression in response to frustration. Hence, the 
present results were not in accord with this mo- 
tivational interpretation of locus of control. 

In summary, the results showed that general- 
ied expectancies about -one’s ability to control 
outcomes can influence aggression in response to 
frustration in a manner similar to that previously 
reported for situationally induced expectancies 
about controllability (Donnerstein & Wilson, 
1976): The less the believed control, the greater 


the aggression. 
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A Path-Analytic Study 


Paul M. Kohn 
York University, Downsview, Canada 
and Addiction Research Foundation 
Toronto, Canada 


Helen M. Annis 
Addiction Research Foundation 
Toronto, Canada 


Measures of the following variables were obtained from 193 high school students: 
marijuana use; attitude toward marijuana use; the peer-acceptance and symbolic- 
protest functions of marijuana use; sociopolitical outlook; and internal sensation 
seeking. Path analysis supported a model that assumes the following: (a) Only atti- 
tude affects use directly. (b) Both the peer-acceptance and symbolic-protest functions, 
as well as sociopolitical outlook, and internal sensation seeking influence attitude 
directly. (c) Sociopolitical outlook affects both functions directly. Support for the 
model came from very close correspondence between the observed and predicted 
correlations and the low, nonsignificant value of the overidentification test statistic. 


This study evaluates a multivariate model of 
youthful marijuana use. The model assumes that 
use or nonuse reflects a person’s attitude toward 
marijuana use, that is, his/her evaluative reac- 
tion to such behavior (cf. Fishbein, 1967a, 
1967b). Attitude, in turn, depends on the per- 
ceived functions of marijuana use. Our model 
includes two functions: the symbolic-protest 
function or value of marijuana use for symboliz- 
ing disidentification with conventional society 
and the peer-acceptance function or the value 
of marijuana use for gaining acceptance from 
permissive peers, Also, the model incorporates 
two personality variables: general left-right so- 
ciopolitical outlook and internal sensation seek- 
ing, a propensity to seek satisfaction from fan- 
tasy, emotional change, and unusual perceptual 
experience (Pearson, 1970). Sociopolitical out- 
look should affect the symbolic-protest function 
because expressing disidentification with con- 
ventional society should appeal specifically to 
rather rebellious persons. Both personality fac- 
tors should influence the peer-acceptance func- 
tion because people’s own attitudes and prac- 
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tices should predispose them to seek acceptance 
from similarly inclined peers (Byrne, 1971). Fi- 
nally, both personality variables should affect at- 
titude. 

To test our model, we administered question- 
naires to 96 male and 97 female high school 
seniors. The following measures were used: fre- 
quency of marijuana use over the past 6-month 
period; a brief, modified version of Wilson and 
Patterson’s (1968) Conservatism Scale, which 
measures sociopolitical outlook; the Internal 
Sensation-Seeking subscale from a modified vet- 
sion of Pearson’s (1970) Novelty Experiencing 
Scale (Kohn & Annis, 1975); and specially con 
structed measures for attitude to marijuana use, 
peer-acceptance function, and symbolic-protest 
function. All alpha or Kuder-Richardson 20 re- 
liabilities were satisfactory, ranging from .76 to 
.91. In addition, the attitude and function mea 
sures in which all items referred to marijuana 
were shown to be factorially independent. 

Path analysis, a multiple-regression procedure 
for testing the implications of causal models 
(Kim & Kohout, 1975), was applied to the data. 
The resultant path diagram appears in Figure $ 

The path diagram strongly resembles that 1m 
plied by our model. The sole discrepancy 15 the 
absence of a significant path coefficient betwee? 
internal sensation seeking and the peer-acceP 
tance function. Possibly, internal sensation seek- 
ing is not publicly visible enough a trait to affeci 
interpersonal attraction except in very close të 
lationships. The residuals, symbolized by & n 
Figure 1, express the square root of the une® 
plained variance for each dependent variable. 
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Thus, the model explains about 33% of the vari- 
ince in marijuana use and about 45% of the 
Variance in attitude. The overidentification test 
litistic (Land, 1973) proved nonsignificant, L 
(6)=4.00, .70> p> .50. This means that our 
odel fits the data well. Further evidence of 
witcess comes from comparisons of the observed 
i elations with those predicted by the model 
pe summing direct causal, indirect causal, 
D noncausal contributions (Kim & Kohout, 
M75). Predictive errors were generally modest 
in all cases nonsignificant. Finally, the re- 
ls of separate path analyses conducted for 
DO separately were highly similar. 
a Eoo and with the one exception noted, 
E. S support our a priori model. The vari- 
kut et for in attitude and behavior, 
k % and 33%, respectively, is high enough 
a at importance of the predictors se- 
“a a ow enough to suggest extending the 
ae example, the present model overlooks 
aa a dysfunction of the various health- 
a a social—legal risks ascribed to mari- 
' "e the related personality variable 
arch per ing propensity. Also, longitudinal re- 
awe mitting stronger causal inferences seems 
5 . Work along both these lines is pro- 
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Figure 1. Path analysis for marijuana use. 


Fishbein, M. Attitude and the prediction of be- 
havior. In M. Fishbein (Ed.), Readings in attitude 
theory and measurement. New York: Wiley, 
1967 (a). 

Fishbein, M. A behavior theory approach to the 
relations between beliefs about an object and the 
attitude toward the object. In M. Fishbein (Ed.), 
Readings in attitude theory and measurement. 
New York: Wiley, 1967 (b). 

Kim, J., & Kohout, F. J. Special topics in general 
linear models. In N. H. Nie, C. H. Hull, J. G. 
Jenkins, K. Steinbrenner, & D. H. Bent (Eds.), 
SPSS: Statistical package for the social sciences 
(2nd. ed.). New York: McGraw-Hill, 1975. 

Kohn, P. M., & Annis, H. M. Validity data on a 
modified version of Pearson’s Novelty Experienc- 
ing Scale. Canadian Journal of Behavioural Sci- 
ence, 1975, 7, 274-278. 

Land, K. C. Identification, parameter estimation, 
and hypothesis testing in recursive sociological 
models. In A. S. Goldberger & O. D. Duncan 
(Eds.), Structural equation models in the social 
sciences. New York: Seminar Press, 1973. 

Pearson, P. H. Relationships between global and 
specified measures of novelty seeking. Journal of 
Consulting and Clinical Psychology, 1970, 34, 199- 
204. 

Wilson, H. D., & Patterson, J. R. A new measure of 
conservatism. British Journal of Social and Clini- 


cal Psychology, 1968, 7, 264-269. 


Received May 16, 1977 m 


Journal of Consulting and Clinical Psychology 
1978, Vol. 46, No. 2, 368-369 


Weight Loss and Behavior Change 1 Year After 
Behavioral Treatment for Obesity 
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The study presents a 1-year follow-up of the first 108 clients to complete a behavioral 
weight reduction program at Stanford University’s Eating Disorders Clinic. On the 
average, clients maintained their in-treatment weight loss over the follow-up period, 
but there was marked variability and a low correlation between in-treatment and 


Posttreatment performance, Clients reported significant changes in their eating be- 


havior after treatment, but these changes were only weakly related to weight changes. 


Although behavioral treatments for obesity 
appear to produce favorable short-term results, 
results, of follow-up studies have been weak 
(Stuart, 1975). Maintenance has typically been 
assessed over short time intervals (e.g., Hagen, 
1974), and many studies suffer from small sam- 
ple size or high attrition (e.g., Hall, 1973; Ma- 
honey & Mahoney, 1976). In addition, although 
weight losses in behavioral programs are thought 
to be mediated by changes in eating behavior, 
there has been little research on the maintenance 
of new eating patterns, 

The present article reports a 1-year follow-up 
of clients treated with behavioral techniques at 
the Stanford Eating Disorders Clinic (EDC), 
where maintenance of weight loss and behavior 
change are evaluated and related to the use of 
various weight control strategies, j 

Follow-up data for 88 of the first 108 clients 
to complete the weight control program at the 
EDC were derived from telephone interviews. 
Pretreatment weights averaged 257 pounds 
(116.8 kg) for males (n= 16) and 209 pounds 
(95 kg) for females (m= 72), with a range from 
141 (64.1) to 353 pounds (160.5 kg). Time from 
completion of treatment varied from 12 to 18 
months, 

Treatment was conducted in small groups led 
by two experienced therapists, who met weekly 
for 14 hours for 20 weeks, Lessons, presented in 
seminar format, covered the following topics 
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(Ferguson, 1975): self-monitoring, stimulus com 
trol, slowing the rate of eating, social support, 
exercise, diet, preplanning, and individual prob 
lem solving. Clients maintained daily records of 
their eating and exercise behaviors, which wert 
checked weekly. All expenses were prepad 
($250), and a 10% refund was given contingent 
on attendance and completion of homework as} 
signments. 

A subset of 25 clients completed a self-report 
questionnaire at the time of follow-up. In terms 
of demographic characteristics and weight loss, 
they were representative of the larger sample 
They rated their eating behavior on 5-point 
scales at three points in time: before treatment, 
immediately after treatment, and currently. 
Twelve questions referred to behaviors taught 
in therapy (e.g., restricting the location of eat 
ing), 5 to thoughts about food (e.g, feeling 
guilty about eating), and 8 to specific eating 
problems (e.g., binge eating). Clients also rate 
the helplessness of 21 techniques presented m 
treatment and indicated whether each was Us% 
immediately after treatment and/or at present 

Clients lost an average of 12.8 pounds (5.8 ks 
during the program and an additional .7 pout f 
(3 kg) during follow-up. Ninety percent los 
weight during treatment, and 43% lost aude 
weight during follow-up. However, weight a 
by the end of follow-up were extremely ver 
ranging from 80 pounds (36.4 kg) lost to of 
pounds (18.2 kg) gained. Both the percent 
clients losing 20 pounds (9.1 kg) or more an te 
Percent gaining weight more than doubled ir 
tween the end of treatment and follow-up. Wes 
losses during follow-up were unrelated to v 
losses during treatment (r= .002). Suia 
characteristics, such as sex and age of onset 4 
obesity, were also unrelated to long-term S" 
cess, 
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Clients reported substantial improvements in 
ic behavior during treatment. Twenty of the 
5 questionnaire items showed significant posi- 
live changes. Although there was some deteriora- 
lim during follow-up, improvements were still 
dearly evident in 19 items 1 year later. 

Clients reported using an average of 9.4 of the 
i techniques immediately after the program, 
hit only 6.6 by the time of follow-up, t(22) = 
M6, p<.005. The 21 behavioral techniques 
were grouped according to the eight topics pre- 
ented in the program. In comparative analyses 
wcial support techniques were used signifi- 
antly less than other techniques both immedi- 

y after treatment and at follow-up, ts(22) 

ged from 2.14 to 4.19, all ps < .05. Self- 
ionitoring was rated significantly more helpful 
an any other technique, ¢s(22) ranged from 

128 to 5.92, all ps <.01. Only self-monitoring 
d preplanning did not decline in use signifi- 
tantly after treatment. 

The relationship between behavioral variables 
| weight loss was weaker than expected. 
Weight loss was significantly correlated with 
Y 5 of the 20 individual behaviors that 
nged during treatment and with 4 of 19 over- 
| Changes in thoughts about food were the 
t predictors of success. 

“Weight loss was modestly related to the re- 
ited use of behavioral techniques. Weight 
y was significantly correlated with the use of 
| eed (r=.61, p<.01) and situa- 

à restriction (r=.47, p< .05) immedi- 
q after treatment. In addition, those clients 
thee TA 11 or more techniques immediately 

a Teatment lost more weight overall than 
N using fewer, ¢(22) = 2.09, p< .05. There 
À no significant relationships between weight 
oe technique use at follow-up. 
is a loss following treatment in the EDC 
is ilar to that reported in other investiga- 

using self-control, The inability of clients 
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to lose additional weight on their own after be- 
havioral treatments underscores the need for 
further study of maintenance processes. In the 
present program only 10% of the clients were 
within 20 pounds of their ideal weight a year 
after treatment and fully a third had enrolled in 
other weight loss programs. Almost all recom- 
mended continuing treatment contracts. 

The failure to find strong relationships between 
self-reported behavior change and weight change, 
despite significant changes in eating habits, can 
be interpreted in several ways. Positive relation- 
ships may have been obscured by errors and bi- 
ases in retrospective self-reports, Changes in 
eating behaviors may not be linearly related to 
changes in weight. Or, some clients may engage 
in idiosyncratic behaviors, such as traditional 
diets, that either enhance or retard their prog- 
ress. Further research is needed to systematically 
assess the effects of specific changes in behavior 
on food consumption and weight. 
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The Minnesota Multiphasic Personality Inventories (MMPIs) of 30 spinal-cord- 
injured (SCI) subjects, 30 hospitalized non-spinal-cord-injured controls (HC), and 
30 hospital employee controls (EC) were studied here. MMPI items that significantly 
differentiated SCI subjects from controls were separately factor analyzed. The major 
factor resulting from the SCI-EC analysis was characterized by items of physical 
description, as was a similar SCI-HC factor. The SCI mean scale scores were then 
corrected for spinal cord injury by replacing the SCI response proportion for the 
factor-analytically identified physically descriptive items with the EC response pro- 
portion. This correction procedure was also done using Taylor’s correction items. 
Both correction procedures resulted in significantly less pathological SCI scale means 
(1, 2, 3, 4, and 8), The present results support the notion that spinal cord injury can 


moderate MMPI item endorsement. Recommendations are discussed, 


Taylor (1970) investigated the effects of spinal 
cord injury (SCI) as a moderator variable that 
affects the validity of the Minnesota Multiphasic 
Personality Inventory (MMPI) for persons with 
such injuries, Taylor contended that the physical 
disability affects the subject’s test input by mod- 
erating the likelihood of a response endorsement 
and may substantially affect output and thus 
reduce the instrument’s validity generalization. 
The extent to which the disability affects the 
MMPI is related to the number of items that 
the disabled patient responds to on a physically 
descriptive basis rather than a psychologically 
descriptive one. 

Taylor’s (1970) evaluation of SCI as a mod- 
erator variable compared the MMPIs of SCI pa- 
tients with normal nonhospitalized cases and 
used the judgment of physical medicine special- 
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ists to identify items that appeared to be descrip 
tive of the physical condition associated will 
spinal cord injury, Taylor then scored SC 
MMPIs with a correction for the somaticall) 
relevant items (the physical descriptors). Re 
moving the patient’s response proportion for the 
somatically relevant items and substituting the 
nonpatient response proportion produced dra: 
matic changes in profile elevation, configuration, 
and consequent interpretation. 
Whereas Taylor (1970) used nine 
ists” to select the items that differentiated a 
SCI group from the nonhospitalized group, the 
present study intended to isolate those items i, 
contribute to a physical rather than a psy! 
logical description by means of factor analy 
In addition, since one source of somatic ie 
in patients appears to be hospitalization, wit 
emphasis on symptoms and attention for ae 
difficulty, the present study includes a cona 
for hospitalization composed of hospitalized a 
SCI patients as well as nonhospitalized empo 
controls. Moreover, while Taylor’s control ma 
consisted of college students (average of 2 va 
more educated than the SCI group), the pres E 
nonhospitalized control group consisted i 
nursing aides with an educational level simi 
to the SCI subjects. Ph 
Three groups of male subjects, SCI oa 
30), hospitalized non-spinal-cord-injured wes i. 
(HC; »=30), and hospital employees (EG; 


“special 
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=30) participated in the present study. The 
mean ages of the groups were, in the same order, 
475, 43.4, and 37.3 years. Similarly, the mean 
educational levels of the groups were 11.8, 10.5, 
and 12.0 years. The overall mean age was 39.4 
(SD = 4.86), and the overall educational level 
was 11.4 (SD = 4.67). Though the present groups 
were comparable, the present SCI group was 16.1 
years older and 1.9 years less educated than 
Taylor’s (1970) SCI group. Subjects in the 
present study individually completed the group 
form of the MMPI. 

Chi-square analyses of the true-false response 
cells were conducted for each item in two compari- 
sons (a) SCI-HC and (b) SCI-EC, resulting in 
56 and 65 items, respectively. Similarly, Taylor 
reported 60 items that differentiated SCI from 
control groups. 

Separate factor analyses were then conducted. 
The major factor resulting from the analysis of 
the items differentiating the SCI and EC groups 
was composed of 10 items (accounting for 26.1% 
of the variance) that gave a clear description of 
physical condition. Seven of these 10 items were 
among those identified by Taylor’s specialists. A 
similar factor resulted from the analysis of the 
SCI-HC items, although it emerged second to a 
Bee complaint factor and contained fewer 

ms, 

The SCI mean scale scores were then corrected 
Wing the EC response proportion for items fac- 
tor-analytically identified and using Taylor's 
~ Both the factor-item-corrected and Tay- 
br-item-corrected SCI profiles were significantly 
a pathological than the uncorrected SCI pro- 
e on five scales (1, 2, 3, 4, and 8), but they 
pa significantly elevated when compared to 
the EC group profiles. 

a general, the emergence of a physical de- 
tiption factor as the major component of 
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variance that distinguishes nonhospitalized pa- 
tients from SCI subjects, and to a lesser extent 
the similar factor from the SCI-HC items, sup- 
ported the validity of both (a) the items selected 
in the Taylor study and (b) the nature of 
physical condition as a moderator variable. In 
the present study, the factor-identified correc- 
tion items (i.e., 273, 330, 9, 192, 63, 310, 51, 179, 
20, 62) and- Taylor’s items resulted in significant 
reductions of the MMPI scale means (see above) 
of SCI patients, Even though both procedures 
corrected for SCI physical description, the pro- 
files were still significantly different from the EC 
group. 

The present study, by using a more repre- 
sentative group of SCI patients, more appro- 
priate controls, and a more objective item-iden- 
tification procedure, supports Taylor’s contention 
that SCI can moderate MMPI item endorse- 
ments. To remove the unwanted effect of physi- 
cal description, it is recommended that the 
MMPI answer sheet be scored twice, once scor- 
ing the physical description items and once de- 
leting them. As noted by Taylor (1970), “this 
method specifies the minimum and maxmium 
limits for the patient on each of the affected 
scales” (p. 187). Moreover, it is recommended 
that the empirically derived (factor-analytically 
identified) items reported herein constitute the 
correction factor. 
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Meanings that Professionals Attach to Labels for Children 


Charles F. Carroll 
Yale University 


N. Dickon Reppucci 


University of Virginia 


This study sought to identify relative meanings among professionals of three clinical 
labels for children—mentally retarded, emotionally disturbed, and juvenile delinquent 
—as well as an average or unlabeled choice. Forty regular classroom teachers and 32 
mental health workers responded on two questionnaires designed to measure their 
reactions to these labels with nine questions concerned with expectations for the 


child’s success in school and work, implications for treatment strategies, 


and moti- 


vation to work with the child. The labels conveyed clearly different relative mean- 
ings, and the two professional groups differed in a consistent fashion, 


Recent discussion and research regarding the 
labeling of children (e.g., Hobbs, 1975) has 
pointed to undesirable consequences of labels, for 
example, that labels affect professionals and 
trainees negatively (Foster, Ysseldyke, & Reese, 
1975). Nevertheless, labels continue in wide- 
spread use, since they facilitate both accounta- 
bility for governmental funding and determina- 
tion of eligibility for services. In addition, pro- 
fessionals in human service and education need 
shorthand methods of communicating with each 
other and appear to find current classification 
methods meaningful and helpful. The suggested 
use of more general labels, such as “children 
with special needs,” is problematic not only 
because many more children could be so labeled 
but also because more general labels do not fill 
the governmental and professional needs just 
mentioned. Therefore, current labels are likely to 
persist, and research that focuses on questions 
such as the following becomes critical. What are 
the relative meanings attached to various labels? 
Is one label more positive than another? Does a 
label have different meanings for different pro- 
fessional groups? 

The purposes of the present investigation were 
twofold: (a) to examine the relative meanings 
among professionals of nonlabeled or average 
children and three common labels for children— 
mentally retarded (MR), emotionally disturbed 
(ED), and juvenile delinquent (JD)—and (b) to 
determine what, if any, differences there are 
between teachers and mental health workers in 
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response to these labels. Meanings investigated 
included the professional’s expectations for the 
child’s success in school and work, implications 
for treatment and placement strategies, and mo- 
tivation to work with the child. Teachers and 
mental health workers were studied because (a) 
both groups come into contact with children 
having a range of labels; (b) mental health work- 
ers frequently consult with teachers; and (c) 
children labeled by one group usually retain that 
label as they contact the other group. 

Forty regular classroom (Grades 6-9) teachers 
from three middle schools and 32 mental health 
workers, including psychiatrists, psychologists, 
social workers and counselors, from four child 
guidance clinics and one mental health clinic 
participated in the study. The schools and clinics 
serve Connecticut areas of varied size, family 
income, and racial composition, Each subject 
completed a case study questionnaire and a la- 
bels only questionnaire. Subjects were first pre- 
sented with the case study questionnaire, 4 
Standard six-sentence case study of a 13-year-0l 
boy either untitled or titled with one of the 
three clinical labels, MR, JD, or ED. The same 
label, or no label, was repeated in the first sen- 
tence. Subjects answered on a 7-point scale (ex- 
tremely negative to extremely positive) nine 
questions about the boy that reflected the three 
attitude areas mentioned above. 

Ten teachers and eight mental health workers 
were randomly assigned to each of the four cas¢ 
study questionnaire conditions. The labels only 
questionnaire consisted of a repeated measures 
design; each subject was asked to answer r 
same nine questions in regard to a 13-year-ol 
boy for each of the three clinical labels alone 
and for “average, without any of these prob- 
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he first author administered both ques- 
jn the subject’s own work setting. 
significant F statistic was obtained 
least squares solution to analysis of 
(a=.05), multiple comparisons were 
d by ¢ test (œ = .05) or Newman-Keuls 
05 for the case study questionnaire; 
1 for the labels only questionnaire). On 
study questionnaire, teachers, in con- 
mental health workers, rated the chil- 
all label conditions as less likely to 
th school, F(1, 64) = 4.05, p<.05, and 
otivated to learn in school, F(1,64) = 
05, and they rated themselves lower 
knowledgeableness for working with the 
,64) = 5.40, p < .05, and willingness to 
(1, 64) = 6.96, p < .05. Similar results 
ned on the labels only questionnaire. 
also rated themselves as less knowl- 
and less willing to work with all clini- 
eled children as compared with the 
whereas the mental health workers 
themselves as less knowledgeable and will- 
with the MR, being uniformly high for 
and average; knowledgeableness, F(3, 
, P<.001; willingness, F(3, 210) = 
001. Similar results were found re- 
expectations for success at future skilled 
ent, F(3, 210) = 4.06, p <.01. On both 
aires mental health workers were less 
of primarily special class placement 
teachers. For the case study ques- 
(1, 64) = 3.59, p < .06; for the labels 
tionnaire, F(1,70) = 5.22, p < .05. All 
es for professional groups were statisti- 
trolled for sex and age bias. In sum- 
els had more negative effects for teach- 
for mental health workers in the areas 
ional motivation and expectations for 
’s success. For mental health workers, 
MR had more negative effects than the 


Were differences among the labels that 
Same for the two professional groups. 
Se study questionnaire, subjects re- 
nselves as less knowledgeable when the 
Was involved than when no label was 
F(3, 64) = 2.87, p<.05. The labels 
tionnaire results (df=3,210, p< 
cated that subjects tended to view 
as less likely to finish high school 
) and as less motivated to learn in 
= 83.35). They also rated the JD and 
likely to have serious need for all 
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three treatments: help for his family (F = 112.95), 
someone to talk about his problems with (F= 
38.30), and professional help to work out diffi- 
culties he has getting along with other people 
(F = 52.58). The MR were rated in the middle, 
and the average, lowest. There was one contra- 
dictory result in this area on the case study 
questionnaire: Subjects viewed the families of 
the MR as more in need of professional help 
than the families of the JD or unlabeled boys, 
F(3,64) =3.19, p<.05. Finally, on the ques- 
tion, “How approving are you of primarily spe- 
cial class placement for John?” on the case study 
questionnaire, the MR received more approval 
than the JD or unlabeled and the ED, though 
high, were not statistically differentiated from 
the other labels, F(3, 64) =3.64, p<.05. On 
the labels only questionnaire, subjects rated the 
MR and ED highest, JD next, and average low- 
est, F(3, 210) = 84.35, p < .001. To summarize, 
professionals expressed less motivation to work 
with the MR, who were perceived as generally 
less in need of clinical services but likely candi- 
dates for special class placement. JDs were 
rated as low on expectations for success in school, 
usually as highly in need of clinical services but 
as inappropriate for special class placement. The 
ED were uniformly rated highly in need of 
clinical services and highly appropriate for spe- 
cial class placement. 

Although sex differences were sparse, there 
were five significant (df = 3, 210, p < 05) Sex 
X Label interactions on the labels only question- 
naire. In essence, women were morè positive 
than men regarding treatment for the clinically 
labeled children, especially the ED and JD (fam- 
ily help, F=2.77; for talk about problems, 
F=3.19; help in interpersonal relations, F= 
5.16). At the same time, the women were less 
willing to work with the JD (F= 23.05) and 
more pessimistic about the MR finishing school 
(F =2.61). 

In conclusion, the present investigation fo- 
cuses attention on three neglected aspects of 
the labeling of children: (a) Professionals at- 
tach different relative meanings to clinical Ja- 
bels, (b) teachers and mental health workers 
may respond differently to labeled children, and 
(c) professional groups may have different rela- 
tive responses to common clinical labels. The 
study suggests that blanket indictments of a 
specific label may miss various subtleties that 
can only be determined by detailing different 
relative meanings of common labels. Moreover, 
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understanding differences between professional 
groups may be even more important, as children 
labeled by one profession frequently keep that 
label when they come into contact with members 
of other professional groups. Therefore, it is 
imperative to delineate both the beneficial and 
the negative aspects of different labels among 
various professional groups. 
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fects of Anxiety and Sex on Neuropsychological Tests 


Glen D. King, H. Julia Hannay, Bruce J. Masek, and Joan W. Burns 
id Auburn University 


neuropsychological test performance were studied. 
Thirty male and 30 female right-handed subjects responded to the Finger Tapping 


chological tests to detect and locate 
ology are being used increasingly by 
gists, especially in psychiatric clinics. 
are commonly used in neuropsycho- 
batteries include the Finger Tapping 
and the Form Board Test (FB). For 
, the subject taps an armature con- 
a counter as fast as possible for 10 
B test requires that a blindfolded sub- 
O blocks of various sizes into their 
ces on a formboard, first with one 
the other, and then both hands to- 
total time across the three admin- 
is computed. The subject is asked to 
f the shapes on paper in their spatial 
nship after removal of the board. This 
measure of memory (shapes remem- 
nd localization (number of correct shapes 
located). 

epeated experience with these tests in 
id demonstration situations, we have 
(a) performance differed by sex, but 
orms are not provided, and (b) anxi- 
a ed to predict poor performance. To 
Notions, a study was designed using 30 
30 female right-handed college stu- 
nteers with no history of brain pa- 
Subjects were administered the State- 
tiety Inventory and the FT and FB 
h the order of FT and FB counterbal- 
= each sex. Analyses revealed no test 
iginal cutoff points for brain damage 
1955) resulted in numerous false posi- 


for reprints should be sent to Glen D. 
ttment of Psychology, Auburn Univer- 
Alabama 36830. 


tives across both sexes for all of the tests. How- 
ever, when the performance of our subjects was 
compared to more recent normative data for 
these tests (Klgve, 1974; Vega & Parsons, 1967), 
we found that the number of false positives was 
substantially reduced. 

In consideration of sex differences on FT, 
females (M = 43.1) tapped fewer times than 
males (M = 47.5) across hands, F(1, 56) = 15.6, 
p<.01, with the sex difference being much 
larger for the right hand than the left as indi- 
cated by a Sex X Hand interaction, F(1,56) = 
5.61, p < .05. These results are inconsistent with 
previous data showing women to be superior on 
manual dexterity tasks (Maccoby & Jacklin, 
1974) and warrant further study. Other compari- 
sons for sex were not significant. Considering 
anxiety, there was a negative correlation between 
trait anxiety (A-Trait) and FT performance for 
females; the higher the A-Trait, the lower the 
number of average taps on FT (r=—.41, p< 
.01). 

Similarly, for women only, A-Trait was posi- 
tively correlated with time to complete the FB 
test using the preferred hand (r = .38, p < 02) 
and both hands together (r=.34, p< .03). It 
should be remembered that the subjects were col- 
lege students. Only a few had clinically elevated 
anxiety scores, and anxiety seemed more preva- 
lent among women than men. These results could 
underestimate the effects of anxiety on neuro- 
psychological tests, especially when the tests are 
used with patients from a psychiatric popula- 
tion or with patients who suspect the presence of 
a serious neurologic disorder. 

These results question the validity of these 
neuropsychological tests. For the tests to be 

that at least the FT needs to 


valid, it appears ds 
have separate norms for each sex. In addition, 
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high anxiety in women had a deleterious effect 
on their performance on the FT and FB tests. 
This relationship needs to be carefully consid- 
ered when neuropsychological tests are admin- 
istered to and interpreted for psychiatric pa- 
tients and neurological patients who often have 
high anxiety levels. The sensitivity of other 
neuropsychological tests to anxiety and sex dif- 
ferences needs to be investigated. 
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Demographic and treat 
economic class psychot! 
health center. Data were 


prevalent assumption 


groups. Class IV and 


ntial literature has attempted to as- 
ence of client socioeconomic status 
the psychotherapeutic process. One 
opinion shared by much of this re- 
; an assumption of homogeneity among 
clients with regard to demographic 
tment-related variables. This assumption 
ined in the present study. 
cts were persons who presented them- 
j for outpatient treatment at a federally 
Community Mental Health Center in 
Kentucky. All cases initiated and 
d within a 15-month period were exam- 
id clients classified as belonging to Class 
ss V (Hollingshead, Note 1) were se- 
excluding minors and persons in special 
S). The final study group consisted of 73 
emales, n= 46; males, n = 27); 95% of 
e was white. 
Hollingshead’s weightings, three dis- 
ps were obtained (SES IV, M = 55.6; 
M=63.8; SES Vg, M = 74.2), which 
indicated differed at the p< .02 level. 
ted that Class V clients might be 
ated in this manner in subsequent 
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í psychotherapy cases initiated and terminated: in 
" jndicate several differences among Social Classes 
of IV-V equivalency. In addition, 


= among Class V clients to justify a partitioning of that 
Class V clients also differed on a global measure of final outcome. 
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ity Among Lower-Class Therapy Clients: A Comparison of 
Class IV and Class V Psychotherapy Recipients 


Michael T. Nietzel, Matthew G. Hile, and Charles Y. Kondo 
University of Kentucky 


tment-relevant differences were studied among 73 lower socio- 
herapy clients at a predominantly rural comprehensive mental 
drawn from the records of all adult (46 females; 27 males) 


a recent 15-month period. Results 
IV and V despite the literature’s 
sufficient heterogeneity exists 
class into two distinct sub- 


Seventy-three files were examined by three 
psychology graduate students who made com- 
mittee decisions on data drawn from the records. 
Demographic variables were age, Sex, location of 
residence, education, occupation, income, marital 
status, number of dependent children, and type 
of family. T ‘reatment-related information in- 
cluded source of referral, diagnosis, treatment re- 
ceived, number of sessions and weeks in treat- 
ment, fee, person responsible for termination, 
and final outcome. Contrasts of interest were 
tested using the chi-square statistic. 

Class IV and Class V (total) clients differed 
significantly on occupation (p < 001) but not on 
amount of education. Class V clients were less 
likely to be married than Class IV clients (p< 
06), and, if married, they were less likely to 
have children (p < .05). No differences were 
noted between the two groups on other demo- 
graphic variables. Class IV and Class V clients 
did differ significantly on final outcome of ther- 
apy (P< 02). Class IV clients were more likely 
to have terminated therapy prior to their thera- 
recommendations, whereas clients from 
likely either to have refused 
have improved (therapist 
evaluation) at the time of mutual termination. 
No other treatment variables differentiated the 


two groups. Comparisons of Class IV clients with 
Class Va and Vg persons evidenced the same 
trends and differences. In addition, Class Vp 
received the diagnoses of schizophrenia, person- 
ality disorder, or transient situational disturbance 
more frequently than Class IV clients (% < 05). 


pists’ 
Class V were more 
service initially or to 
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be defined by a locus of control score of 6 or 
below, the more externally controlled alcoholic by 
a score of 7 or above. Thus, some degree of 
standardization is permitted, and researchers can 
test the means of their samples with the mean 
of the large normative group. The use of this 
base also permits ready comparison among other 
subpopulations of alcoholics. It is only when the 
most meaningful subgroups of alcoholics are so 
normed that useful comparisons and needed 
generalizations will emerge. 
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Distinguishing Learning-Disabled and Emotionally Disturbed Children 
on the WISC-R 


Raymond S. Dean 


Arizona State University 


This study concerned the isolation of distinctive Wechsler Intelligence Scale for Chil- 
dren-Revised (WISC-R) subtest patterns that would differentiate the performance of 
emotionally disturbed and learning-disabled children. A stepwise discriminant analysis 
was used to evaluate the subtest scores of 60 learning-disabled and a matched sample 
of 60 emotionally disturbed children. Four subtests of the WISC-R differentiated 
significantly between diagnostic categories. Learning-disabled children performed pre- 


dictively poorer on the Block Design, 


subtests and higher on Vocabulary than 


Picture Arrangement, and Object Assembly 
their emotionally disturbed counterparts. 


These results are interpreted as a deficit in perceptual organization for children with 


specific learning problems. 


A problem frequently encountered by the 
dinician is the differentiation of emotionally 
disturbed and learning-disabled children. With 
the publication of the Wechsler Intelligence 
Scale for Children (WISC; Wechsler, 1949) 
tame many attempts to aid such diagnosis with 
subtest patterns. Despite this long-standing in- 
terest, studies dealing with diagnostic implica- 

tions of WISC scores have been inconclusive. 
A close scrutiny of past research dealing with 
tharacteristic WISC subtest patterns reveals 
certain methodological shortcomings. Of concern 
in the majority of studies is the univariate treat- 
ment of subtests that are multiple variates. As 
multiple measurements of the same individuals, 
pees are interdependent and should not be 
a off and considered univariately, But rather, 
lane subtest patterns would be more heur- 
ie y assessed using a multivariate procedure 
E Eiders subtests in combination. Another 
the orthy concern in past research is the pos- 
ined contaminating effects of inconsistently de- 
mon in the selection of subjects (Dean, 
Pet analysis concerned the isolation 
on istinct subtest pattern on the Wechsler 
ry ee Scale for Children-Revised (WISC- 
am echsler, 1974) that would discriminate 
ee A learning-disabled and emotionally dis- 
ed children. In distinguishing between 
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groups, a stepwise discriminant analysis was used 
as a base for evaluating the predictive power of 
subtests. Such an analysis is used to statistically 
discriminate between two or more groups on the 
basis of a number of variables. In the present 
situation, groups were defined in terms of their 
diagnostic classification. Subtest data and IQ 
composites were used as the discriminating 
variables on which the present groups were ex- 
pected to differ. 


Method 


The subjects were 120 Caucasian children from 
the Phoenix, Arizona, area who had been diag- 
nosed by two experienced psychologists as either 
learning disabled or emotionally handicapped. 
The emotionally disturbed group, composed of 48 
males and 12 females, displayed specific conduct 
disorders: aggressiveness, temper tantrums, anxi- 
ety states, and so forth. Moreover, the behavioral 
criteria proposed by Quay (1972) were used for 
selection. None of the children in this group was 
reported to have specific or generalized problems 
in learning. A comparable group of subjects di- 
agnosed as learning disabled comprised the sec- 
ond group. The twofold basis for selection of 
this group was (a) the criterion recommended by 
Chalfant and Scheffelin (1969) for the identifi- 
cation of children with specific learning disabili- 
ties and (b) their goodness of match with chil- 
dren in the emotionally disturbed group. This 
match was performed with respect to sex, age 
(#2 months), present grade placement, and 
socioeconomic status as determined by the occu- 
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pation of the family’s major wage earner. Chil- 
dren in both groups ranged in chronological age 
from 6.4 to 13.6 years, with a mean of 10.06 (SD 
=3.17) for the learning-disabled group and 
10.13 (SD = 3.21) for the emotionally disturbed 
children. 

Subjects were individually administered the 
regular subtests of the WISC-R during a 6- 
month period, which were scored in the usual fash- 
ion. Subtest scale scores, along with the WISC-R 
Verbal, Performance, and Full Scale IQ scores, 
were calculated for each child. 


Results and Discussion 


The discrepancy between the Verbal IQ and 
Performance IQ means for each diagnostic group 
was evaluated using a ¢ comparison for depen- 
dent samples. This analysis showed the emotion- 
ally handicapped children to have a significantly 
higher Performance mean than that obtained on 
the Verbal scale, #(119) = 16.88, p<.01. The 
differences between Verbal and Performance IQs 
were not significant for learning-problem chil- 
dren. Hence, this finding supports the conten- 
tion of Performance IQ superiority for acting-out 
children but not for those with learning prob- 
lems, replicating the findings of Dean (1977). A 
2 (sex) X 2 (diagnostic group) multivariate 
analysis of variance was computed to test for 
equality of subtest and IQ composite means. 
The analysis revealed a significant multivariate F 
ratio for groups (p<.01). The multivariate F 
for sex and the Sex X Group interaction was not 
significant. The significant diagnostic group mul- 
tivariate F yielded significant univariate com- 
parisons for the Block Design, F(1,118)= 
8.43, p<.01, and Object Assembly, F(1, 118) 
= 5.31, p < .05, subtests, with both favoring chil- 
dren in the acting-out group. 

A stepwise discriminant analysis using the 
WISC-R subtests and IQ composites as dis- 
criminator variables was computed. The selection 
of variables included was based on the signifi- 
cance of the function as measured by overall mul- 
tivariate F ratio with the addition of each vari- 
able. Using such a procedure results in the choice 
of an optimal set of discriminating variables. 

Of the 14 possible variables, the Block De- 
sign, F(1, 118) = 8.43, p<.01, Vocabulary, F 
(1, 117) =9.45, p < .01, Picture Arrangement, 
F(1, 116) =6.96, ~<.01, and Object As- 
sembly, F(1, 115) = 6.08, p <.01, subtests added 
significantly to the amount of centroid separation 
between groups. The Wilk’s lambda chi-square 
transformation showed this derived function of 
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variables to be highly reliable, x2(4) = 2 
<.001. 

The present results provide an interesi 
somewhat unexpected view cf children 
learning difficulties when compared with 
dren whose problems involve inappropriate 
havior. In general, children diagnosed as learni 
disabled scored predictively lower on the 
Design, Picture Arrangement, and Object 
sembly subtests and higher on Vocabulary than 
emotionally disturbed children. With the excep. | 
tion of Vocabulary, these subtests correspond to 
a WISC-R factor that Kaufman (1975) has in- 
terpreted as Perceptual Organization, and they 
involve visually guided motor activity. Prior evi 
dence indicates that these subtests also req 
the capacity to overcome the embedding of an 
item from its surrounding context (Gooden 
& Karp, 1961). In the main, then, children ¥ 
were learning disabled appeared more field 
pendent in their performance on the WIS 
This finding suggests a disturbance on th 
of learning-problem children in perceptual i 
tion, whereas children with behavior problen 
displayed more of a verbal deficit. Gener 
learning-disabled children performed more 
quately on tasks within the performance 
that required verbal skills than those subtesi 
that required nonverbal visual constructive abil 
ities. 
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of Schizophrenic, Neurotic, 


socioeconomic and educational levels 
training. 


Electromyographic (EMG) biofeedback of 
frontalis muscles has recently been reported as 
effective for nonpatient volunteers in treating 
problems of chronic tension and tension head- 
aches (Budzynski, Stoyva, & Adler, 1970; Raskin, 
Johnson, & Rondestvedt, 1973). To date, no 
study has compared the effectiveness of EMG 
biofeedback on the frontalis muscles with differ- 
ent diagnostic groups of patients in a public psy- 
chiatric outpatient clinic. The present study was 
done to determine if outpatients diagnosed as 
schizophrenic, neurotic, or having tension head- 
aches with serious and chronic tension would re- 
spond to EMG biofeedback. 

The subjects were 14 outpatients whose clini- 
cal diagnoses were schizophrenia (6) ; neurosis 
(5); and tension headache (3). Ten of the pa- 
tients were women with a mean age of 38 years; 
4 were men with a mean age of 40 years. All 
Subjects participated voluntarily in this study to 
learn tension reduction. 


study was presented at the annual meeting 

W. le Western Psychological Association, Seattle, 
ashington, April 23, 1977. 

[a PPreclätion is extended to Lupe Guzman for her 
P in collecting the data. 

eee for reprints and for an extended report 

eis study should be sent to Frank X. Acosta, 

ce 7 of Psychiatry and the Behavioral Sci- 

Meats niversity of Southern California School of 

nee Hospital Place, Los Angeles, Cali- 
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Application of Electromyographic Biofeedback to the Relaxation Training 
and Tension Headache Patients 
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This study examined the effects of electromyographic (EMG) biofeedback on tension 
reduction by schizophrenic, neurotic, and tension headache patients. Fourteen patients 
participated voluntarily in at least 10 weekly EMG biofeedback sessions at a public 
outpatient clinic. All had complained of chronic tension. Patients showed significant 
decreases in their muscle tension levels with successive biofeedback training sessions. 
No significant differences were found between the schizophrenic, neurotic, and tension 
headache groups. A further contribution was the finding that patients with diverse 
benefited similarly from EMG biofeedback 


EMG levels were recorded in microvolts 
(range = 2-150 uv) by a Model BTF-401 Bio- 
feedback Technology, Inc., instrument. Three 
silver/silver chloride electrodes were placed 
across the forehead, approximately 1 inch (3.54 
cm) above the eyebrows. A modulated tone via 
a table-mounted speaker provided auditory feed- 
back to the patient. The pitch of the tone re- 
flected the level of frontalis muscle tension. 

Patients were seated in a reclining chair with 
eyes closed, Patients were scheduled for 1 bio- 
feedback session per week and had to complete 
10 or more sessions to be included in the study. 
The first visit by each patient to the biofeedback 
laboratory consisted of (a) taking baseline EMG 
measures; (b) the collection of demographic 
data; (c) administration of a brief IQ test, the 
Kent Intelligence Scale; and (d) viewing of a 
slide-tape cassette. Using cartoon characters and 
photographs, the slide tape explained the con- 
cept of biofeedback, its purpose, and how the 
equipment worked. Fifteen minutes of EMG 
baseline were recorded, which consisted of simply 
recording frontalis muscle EMG without any 
feedback being given; values were generated at 
30-sec intervals. The technician recorded the 
values and plotted them on semilog graph paper. 
At the end of the baseline session and on all sub- 
sequent sessions, the subject was shown the plot 
of his/her EMG values and was told whether it 
indicated high, moderate, or low muscle tension. 

The second visit by each subject was the first 
biofeedback training session. Each session con- 
sisted of 10 minutes of prebiofeedback EMG 
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baseline, during which no biofeedback tone was 
heard; followed by 15 minutes of biofeedback, 
during which the subject was instructed to re- 
duce his/her tension by reducing the pitch of 
the tone from the speaker; and then 5 minutes 
of postbiofeedback baseline, again with no tone. 
Subjects were instructed to try and relax with 
their eyes closed but were not to fall asleep. 

For data analysis, each biofeedback session 
was divided into four within-session measures: 
prebiofeedback baseline (10 minutes), first half 
of the biofeedback (74 minutes), second half of 
the biofeedback (74 minutes), and postbiofeed- 
back baseline (5 minutes). Mean EMG values 
for each of the four measures were recorded for 
each biofeedback session. Several sessions were 
averaged to form three session blocks to even 
out some of the session-to-session within-subject 
variation. Block 1 consisted of Session 1 modi- 
fied by averaging 15 minutes of no-feedback 
baseline of the subject’s first visit with the 10 
minutes of prefeedback baseline of Session 1. 
This gave a total of 25 minutes of prefeedback 
baseline covering two different visits. Block 2 
consisted of averaging Sessions 4, 5, and 6. Block 
3 consisted of averaging Sessions 8, 9, and 10. 
The study consisted of a 3 X 3 X 4 factorial de- 
sign, with 3 levels of diagnosis, 3 levels of ses- 
sion block, and 4 levels of within-session mea- 
sures. 

The results of a three-way analysis of vari- 
ance with repeated measures showed a significant 
decrease (p<.05) in EMG levels across the 
three session blocks. Duncan’s multiple-range 
test indicated that Blocks 2 and 3 both differed 
significantly from Block 1 (p <.05), This indi- 
cated that the three patient groups did lower 
their EMG levels with successive biofeedback 
training sessions. Although the schizophrenic (M 
= 18.0), neurotic (M = 16.1), and tension head- 
ache (M = 29.8) groups had different overall 
mean EMG levels, the analysis of variance 
showed no significant differences between the 
three diagnostic groups. No significant differences 
were noted for the within-session measures nor 
for any interactions. 

To see if factors other than diagnosis may 
have affected the biofeedback learning, subjects 
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were grouped according to the following factors: 
intelligence, education, social class, motivation, 
source of referral to the study, and final disposi- 
tion of the patient. Analyses of variance were 
performed on subject EMG levels for these fac- 
tors, and no significant effects were found. How- 
ever, the analyses of EMG values across Ses- 
sion Blocks 1, 2, and 3 were consistently signifi- l 
cant (varying from p<.01 to p< .10). This 
persistent decline in EMG values across the 10 
sessions further supports the findings that the 
patients did, in fact, lower their frontalis muscle 
tension with successive biofeedback training ses- 
sions. 

The findings on IQ and education are of par- 
ticular importance when compared with Town- 
send, House, and Addario’s (1975) biofeedback 
study in a clinical setting, which screened out 
subjects who fell in the low IQ ranges or had 
less than an eighth-grade education. In contrast, 
our study suggests that learning to reduce ten- 
sion through EMG biofeedback is applicable to 
individuals with diverse backgrounds. 

Unlike other biofeedback studies that usually 
report findings based on 2-5 sessions per week 
(Budzynski et al., 1970; Raskin et al., 1973; 
Townsend et al., 1975), the patients in our study 
typically had only one biofeedback session per 
week. Interestingly, this less frequent rate of 
biofeedback proved to be sufficient to produce 
a significant reduction in frontalis muscle tension. 
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Effects of Dopamine Synthesis Inhibition 
on WAIS Comprehension 


Edward F. Donnelly, Henry A. Nasrallah, Richard Jed Wyatt 
J. Christian Gillin, and Llewellyn B. Bigelow ‘ 


National Institute of Mental Health, St. Elizabeths Hospital, Washington, D.C. 


A previous study has shown that parkinsonian patients treated with L-dopa had 
decreased scores over time on the Comprehension subtest of the Wechsler Adult 
Intelligence Scale (WAIS), whereas scores on all other 10 subtests of the WAIS 
increased. It was hypothesized that if L-dopa treatment, which increases do- 
pamine activity in the brain, is directly related to an apparent deleterious effect 
on the WAIS Comprehension, then a drug such as alpha-methyl-para-tyrosine 
(AMPT), which decreases dopamine activity, might have an augmenting effect 
on this subtest. A therapeutic trial of AMPT in a group of chronic schizophrenic 
patients provided an opportunity to test this hypothesis. It was found that Com- 
prehension scores improved significantly with AMPT. Other clinical rating in- 
struments failed to show any changes. The implications of using a psychometric 
instrument to assess specific, but clinically obscured, drug effects on intellectual 
functioning are discussed. 


by antipsychotic drugs (Snyder, Banerjee, 
Yamamura, & Greenberg, 1974). Research in 
parkinsonism itself has revealed that some 


f Over the past 25 years, treatment of the 
Shizophrenic disorders has been dramatically 


k: a Mann, & Margolis, 1964; Lasky, 
¥ affey, Bennett, Rosenblum, & Hollis- 
a 962; National Institute of Mental 
ealth, 1964). 

fent side effects of the treatment of 
fareni patients with antipsychotic 
e are tremors, muscle rigidity, stiff pos- 
te gait, and drooling. Because of 
a ey of these side effects to the 
E oms of Parkinson’s disease, they have 
a “alia to as “parkinsonian” side ef- 
ade coe it has been shown that these 
Bcd ects in schizophrenic patients are re- 
i to the blockade of dopamine receptors 


Re 
ae for reprints should be sent to Edward 
St, Babee patio Institute of Mental Health, 
z s Hospital, William A. i ildi 
tshington, D.C. sitet i A. White Building, 


brain areas contributing to the control of body 
movements have lower concentrations of the 
neurotransmitter dopamine than normals. 
When parkinsonian patients were treated 
with L-dopa (dopamine itself cannot be used 
in replacement therapy because it does not 
cross between the blood stream and brain 
tissue, whereas its synthetic precursor, L- 
dopa, does), their symptoms often disap- 
peared or became less severe. 

Improvement of schizophrenic symptoms 
seems to be associated with drugs that de- 
crease brain dopamine activity, whereas im- 
provement in parkinsonian symptoms is asso- 
ciated with drugs that increase dopamine 
activity in the brain. 

It is generally agreed that the Comprehen- 
sion subtest of the Wechsler Adult Intelli- 
gence Scale (WAIS) is a measure of some 
form of “judgment.” A recent study (Don- 
nelly & Chase, 1973) of parkinsonian patients 
placed on therapeutic trials of L-dopa re- 
ported a decrease in the Comprehension scores 
after several months of treatment; in con- 
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trast, scores on all other 10 subtests of the 
WAIS showed an increase. A possible drug- 
induced defect in this type of judgment is 
consistent with exacerbation of psychosis in 
chronic schizophrenic patients after L-dopa 
treatment (Yaryura-Tobias, Wolpert, Dana, 
& Merlis, 1970) and with other reports of 
agitation, confusion, paranoia, delusions, and 
depression in parkinsonian patients after L- 
dopa treatment (Cotzias, Papavasiliou, & 
Gellene, 1969; Yahr, Duvoisin, Shear, Bar- 
rett, & Hoehn, 1969; Barbeau, 1972). These 
findings, suggesting that L-dopa is detrimental 
to cognitive functioning, led us to hypothe- 
size that depleting the brain of its dopamine 
stores might be accompanied by an increase in 
the scores of the Comprehension subtest. 
Alpha-methyl-para-tyrosine (AMPT), a drug 
known to inhibit the rate-limiting enzymatic 
step in dopamine synthesis (Spector, Sjoerd- 
sma, & Undenfriend, 1965), was used to test 
our hypothesis. 

The main objective of the present study, 
a part of a more extensive study of the po- 
tential clinical effectiveness of AMPT in 
schizophrenia (Nasrallah et al., 1977), was 
to determine if the WAIS Comprehension 
scores of a small but relatively homogeneous 
group of chronic schizophrenic patients, al- 
ready receiving high doses of antipsychotic 
drugs (e.g., chlorpromazine, fluphenazine, and 
thioridazine), would increase with the ad- 
ministration of AMPT. 


Method 
Subjects 


participated in the present 
age from 19 to 35 years, 
Length of hospitalization ranged from 2 to 15 years, 


with Each pati i 
ars, Patient satisfied the 
research diagnostic criteria (Spitzer, Endicott, & 


ic undifferentiated schizo- 
Procedure 


The patients had been on the research 
4 to 13 months when consent ard for 
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of medication at the time of the study, e7 
indicated that greater doses of antipsychoj 
would not have enabled them to leave the h 
for a sustained period of time. Each of the 
had been stabilized on a constant dose of 
chotic drugs (average equivalent dose of 
mazine = 800 mg/24 hr) for several months b 
study, and all were kept on the same dose 
out the study. Patients entered the protocol 
12-month period as they became available. 

AMPT and placebo were administered in a 
blind nonrandom design, with periods on 
placebo for each patient. A 4-week period o 
administration preceded and followed a 3 
of AMPT. A fixed number of capsules (a 
12 placebo, placebo plus AMPT, or AMPT 
alone) was given daily to each patient in 
vided doses. To assess each patient’s clinical < 
the Comprehension subtest of the WAIS ¥ 
ministered and scored by the same psycholo 
the end of the first placebo period, the AM! 
the last placebo period. Two other psycho 
blind to the design of the study, also score 
subtest (interrater reliability W = .939). i 

The patient’s clinical status was also mon 
daily by the National Institute of Mental 
(NIMH) Inpatient Behavioral Rating Scale, 
item nurses’ rating scale characterizing a broad 
of psychopathology and showing good inter 
liability (Green, Bigelow, O’Brien, Stahl, & W 
1977). Nurses completing this form were bl 
the experimental design and medication status 
patients. The Brief Psychiatric Rating Scale 
Overall & Gorham, 1962) was completed by 


chiatrist who also was blind to design and 
tion status, 


Results 
WAIS Comprehension Subtest Scores 


A Friedman two-way analysis of varia 
indicated significant differences (yr? = 1h 
$ < 001) between the AMPT trial and | 
two placebo periods. The Wilcoxon match 
Pairs signed-ranks test was then used to‘ 
termine the differences between the A 
trial and the two placebo periods. Means 
Standard deviations from the first pla 
Period to the AMPT trial were 6.56 = 
and 10.33 = 2.16, respectively (T = 1, ; 
:01). Means and standard deviations from # 
AMPT trial and the second placebo 
were 10.33 + 2.16 and 8.00 + 2,26, re 
tively (T = 0, p < .01). However, there Y 
No significant difference between the first 
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during the second placebo period, and all but 
one improved during the AMPT trial. 


Other Patient Ratings 


The means of the psychosis scores of the 
nurses’ NIMH Inpatient Behavioral Rating 
Scale did not show significant changes among 
the first placebo, AMPT, and second placebo 
periods (11.42 + 3.11, 11.93 = 2.54, and 
11,23 + 3.96, respectively). Similarly, the 
BPRS schizophrenia scores did not show sig- 
nificant changes among the first placebo, 
AMPT, and second placebo periods (9.54 + 
3.39, 7.56 3.11, and 8.43 + 3.11, respec- 
tively). 


Discussion 


The findings of this study support the hy- 
pothesis that depleting the brain of its dopa- 
mine stores produces an increase in WAIS 
Comprehension scores. Thus, judgment, as 
measured by the Comprehension subtest, in 
schizophrenic patients appears to improve in 
a relatively “hypodopaminergic” state. More- 
Over, our results also suggest that Compre- 
hension may be a more sensitive measure of 
subtle drug-induced intellectual changes in 
schizophrenic patients than the nurses’ rating 
scale (NIMH Inpatient Behavioral Rating 
Scale) or the interview-based BPRS. 

The possibility of artifacts arising conse- 
quent to repeated testing in each patient at 
relatively short intervals must be considered. 
It is conceivable that the increase in scores 
at the time of the second testing (AMPT 
trial) was a function of a practice effect. 
Likewise, the decrement in scores observed at 
the third testing (second placebo trial) could 
be attributed to a boredom effect. Such a 
coincidence of influences, however, seems un- 
likely, 

The possibility that improvement in the 
Comprehension scores was due to a placebo 
effect also must be considered. If it is due to 
such an effect, it would have to be operative 
only between the 4th and 7th week of the 
Protocol and be sufficiently evanescent to dis- 
appear at the time of the third testing, given 
at the close of the 11th week of the study. 

Results could not be due to a group trend 
or overall shifts in the ward environment, be- 
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Table 1 

Individual Comprehension Scores (Raw) of 
Nine Chronic Schizophrenic Patients on 
AMPT and Two Placebo Periods 


ee 


; First Second 
Patient placebo AMPT placebo 
1 0 7 6 
2 6 11 10 
3 8 7 4 
4 9 13 11 
5 6 9 7 
6 8 11 9 
7 6 10 y 
8 8 12 7 
9 8 13 11 


Note. AMPT = alpha-methyl-para-tyrosine. 


cause the patients were run at different times, 
not as a cohort. 

To assess whether administration of AMPT 
affected central dopamine in this study, we 
measured plasma prolactin levels. The release 
of this hormone, secreted by the pituitary 
gland, is normally inhibited by dopamine. 
Since serum prolactin levels did increase sig- 
nificantly during this study, it appears likely 
that AMPT did inhibit synthesis of dopamine 
in the brain (Nasrallah, Rogol, Wyatt, & 
Gillin, Note 1). 

These results, together with the earlier find- 
ing that parkinsonian patients performed 
worse on the Comprehension subtest when 
treated with L-dopa than without, suggest 
that performance on this subtest is inversely 
related to central dopaminergic activity. 

At the present time, assessment of drug- 
induced changes in psychiatric populations 
relies heavily on rating scales, symptom 
checklists, and self-reports. Our data, how- 
ever, suggest that the relationship between a 
specific intellectual function and the specific 
brain neurotransmitter dopamine can be most 
accurately determined by quantitative assess- 
ment, that is, the patient’s scored responses 
to the Comprehension subtest. Several studies 
that have investigated the effects of antipsy- 


chotic drugs on the WAIS scores of normal 


and schizophrenic subjects reported signifi- 
cantly improved scores (Abrams, 1958; Gard- 
ner, Hawkins, Judah, & Murphree, 1955; Gil- 
gash, 1957), whereas others reported no 
changes (Judson & Mac Casland, 1960; Pare- 
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des, Baumgold, Pugh, & Ragland, 1966). One 
study (Gilgash, 1957) indicated that a group 
of schizophrenic patients showed a greater 
point average increase on the Comprehension 
subtest than on any other subtest of the Ver- 
bal scale subsequent to chlorpromazine medi- 
cation. Since most of these studies were re- 
ported in the late 1950s and similar assess- 
ment by the WAIS has been rarely reported 
in the past decade, it seems that psychometric 
evaluation has been a neglected resource as a 
clinical dependent variable in determining the 
impact of drug treatment on intellectual 
functioning, 

The manner in which the Comprehension 
subtest and AMPT covaried in the design of 
the present study suggests that this subtest, 
or possibly other subtests of the WAIS, in 
contradistinction to conventional behavioral 
rating scales, offers some promise for under- 
standing the complexity of interacting varia- 
bles in cognitive psychopharmacology. Fur- 
ther studies of drug-induced intellectual 
changes in schizophrenic patients should be 
directed at determining whether any of the 
other subtests of the WAIS are as sensitive 
as Comprehension in assessing changes and 
whether Comprehension is sensitive to the ef- 
fects of drugs other than AMPT and L-dopa 
in designs similar to the present study. 


Reference Note 


1, Nasrallah, H., Rogol, A., Wyatt, R. J, & Gillin, 
J. C. Potentiation of phenothiazine-induced pro- 
lactin secretion by alpha-methyl-para-tyrosine 
(AMPT) in schizophrenic males. Unpublished 
manuscript, National Institute of Mental Health, 
Washington, D.C., 1977. 
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Psychophysiological Effects of Progressive Relaxation 
in Anxiety Neurotic Patients and of 


Progressive Relaxation and Alpha Feedback in Nonpatients 


Paul M. Lehrer 
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College of Medicine and Dentistry of New Jersey 
Rutgers Medical School 


Ten anxiety neurotic patients were given four sessions of individual instruction 
in progressive relaxation, and 10 patients served as waiting list controls. Ten 
nonpatients were assigned to each of the same conditions, and an additional 10 
nonpatients were given four sessions of alpha feedback. Nonpatients showed 
more psychophysiological habituation over sessions than patients in response to 
hearing five very loud tones and to a reaction time task. Patients, however, 
showed greater physiological response to relaxation than did nonpatients. After 
relaxation, the autonomic responses of the patients resembled those of the non- 
patients. The effects of relaxation were more pronounced in measures of physio- 
logical reactivity than in measures of physiological activity. Defensive reflexes 
yielded to orienting reflexes more readily in nonpatients than in patients. There 
was also a tendency for progressive relaxation to generalize to autonomic func- 


tions more than alpha feedback. 


Progressive relaxation is one of the most 
venerable techniques of behavior therapy and 
one of the most investigated. Jacobson (1938) 
reviewed a number of the early case studies 
showing that progressive relaxation training 
produces extraordinarily low levels of muscle 
tension, and that patients suffering from a 
variety of psychological and psychosomatic 
disorders experience significant relief when 
they practice the technique. Gellhorn (1958) 
hypothesized that progressive relaxation re- 
duces physiological reactivity through reduc- 
tion in proprioceptive feedback from the 
muscles to the reticular system. However, 
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Davison (1966) disputed this “peripheralist” 
theory and argued that relaxation works on 
a more central, cognitive level, since subjects 
can still be anxious when their muscles are 
rendered almost completely flaccid by curare. 
A number of controlled studies have been 
done on the treatment effectiveness of relaxa- 
tion. Relaxation has been found to be effec- 
tive in treating the symptoms of insomnia 
(Borkovec & Fowles, 1973; Borkovec, Kalou- 
pek, & Slama, 1975; Lick & Heffer, 1977; 
Nicassio & Bootzin, 1974; Woolfolk, Carr- 
Kaffashan, McNulty, & Lehrer, 1976); hy- 
pertension (Deabler, Fidel, Dillenkoffer, & 
Elder, 1973; Shoemaker & Tasto, 1975); 
tension headaches (Cox, Freundlich, & Meyer, 
1975); and chronic anxiety (Townsend, 
House, & Addario, 1975). 

The first controlled studies on the psycho- 
physiological effects of relaxation were done 
on reflexes of the skeletal muscles. Miller 
(1926) tested the finger withdrawal reflex in 
subjects who had been trained in progressive 
relaxation for 6 months. The reflex was 
smaller in subjects who were instructed to 
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relax than in those who were instructed not 
to do so. A more sophisticated control was 
added by Freeman (1933), who instructed all 
subjects to relax but had some subjects sit 
in a seat that required them to maintain some 

` muscle tension in order to remain comforta- 
ble. The finger withdrawal reflex was smaller 
in the relaxation group in this study also. 
More modern studies of skeletal muscle ac- 
tivity have indicated that progressive relaxa- 
tion can reduce frontalis electromyogram 
(EMG) activity more than control conditions 
(Cox et al., 1975). One study found that tape- 
recorded instructions in progressive relaxation 
reduced frontalis EMG activity more than 
did autogenic training (Staples, Coursey, & 
Smith, Note 1). 

Since progressive relaxation operates pri- 
marily on the skeletal muscles, it can be ex- 
pected to have a greater effect on this system 
than on other physiological systems. Davidson 
and Schwartz (1976) have argued that all re- 
laxation techniques are not alike and that 
they probably have different effects on differ- 
ent physiological systems, depending on the 
system at which they are most directly aimed. 
Nevertheless, a number of studies have shown 
that under certain conditions, autonomic 
measures are also affected by progressive re- 
laxation training. For example Paul (1969b) 
found that brief live progressive relaxation 
training reduced heart rate and respiration 
rate more than a control condition, and it re- 
duced heart rate more than hypnotic sugges- 
tions to relax. 

However, results have not been consistent, 
In another study, Paul (1969a) found that 
although live progressive relaxation training 
reduced a combined measure of physiological 
reactivity to an anxiety-provoking scene, no 
individual physiological measure was signifi- 
cantly reduced. Lehrer (1972) found that al- 
though relaxation reduced heart rate toward 
the end of a period of electric shock, it had 
no measurable effect on skin potential or res- 
piration rate. Other studies have found no 
differences in autonomic activity between pro- 
gressive relaxation and ordinary rest (Paul 
& Trimble, 1970; Grossberg, Note 2) or sug- 
gestions to relax (Edelman, 1970). Even EMG 
measures have not always significantly differ- 
entiated progressive relaxation training from 


LEHRER 


control conditions (Haynes, Moseley, & Mc- 
Gowan, 1975; Lehrer, 1972; Paul & Trimble, 
1970). A number of factors may account for 
the discrepant results. Among these are popu- 
lation differences, techniques for indepen- 
dently assessing level of anxiety, intensity of 
training, and conditions of testing. 


Population Differences 


The effects of relaxation appear to be 
clearer among persons with high basal anxiety 
levels. Thus Wilson and Wilson (1970), 
studying patients in a general medical hos- 
pital, found that brief relaxation instructions 
reduced heart rate as compared with a con- 
trol group that listened to a talk on the his- 
tory of baseball, but only in high anxious 
subjects, as determined by responses to the 
IPAT Anxiety Scale. Also, Brandt (1974), 
studying college students, found that taped 
progressive relaxation instructions reduced 
the size and frequency of electrodermal re- 
sponses, heart rate, and EMG when measured 
during relaxation training, but when these 
measures were repeated for a rest period af- 
ter the second (and last) training session, 
only heart rate was different between the 
groups, and only among high scorers on the 
Fear Survey Schedule. It should come as no 
surprise that the effects of relaxation are 
easier to measure among anxious rather than 
nonanxious persons, since nonanxious pêr- 
sons are presumably able to relax without 
special training. Lader and Wing (1966) re- 
ported that anxiety neurotics do not show 
habituation of the skin conductance to re- 
peated presentations of a loud (100 dB, 1,000 
Hz) aversive tone. They also reviewed a num- 
ber of other studies showing that anxiety 
neurotics have higher levels of sympathetic 
activity, slower habituation of the electro- 
dermal response to noxious stimuli, and higher 
levels of some indicators of physiological 
arousal. 


Method of Assessment of Anxiety 


Although the studies by Brandt (1974) and 
by Wilson and Wilson (1970) suggest that 
the IPAT anxiety inventory and the Fear 
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Survey Schedule hold promise as screening 
devices for relaxation studies, most of the 
commonly used paper-and-pencil tests of anx- 
iety probably do not measure the kind of 
gross psychopathology that was studied by 
Lader and Wing (1966). For example, Edel- 
man (1970) used the Spielberger State-Trait 
Anxiety Inventory (STAT) as a screening de- 
vice and found that neither anxious nor non- 
anxious subjects manifested any significant 
autonomic changes after tape-recorded relaxa- 
tion instruction. Other paper-and-pencil tests 
may have the same problem, especially the 
commonly used Taylor Manifest Anxiety 
Scale (TMAS). In some instances the TMAS 
has been found to be significantly related to 
physiological measures, such as eyeblink con- 
ditioning (Spence & Taylor, 1966), condition- 
ing of finger withdrawal reflex (Sloane, 
Davidson, & Payne, 1965), the Palmar Sweat 
Index during a verbal conditioning task (Hay- 
wood & Spielberger, 1966), conditioned heart 
rate accelerations (Dube, 1966), and basal 
heart rate (Lehrer, 1969). However, most 
studies have found no relationship between 
the TMAS and autonomic measures (e.g, 
Beam, 1955; Bitterman & Holtzman, 1952; 
Bursten & Russ, 1965; Calvin, McGuigan, 
Tyrrell, & Soyars, 1956; Uewinsohn, 1956; 
McGuigan, Calvin, & Richardson, 1959; 
Sloane et al., 1965; Spelman, 1966). 

A similar lack of significant relationships 
has also been found between various physio- 
logical measures and the Maudsley Neuroti- 
cism Scale (Spelman, 1966; Sloane et al., 
1965), Welsh’s A factor (Katkin, 1965), Mc- 
Reynold’s Assimilation Scale (McReynolds, 
Acker, & Brackbill, 1966), and several tests 
of situational anxiety (Katkin, 1966; Rosen- 
stein, 1960). A possible reason for the inade- 
quacy of these paper-and-pencil tests as mea- 
sures of anxiety is that these tests may mea 
sure other things besides anxiety, for €x- 
ample, the willingness of people to describe 
themselves negatively or to admit emotional 
weaknesses, Thus Kimble and Posnick (1967) 
reported a correlation of -73 between the 
TMAS and a scale having nothing to do with 
anxiety that contained items matched with 
the TMAS on emotional importance and so- 
Cial acceptability. 
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Intensity of Training 

None of the studies using brief relaxation 
instructions produced the size of changes im- 
plied in Jacobson’s work (1938), in which 
people receive training over a period of 
months. Perhaps the intensive training given 
by Jacobson and his more strict adherents has 
a more profound physiological effect than the 
brief training procedures that are convention- 
ally used, especially those that are adminis- 
tered in a single session and/or are given by 
tape recording. 


Conditions of Testing 


In almost all of the above cited studies that 
found brief relaxation training to be effective 
in altering psychophysiology, subjects were 
tested either while they were receiving relaxa- 
tion instruction or while the trainer was still 
in the room immediately after live instruction. 
These conditions are not analogous to the 
clinical situation in which the patient must 
apply the training at home and may have 
exaggerated the effects of the relaxation train- 
ing. This is suggested by Paul and Trimble’s 
(1970) findings that live relaxation training 
(with the therapist in the room during test- 
ing) produces a greater physiological effect 

tape-recorded instructions, and by 
Brandt’s (1974) findings that the physiologi- 
cal effects of brief tape-recorded relaxation 


instructions are quickly attenuated after a 


training session is over. 


Alpha Feedback 


Electroencephalogram (EEG) alpha control 
has recently become a topic of broad interest 
and wide app! ication. Early investigators have 
remarked on how the presence of alpha ap- 

ars to be inversely related to anxiety (Jas- 


per, 1937 ), and some modern work on alpha 


feedback has found that among subjects who 
do learn to control their alpha production, 
the production of alpha is usually associated 
with subjective feelings of relaxation, letting 
go, and a Jack of focused thought (Brown, 
1970; Nowlis & Kamiya, 1970). More re- 
cently, however, Orne and Paskewitz (1974) 


reported a study showing that under the 
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threat of shock, subjects who are given oc- 
cipital alpha feedback still show increases in 
subjective anxiety, frequency of skin con- 
ductance responses, and heart rate, despite 
the fact that their levels of occipital alpha 
production remain high. Thus, although high 
levels of alpha are, under some conditions, 
associated with diminished anxiety, the pro- 
duction of alpha appears to be at least par- 
tially a differentiable process from subjective 
anxiety and autonomic reactivity. 


Present Study 


This study compared the physiological ef- 
fects of progressive relaxation, alpha feed- 
back, and a no-treatment control condition. 
The responses of psychiatric patients who 
were clinically diagnosed as suffering from se- 
vere anxiety were compared with that of non- 
anxious volunteer subjects who were not psy- 
chiatric patients. Subjects received approxi- 
mately the same intensity of training as is 
usually given to patients in behavior therapy, 
and the testing was done during a separate 
session by an experimenter who was not in- 
volved in training the subjects to relax, 


Loud Tones 


The testing situation included a series of 
loud tones similar to those used by Lader and 
Wing (1966), who had found that tranquiliz- 
ing medication allowed anxiety neurotics’ skin 
conductance responses to the tones to habit- 


uate at a rate equal to the rate in normal 
subjects, 


Reaction Time Task 


The question is frequently asked whether 
relaxation makes people somnolent and less 
reactive to the environment in general or 
whether, as suggested by Wilson and Wilson’s 
(1970) data, it reduces the specific effects of 
anxiety and enhances the ability to Pay atten- 
tion to the environment and to solve problems, 
A test of reaction times was thus included to 
test whether relaxation facilitates this t 
of “alertness” response and whether, on a 
physiological level, it affects the orienting re- 
flex (cf. Graham & Clifton, 1966). 
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Alpha Feedback 


The present study also included an alpha 
feedback condition that used a training pro- 
cedure similar to that described by Brown 
(1970). The clinical and experimental studies 
cited above suggest that progressive relaxa- 
tion would, especially among the anxious pa- 
tients, generalize to autonomic reactivity, 
whereas alpha feedback would not generalize 
as much. Because of the difficulty in finding 
a sufficient number of suitable anxious pa- 
tients who would be willing to participate in 
the study, I decided to test the alpha condi- 
tion only on nonpatient volunteers. Light was 
used as a feedback signal rather than the 
more conventional tone signal so as not to 
unintentionally habituate these subjects to 
the noxious tones used in the testing situation. 
Although light may depress occipital alpha, 
it has also been found that the presence of 
light in a room may increase the effects of 
alpha feedback training (Paskewitz & Orne, 
1973). 


Method 


Subjects. The subjects for this experiment were 20 
anxiety neurotic patients and 34 nonpatient volun- 
teers. The patients Were recruited through their ther- 
apists, who are professionals at the Rutgers Mental 
Health Center or members of the faculty of the De- 
partment of Psychiatry of Rutgers Medical School 
and/or of the Graduate School of Applied and Pro- 
fessional Psychology at Rutgers University. Patients 
were accepted whose symptoms were “primarily 
those of anxiety (according to the report of their 
therapists), who were not psychotic, and who were 
not regularly taking any medication other than minor 
tranquilizers. Most of the anxious subjects were 
diagnosed as anxiety neurotic. Patients with histories 
of neurological or cardiovascular disorders were ex- 
cluded, and testing sessions were held only after 
patients had not taken any medication for at least 
24 hours, Ten patients each were included in the 
relaxation and control (ordinary rest) conditions. 
The nonpatients were recruited from newspaper ad- 
vertisements, notices to staff members at Rutgers 
Mental Health Center, and signs on bulletin boards. 
Subjects were accepted for the experiment only if 
they reported that they were in excellent physical 
and emotional health, were not taking any psycho- 
active medication, and were not currently receiving 
any form of treatment for an emotional problem. 
Subjects were assigned to groups on as random a 
basis as Possible, given difficulties in scheduling. 
Eleven subjects were assigned to the relaxation group, 
11 to the alpha feedback group, and 12 to the con- 
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Pretest Training 


Patients and 
nonpatients 


Relaxation 


5-min rest; 
5 tones, 100 dB, 1,000 Hz 


4-5 sessions of training 
in progressive re- 
laxation during a 
3-week period 

3-week wait for training 


Control Patients and 5 reaction time trials 
nonpatients with variable interval 
foreperiods 
Alpha feedback Nonpatients 3-min rest 4-5 sessions of alpha 


feedback during a 
3-week period 


Nole. Posttest consisted of the same procedures as the pretest. 


trol group (ordinary rest). Both patients and non- 
patients in the control group were offered relaxation 
training after their participation in the two testing 
sessions. They were not told that they were mem- 
bers of a control group. The complete design is sum- 
marized in Table 1. 

Testing procedure, Testing sessions were held ap- 
proximately 3 weeks apart. Subjects in the relaxation 
m alpha feedback groups received their training 
e testing sessions, whereas subjects in the con- 
A group were tested at the same intervals but 

ere not in contact with the project between testing 
sessions, 
ee testing session, subjects were first wired 
A i, ea and, in both conditions, before 
ua A training, they were asked to close their 
ie n we relax as deeply as possible in the re- 
int g chair. They were told to expect to hear some 

id tones after several minutes but to try to re- 
main relaxed through them. 
mar approximately 5 min of resting, five aversive 
eh were presented for a duration of 1 sec each 
and ae interstimulus interval varying between 30 
ae sec. They were presented from a speaker 
mae several feet behind the subject’s chair, at a 

N ncy of 1,000 Hz and an intensity of 100 dB(A). 
aie Poore 30 sec after the last aversive tone, 
E Pime reentered the subject room, put a 
ee switch in the subject’s nondominant 
ied pe told the subject that the next task con- 
iii ie a series of reaction time trials. Subjects 
ye ne to expect to hear “softer, deeper” tones 
aes had heard before and that they would 
i he avenae, Each tone was a “get ready” stimu- 
Na ubjects were told to press the button as quickly 
pocbi after the tone went of. The tones were 
hake at 60 dB and approximately 300 Hz for 
tes soe vaian between 5 and 15 sec, with inter- 
Ren lus intervals of between 30 and 60 sec. Fol- 
i g the last trial, subjects were given an addi- 

onon rest. ug 
eee were also administered the State Anxiety 

: of the STAT before and after each session. 

efore each session they were asked to fill it out 


with reference to how they felt “right now.” After 
the session they were asked to fill it out as to how 
they felt “during the session.” All subjects were also 
administered the STAI Trait scale prior to their 
first session. 

Apparatus. Physiological measures for the first 10 
subjects tested, approximately equally distributed 
among groups, were taken on a Grass Model 5 poly- 
graph. The rest of the subjects were tested on a 
Beckman Type R Dynograph. The following mea- 
sures were taken: skin conductance, heart rate, and 
EEG from the dominant occipital area. 

Skin conductance was measured from a pair of 
silver-silver chloride electrodes, 1 cm* each, mounted, 
in a special holder to facilitate recording from the 
palmar surface of a finger (Edelberg, Note 3). The 
middle finger of the dominant hand was used, K-Y 
jelly was used as the electrolyte. Resistance was mea- 
sured in kilohms on the Grass polygraph, which was 
later converted to micromhos conductance for data 
analysis. On the Beckman polygraph, a Lykken 
coupler was used for direct measurement of skin 
conductance. Heart rate was measured through 4 
cardiotachometer from standard electrocardiogram 
electrodes attached to the right arm and left leg. 
EEG leads were attached to the dominant occipital 
area (0: or O:) and to both earlobes as the refer- 
ence point. A ground lead was attached to the fore- 
head. Alpha was filtered by an instrument, similar 
to that described by Paskewitz (1971), which pro- 
vides a voltage output when a criterion amplitude 


of filtered alpha is present. For the testing sessions, 
the criterion level for alpha for all was set at 19 uV. 
Subjects in the progressive 


Relaxation training. 
relaxation group were administered four sessions of 
individual training using an abbreviated form of 


Jacobson’s (1938, 1964) technique. Subjects were 
asked, alternately, to tense and to relax each of 
35 muscle groups. Subjects were instructed to tense 
each muscle as little as necessary to barely feel the 
proprioceptive sensations of muscle contraction. Each 
muscle group was worked with repeatedly until the 
subject could feel the muscle contraction without 
engaging in any overt movement. Each subject was 
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trained at his or her own pace. In most subjects the 
arms and some of the leg muscles were trained dur- 
ing the first session, the remainder of the leg muscles 
and the trunk were trained during the second session, 
and the facial muscles were trained during the third 
session, The fourth session was generally devoted to 
review and general relaxation. Subjects were asked 
to rate their relaxation during each session from 0, 
“as relaxed as I ever felt,” to 100, “very anxious.” If 
a subject’s self-rating of relaxation did not fall be- 
low 10 by the fourth session, an additional session 
of training was given. Subjects were told to prac- 
tice the technique at home for 1 hour daily, but they 
were given no written or taped material to take 
home with them, 

Alpha feedback. Dominant occipital alpha feedback 
was given in four sessions of individual training. The 
occipital area was chosen for feedback because of the 
high density of alpha often recorded from that site 
and because of previous reports of success in condi- 
tioning alpha from that region, including one study 
showing that increased alpha can persist during aver- 
sive stimulation (Chisholm, De Good, & Hartz, 1977). 
Dominance was assessed simply by asking subjects 
whether they are right-handed or left-handed. In 
each session the subjects were first wired to the 
equipment and were then asked to lie back on a 
reclining chair, to close their eyes, and to try to 
keep their minds blank. This, they were told, is the 
best way to stay in an “alpha state.” Through their 
closed eyelids, they were told, they would see a 
light flashing on and off. The presence of this light 
indicated that the subject was emitting alpha. A 
small flashlight bulb was mounted on the 8-foot 
(2.4 m) ceiling directly above the subject’s head, 
and it flashed on when the subject’s alpha am- 
plitude reached a specified criterion, Generally 
this criterion was 19 aV. If the subject emitted less 
than approximately 30% alpha, the criterion was 
lowered and the response was shaped, Subjects were 
given approximately 40 minutes of training during 
each session. They were told to try to put themselves 
into an alpha state at home in daily 1-hour practice 
sessions, without feedback. As in the relaxation group, 
alpha feedback subjects were asked to rate their 
level of relaxation after each session and were given 
a fifth session of training if they did not rate them- 
selves below 10 after the fourth session, 


Results 


Most analyses reported below are on dif- 
ference scores between values on the pretest 


Computed, the F 
effect or interac- 
teraction between 
cit sessions factor 
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from an analysis done on the raw session 
scores (vs. the session-difference scores used 
here). Covariance adjustments were also used 
in these analyses to adjust for some differ- 
ences between experimental groups in the 
first session that were determined by analyses 
of variance done on the pretest session alone, 
The session-difference scores were adjusted 
for pretest (Session 1) levels of the same 
measure. Also, to test the significance of 
habituation across sessions and across trials, 
analyses of variance were computed on the 
raw scores for all data points within each of 
the sessions, and analyses of variance were 
computed with an additional between-groups 
measure for session. All analyses include a 
between-groups factor (treatment) and a 
within-group factor (time). The time factor 
consists of five trials each for the loud tones 
and reaction time task and eight epochs or 
time samples of 15 sec each for the two rest 
periods at the beginning and end of each ses- 
sion (i.e., before the first loud tone and after 
the last reaction time trial). Data were scored 
for 15 sec before and after each loud tone 
and for the duration of each get ready stimu- 
lus and an equivalent time period before each 
reaction time trial. 

Wherever data points were missing due to 
equipment difficulties, they were filled in us- 
ing the unweighted means solution. Data were 
analyzed on an IBM 370 computer using the 
Data-Text program (Armor & Couch, 1972) 
Versions 3.0 and 3.1. Each measure was first 
tested for skewness, kurtosis, and hetero- 
scedasticity. Where necessary, normalizing 
transformations were used, and, in all cases, 
the assumptions of the analysis of variance 
and covariance were met. 

Combined analyses of patients and nonpa- 
tients in the relaxation and control conditions. 
To compare diagnostic groups and treatment 
effects in a single analysis, the alpha feedback 
condition, which was only given to nonpa- 
tients, was excluded in some analyses. In 
these analyses the covariance — adjusted ses- 
sion-difference scores have two between-groups 
Measures, diagnosis (patient vs. nonpatients) 
and, treatment (relaxation vs. control), and 
one repeated measure, time. 

Separate analyses of patients and nonpo- 
tients. To test the significance of treatment 
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Table 2 
Summary of Pretest and Posttest Means that are Significantly Different Between Groups 
Nonpatients Patients 
Relaxation Control Alpha Relaxation Control 
Measure Pre Post Pre Post Pre Post Pre Post Pre Post 
State anxiety 42.5 34.8 32.3 31.2 
` z x 4 43.0 40. 
Cardiac decelerations* x Pe RERE 
Tones -.5 2.5 1 7 
¥ P a ‘ : 5 1.6 
RT trials 3" 2a e AOE Pes a te! 
Cardiac accelerations* { 
Tones 2 2.7 48 48 
7 < i 2 * 12 8 3.4 eh 1.6 5.6 
RT trials 9 2.6 6 1.4 : 4 
A È sl j: 6 —8 mee = = 
Percent alpha $ i 4 a 
Initial rest 38.9 27.0 29.1 29.7 12. 
3 d 5 5 223.8 32.6 41.6 33.7 30.8 
Balore tones 41.1 30.7 24.8 24.9 16.4 24.8 30.4 46.5 36.9 34.0 
Boe NE- 34.5 28.5 17.7 24.4 9.3 20.9 21.4 42.8 30.9 26.8 
Da ore R I trials 40.9 30.9 27.4 24.4 20.8 22.1 34.1 47,2 38.9 35.6 
uring RT trials 274 33.0 19.0 28.9 9.1 19.0 30.9 41.4 31.8 31.0 
Maximum log skin 
conduct ance 
aa rest 20 15 15% 13 1.50. 013 PAEA o 1.8 2.0 
en rest 23°) Se LSH 716 18 15 24 414 1.9 2:2 
erence tea? 24 1.9 2.0 1.8 21 1.8 25 1.8 2.3 2.5 
uring RT trials 2.5 1.9 2.0 1.9 2.1 1.6 2.5 1.7 21 2.3 


RT Sd reaction time. 
Beer centive cardiac deceleration reflects a 
. A negative cardiac acceleration refl 
after the stimulus. 
and habituation effects in each of the two 
eee that were studied, the data from 
e population were submitted separately to 
le analyses of variance described above, 
without the between-groups factor of diagnosis. 


Subjective Anxiety 
Trait Anxiety 


a a check on our diagnostic procedures 
ay check on the randomness of assign- 
= to treatment conditions, a two-factor 
Te lysis of variance was done on the STAI 
K a Anxiety scale scores, with diagnosis and 
ment as between-groups measures. As ex- 
ae patients scored significantly ($ < 
a ) higher than nonpatients. The treatment 
ao was not significant, indicating that on 
is measure, the assignment of subjects to 
groups was effectively random. 


State Anxiety 


b The STAI State Anxiety scale was analyzed 
y an analysis of variance with two between- 


e before the stimulus than after the 
heart rate before the stimulus than 


groups measures (diagnosis and treatment) 
and two repeated measures (time and ses- 
sion). The two levels of the time factor were 
presessionsand during session. The Diagnosis 
x Time interaction was significant at p < 
.004, indicating that the patients reported 
themselves to be more anxious than the non- 
patients at the beginning of each session. The 
analysis of variance on Session 1 scores 
yielded a significant (p < .02) effect for 
treatment, indicating that on this measure, 
the method of random selection to groups was 
not adequate and that the subjects assigned 
to the control group admitted less anxiety 
than the subjects assigned to the treatment 
groups (Table 2). Thus an analysis of vari- 
ance was done on the covariance-adjusted 
difference scores, aS described above. Reported 


anxiety decreased significantly more in the re- 
in the control groups. 


laxation group than 1 
Separate analyses of the patient and non- 
patient groups revealed that patients and non- 
patients both habituated over sessions but 
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Figure 1. Differences between the pretest and the posttest, adjusted for the level for each score 


during the pretest session. 
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Table 3 
Summary of Effects 
Summary J EU COS > ASIN AE in 


Habituation over trials Habituation over sessions Treatment effect* 
Non- Com- Non- Com- Non- Com- 
Measure Patients patients bined? Patients patients bined? Patients patients bined? 
State anxiety® * +++ ** eee Peery * „+ 
Percent alpha? 
Initial rest = — — ke 
Before tone — = — * ke 
After tone * +++ +++ ee ** *kke 
Before each RT trial — — — * He 
During RT trial + pred +*+ +++ +e me 
Alpha blocking 
Tones eee + ** ** 
RT trials ** +s+ KK eee 
Heart rate acceleration S 
Tones +*+ +k +s+ ** * 
Heart rate deceleration’ 
Tones see 
ee 
RT trials * x 
Log skin conductance a 
Initial rest — = — i h h 
End of session rest — — — An ie R 2 
Tone onset eet 3) ee + À 
RT onset — 2 pe +++ ee 
Maximum skin 
condition /epoch s we iy 
Initial rest = = — a Ge Hy 
End of session rest = — — 
Posttone + eee Seed * ee KK ** ka 
During RT trials +*+ ++ srt +k xe ** 
Amplitude of SCR i 
Tone® eee sr eee e 
RT trials? eee prea pred 2 ae 
SCR recovery rate 
ones eee ee 
Note. RT : = 
` = reaction time. í justed session-difference 
“Unless otherwise specified, this refers to the treatment effect on the covariance-adjuste 
een + 
hig for the relaxation and control 


P ae column refers to an analysis of both the patients and the nonpatients, 

Conditions only, . d rated 
° Patients ented higher skin conductance responses (SCRs) to the first Fee Diageo Tinie ier 
themselves as more anxious than nonpatients at the beginning of each session. 

actions are, respectively, significant at p < .03 and p < 001. 

Ncreases in alpha are scored as habituation. $ k ey 
These values are for the Treatment X Diagnosis effect on the covariance-adjusted session: 
he overall Treatment effect is not sient eaten 

ere an in i i eration is rated as habii by Shay Cae ter 

€ The Bi E PERES significant at p < .05, indicating that habituation over trials is fas 
in the postt i 
seat est than in the pretest. y 
atients also emitted significantly smaller skin conductance responses tha: 


* 
P< .10. 
nob < 05. 
k 
mrb <01. 
b < .005. 


difference scores. 


n nonpatients. 
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that the covariance-adjusted treatment effects 
were significant only for the nonpatients. A 
trend analysis on the session-difference scores, 
with the nonpatient relaxation group assigned 
a value of 1, the alpha feedback group a 
value of 2, and the nonpatient control group 
a value of 3, revealed a linear trend of border- 
line significance (p < .052), suggesting that 
relaxation was most effective in reducing anx- 
iety, the alpha feedback condition second, and 
the control condition least (Figure 1). 


EEG Alpha 
Percent alpha 


Percent alpha of 19 „V or greater was 
scored for each of the time periods described 
above. As can be seen in Figure 1, relaxation 
appears to have increased patients’ tendencies 
to emit alpha waves. However, paradoxically, 
percent alpha appears to have decreased over 
time in the nonpatient relaxation group rela- 
tive to the nonpatient alpha and control 
groups. The Diagnosis X Treatment effect 
was significant for most time periods (Table 
3) in the combined analysis. Analyses of the 
pretest session scores revealed significant dif- 
ferences between treatment groups (p < .02) 
for the epochs following the loud tones, with 


Table 4 
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the alpha feedback group having less alpha 
than the other two groups (Table 2). Co. 
variance-adjusted analyses of variance on the 
pretest versus posttest change scores revealed 
that among the patients, differences between 
treatments were significant for the epochs 
following the loud tones and were of border- 
line significance for the epochs preceding the 
loud tones and reaction time trials. For the 
nonpatients, none of the differences were sig- 
nificant. A trend analysis was computed for 
the nonpatients across groups on the session- 
difference scores, with the relaxation group 
assigned a value of 1, the alpha feedback 
group a value of 2, and the control group a 
value of 3. A significant (p < .05) quadratic 
trend was found for the beginning-of-session 
rest, before and after the loud tones, and be- 
fore the reaction time trials, thus indicating 
that the alpha feedback condition produces 
greater increases in percent alpha than the 
other two groups of nonpatients. 


Alpha Blocking 


Alpha blocking was assessed by subtracting 
the prestimulus percent alpha from the post- 
stimulus value for each trial of the loud tones 
and the reaction time “ready” stimuli. There 
were no differences on this measure as a func- 


Differences Between Patients and Nonpatients in Heart Rate 


M 
A a 
Measure Patients Nonpatients F $ 
HR at stimulus onset 
Tones 77.7 71.2 12 
5 : s 2.508 0 
RT trials 76.4 70.4 2.108 16 
Minimum HRs 
Pretone 72.3 64.6 
: i 3.718 06 
Posttone 3 724 64.0 4.528 04 
Pre-RT trials 72.2 64.7 3.73 06 
During RT stimuli 71.2 63.7 3.33° 08 
gree HRs : 
retone 83.2 76.7 18 
$ y 2.428 , 
Posttone ; 86.1 80.2 2.08" 16 
re-RT trials 81.2 76.2 1.46» 2 
During RT stimuli 80.7 17.3 63° 43 
Note. Di ies i ae ion ti 
cag ne in degrees of freedom are due to missing data. HR = heart rate; RT = reaction time: 
> df = 1, 35, 
edf = 1, 33, 
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tion of either diagnosis or treatment. Although 
the combined analysis showed significant main 
effects for sessions and for trials, the separate 
analyses on the two populations indicate that 
this effect was only significant for the non- 
patients. Patients did not habituate over trials 
on this measure, but nonpatients did. The 
Diagnosis X Session interaction, which re- 
flects the difference between habituation rates 
among the two populations in the combined 
analysis, was significant at p < .005 for the 
reaction time “get ready” stimuli. 


Heart Rate 


Tonic heart rate was measured every 15 sec 
during the rest periods and immediately be- 
fore each trial during the loud tones and the 
reaction time task. There were no significant 
effects for treatment. However, the main ef- 
fect for diagnosis was significant for minimum 
heart rates in the epochs following the loud 
tones and approached significance ($ < 07 
and p<.08) for minimum heart rates at 
other times (Table 4). This indicates that 
patients had higher “floor” heart rates than 
nonpatients. Although the pattern was simi- 
lar for maximum heart rates in each epoch, 
the p values were much higher (p < .2-5), 
thus strongly suggesting that peak heart rates 
Were unaffected by anxiety neurosis. 

Heart rate accelerations to the tones and 
teaction time stimuli were assessed by sub- 
tracting the maximum prestimulus heart rate 
from the maximum poststimulus heart rate. 
This adjustment was used to control for the 
effects of sinus arrhythmia, which otherwise 
would obscure the effects of reactivity to the 
tones, It was used rather than the traditional 
calculation of the diffetence between heart 
tate at stimulus onset and poststimulus maxi- 
mum or minimum levels (uncorrected for 
Prestimulus variability) because of experience 
from a prior study on the effects of alcohol 
on cardiac reactivity (Lehrer & Taylor, 1974). 
That study also used loud 1-sec long noxious 
tones. In Lehrer and Taylor (1974), the heart 
tate accelerations and decelerations correct 
for prestimulus maximum and minimum levels 
differentiated between drunk and sober con- 
ditions, whereas the uncorrected accelerations 
and decelerations did not. Drunk subjects 
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showed greater corrected decelerations and 
smaller corrected accelerations in response to 
the five loud tones. Cardiac accelerations may 
be interpreted as components of a defensive 
reflex, and cardiac decelerations, as compo- 
nents of an orienting reflex to the stimuli 
(Graham & Clifton, 1966). 


Accelerations 


The differences between treatment groups 
in the first session (Table 2) were not sig- 
nificant. Cardiac accelerations elicited by the 
loud tones habituated significantly over trials 
(Table 3). Also, relaxation significantly en- 
hanced habituation of this reflex over ses- 
sions, but only among the patient groups 
(see Tables 2 and 3 and Figure 1). There 
were no significant differences between groups 
for cardiac accelerations to the reaction time 


task. 


Decelerations 


The differences between treatment groups 
in the first session (Table 2) were not sig- 
nificant. Cardiac decelerations increased over 
sessions among the nonpatients but not 
among the patients. Significant treatment ef- 
fects on the covariance-adjusted session-dif- 
ference scores were obtained for the reaction 
time trials in the combined analysis and 
among the nonpatients. This effect was of 
borderline’ significance for the patients. As- 
suming that progressive relaxation would pro- 
duce greater generalization to heart rate than 
alpha feedback, a linear trend analysis was 
done between groups of nonpatients, with the 
relaxation group assigned a value of 1, the 
alpha feedback group a value of 2, and the 
control group a value of 3. Significant linear 
trends were found for the tones ($ < 05) 
and for the reaction time trials (p < .02), 
indicating that this effect was greater in the 
relaxation group, least in the control group, 
and at an intermediate level in the alpha 


feedback group. 
Skin Conductance 


Tonic Levels 


Skin condu 
sec during the rest pel 


ctance was measured every 15 
riods and immediately 
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before each trial during the loud tones and 
reaction time task. At each of the periods 
studied, skin conductance appeared to de- 
Crease over sessions in the relaxation group 
and to increase in the control group. The ef- 
fects appear to be more pronounced among 
the patients than among the nonpatients, but 
they are at best, of only borderline signifi- 
cance (Table 3). Overall habituation across 
sessions was significant for the nonpatients 
but not for the patients, 


Reactivity 


The maximum skin conductance level dur- 
ing each epoch was scored and submitted to 
analysis. The differences between treatment 
groups (Table 2) in the first session were 
significant (p < .04) for the reaction time 
ready stimuli and the end-of-session rest pe- 
tiods in the combined analysis of relaxation 
and no treatment for patients and nonpatients, 
They were not, however, significant in either 
analysis of the diagnostic groups taken sepa- 
rately. The pattern of results was the same 
as that for tonic levels of skin conductance, 
but this time the covariance-adjusted treat- 
ment effects were significant for the patients 
(Tables 2 and 3, Figure 1). These results 
indicate that although the nonpatients tended 
to habituate over sessions regardless of treat- 
ment, the patients only habituated when 
treated. All subjects, however, significantly 
habituated over trials (within sessions) to the 
tones and reaction time task. The tendency 
of this measure to decrease more after re- 
laxation instructions than after alpha feed- 
back was only of borderline significance, The 
linear trend across groups, computed as de- 
scribed above for cardiac accelerations, ap- 
proached significance during the end-of-ses- 
sion rest (p < .13) and during the reaction 
time trials (p < .07), 

Frequency of skin conductance responses 
was also analyzed, During the rest periods at 
the beginning and end of each session, skin 
conductance increases of greater than 1% 
of the prestimulus skin conductance were tal- 
lied. The reciprocal transformation rendered 
these scores normally distributed. There were 
No significant effects on this measure. 

Another measure of electrodermal reactiy- 
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ity that was studied is the amplitude of the 
skin conductance responses elicited by the 
loud tones and the reaction time task. This 
was defined as the difference between the 
skin conductance at stimulus onset and the 
maximum skin conductance reached during 
the 15 sec following each loud tone or dur- 
ing the get ready stimulus for each reaction 
time trial. A log transformation rendered this 
Measure normally distributed. The skin con- 
ductance response habituated significantly 
over sessions and over trials for the loud 
tones and the reaction time task. Also, a sig- 
nificant (p < .05) Session X Trials interac- 
tion revealed that subjects habituated more 
quickly in the second testing session than in 
the first during the loud tones. The patients 
made a larger response to the first loud tone 
than the nonpatients, but otherwise the re- 
sponses to the tones were similar for the two 
groups. The Diagnosis X Trial interaction 
was significant at p < .02. On the other hand, 
the patients emitted significantly (p < .02) 
smaller responses to the reaction time task 
than the nonpatients. There were no signifi- 
cant effects or interactions involving the treat- 
ment factor. 

A third measure of electrodermal reactivity 
was recovery rate of the skin conductance 
response. Edelberg (1972) has found that 
stress and threat of stress slow the recovery 
of the skin conductance response and that 
schizophrenic patients show slower recovery 
than normal subjects (Maricq & Edelberg, 
1975). He has also demonstrated that the 
Tecovery rate can be approximated by mea- 
suring the percent recovery during the 2 sec 
following the apex of each response (Edel- 
berg, Note 3). We computed this measure for 
the first observable skin conductance response 
emitted after each loud tone and after the 
onset of each get ready stimulus for the re- 
action time task. The log transformation 
rendered these data normally distributed. Log 
percent recovery increased significantly over 
trials during the loud tones. This parallels 
the findings for cardiac accelerations and ap- 
pears to reflect a habituation of the defen- 
sive reflex over trials, However, there were 
no significant effects involving diagnosis, 
treatment, or session, 
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Reaction Times 


The natural log transformation rendered 
the reaction time data normally distributed. 
There were no significant effects for treat- 
ment, diagnosis, or the interactions of these 
with sessions. 


Discussion 
Relaxation Effects 


Relaxation appeared to reduce physiologi- 
cal reactivity. This finding was most pro- 
nounced among the patients, for whom re- 
laxation was significant for maximum skin 
conductance elicited by both tasks, for car- 
diac accelerations (defensive reflexes) elicited 
by the loud tones, and for percent alpha levels 
after the loud tones. The effect was not pres- 
ent for measures of tonic physiological activ- 
ity, This supports the notion that progres- 
sive relaxation of the muscles generalizes to 
relaxation in other physiological systems. It 
also confirms the findings reviewed above that 
physiological effects of brief relaxation in- 
structions can best be measured in an anxious 
population. The reason for this appears to 
be different for each of the three measures 
on which significant effects were found. For 
maximum skin conductance after the stimuli, 
the nonpatients all appeared to habituate over 
sessions, whereas the patients only habituated 
When they had been taught relaxation. For 
cardiac accelerations to the loud tones, sub- 
jects tended to sensitize over sessions rather 
than to habituate, although this overall ef- 
fect was not significant. The patient popula- 
tion appears to have been more prone to this 
Sensitization effect than the nonpatients, 
this effect was attenuated only by relaxation 
training. After training, the patients’ pattern 
of autonomic reactivity became similar to that 
of the nonpatients. For percent alpha after 
the loud tones, the results are more complex. 
Percent alpha was significantly increased by 
relaxation among the patients. However, the 
Nonpatient relaxation group had a seemingly 
paradoxical decrease in percent alpha, rather 
than an increase. One can only speculate on 
Possible causes for this. The patient control 
group showed this same pattern, and the latter 
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group apparently sensitized over sessions. It 
is possible that something similar happened 
to the nonpatient relaxation group, but this 
would be inconsistent with the data on au- 
tonomic functions. Perhaps the nonpatient 
relaxation group was emitting theta activity. 
The tape-recorded data were no longer avail- 
able for checking this, but a cursory examina- 
tion of the raw data indicate that this was 
not so. Another possibility is that the non- 
patients were “working too hard” at relaxing 
and were concentrating on their muscles so 
much that alpha was blocked. I have infor- 
mally tried concentrating on relaxing my own 
muscles while being given alpha feedback and 
have similarly experienced a decrease in al- 
pha production. But why, then, did this not 
happen with the patient relaxation group? 
Here we must speculate some more. Perhaps 
the patients had taken the task more seriously 
and had practiced more intensively at home. 
Thus, when they came for testing, the task 
could have been more automatic for the pa- 
tients than for the nonpatients. Informal 
contacts with the subjects indicated that pa- 
tients were more motivated to practice the 
task than nonpatients, since they expected 
some therapeutic benefit. 

There are two measures on which the re- 
laxation effects were somewhat more pro- 
nounced among nonpatients than among pa- 
tients: cardiac decelerations (orienting re- 
flexes) to the reaction time task and state 
anxiety. The slightly greater treatment effects 
among nonpatients for cardiac decelerations 
may partly be a mathematical artifact of 
changes in cardiac accelerations and partly a 
reflection of the higher heart rate “floor” for 
haps due to their higher state 


the patients, per ; 
of general sympathetic tuning. Relaxation ap- 
parently did not affect heart rate floors. In 


a previous study we found that alcoholics also 
have higher floor heart rates than “normal” 
subjects (Lehrer & Taylor, 1974), and we 
interpreted this as reflecting changes in car- 
diac function due to chronic alcohol intake. 
It is possible that the effect in the present 
study was also due to drug intake in the pa- 
tient group, since most of the patients in the 
sample had histories of taking tranquilizers. 
It is also possible that the results of both 
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studies are due to the effects of chronic anx- 
iety in both patient groups. Perhaps chronic 
anxiety inhibits the individual from emitting 
orienting reflexes to situations that had pre- 
viously evoked defensive reflexes. This “shift” 
would be expected to occur over time among 
normal individuals (Sokolov, 1963). In this 
regard, note also that cardiac decelerations 
(orienting reflexes) to the loud tones in- 
creased significantly over sessions in the non- 
patients but not in the anxious patients 
(Table 3). 

The fact that state anxiety was decreased 
by relaxation more among the nonpatients 
than among the patients may result from the 
fact that many of the patients showed ob- 
vious evidence of “secondary gain” from their 
symptoms, including having a friend, parent, 
or spouse accompany them to each session 
because of their fears. To admit that their 
fear was reduced during the testing sessions 
would have justified their giving up this de- 
pendence. 


Relaxation, Alpha Feedback, and 
Generalization 


As revealed in the trend analyses on the 
nonpatients, alpha feedback produced greater 
increases in alpha but smaller changes in other 
physiological measures and subjective anxiety 
than progressive relaxation, These findings 
give some support to the notion of physio- 
logical specificity of alpha. These results are 
also consistent with the findings of Orne and 
Paskewitz (1974) that increases in alpha pro- 
duced by alpha feedback do not generalize 
well to other physiological functions or to 
subjective anxiety. Muscle relaxation appears 
to produce greater generalization. 


Physiological Activity Versus Reactivity 


It should be noted that relaxation effects 
were more pronounced for measures of physi- 
ological reactivity than for measures of tonic 
physiological activity. They were greater for 
maximum skin conductance level for each 
epoch than for initial skin conductance level 
in the epoch, greater for cardiac accelera- 
tions and decelerations than for tonic heart 
rate, and greater for percent alpha in the pe- 


LEHRER i 
= 
riod following the loud tones than at any 
other time period. In terms of Routtenberp’s 
(1968) hypothesis of two separate arousal 
systems, perhaps progressive relaxation has 
greater effects of the type of arousal mediated 
by the limbic system than that mediated by 
the reticular system. 
i 


Other Differences Between Patients 
and Nonpatients 


Another finding of interest is that the pa- 
tients appeared to be hyperresponsive to 
threat but somewhat underresponsive to a 
nonthreatening task, which generally produces 
an orienting reflex rather than a defensive 
reflex. The patients emitted larger skin con- 
ductance responses to the first loud tone and 
higher scores on the STAI at the beginning 
of each session than the nonpatients, but the 
patients emitted smaller skin conductance re- 
sponse amplitudes during the reaction time 
task than the nonpatients. Apparently, pa- 
tients respond with greater extremes of elec- 
trodermal reactivity than do the nonpatients, 
giving smaller responses to the nonthreatening 
reaction time task (which requires orienta- 
tion to the environment rather than defense 
from it) but a larger response to the threaten- 
ing first loud tone. This squares with the 
clinical picture of anxiety neurotics as easily 
upset by everyday stress and not always as 
“tuned in” to their environment because of 
their anxious preoccupations. 

Finally, Lader and Wing’s (1966) conten- 
tion that people suffering from anxiety neu- 
tosis habituate more slowly than nonpatients 
to noxious stimuli is lent some support by 
this experiment. Patients showed less habitua- 
tion of alpha blocking and skin conductance 
levels than did nonpatients. 
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Objective Measurement of Fear of Success 
and Fear of Failure: 
A Factor Analytic Approach 


Susan Sadd, Michael Lenauer, Phillip Shaver, and Noel Dunivant 
New York University 


Several objectively scored measures of fear of success and fear of failure have 
been designed in recent years, but there is little evidence that they measure two 
distinct, unidimensional constructs. The present study was undertaken primarily 
to answer two questions: Are fear of success and fear of failure operationally 
distinct? Do all fear of success measures tap a single unidimensional construct? 
Eight fear of success and fear of failure scales were administered to 415 male 
and female subjects, and the scores were intercorrelated. Results indicated that 
fear of success is not a unidimensional construct and that some of the measures 
of fear of success and fear of failure are highly related. Next, each scale was 
factor analyzed, and 37 new variables were created. These were in turn factor 


analyzed, and five highly stable orthogonal factors were obtained. One of these 
factors appears to be fear of success; another is clearly test anxiety (called fear 
of failure in the literature on achievement motivation). A third factor is con- 
cerned with sex-role-related attitudes toward success in medical school. A fourth 
seems to reflect neurotic insecurity, and the fifth has to do with the value of 
success. Indices of psychological well-being and psychosomatic illness related 
differently to each of the five factors. Implications and further questions are 


discussed briefly. 


Horner’s research on fear of success (Hor- 
ner, 1969a, 1969b, 1972, 1974) has generated 
well over 100 research articles in the past 
few years. According to Horner (1969a), fear 
of success (hereafter FOS) is “the fear that 
Success in competitive situations will lead to 
negative consequences” (p. 38). Taken by it- 
self this definitional phrase is quite general. 
However, because Horner was primarily in- 
terested in a form of FOS that she thought 
was characteristic of females in American 
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society, she included in her definition not only 
general negative consequences of success, such 
as social rejection (“unpopularity”), but also 
“Joss of femininity.” Most subsequent studies 
have failed to support this specific definition, 
however. In a recent review of the burgeoning 
literature on fear of success, Tresemer (1976) 
concluded that the “hypothesis that there is 
a gender difference in FOS is not supported” 
(p. 217). Therefore, the more general defini- 
tion of FOS, leaving out loss of femininity, 
appears to be the most appropriate starting 
point for anyone undertaking studies in this 
area. Unfortunately, interested researchers 
will find little consensus concerning the best 
way to measure FOS. The present article 
takes a few steps toward remedying that 
problem. 

Horner’s earliest measure of FOS, and the 
one used most often by subsequent research- 
ers, was a Thematic Apperception Test 
(TAT) type of projective device based on the 
following story line: “After first term finals 
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Anne (John) finds herself (himself) at the 
top of her (his) medical school class.” Sub- 
jects were asked to complete the story by de- 
scribing Anne or John in more detail, includ- 
ing how she (he) felt about this situation 
and what would probably happen in the fu- 
ture. This measure is problematic for several 
reasons. It is highly specific to a set of cir- 
cumstances—success in medical school— 
which is obviously not synonymous with suc- 
cess in general, Moreover, until recently, 
American medical schools were masculine in- 
stitutions, which meant that the hypothetical 
Anne was not only successful in a general 
sense but also highly deviant for her gender 
in terms of career choice (Lockheed, 1975). 
Add to this the fact that males were given 
the John cue and females the Anne cue, and 
it is easy to see that fairly specific attitudes 
about a successful male or female in medical 
school were being tapped, not necessarily a 
general personality trait, FOS. F inally, a 
Present/absent scoring system based on one 
such story is not likely to be reliable (cf. 
Nunnally, 1967). 

Because of these and other problems, many 
new measures of FOS have been developed 
recently, one by Horner and her colleagues 
(Horner, Tresemer, Berens, & Watson, Note 
1) and several by other investigators (eg., 
Cohen, 1975; Good & Good, 1973; Pappo, 
1973; Spence, 1974; Zuckerman & Allison, 
1976). Shaver (1976) reviewed the measures 
briefly and raised several questions about 
them. First, are all of the instruments mea- 
suring the same thing? The answer is almost 
certainly no. With some of the measures (e.g., 
Good & Good, 1973; Spence, 1974; Zucker- 
man & Allison, 1976), significant sex differ- 
ences have been obtained, but this is not the 
case with other measures (Cohen, 1975; 
Pappo, 1973). Even more disturbing, the sex 
difference between means is not always in the 
same direction for different measures; Spence’s 
instrument reveals more hostility among males 
toward a successful male, for example. Also 
disturbing are the low correlations between 
some of the new measures (e.g., Zuckerman 
& Allison, 1976) and Horner’s original one, 
even though these are purported to measure 
the same construct. 
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A less familiar but even more serious prob- 
lem is that FOS may not be different from 
what achievement motivation researchers 
(e.g., Atkinson & Feather, 1966) have called 
fear of failure. Shaver (1976) argued that 
most of the experimental results obtained by 
Cohen (1975), Pappo (1973), and Zucker- 
man and Allison (1976), which these authors 
interpreted as evidence for the construct 
validity of their FOS measure, could have 
been explained just as easily in terms of fear 
of failure. Compatible with Shaver’s argument 
is the highly significant correlation of 57, 
which Pappo reported between her measure 
and the Debilitating Anxiety scale of the 
Achivement Anxiety Test (AAT; Alpert & 
Haber, 1960), a commonly used measure of 
fear of failure. Jackaway and Teevan (1976) 
compared the new FOS measure developed 
by Horner et al. (Note 1) with the projective 
measure of fear of failure devised earlier by 
Birney, Burdick, and Teevan (1969) and 
found significant correlations between the two 
measures for both males and females (.42 and 
-57, respectively). This finding is not sur- 
prising given the considerable overlap between 
the two scoring systems. 

Believing that much of the inconsistency 
and confusion in the recent FOS literature 
is due to measurement problems, we under- 
took a factor analytic study of several mea- 
sures of FOS and fear of failure. The study 
was designed primarily to answer two ques- 
tions: (a) Are FOS and fear of failure mea- 
surably distinct? (b) To what extent do the 
recently devised objective tests of FOS mea- 
sure a unitary (i.e, unidimensional) cons 
struct? To begin exploring another issue 
taised by Shaver (1976)—namely, that avoid- 
ance motives proposed in the achievement 
literature might be related to psychosomatic 
and psychological symptoms of conflict and 
Stress, in addition to or instead of, being Tes 
lated to performance decrements—we also in- 
cluded questions concerning these symptoms. 

The present article focuses only on objec- 
tively scored measures of FOS and fear of 
failure, because these are likely to be mo 
reliable and are obviously easier to administer 
and score. Further research will be necessaťy 
to discover whether one or more of the di- 


we discovered are present in the 
e measures as well. We did, however, 
one measure that is directly related 
3s original TAT-like device, Spence’s 
| objectively scored version of the 
John cue. 


es Included in the Factor Analysis 


re presenting our study in detail, each 
re will be described briefly. Five mea- 
of FOS and two measures of fear of 
were included. 
e (1974). The objective measure of 
| most similar to Horner’s original pro- 
ve measure is the one developed by 
ice, which requires that subjects com- 
e three stories, including the one about 
e (or John) in medical school, and then 
a series of multiple-choice questions 
it the story. For the Anne story these 
tions include: “How likable do Anne’s 
smates consider her?” In a study of 328 
and female undergraduates, Spence 
4) found a significant relationship be- 
her measure and Horner’s. Other anal- 
led Spence to conclude that her measure, 
j well as Horner’s, tapped attitudes toward 
kinds of achievement rather than a 
ly rooted personality trait. It is also 
Orth mentioning, because we replicated 
pence’s results, that she found males to be 
e Negative toward a successful male than 
S were toward a successful female. Al- 
gh this is quite different from what 


S measure (Tresemer, 1974, 1976). 
n the research to be reported here, we 
the Anne (or John) story lead followed 
15 of Spence’s multiple-choice questions. 
ood and Good (1973). These authors 
eloped a 29-item self-report measure of 
based on Horner’s (1969a, 1969b) no- 
that a person who fears success will be 
ous and worried about other people’s 
ive reactions to her (or his) success. In 
ple of 228 male and female college un- 
graduates, the mean FOS score for females 
Significantly higher than the mean score 
t males, as Horner expected. The measure 
ains items of the following sort, each of 
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which is answered on a true-false scale: “I 
worry that I may become so knowledgeable 
that others will not like me.” “If I were to 
do well at something, I would worry that 
someone might try to undermine my success.” 

Zuckerman and Allison (1976). These 
authors’ 27-item scale contains statements 
concerning (a) benefits of success, (b) costs 
of success, and (c) attitudes toward success 
as compared with other alternatives. It was 
constructed on the basis of Horner’s theoriz- 
ing. The following items, answered on a 7- 
point agree—disagree scale, are representative: 
“The cost of success is overwhelming respon- 
sibility.” “When competing against another 
person, I sometimes feel better if I lose than 
if I win.” Based on Horner’s description of 
FOS, Zuckerman and Allison predicted that 
FOS would be greater for females than for 
males, and in three different samples of un- 
dergraduate college students, females scored 
higher than males. (The difference was sta- 
tistically significant in two out of three sam- 
ples.) In two of their samples, Zuckerman 
and Allison’s scale correlated weakly but sig- 
nificantly with Horner’s (1969b) measure. In 
a validation study, high-FOS subjects per- 
formed significantly less well on an anagrams 
task than did low-FOS subjects. 

Pappo (1973). The measures developed 
by Pappo (1973) and Cohen (1975) are dif- 
ferent in conception from the ones described 
thus far. Whereas Horner, Spence, Good and 
Good, and Zuckerman and Allison were all in- 
terested in a motivational construct related to 
sex role socialization, Pappo and Cohen were 
interested in a neurotic form of FOS that 
might be equally prevalent among males and 
females. The two researchers differed slightly 
in their conceptualization of the etiology of 
neurotic FOS, with Pappo favoring Sullivan- 
ian theory (Sullivan, 1953) and Cohen basing 
her conception on Freud’s discussion of oedi- 
pal conflicts. 

Pappo developed an 83-item true-false 
questionnaire to measure “academic fear of 
success.” The scale has been administered to 
large samples of high school and college stu- 
dents, and as expected no sex differences have 
been found. In an experimental study involv- 
ing college students (Pappo, 1973), high 
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scorers on the scale exhibited poor perform- 
ance on a reading test following success feed- 
back; low scorers, in contrast, improved. The 
following are typical items from Pappo’s 
scale: “I feel I need someone to push me to 
do the things I want to do.” “I find it dif- 
ficult to measure up to the standards I set 
for myself.” 

Cohen (1975). Cohen used Pappo’s (1973) 
scale as a model; consequently, many of the 
items in the two scales are quite similar, al- 
though Cohen’s are more general (i.e., not 
tied to academic situations). In a high-ability, 
achievement-oriented sample of high school 
students, Cohen’s 64-item true—false scale 
successfully predicted which subjects would 
perform poorly on a memory task following 
success feedback. (Both Pappo’s and Cohen’s 
scales can be found in Canavan-Gumpert, 
Garner, & Gumpert, 1977.) 

Measures of fear of failure. We included 
two measures of test anxiety or fear of fail- 
ure, the Test Anxiety Scale (TAS) used in 
many studies by Sarason (e.g., 1972, which 
includes the scale items) and the AAT, de- 
signed by Alpert and Haber (1960) within 
the framework of achievement motivation 
theory (Atkinson & Feather, 1966). The TAS 
is a 37-item true—false scale. The AAT con- 
tains two subscales, one measuring debilitat- 
ing anxiety (10 items, hereafter referred to 
as AAT—) and the other measuring facilitat- 
ing anxiety (9 items, AAT+); both refer 
explicitly to academic or intellectual testing 
situations. The following items are typical 
of the TAS and the AAT—: “While taking an 
important exam I find myself thinking how 
much brighter the other students are than I 
am.” “Nervousness while taking an exam or 
test hinders me from doing well.” (The 
AAT+ is not important to later conclusions; 
essentially, it measures a form of arousal in 
test situations that is negatively correlated 
with debilitating test anxiety.) 


Overview of the Present Study 


We administered all of the measures de- 
scribed above to 415 university students; in 
addition, each subject provided background 
information and answered questions concern- 
ing psychological problems and psychosomatic 
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symptoms. Analysis of the data proceeded in 
stages. First, the reliabilities of the scales 
were computed for our sample, and, since 
these were acceptable, mean scores for males 
and females were compared. Second, the scales 
were intercorrelated, separately for males and 
females, to determine how strongly they re- 
lated to each other. Third, each scale was 
factor analyzed, and 37 subscales were cre- 
ated. These subscales were in turn factor 
analyzed, yielding five orthogonal factors. 
Finally, scores on these five factors were cor- 
related with background variables and mea- 
sures of psychological well-being. 


Method 
Subjects 


A letter was sent in the spring of 1976 to juniors 
and seniors at New York University inviting them 
to participate in a study of “achievement-related 
motivation and career choice.” Older students were 
preferred, because they were closer to graduation 
and career choice, and thus to real-world success 
or failure. To obtain a sufficiently large sample, 
subjects were also recruited in classrooms and dor- 
mitory lobbies. Each subject received $3 for filling 
out the battery of questionnaires. Data were collected 
from 430 subjects, but 15 of these failed to follow 
instructions or left out large numbers of items. Thus, 
166 males and 249 females (W = 415) were repre- 
sented in the data analyses. 


Materials and Procedure 


Each subject received a large envelope containing 
a background information form and two booklets of 
questions (Parts 1 and 2). Included in the back- 
ground information form were standard demographic 
questions, questions about the subject’s relationship 
with his or her parents, and a list of psychological 
problems and psychosomatic symptoms. Part 1 con- 
tained Horner’s (1969b) medical school cue, fol- 
lowed by Spence’s (1974) objective questions. (Fe- 
males wrote about Anne in medical school; males, 
about John.) Instructions were the same as Horners, 
except for a request that subjects were not to look 
at the multiple-choice questions following the story 
until after the story had been completed. Part 2 
contained questions from the objective measures h 
FOS and fear of failure (Alpert & Haber, 1960; 
Cohen, 1975; Good & Good, 1973; Pappo, 1973; 
Sarason, 1972; Zuckerman & Allison, 1976). The 
items from all six measures were shuffled and printed 
in random order, and all were answered on a 4- 
Point scale from strongly disagree, disagree, agteé 
to strongly agree. For items worded in the FO! 
or fear of failure direction, strongly disagree Was 


1; disagree, as 2; agree, as 3; and strongly 
_For items worded in the opposite direc- 
trol for acquiescence response bias), 
was reversed. Scale scores were obtained by 


of the complicated nature of the analyses 
” ed, each stage of the analysis and its 
be described and discussed in turn. Gen- 
m will be reversed until all analyses 


presented. 
Results 


1: Analyses Involving the 
al Scales 


scoring of the scales was modified 
‘in the present study, so that all ques- 
ould be answered on the same 4-point 
um, internal-consistency reliability co- 
were computed (see Table 1). Since 
of items differs from scale to 
timated reliabilities for tests of stan- 
gth (20 items) were computed using 
from the present study. This allows 
parison of the reliabilities of the vari- 
ales. These estimated reliabilities are 
esented in Table 1. 
e of the scale constructors, following 
(1969b), predicted sex differences in 
‘Good & Good, 1973; Spence, 1974; 
rman & Allison, 1976); others did not 
n, 1975; Pappo, 1973). In the present 
the only measure that produced a 
cant difference between males and fe- 
‘ Spence’s measure of FOS. The re- 
of the Spence questionnaire were simi- 
those reported by her in 1974; males 
d significantly higher FOS than females. 
a d scores were computed for each of 
items on the Spence scale. These stan- 
cores were summed to yield a total scale 
fe for each subject; scores ranged from 
7 to 28.58.) Means, standard devia- 
and ¢ tests are summarized in Table 2. 
Major interest in this study were the 
hips among the various instruments 
to measure FOS and fear of failure. A 
Of correlations among the eight scales 
#8 obtained separately for males and fe- 
les. As can be seen by comparing the val- 
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Table 1 
Reliabilities of the Original Scales for the 
Present Sample 
Relia- 
bilities 
Coeffi- ofa 
cient 20-item 
Ttem a scale 
Fear of succcess 
Spence (1974) 83 87 
Good & Good (1973) .89 84 
Zuckerman & Allison (1976) 68 61 
Pappo (1973) .90 69 
Cohen (1975) .92 18 
Fear of failure 
Debilitating scale 84 .91 
Facilitating scale 64 80 
Test Anxiety Scale 91 85 


ues for males and females in Table 3, there 
are very few differences between the corre- 
sponding correlations for the two sexes. A 
number of interesting relationships appeared 
for both males and females. For ease of un- 
derstanding, these will be discussed under 
three separate headings: (a) relationships 
among two or more FOS measures; (b) re- 
lationships among the fear of failure mea- 
sures; and (c) relationships between FOS and 
fear of failure. Each group of relationships 
will be discussed in turn. 

1. Spence’s (1974) measure of FOS cor- 
related most strongly with the measure de- 
veloped by Zuckerman and Allison (1976). 
(re = 4l; tm = 375 the subscripts f and m 
refer to males and females.) This finding is 
quite reasonable given that both Spence and 
Zuckerman and Allison designed their mea- 
sures with Horner’s work in mind. Neverthe- 
less, the two measures are quite different in 
design and item format, so it is not surpris- 
ing that the correlations between the two 
scales were not higher. It is impossible to 
determine the extent to which method vari- 
ance, as distinct from differences in content, 
contributed to lowering these correlations. 

The correlation between Pappo’s (1973) 
and Cohen’s (1975) measures approached the 
reliabilities of the scales (r: = 88; Tm = 86). 
The correlations between Pappo’s measure 
and other variables were quite similar to the 
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Table 2 
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Mean Scores and Standard Deviations for Males and Females on the Original Scales 


—_—— 


Males Females 
Item M SD M SD t 

Fear of success 

Spence (1974) 2.47 8.12 —=2.15 7.58 5.71% 

Good & Good (1973) 61.28 10.53 59.83 11.08 1.29 

Zuckerman & Allison (1976) 64.57 7.06 64.65 6.27 0 

Pappo (1973) 197.40 20.72 196.74 22.81 09 

Cohen (1975) 157.90 18.83 157.80 20.39 0 
Fear of failure 

Debilitating scale 22.93 4.72 23.53 4.97 1.20 

Facilitating scale 21.95 3.37 22,03 3.39 22 

Test Anxiety Scale 87.85 13.14 90.02 15.17 1.47 


* p < .001; all other ¢ tests were nonsignificant. 


correlations between Cohen’s measure and 
those same variables. According to Pappo and 
Cohen, their conception of FOS is quite dif- 
ferent from Horner’s, a claim that is sup- 
ported by the relatively low correlations be- 
tween the Pappo and Cohen measures on the 
one hand and the measures developed from 
Horner’s theory (Spence, 1974; Zuckerman 
& Allison, 1976) on the other. For example, 
Pappo’s scale correlated .39 for females (.41 
for males) with Zuckerman and Allison’s 
scale: 7; = .30 with the measure designed by 
Spence (fm = .27). 

Good and Good’s (1973) scale appears to 
measure elements common to the Horner- 
based scales and the measures designed by 
Pappo and Cohen. The correlations between 
FOS as measured by Good and Good and 
either the Horner-based measures or the 
Pappo and Cohen measures were higher than 


Table 3 
Correlations Among Eight Original Scales 


any correlation between a Horner-based mea- 
sure and Pappo’s or Cohen’s scale. Appar- 
ently, the Good and Good measure shares some 
content with both kinds of FOS measures. 

2. The two measures of fear of failure, TAS 
and AAT-—, were highly correlated with each 
other (re = .83, rm = .78), thus replicating 
the findings of earlier investigators (Alpert & 
Haber, 1960; Sarason, 1960). The relation- 
ships between the TAS and other measures 
were nearly identical to the correlations be- 
tween the AAT— and those same measures, 
which lends support to the contention that 
the TAS and AAT— measure the same con- 
struct. The AAT+ correlated negatively with 
the TAS and AAT— at about the level re 
ported by Alpert and Haber (1960). 

3. As mentioned earlier, some authors have 
suggested that FOS and fear of failure arè 
indistinguishable motives (Jackaway & Tee 


ee Se A ices O L ŘŮ 


Item 1 2 3 4 5 6 7 8 
1. Spence (1974) = 360 AL BGO 2624. e069) 288 
2, Good & Good (1973) Sao (OTM MNNGL ine GL 54.) -—.06, \y 00 
3. Zuckerman & Allison (1976) ey} AS — 40 39 29 —.06 ae 
4. Pappo (1973) oF 69 te 88 70 —.16 6 
5. Cohen (1975) HE AA Af TES eS a) A 
6. Debilitating scale 19 53 20 66 62 == —.40 83 
7. Facilitating scale SO IGm E20 eag 34 as el 
8. Test Anxiety Scale .08 50 .17 62 58 See T 


Note. Correlations above the diagonal are for females (n = 249); those below the diagonal are for ma 
(n = 166). For n = 249, 7.99 = .16; for n = 166, rs = .19. 


6; Shaver, 1976). The Pappo (1973) 
en (1975) measures correlated quite 
ly with the measures of fear of failure, 
ie Horner-based measures did not. Since 
er are based on the notion that FOS 
through sex role socialization, one 
t expect the Horner-based measures 
to be related to fear of failure. Nor 
me expect them to be related if the 
Spence measure is tapping attitudes 
success in medical school rather than 
“success. 
‘examination of the correlations among 
les suggests that FOS is not unidimen- 
d that there is considerable common- 
tween FOS and fear of failure as they 
rently operationalized. A factor ana- 
tudy is needed to reveal the basic un- 
dimensions that have until now been 


@2? Factor Analyses 


‘Conduct a reliable factor analysis of 254 
about 1,300 subjects would be needed, 
s number was beyond our resources. 
Ore, an alternative two-step procedure 
lowed: First, each of the eight scales 
g AAT+ and AAT— as separate 
) were factor analyzed; next, the result- 
Or scores were factor analyzed. Since 
factors resulted from the first step of 
cedure, it was possible to produce 
le factors in the second step using 
0m 415 subjects. 

‘order factor analysis, Each of the 
les was factor analyzed using a prin- 
Omponents analysis with communalities 
le diagonal. (This and all other proce- 
followed in the study have been de- 
by Gorsuch, 1974.) The number of 
extracted from each scale was deter- 
Using multiple criteria: the scree test, 
t of variance accounted for, and inter- 
ty. The extracted factors were ro- 
© the direct oblimin criterion. An item 
14 factor loading greater than .30 was 
© as an item representing that factor. 
M'an item loaded significantly on two or 
actors, the differences between loadings 
ed for significance. If the difference 
t significant, the items was assigned to 
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both factors in question. In the few cases in 
which items did not load significantly on any 
factor, they were eliminated from later 
analyses. 

The first-order factor analysis yielded 9 fac- 
tors for Pappo’s (1973) scale, 4 for Zucker- 
man and Allison’s (1976), 4 for Good and 
Good’s (1973), 8 for Cohen’s (1975), 6 for 
the TAS, 1 for AAT—, 1 for AAT+, and 4 
for Spence’s measure; thus, a total of 37 new 
variables were created. 

Second-order factor analysis. Scores for 
each subject were computed on the 37 factors 
identified in the first-order factor analyses, 
All items that loaded .30 and above were as- 
signed unit weights and were summed to pro- 
duce total scores. To evaluate the stability 
of the factor solution, subjects were randomly 
assigned to two separate subsamples (Data 
Sets 1 and 2, or DS-1 and DS-2). Since there 
were significant mean differences between 
male and female scores on 2 of the 37 factors 
(1 from Spence’s measure and 1 from Pap- 
po’s), within-groups correlations were com- 
puted for the two subsamples as recommended 
by Cooley and Lohnes (1971). Following an 
initial principal components analysis of the 
within-groups correlation matrix for DS-1, 
which produced a six-factor solution, the anal- 
ysis was repeated specifying that it produce 
three-, four-, and five-factor solutions, each 
followed by both oblique and orthogonal 
(varimax) rotations. The choice of the final 
solution, 5 orthogonal factors, was based on 
clarity and interpretability. 

To determine the degree of stability of the 
factor structure from DS-1, we attempted to 
replicate it using data from the second half 
of the sample (DS-2). A principal compo- 
nents analysis of the within-groups correla- 
tions for DS-2 was performed. Then, taking 
the varimax rotated factor structure matrix 
from DS-1 as the target matrix, the initial 
factor solution for DS-2 was subjected to an 
orthogonal Procrustes rotation (Cliff, 1966).? 


1 Factor analyses were computed using program 
FACTOR from the Statistical Package for the Social 
Sciences (Nie, Hull, Jenkins, Steinbrenner, & Bent, 
1975). 

2 These analyses were con 
a linear model variance ana 
Kornhauser, & Thayer, 1973). 


ducted using VARAN 2, 
lysis program (Hall, 
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That this procedure succeeded in generating 
highly congruent factor structures is evinced 
in the correlations between matching factors 
for the two data sets: .97, .98, .92, .85, and 
.84. Having replicated the five-factor solu- 
tion, we recombined the data from the two 
halves of the sample and computed a final 
five-factor varimax solution using all 415 
cases. Of course, the results of this analysis 
were virtually identical to the results for each 
half of the sample. 

The new scales. The end result of the fac- 
tor analytic process can be viewed as five new 
scales composed of items from the original 
measures of FOS and fear of failure. To un- 
derstand the scales and their implications, one 
must know something about their composi- 
tion and meaning. 

Factor 1 is composed of first-order factors 
from the measures designed by Good and 
Good (1973), Zuckerman and Allison (1976), 
Pappo (1973), and Cohen (1975). We have 
named this factor Concern about the Nega- 
tive Consequences of Success, because all of 
the items loading highly on it reflect concern 
about jealousy, exploitation, criticism, sabo- 

tage, rejection, burdensome responsibility, and 
pressure following success, Also represented 
on this factor are statements about feelings 
and behaviors that may result from such con- 
cern. The factor is characterized by the fol- 
lowing sorts of items: 


(a) If I were outstanding at something, I would 
worry about the possibility’ of others making fun of 
me behind my back, and (b) the cost of success is 
overwhelming responsibility. 


Of the five factors, this one comes closest to 
the general definition of FOS offered by Hor- 
ner (1969a): “fear that success in competi- 
tive situations will lead to negative conse- 
quences” (p. 38). 

Factor 2 is composed solely of first-order 
factors from Pappo’s (1973) and Cohen’s 
(1975) measures. This factor has been labeled 
Self-deprecation and Insecurity, and it is 
characterized by failure to live up to one’s 
own standards, self-consciousness, unassertive- 
ness, and behavioral manifestations of inse- 
curity. This factor includes items such as: 
(a) I frequently find it difficult to measure 
up to the standards I set for myself, and (b) 
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I often brood about something I’ve said which 
may have been taken the wrong way by an. 
other person. It is extremely important that 
none of the items on this factor explicitly 
mentions negative consequences of success, 
and therefore the factor may not represent 
fear of success at all. 

Factor 3 is simply Test Anxiety, All six 
first-order TAS factors, plus AAT— and 
AAT-+, load on this factor, None of the com- 
ponents of the other scales loads significantly 
on Factor 3. This is probably due in part to 
the extreme specificity of the situation de. 
scribed in the scales; all refer directly to anx- 
iety in testing situations. Whether this factor 
deserves the more general label of fear of 
failure is disputable. 

Factor 4 contains all of the first-order fac- 
tors derived from Spence’s measure of FOS, 
with no contributions from any other mea- 
sure. This too is probably the result of the 
specificity of the situation described in the 
measure, which leads us to call it Attitudes 
Toward Success in Medical School, This des- 
ignation is compatible with Spence’s descrip- 
tion of Horner’s cue as a measure of attitudes 
rather than personality. 

Factor 5 is called Extrinsic Motivation to 
Excel. Items on this factor are concerned with 
the extreme importance of success, status, and 
power. There is no mention of fear or nega- 
tivity of success. The 19 items come from 
two factors of Zuckerman and Allison's 
(1976) scale and one factor from Pappo 
(1973) scale; they include (a) When you'te 
on top, everyone looks up to you, and (b) 
feel that it is important for people of higher 
status to like me. Most of the items on this 
factor were written to reflect the absence o 
Opposite of FOS; in fact, however, they seem 
to measure an independent dimension. 


Phase 3: Relationships Between the Factors 
and Psychological Well-Being and 
Psychosomatic Symptoms 


Five factor scores were computed for an 
subject using the varimax factor weights 0 


: d 
3A complete list of items, factor loadings, 2” 


fe 4 ors 
scoring procedures can be obtained from the aut 
on request. 


the factor analysis based on all 415 
These scores were then correlated 
mographic variables, family back- 
variables, grade point average, and 
of psychological well-being including 
somatic symptoms.‘ The latter indices 
yased on a 4-point scale associated with 
wing question: “How much have the 
problems bothered you during the 
re” (The alternatives were not at all, 
it, moderately, and quite a bit.) We 
the first three factors—Concern 
egative Consequences of Success, Self- 
ion and Insecurity, and Test Anx- 
be related to psychological and psy- 
atic symptoms, since all of them in- 
anxiety or conflict. This should be 
specially true for the Self-deprecation factor, 
hich is very similar to measures of general 
Or neurosis. We did not expect Fac- 
Attitude Toward Success in Medical 
to be related to psychological well- 
although Horner’s (1969b, 1974) origi- 
tk might have led some investigators 
t this. Factor 5, Extrinsic Motivation 
el, might be related to a few symptoms 
ess, since it connotes great drive and 
on, but there is no certain basis for 
Prediction, (Factor 5 was unanticipated 
is still the least well understood of the 
factors. ) 
e criterion for considering a correlation 
cient as meaningful was set at .20. Al- 
small, a correlation of this size is 
reliable (p< .001). All correlations 
20 are shown in Table 4. 
expected, Factor 4 (Attitudes Toward 
Ss in Medical School) was not corre- 
with any of the psychological or psycho- 
¢ variables. Factor 5, Extrinsic Moti- 
n to Excel, was correlated with only two 
both indicating sleep disturbances. Also 
ected, Factor 2, Self-deprecation and 
ity, was related to the largest number 
oblems and symptoms (14) and yielded 
fargest correlation coefficients. Notice 
the highest correlation in this list is the 
etween Factor 2 and feelings of worth- 
less, which is quite consistent with the 
assigned to the factor Self-deprecation 
Insecurity. Test anxiety (Factor 3) was 
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Table 4 
Correlations of Factors with 
Psychosomatic Symptoms 
Factor 
Symptom I E 

Headaches «20 
Feeling tired or low in energy 36 
Poor appetite .20 
Crying easily .20 .21 
Feeling lonely .39 
Worry and anxiety 39.23 
Irrational fears soled 
Nausea or upset stomach .22 
Trouble falling asleep 21.22 
Sleep that is restless or 

disturbed .28 .28 .20 
Waking up too early in 

the morning .24 
Feeling tense or keyed up Ayi 
Overeating .25 
Feelings of worthlessness 53.29 
Feelings of guilt 40.24 
Trouble concentrating 40 34 
Feeling that you just 

can’t go on 26 31.30 
Difficulty making decisions 43 


Note. None of the correlations with Factor 4 exceeds 
.20. The following symptoms did not correlate above 
.20 with any factor: faintness or dizziness; loss of 
sexual interest or pleasure; stomach ulcers or colitis; 
pains in heart or chest; pains in lower back; re- 
curring diarrhea; chronic constipation; high blood 
pressure; trouble getting your breath. 

also correlated with several psychosomatic 
and psychological problems, although not as 
many as were related to Factor 2. The fact 
that the highest relationship was between test 
anxiety and poor concentration is quite com- 
patible with previous research on test anx- 
iety (e.g, Sarason, 1972; Wine, 1971). Fac- 
tor 1, Concern over the Negative Conse- 
quences of Success, which we have said seems 
closest to Horner’s original conception of 
FOS, was also related to three psychological 
and psychosomatic problems: poor appetite, 
waking up too early in the morning, and feel- 
ing that one “just can’t go on.” Perhaps it is 
worth mentioning that all three are common 
symptoms of depression. 


4 Correlations with demographic and background 
variables will not be discussed in this article. A list 
and discussion of the significant relationships can be 
obtained from the authors. 


414 
Discussion 


We began with two major questions: (a) 
Are all of the recently proposed objective tests 
of FOS measuring the same unidimensional 
construct? and (b) Are FOS and fear of fail- 
ure measurably distinct? The answer to the 
first question is no. The answer to the sec- 
ond question is more complex. FOS as mea- 
sured by Pappo (1973) and Cohen (1975) is 
closely related to measures of test anxiety 
(fear of failure), but FOS as measured by 
Spence is not. 

Seeking clearer answers to our two major 
questions, we factor analyzed each of the 
original scales, producing 37 new variables, 
These variables were then factor analyzed, 
and five orthogonal factors were produced. 
One of these, Factor 3, was clearly Test Anx- 
iety; all components of the original test anx- 
iety scales (TAS, ATT+, and ATT—) loaded 
on this factor, but no other variables did. If 
one agrees with achievement motivation re- 

searchers that test anxiety should be called 

“fear of failure,” then Factor 3 represents 

fear of failure, We are hesitant to accept this 

general label, however. The items on Factor 

3 are quite specific to academic or intellectual 

tests and may not be good indicators of fear 

of failing in other situations, 

Factor 2, which we have called Self-depre- 
cation and Insecurity, might serve as a more 
general measure of fear of failure. The items 
reflect fear that the person cannot or will 
not live up to his or her own standards. Some 
of the items refer explicitly to low self-con- 
fidence and anticipation of failure. As defined 
by achievement-motivation researchers, fear 
of failure (or the “motive to avoid failure”) 
is a “capacity to experience shame given non- 
attainment of a goal (failure)” (Weiner, 
1972, p. 200). High scorers on Factor 2 surely 
display this capacity. 

There are two obstacles to accepting Fac- 
tor 2 as a general measure of fear of failure, 
however. First, many of the items that load 
highly on Factor 2 deal with neurotic inhibi- 
tion of assertiveness; high scorers are afraid 
to express their desires or to stand out in a 
group. We do not know for sure whether this 
should be included in a general measure of 
fear of failure. Second, Pappo (1973) and 
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Cohen (1975) argued that FOS is unconscious 
and hence is expressed only indirectly; it may 
even manifest itself to the success-anxious 
person as fear of failure. This argument is 
weakened somewhat by our finding that the 
items that explicitly mention negative reac- 
tions to success load on Factor 1, not on 
Factor 2, which contains the largest portions 
of Pappo’s and Cohen’s scales. Still, we can- 
not completely rule out the possibility that 
Factor 2 measures unconscious fear of suc- 
cess. Only careful experimental studies can 
clarify this matter. 

Factor 1, Concern over Negative Conse- 
quences of Success, fits very well with Hor- 
ner’s original conception of FOS, except that 
it does not include loss of femininity. Horner 
believed that FOS was significantly more 
prevalent in women than in men because it 
was the result of sex role socialization pro- 
cesses. According to the stereotypic image of 
males and females, independence, competence, 
intellectual achievement, and leadership are 
positive attributes for males, but they are 
inconsistent with femininity. In the present 
study, however, we found no sex differences 
in concern about negative consequences of 
success. Males are just as likely as females to 
exhibit concern over the negative conse- 
quences of success. None of the items that 
load on Factor 1 mentions conflicts with or 
loss of femininity. Rather, the concern is 
over jeolousy, exploitation, social rejection, 
and excessive pressure and responsibility. 
These are as likely to be concerns of men as 
they are of women. Thus, contrary to Hor- 
ner’s (1972, 1974) theory, sex role socializa- 
tion is not sufficient to explain the develop- 
ment of this form of FOS. The real develop- 
mental factors that cause some people, male 
or female, to score high on Factor 1 remain 
to be identified. 

We should not ignore the possibility that 
Horner was correct about the link between 
femininity and FOS in 1968. (Her data were 
collected in 1965.) The social climate has 
changed considerably since then, and this cer- 
tainly could have affected our subjects, who 
were college juniors and seniors in 1976. In 
fact, Tresemer (1976) presented evidence 
for a historical change. Despite changes in 
sex role orientation, and perhaps also in atti- 


FEAR OF SUCCESS AND FEAR OF FAILURE 


tudes toward achievement, however, we still 
find many males and females who experience 
FOS as measured by Factor 1. 

The factor closest to Horner’s (1969b) 
original measure of FOS is Factor 4, which 
we have called Attitudes Toward Success in 
Medical School to indicate that we doubt 
whether it has much to do with fear of suc- 
cess. In our study, as in Spence’s (1974), 
males were actually more negative toward a 
successful male medical student than females 
were toward a corresponding female. Spence 


| showed that the reasons for the hostility were 


not identical for males and females. Thus, 
some of Horner’s arguments might be correct 
despite what seems to be a strongly contra- 
dictory pattern of results. However, given 
that the attitudes expressed in people’s stories 
are quite complex and probably specific to 
the medical school situation, it is not surpris- 
ing that research based on this measure has 
been confusing and contradictory (Tresemer, 
1976), 

Factor 5, Extrinsic Motivation to Excel, 


| Was not anticipated and remains somewhat of 


4 mystery. For the moment, the most impor- 
tant point to be made about Factor 5 is that 
it contains items from the scales designed 
by Pappo (1973) and Zuckerman and Allison 


| (1976), who expected the items to indicate 


absence of FOS; in fact, they appear to mea- 
Sure an entirely different dimension. 
Directions for further research are clear. 
(a) It would be useful to develop shorter 
Scales to represent Factors 1, 2, and 5. 
Spence’s measure and the existing test anx- 
lety scales are acceptable in their original 
forms. (b) Experimental studies are needed 
to assess the behavioral consequences of FOS 
and fear of failure. FOS could be measured 
ty the items on Factor 1; either Factor 2 or 
a 3 could be used to measure fear of 
failure, whichever proves to have the most 
Construct validity. Presumably, a person for 
Whom FOS is the dominant motive would 
ae more poorly following success than 
ot ee failure. A person with strong fear 
sh ailure and little or no fear of success 
ld be disorganized by failure but quite 
ee by success. (c) The possibility that 
3 levement conflicts lead to psychosomatic 
Ymptoms of specific kinds deserves much 
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more attention than it has received to date. 
(d) It is important to determine the develop- 
mental antecedents of the kind of FOS mea- 
sured by Factor 1. (e) Since we did not in- 
clude Horner’s most recent measure (Horner 
et al., Note 1), it would be useful to conduct 
a study explicitly to determine how her new 
measure, which is beginning to be widely used, 
is related to the five dimensions discussed 
here. If Jackaway and Teevan (1976) are 
right, Horner’s new measure will be substan- 
tially correlated with our fear of failure di- 
mensions (Factors 2 and 3). (f) Finally, it 
is important to reevaluate Atkinson’s model 
of achievement motivation (Atkinson & 
Feather, 1966) in light of FOS research. Hor- 
ner intended for the concept to play a role 
in a revised comprehensive model of achieve- 
ment behavior, but so far no one has pro- 
posed such a model. 
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WISC-R Factor Structures Among Anglos, Blacks, 
Chicanos, and Native-American Papagos 


Daniel J. Reschly 
Iowa State University 


Weschler Intelligence Scale for Children—Revised (WISC-R) factor structures 
were compared for sample: of Anglo, Black, Chicano, and Native-American 
Papago children from Pima County, Arizona. The samples were randomly se- 
lected from school enrollment rosters and stratified by ethnicity, grade level, 
sex, and urban-rural residence (WV = 950). Application of two objective proce- 
dures for determining the appropriate number of factors for each group sug- 
gested a three-factor solution for Anglos, a two- or three-factor solution for 
Chicanos depending on procedure used, and two-factor solutions for Blacks 
and Native-American Papagos. The two-factor solutions were highly similar for 
the four groups. The three-factor solutions were similar for Anglos and Chicanos 
but were substantially different for the other groups. The groups were highly 
similar in terms of the proportion of variance accounted for by a general factor, 
and the Verbal—Performance scale distinction appeared equally appropriate for 


all groups. 


Examination of the construct validity of a 
test in samples of diverse sociocultural groups 
provides evidence concerning the appropriate- 
ness and fairness of the use of the test with 
different groups. Comparability of factor 
analysis results for different groups, and the 
degree to which the results of the factor anal- 

ysis are consistent with the major scores and 
Common interpretations of the test are neces- 
sary conditions for fairness in use of the test 
with culturally diverse persons. Indeed, if a 
test is not measuring the same underlying 
abilities or if the commonly used scores from 
the test. represent varying abilities depending 
on group membership, then use of the test 
with culturally different persons is probably 
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inappropriate and unfair, and the predictive 
validity of the test is likely to be lower for 
specific groups. Since the Wechsler Intelli- 
gence Scale for Children (WISC) and the 
Wechsler Intelligence Scale for Children-Re- 
vised (WISC-R) have been the most fre- 
quently used measures of general intelligence 
in schools and clinics, data from diverse 
groups concerning the construct validity of 
the WISC-R are needed for practitioners to 
make judgments concerning its appropriate- 
ness and possible fairness. 

Although numerous investigations of the 
factor structure of the WISC appeared in the 
literature (Sattler, 1974), only three studies 
compared factor similarity among diverse 30- 
ciocultural groups, and no studies on the 
WISC-R of this nature have appeared to 
date. Generally, these studies found high simi- 
larity for Blacks and Anglos (Lindsey, 1967) 
and for Blacks, Anglos, and Chicanos (Silver- 
stein, 1973) when two-factor solutions for the 
WISC-R were used. However, when the three- 
factor solutions were applied (Semler & Iscoe, 
1966), significant differences resulted in the 
second and third factors for Blacks and 


Anglos. 
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The present study attempted to extend 
Kaufman’s (1975) factor analysis of the 1974 
WISC-R to three non-Anglo groups. Kauf- 
man interpreted three factors in an analysis 
of the standardization data by age level. 
These factors were labeled Verbal Compre- 
hension (VC), formed by the subtests of In- 
formation (I), Similarities (S), Vocabulary 
(V), and Comprehension (C); Perceptual 
Organization (PO), formed by Picture Com- 
pletion (PC), Picture Arrangement (PA), 
Block Design (BD), Object Assembly (OA), 
and Mazes (M); and Freedom from Distrac- 
tibility (FD), formed by Arithmetic (A), 
Digit Span (DS), and Coding (Co). In addi- 
tion, strong support from the factor analysis 
data was reported for Wechsler’s use of the 
Full Scale IQ score as an index of general 
intelligence and for the organization of the 
test into the Verbal and Performance IQ 
Scales. Silverstein (1977) also factor analyzed 
the standardization data, and, although a 
slightly different method of rotating factors 
was used, the same three-factor pattern 
emerged. Thus, the WISC-R factor structure, 
in contrast to the WISC, appears to be highly 
stable across both age levels and different 
methods of conducting the analysis. 

The purposes of the present study were to 
examine the appropriateness and fairness of 
the WISC-R for four sociocultural groups in 
terms of (a) comparability of factor struc- 
tures; that is, does the WISC-R measure the 
same underlying abilities for Anglo and non- 
Anglo groups? and (b) construct validity 
evidence for the Full Scale IQ and the verbal- 
performance organization of the test; that is, 
does the Full Scale IQ measure general intelli- 
gence and do the Verbal and Performance IQ 
scores represent somewhat different but over- 
lapping abilities for the four groups? 


Method 
Sample 
In November 1973 the Division of Special Edu- 
cation, Arizona State Department of Education 


funded a comprehensive study of handicapping con- 
ditions among school-age children. The Pima County 
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County is geographically large (9,200 square 
ethnically diverse (approximately 68% Anglo, 25% 
Chicano, 4% Black, and 3% Native American), and 
largely urban in population (Tucson) with extensive 
and sparsely populated rural areas, 

A stratified random sample of 1,040 children was 
selected with equal numbers from each of the strati- 
fication variables of group (Anglo, Black, Chicano, 
and Native-American Papago, = 260 per group), 
sex, urban-rural residence, and grade (first, third, 
fifth, seventh, and ninth; n= 208 per grade). The 
entire sample of Black children was urban, and the 
entire sample of Native-American Papago children 
was rural due to the very low proportions of urban 
Indians and rural Blacks in Pima County. 

The cooperation of Tucson District 1, which en- 
rolls about two thirds of all school-age children in 
the county, and all of the rural school districts in 
the county was obtained through contacts with dis- 
trict authorities. School district enrollment rosters 
Were used to randomly select the sample. Ethnicity 
was determined by school data and in some cases 
by contacting the parents. Tucson District 1 was 
regarded as urban. Outlying districts, 25 miles or 
more from Tucson, were regarded as rural. Parents 
of children selected in the initial sample were con- 
tacted by letter or phone to explain the nature of 
the study and to solicit written permission. If par- 
ent permission was not obtained due to refusal 
(4%), no reply (18%), or the child withdrew from 
school, parents moved, and so forth (11%), another 
child was selected from an alternative sample con- 
stituted by the above Process. There were no ap- 
preciable differences among the groups in percentages 
of parents granting permission, refusing permission, 
no reply, or no address—family moved, and so forth. 
Due to various logistical problems, for example, 
delays in return of parental permission, school sched- 
uling problems, and availability of examiners to 
travel to remote areas, WISC-R scores were ob- 
tained for 950 of the original sample of 1,040 stu- 
dents (Anglo, n=252; Black, »=235; Mexican 
American, n= 223; and Native-American Papago, 
n =240). The average age of the final sample was 
10.63 years (range from 6.28 to 15.87 years) includ- 
ing 468 males and 482 females. Additional informa- 
tion concerning the sample is provided in Reschly 
and Jipson (1976). 

As soon as parental permission was obtained, ap- 
Pointments were made with school officials to ad- 
minister the various assessment procedures. ‘The 
WISC-R was administered by appropriately trained 
examiners, and all WISC-R protocols were further 
checked by me for clerical and scoring errors. 


miles), 


Analysis 


For each of the four ethnic groups, two proce- 
dures were used to provide a guide to the appro- 
Priate number of factors needed to efficiently and 
thoroughly describe the WISC-R. Following Silver- 
stein’s (1977) and Kaufman’s (1975) methods, 2 
Principal components analysis was conducted with 


als. An eigenvalue greater than 1 was 
for determining the appropriate num- 
s. Second, an unrestricted maximum like- 
isis was conducted for the two-, three-, 
or solutions. At each step in the anal- 
uare goodness of fit test was conducted 
the resulting factor matrix and the 
ational matrix. 

factor analysis with squared multiple 
s in the diagonals was then conducted 
ly for each group followed by varimax ro- 
‘the two-, three-, and four-factor solutions. 
dtation of the two-, three-, and four-factor 
was conducted regardless of the outcomes 
ests for appropriate number of factors to 
as closely as possible to the previous 


Results 
of Factors 


echniques used as guides to the appro- 
mber of factors yielded somewhat 
stent results. The principal compo- 
iterion (eigenvalues > 1) suggested 
ctor solutions for Anglos and Chicanos 
O-factor solutions for Blacks and Na- 
erican Papagos. The chi-square tests 
ness of fit of the two-factor solutions, 
iting the hypothesis that the result- 
factor matrix accounts for the vari- 
| the original matrix of simple corre- 
among subtests, suggested that more 
0 factors were required only for An- 
< .07) and that two factors were suf- 
for the other groups (p > .15). Using 
in’s (1975) reasoning, a three-factor 
Might be explored for all of the 
nce there was evidence for the in- 
cy of a two-factor solution for some 
groups, and the eigenvalues for the 
ctor in the three-factor solutions were 
to 1.0 for all groups. Although the 
Criterion for the appropriate num- 
actors is psychological meaningfulness 
ing descriptive efficiency, statistical in- 
, and predictive utility), the objec- 
ides clearly suggested only two factors 
Ks and Native Americans and perhaps 
0 factors for Chicanos. 


ison of Two-Factor Solutions 


two-factor solutions yielded the now 
pattern of a first factor composed 
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of the Verbal scale subtests and a second fac- 
tor constituted by the Performance scale sub- 
tests.t The subtests with the highest loadings 
on the first factor for all groups were Vocabu- 
lary (V), Information (I), Comprehension 
(C), and Similarities (S), although all the 
verbal subtests were significantly related to 
the first factor. (Loadings were .40 or above 
for all six subtests except for the loading of 
Digit Span (DS) for one group, Mdn =.65.) 
The second factor formed by Performance 
scale subtests was again highly similar across 
the four ethnic groups. The Block Design 
(BD) and Object Assembly subtests had the 
highest loadings for all groups. All Perform- 
ance scale subtests were significantly related 
to the second factor (loadings of .40 or above) 
except for Coding (Co), which was not a sig- 
nificant component of the second factor for 
any of the groups (loading < .30 for all 
groups). Examination of coefficients of con- 
gruence between matching factors among the 
four groups further supported the judgment 
of very high similarity among the factors. 
The 12 coefficients of congruence varied from 
.97 to .99. 


Comparison of Three-Factor Solutions 


Three- and four-factor solutions were ana- 
lyzed for all groups even though the objective 
criteria unequivocally supported three factors 
for Anglos only, either two or three factors 
for Chicanos depending on the criterion used, 
and only two factors for Blacks and Native- 
American Papagos. In addition to inspection 
of the size and pattern of the loadings on the 
factors, coefficients of congruence were com- 
puted among matching factors for the groups 
in this study and between the varimax median 
loadings reported for the standardization sam- 
ple (Kaufman, 1975, Table 4, p. 141), The 
three-factor solution for Anglos (see Table 1) 
was virtually identical to the three-factor pat- 
tern reported for the standardization sample 


1 Tables of the subtest loadings for each group in 
the two-factor solutions and the coefficients of con- 
gruence for matching factors in the two- and three- 
factor solutions are available from the author. ee 
the loadings of each subtest on the first unrotate 
factor are available in these tables. 
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Table 1 
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Wechsler Intelligence Scale for Children—Revised Subtest Loadings in 


Three- Factor Solution for Four Ethnic Groups 


Sa G 


Native-American 


Anglo Black Chicano Papago 

Subtest JA Ta itp lI TETA T ULL Tool EN A 
Information 63 32 «26 66 40 18 66 20 33 68 22 21 
Similarities 59 26 26 59 44 13 67 #15 22 581335. it 
Arithmetic 43 26 45 61 34 27 40 13 45 42. 37 «09 
Vocabulary 74 23 12 75 20 16 67 26 30 74-15 05 
Comprehension 64 $22 "21 71 24 09 61 20 06 70-2340) Say; 
Digit Span 35 02 40 49 08 36 33- 14. 31 30 35 09 
Picture Completion 20 49 09 25: PS2: i2 TAES: 12 21 53414 
Picture Arrangement 20: 53; 00 29:3 2A ate 38-. 39 23. 44 03 
Block Design L760) 22 20, 33. 58 20 59 16 14 69-05) 
Object Assembly 07 «59 «18 RO ede) 58 14 58 09 07 MES SK 
Coding 12 16 40 33 20 2 14 16 37 17, «17— 37 
Mazes 18 42 10 23 44 30 06 47 20 14 51 28 


Note. All decimal points have been omitted. Roman numerals refer to the factor number : Factor I = Verbal 


Comprehension; Factor II = 


by Kaufman (1975). The first factor was 
formed by the V, C, I, and S subtests. The 
A and DS subtests also had substantial load- 
ings on the first factor, but their highest 
loadings were on the third factor, The second 
factor for Anglos was the familiar Perceptual 
Organization (PO) factor formed by the BD, 
OA, Picture Arrangement (PA), Picture Com- 
pletion (PC), and Mazes (M) subtests. A 
third factor, previously described as Freedom 
From Distractibility (or memory, number, 
sequential, etc.) was formed by the A, DS, 
and Co subtests, Coefficients of congruence 
between the three-factor solutions for Anglos 
and the data reported by Kaufman were .98, 
-98, and .97, respectively, for the three factors, 
further supporting the conclusion of near- 
perfect replication of the three-factor patterns 
reported for the standardization sample. 

The first two factors in the three-factor 
solution for Chicanos were highly similar to 
the above pattern, but the third factor was 
slightly different in that the loading on PA 
was slightly higher on the third rather than 
the second factor, and the loading of DS on 
the third factor was somewhat lower. Co- 
efficients of congruence for the three factors, 
respectively, between the Chicano and Anglo 
loadings were -99, .98, and -86, and .98, .99 
and .93 between the loadings for Chicanos 
and the standardization sample. 


Perceptual Organization; Factor III = Freedom from Distractibility, 


The three-factor solutions for Native- 
American Papagos and Blacks were clearly 
different from the previously reported three- 
factor solutions and should, perhaps, not be 


“interpreted at all, since the objective guides 


to appropriate number of factors suggested 
the sufficiency of two-factor solutions. For 
both groups, the first two factors in the three- 
factor solutions were formed by the Verbal 
and Performance scale subtests, respectively. 
However, for Blacks the second factor was 
formed by PA, PC, and M and the third fac- 
tor by BD and OA. Thus the third factor for 
Blacks appeared to involve a splitting of the 
Second factor into two factors and clearly did 


not match the previously reported patterns for . 


the third factor, A similar result was obtained 
for Native-American Papagos for whom the 
third factor was formed by one subtest only 
(Co), which suggests subtest-specific Or 
method variance rather than a factor as such. 
However, it should be noted that the third 
factor for Native-American Papagos was simi- 
lar to what Cohen (1959) called Factor E 
(Quasi-Specific), which also was formed pri- 
marily by Co. Coefficients of congruence for 
the three-factor solutions among all groups 
and the standardization data were very high 
for the first factor (>.96) and only slightly 
lower for the second factor (>.89). However, 
for the third factor, the coefficients for Blacks 
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md Native-American Papagos were signifi- 
cantly lower (.72—.78) in all comparisons, 
further supporting the conclusion of not in- 
teppreting a third factor for these groups. 

The fourth factor in the four-factor solu- 
fins was uninterpretable for all groups. (All 
subtest loadings were <.30 with most near 
wro on the fourth factor.) 


{ 
General Intelligence 


| The data were further analyzed to deter- 
mine the degree to which a general factor of 
intelligence was measured by the WISC-R 
for the different groups. In addition to the 
obvious evidence of a general factor, that is, 
the significant and positive correlations of 
subtests with each other and with Full Scale 
score, three indices of a possible general fac- 
tor were examined. First, following Kaufman’s 
p (1975) analysis, which resulted in a median 
[of 82% of the common factor variance at- 
| tributed to the general factor, the loadings 
a the unrotated first principal factor were 
analyzed separately for each group. The per- 
centage of common factor variance accounted 
lor by the general factor was nearly the same 
Tegardless of group (79, 83, 79, and 77, re- 
spectively, for Anglos, Blacks, Chicanos, and 
Native-American Papagos). A second anal- 
Ysis of the general factor using the first un- 
‘totated principal component resulted in about 
the same percentages for each group. 
A third index, which I preferred, was based 
a restricted maximum likelihood analysis 
that used a target matrix based on the results 
of the two-factor solutions for each group. 
This method provided a lower but more real- 
istic estimate of the variance attributable to 
general factor, because the unique vari- 
ance associated with the Verbal Comprehen- 
sion (VC) and PO factors (clearly the strong- 
êst factors for all groups) was taken from the 
Seneral factor and apportioned to the respec- 
live VC and PO factors. This analysis yielded 
estimates of 61%, 63%, 59%, and 61% of 
a variance attributable to a general factor 
or Anglos, Blacks, Chicanos, and Native- 
ies Papagos, respectively. Regardless 
he index used, the proportions of variance 
attributable to a general factor were approxi- 
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mately the same for all groups and similar to 
the standardization sample. 


Verbal—Per formance Organization 


The first and second factors, VC and pg 
were highly similar to the respective Verbal 
and Performance scales in the two-factor soly. 
tions for all groups and in the three-factor 
solutions for all groups except Blacks. The 
closeness of the VC and PO factors to the 
verbal-performance dichotomy is even greater 
when the supplementary subtest for the verbal 
scale (DS) is ignored and when M is sub. 
stituted for Co on the Performance scale, If 
these stipulations are followed, then the re. 
spective scales conform almost perfectly to 
the first and second factors for all groups. 


Discussion 


Some caution regarding the results of this 
study must be expressed, since in addition to 
race or ethnicity, the groups also varied sig- 
nificantly on socioeconomic status and level 
of intelligence. Due to limitations in sample 
size in this and most other investigations, it 
was impossible to analyze separately the pos- 
sible effects of socioeconomic status, level of 
intelligence, and group membership. The well- 
known fact that these variables are not inde- 
pendent requires caution in interpreting the 
differences reported among the groups, A 
slight relationship between level of intelligence 
and factor patterns on the WISC and 
WISC-R has been reported previously (Van 
Hagen & Kaufman, 1975), and may partially 
account for the group differences reported 
here. 

The major differences m factor patterns 
among the groups were essentially restricted 
to the question of the existence and composi- 
tion of the third (Freedom from Distractibi). 
ity—FD) factor. The number of interpreta- 
ble factors on the WISC-R varied for these 
samples, and perhaps vaty for these groups 
generally. The previously reported pattern of 
three factors for Anglos was clearly repli- 
cated, and the results for Chicanos were gen- 
erally consistent with this pattern. However, 
the objective guides to the appropriate num. 
ber of factors and the pattern of factor load. 
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ings in the three-factor solutions both failed 
to support the existence of the third (FD) 
factor for Blacks and Native-American Pa- 
pagos. The existence of the third factor on 
the WISC was the subject of some debate 
(Silverstein, 1969), and the “meaning” of this 
factor has never been altogether clear. Some 
reseachers (e.g., Cohen, 1957) regard the 
third factor as a measure of intellectual skills 
such as immediate memory, numerical or se- 
quencing. Others, including Cohen (1959), 
have described the third factor in more be- 
havioral or nonintellectual terms such as at- 
tention-concentration, or more commonly now, 
Freedom from Distractibility (Kaufman, 
1975), If the third factor is viewed as a non- 
intellectual factor, then the construct validity 
of the WISC-R as an intellectual measure for 
different groups is supported even more 
strongly by these data. 

Interpretation of the factor scores of VC 
and PO appears to be equally appropriate for 
all of the groups included in this study. Fol- 
lowing Kaufman (1975), the Verbal Scale 
IQ can be used for the VC factor, and if M 
is substituted for Co, the Performance IQ 
can be used as for the PO factor. However, 
the FD factor for Chicanos should probably 
involve four subtests (adding PA to the usual 
three), and the FD factor scores should prob- 
ably not be used with Blacks and Native- 
American Papagos unless other data suggest 
its existence and/or predictive utility for 
these groups. 

Confidence in the appropriateness of the 
WISC-R as a measure of intellectual ability 
for different groups is increased by the fact 
that (a) a large general factor was clearly 
apparent in about the same form and amount 
for all groups. Thus, the usual interpretation 
of the Full Scale IQ as an index of general 
intelligence appears to be equally appropriate 
for Anglo and non-Anglo groups. Further, the 
Verbal—Performance scale distinction appears 


also to be equally appropriate for the four 
groups. 
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Finally, the conclusions of this study pro- 
vide increased confidence in the construct 
validity of the WISC-R for different groups. 
Construct validity evidence is certainly a 
necessary but not a sufficient condition for 
fairness in test use. These conclusions, of 
course, do not reveal whether or not the com- 
mon predictions and classifications based on 
the WISC-R are equally valid for diverse 
sociocultural groups. 


References 


Cohen, J. A factor-analytically based rationale for 
the Wechsler Adult Intelligence Scale. Journal of 
Consulting Psychology, 1957, 21, 451-457. 

Cohen, J. The factorial structure of the WISC at 
ages 7-6, 10-6, and 13-6. Journal of Consulting 
Psychology, 1959, 23, 285-299. 

Kaufman, A. Factor analysis of the WISC-R at 11 
age levels between 64 and 164 years. Journal of 
Consulting and Clinical Psychology, 1975, 43, 135- 
147. 

Lindsey, J. The factorial organization of intelligence 
in children as related to the variables of age, sex, 
and subculture. (Doctoral dissertation, -University 
of Georgia, 1967). Dissertation Abstracts, 1967, 27, 
3664A-3665A. (University Microfilms No. 67-3567) 

Reschly, D., & Jipson, F. Ethnicity, geographic locale, 
age, sex, and urban-rural residence as variables in 
the prevalence of mild retardation. American Jour- 
nal of Mental Deficiency, 1976, 81, 154-161. 

Sattler, J. Assessment of children’s intelligence. Phila- 
delphia, Pa.: Saunders, 1974. : 

Semler, I, & Iscoe, I. Structure of intelligence in 
Negro and white children. Journal of Educational 
Psychology, 1966, 57, 326-336. 

Silverstein, A. An alternative factor analytic solu- 
tion for Wechsler’s intelligence scales. Educational 
and Psychological Measurement, 1969, 29, 763-767. 

Silverstein, A. Factor structure of the Wechsler In- 
telligence Scale for Children for three ethnic groups. 
Journal of Educational Psychology, 1973, 65, 408- 
410. 

Silverstein, A. Alternative factor analytic solutions 
for the Wechsler Intelligence Scale for Children- 
Revised. Educational and Psychological Measure- 
ment, 1977, 37, 121-124. 

Van Hagen, J, & Kaufman, A. Factor analysis of the 
WISC-R for a group of mentally retarded children 
and adolescents. Journal of Consulting and Clini- 
cal Psychology, 1975, 5, 661-667. 


Received December 6, 1976 " 


mal of Consulting and Clinical Psychology 
Vol. 46, No. 3, 423-431 


University 


The study compared 


general trait approach and approaches 


were pretested with general and specifi 


cantly greater than that o 
the reverse never occurre 


d response 
1 general t 


persons across situations an 
was found for the traditional 


During the past decade there has been con- 
derable debate in the field of personality 
id assessment on the relative importance 
(traits and situations. The traditional “trait” 
proach, which calls attention to enduring 
itrapsychic dispositions and deemphasizes 
le role of the environment, has been at- 
flicked on both theoretical and empirical 
Mounds, and alternative models have been 
posed. The behavioral or “situationist” 
Pproach (e.g, Kanfer & Saslow, 1965; 
lischel, 1968), which has been one of the 
st widely accepted of these alternatives, 
lim be characterized by an emphasis on the 
of the current environment or situation 
Ndetermining behavior. More recently, pro- 
Mhents of the interactional approach have 
illggested that neither trait nor situationist 
counts of behavior are adequate, and that 
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the test. The subjects were 56 male and 58 female 


exposed to three situations involving a rai 
sults indicated that the predictive validity of 
f the general measures ii 
d. Other analyses showed some support for the pres- 


ence of behavioral consistency; for example, 
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General Versus Specific Traits in the Assessment of Anxiety 


Martin Mellstrom, Jr., Marvin Zuckerman, and George A. Cicala 


of Delaware 


the predictive validity of anxiety measures based on the 


that include the situation dimension in 
undergraduates. The subjects 
c trait anxiety measures and were later 
t, a test, and social anxiety. The re- 
the specific measures was signifi- 

in 7 of 32 comparisons, whereas 


the generalizability coefficient for 
modes was about .35, but little support 
rait tests that purport to measure them. 


the person and the situation must be con- 
sidered in their interaction to provide an ade- 
quate conceptualization of behavior (cf. 
Bowers, 1973; Ekehammar, 1974; Endler & 
Magnusson, 1976). 

Although reviews of the evidence on which 
of these approaches holds the most promise 
as a model for the study of personality (€.8-, 
Bowers, 1973; Ekehammar, 1974; Endler & 
Magnusson, 1976) seem to support some form 
of interactional approach, it is not yet clear 
that an application of this approach to assess- 
ment problems will yield gains in our ability 
to understand and predict behavior. Studies 
of the problem in the area of anxiety (e.g. 
D’Zurilla, 1965; Hodges & Spielberger, 1966; 
Lamb, 1973; Mellstrom, Cicala, & Zucker- 
man, 1976; Paul, 1966) have compared the 
predictive validity of “anxiety trait” (A- 
Trait) tests that included the situation vari- 
able in the design of the test with those that 
did not. In these studies, “general A-Trait” 
tests, like that of Spielberger, Gorsuch, and 


Lushene (1970), which are intended to mea- 
sure relatively stable individual differences 1m 
are evaluated 


general anxiety proneness, t 
against “specific A-Trait” measures, which 
assess a person’s disposition to be anxious in 
a particular situation or class of situations, 
for example, classroom examination situa- 


tions. The idea of specific traits can be viewed 
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as a redefinition of the trait concept so that 
it takes into account the situation and the 
Person X Situation interaction. 

Although the results, on the whole, have 
indicated that specific A-Trait measures are 
more predictive of anxiety in the correspond- 
ing criterion situation, there have been in- 
stances in which the general A-Trait tests 
were nearly or equally as predictive. This has 
seemed to happen when there was some threat 
to self-esteem (“ego threat”) in the criterion 
situation (Mellstrom et al., 1976). Thus, the 
issue of whether anxiety assessment proce- 
dures based on a general trait model of per- 
sonality are inferior to those that introduce 
the situation variable into assessment has 
been unresolved because the variable of ego 
threat has not been adequately explored and 
only a small number of anxiety situations 
have been evaluated. The present study at- 
tempted to help clarify this issue by having 
individuals who were pretested with general 
and specific A-Trait tests experience three 
anxiety situations, two of which posed a clear 
ego threat (social and test anxiety situations) 
and one of which did not (a rat situation). 


Method 
Subjects 


Fifty-six male and 58 female students enrolled in 
introductory psychology at the University of Dela- 
ware were selected for the study. They participated 
to fulfill a course requirement, 


Predictor Tests 


Since all the predictors have been described in the 
Mellstrom et al. (1976) study, they will be only 
briefly described here. 

Two of the pretests can be regarded as traditional 
A-Trait measures: (a) The trait form of the State- 
Trait Anxiety Inventory (STAI A-Trait; Spielberger 
et al, 1970) measures individual differences in anx- 
iety proneness and has been found to correlate .73 
with the Neuroticism scale of the Eysenck and 
Eysenck (1964) Personality Inventory (Mellstrom et 
al, 1976); (b) the Neuroticism (N) scale of the 
Eysenck Personality Inventory (EPI) was the sec- 
ond general anxiety measure,used. 

Two pretests provided indices of both general and 
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as a general fearfulness or A-Trait measure, and ` 
their scores on the particular items were used as 
specific A-Trait measures. (b) Another test that pro- 
vides indices of both general and specific A-Trait 
is a modified form of the Zuckerman Inventory of 
Personal Reactions (ZIPERS; Zuckerman, 1976, 
1977). It consists of 12 situations in which subjects 
indicate the degree to which each situation elicits in 
them each of 13 reactions. Factor analysis of the 
response dimensions of the scale has yjelded several 
factors including the fear arousal factor used in the 
present research. By summing a subject’s scores 
across all situations on the responses comprising the 
fear factor, a measure of general A-Trait can be 
derived. A subject’s score in response to any one of 
the 12 situations yields a specific A-Trait measure, 
The test was modified by adding social, test, and 
rat situations, which matched the real situations in 
the study, to the usual form of the test. Responses 
to these situations were not included in subjects’ 
total scores, which were obtained by summing re- 
sponses across the other situations, 


Anxiety Situations 


Rat. In the rat situation, an albino laboratory 
rat with clipped front incisors was housed in a wire 
cage located on a table in the corner of the 4 m 
Square X 3.5 m high experimental room. 

Test. The test situation was similar to that used 
by Sarason (1957) to study test anxiety and in- 
volved a memory drum used to provide a serial 
verbal learning task. Ten nonsense syllables were 
presented at the rate of one every 1 sec, and the 
subject’s task was to say when a given syllable ap- 
peared what the next one to appear would be. Ac- 
cording to Sarason, when subjects are told that this 
task is a measure of intelligence, the resulting evalua- 
tion anxiety disrupts the performance of those who 
are “test anxious.” Subjects spoke into a tape re- 
corder connected to an audio oscillator so that a 
“beep” (inaudible to the subject) was recorded on 
the tape each time a new word was presented on 
the memory drum. Using this arrangement, sub- 
jects could be left alone to perform the task, and 
their tape can be scored later for the number of 
trials required to learn the list. Thus, anxiety in this 
Situation would not be contaminated by social anx- 
iety related to the presence of other people. 

Social situation. This situation consisted of a non- 
directive interview during which subjects were asked 
to “talk about themselves” for 2 minutes. The first 
author conducted the interview while he and tw? 
other experimenters, one male and one female, gazed 
directly and continuously at the subject. This pro- 
cedure was followed because of Marks’s (1969) oF 
servation that one of the cardinal features of sea 
anxiousness is a fear of the gaze of others. Althoug 
it never occurred, experimenters were instructed t0 
stop gazing if it appeared that the subject was be 
coming very distressed by it. A videotape camera, 
focused on the subject, was also located in the Bee 
but subjects were given no indication of whether " 


ating until the end of the experimental ses- 
the situation could be observed from an ad- 
room, where the videotape deck was located, 
e-way mirror. 


tive Situational Anxiety Measures 


ee tests used to measure subjective anxiety 
ituation were the Anxiety State (A-State) 
Spielberger et al.’s (1970) STAI; the A- 
of the ZIPERS, which consists of the 
bf the 13 reactions on the ZIPERS, with in- 
is to indicate the reactions being experienced 
and the fear thermometer (FT) of Walk 
which subjects place a check mark on a 
to indicate the amount of fear or anxiety 
perienced at that moment. 
s were administered the predictor tests dur- 
st 4 weeks of the semester. Six weeks later, 
ect participated in each anxiety situation 
h the temporal order of the situations de- 
randomly for each subject. To minimize 
effects from one situation to the next, 


uation Procedure 


instructions directed the subject to approach 
fand lift it up. In this, as in the other two 
, Subjects were told that they did not have 
te if they did not wish to, but no sub- 
this option. Following the instructions, 
were administered the first two A-State 
TAI A-State and ZIPERS-State). While 
ompleted these scales, the experimenter rated 
on a modified form of Paul’s (1966) Behavior 
st in which the experimenter rates the de- 
ich each of the checklist behaviors is no- 
An extra item of “overall anxiety” was 
the checklist. 

E behavioral test was identical to the one used 
€ Mellstrom et al. (1976) snake situation and 
ally a behavior approach test, yielding a 
Score” and a latency measure as the behav- 
lear indices. After this task was completed, 
lect filled out a FT as he or she stood in 
ation of the closest approach to the rat. In 
ĉe situations, subjects were told that they did 
tive to others to prevent possible loss of 


instructions told subjects that they would 
to perform a difficult memory task that 
Gicator of one’s intelligence. Then the first 

e scales were completed while the experi- 
filled out the Behavior Checklist. Next, the 
task was begun, and the experimenter left 
I. At the end of 15 minutes, he or she re- 

ferminated the task, and administered the FT. 


GENERAL VERSUS SPECIFIC TRAITS 


425 


Social Situation Procedure 


Taped instructions asked subjects to talk about 
themselves for 2 minutes, The first two A-State 
scales were then administered, the “interview” was 
conducted, and the FT was completed. Before sub- 
jects left, they were told that they had been video- 
taped but that if they objected, the tapes would 
be erased. No subjects requested this, After sub- 
jects left, the experimenters watched the tape of the 
interview and completed the Behavior Checklist. 
The videotapes provided additional behavior ratings 
made by a separate set of observers, but these 
ratings were uncorrelated with the other measures 
in the study, so they will not be reported. 


Results 


Before the main results are discussed, it 
should be noted that there was a truncation 
of range in the behavioral measure of the rat 
situation (task score), since only 6 males 
and 15 females could not perform the task. 
As a consequence, the correlations of this 
variable with the predictors were expected to 
be somewhat low. 


Comparative Validity of the General and 
Specific Predictors 


In the previous study (Mellstrom et al., 
1976), the correlations among the predictor 
measures indicated three fairly distinct 


Table 1 
Intercorrelations Among the Three Classes 


of Predictor Measures 


A-Trait 
Neuro- 

Scale Omnibus ticism Rat Test 
Neuroticism .49 
Rat A-Trait 56 .28 
Test A-Trait 80 40 46 a 
Social A-Trait «13 .48 43,0 

Note. Omnibus = summed standard scores of the 


he Geer Fear Survey Schedule and 
Personal Reactions. 
dard scores on the 


total scores on tl 
Zuckerman erage ot 
icism = summed stan } 
AE Trait Anxiety Inventory Trait scale (A-Trait) 
and Eysenck Personality Inventory NENO SE 
scale. Rat A-Trait = summed standard score of the 
rat A-Trait measures of the Geer Fear Baie, 
Schedule and the Zuckerman Inventory of PTA 
Reactions. Test and social A-Trait omp i: 
similarly derived. All correlations were signihc: 


at p < 01. 
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Table 2 


M. MELLSTROM, M. ZUCKERMAN, AND G. CICALA 


Validity Coefficients for Three Classes of Predictors for Males 


Se 


Predictor 
Il: t-test comparison 
Situational anxiety ifs Neuro- II: 
measure Omnibus ticism Specific I vs. II I vs. III II vs. III 

Rat 

Self-report 38" 34e* npa ns II >I Ill >II 

Observational rating .36** 13 gee ns ns ns 

Behavioral —.09 .06 15 ns ns ns 
Test 

Self-report Eyad .38** .48** ns ns ns 

Observational rating .04 —.05 02 ns ns ns 

Behavioral measure —.02 -20 -03 ns ns ns 
Social 

Self-report 733% 40"* .44** ns ns ns 

Observational rating .26* .29* 19 ns ns ns 

% significant r 38 38 50 


Note. Omnibus = summed standard scores of the total scores on the Geer Fear Survey Schedule and 
Zuckerman Inventory of Personal Reactions. Neuroticism = summed standard scores on the State-Trait 
Anxiety Inventory Trait scale (A-Trait) and Eysenck Personality Inventory Neuroticism scale. Rat A-Trait 
= summed standard scores of the rat A-Trait measures of the Geer Fear Survey Schedule and the Zuckerman 
Inventory of Personal Reactions. Test and social A-Trait composites were similarly derived. 


*p < .05. 
** p <01. 


classes: neuroticism, omnibus, and specific 
A-Trait measures as represented by measures 
like the EPI-N, total score on the FSS, and 
a single item on the FSS, respectively. Mea- 
sures within a given class were found to cor- 
relate more highly with each other than with 
other measures, and to show very similar 
patterns of correlation with the situational 
criteria. Therefore, to simplify interpretation 
of the data, tests comprising one class were 
combined to form one composite measure by 
summing subjects’ standard scores on each 
test. For example, subjects’? standard scores 
on the STAI A-Trait and EPI-N scales were 
summed to provide a composite labeled neu- 
roticism. An additional justification for this 
procedure is that the focus of the study was 
not on the validity of any single test, which 
may be affected by the amount of effort put 
into its construction, but on the validity of 
different strategies of assessment as repre- 
sented by more than one measure. 

The response data of the present study 
were prepared for analysis in the same way. 
On the basis of an a priori decision, based 
on the foregoing theoretical rationale, sub- 


jects’ standardized total scores on the FSS 
and ZIPERS were summed to provide a com- 
posite labeled omnibus. The STAI A-Trait 
and EPI-N scales were combined to form a 
neuroticism composite, and the specific A- 
Trait measures derived from the ZIPERS and 
FSS were similarly combined for each specific 
disposition, yielding specific composites for 
“social anxiety,” “test anxiety,” and “rat 
anxiety.” i ; 

As in the prior study, situational criteria 
of the same type were combined to form self- 
report, observers’ rating, and behavioral index 
composites. STAI A-State, ZIPERS-State, 
and the FT comprised the “self-report” com- 
Posite, total score on the Behavior Checklist 
and the additional “overall anxiety” item 
comprised the “observers’ rating” composite, 
and, in the rat situation, the task score and 
latency formed a “behavioral index” com- 
posite.! 


1 Anyone interested in the correlations between the 
single predictor measures and the separate fear Te- 
sponse measures can obtain a table of these corre- 
lations from the second author. 
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Table 1 shows the correlations among the 
composite measures representing the several 
dasses of predictors for the entire sample of 
| subjects. The omnibus composite showed mod- 
| erate to strong correlations, ranging from .49 
f fo .80 with all the other predictors, whereas 

the neuroticism composite showed somewhat 

weaker but significant correlations. The spe- 

tific A-Trait composites for test and social 
| anxiety correlated highly (.65) and were more 
| highly related to the neuroticism composite 
than was the rat A-Trait composite. 

Tables 2 and 3 show the correlations be- 
tween the three classes of predictors and the 
situational fear measures for the male and 
female samples, respectively. The validity co- 
| efficients of the specific compositions repre- 
| sent the relations between the situational anx- 

iety measures and the specific composite de- 

signed to predict that anxiety. To compare 
| the validity coefficients of the three types of 

A-Trait predictors, a £ test for the signifi- 

cance of the difference between two correla- 

tion coefficients in correlated samples (Fer- 
| guson, 1971) was used. The right-hand por- 


| 


Table 3 
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tion of the tables shows the results of these 
comparisons. 

For the rat situation, there were several 
instances in which the specific composite had 
significantly more accuracy in the prediction 
of criteria than either the omnibus or neu- 
roticism composites. Also, the specific com- 
posite correlated significantly with all three 
types of situational measures except the be- 
havioral criterion for the males, which was 
limited by range restriction, whereas the neu- 
roticism composite correlated significantly 
with self-reports and observer’s ratings but 
not with behavioral measures. 

In the test situation, there was little or no 
evidence of differential predictive power for 
the three types of predictors. For the females 
in the social situation (Table 3), the specific 
measure was significantly better than the 
omnibus one in predicting both types of situa- 
tional responses. Overall, the specific measures 
were significantly more predictive in 7 of 32 
comparisons, whereas the reverse never 0c- 
curred; that is, the general measures were 
never significantly more predictive. Also, the 


| Validity Coefficients of Three Classes of Predictors for Females 


l 
| Predictor 


t-test comparison 


II 
Situational anxiety I; Neuro- Ill: 
measure Omnibus ticism Specific Ivs. Il I vs. Il II vs. UL 

Rat 

Self-report 25 .34** 52%* ns I >I ns 

Observational rati .21° 18 4i* ns ns ns 

Bebe oral Utter 02 “02 340" Boece aval oall 
Test 

Self-report s4** 5i** .50** ns ns ns 

Observational rating -16 —.07 .23 ns "s i 

Behavioral 07 AS .08 ns ns ns 
Social ee 

Self-report .49** i bad .66** ns > ns 

Observational rating 19 .26* 38"* as eel ns 

% significant r 25 38 75 


Note. Omnibus = summed standard scores of the total scores 


Zuckerman Inventory of Personal Reactions. 


Anxiety Inventory Trait scale (A-Trait) and Eysenck Personality 
re summed standard scores of the rat A-Trait measures of 
nventory of Personal Reactions. Test and social A-Trait compos! 


i 
p < .05. 
“> <01. 


on the Geer Fear Survey Schedule and 
ores on the State-Trait 
Rat A-Trait 
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percentages of significant (p < .01) correla- 
tions for the omnibus, neuroticism, and spe- 
cific predictors (Tables 2 and 3) were 38%, 
38%, and 50%, respectively, for the males 
and 25%, 38%, and 75% for the females. 
The results suggest that the specific predic- 
tors had somewhat greater overall predictive 
accuracy than the other two types of A-Trait 
measures, 


Proportions of Variance: Persons 
Versus Situations 


To assess the relative importance of per- 
sons, situations, and their interaction, the 
ZIPERS data were subjected to three-way 
analyses of variance. In the first analysis, each 
subject represented a level of the person 
source, whereas the social, test, and rat situa- 
tions described in the inventory represented 
levels of the situation source. The three re- 
sponse items that represented the mode of 
response source were “Your heart beats 
faster,” “You feel fearful,” and “You get out 
of the situation or avoid it.” The rationale for 
using these three responses was that each 
represents one of the three basic human re- 
sponse channels: physiological, phenomenal, 
and behavioral. Also, it was thought prefer- 
able to have the same number of response 
modes as situations, 

After the analyses of variance were per- 
formed, the proportions of variance explained 


Table 4 

Percentages of Variance Accounted for by 
Sources, with Individuals Comprising 
Levels of the Person Source 


oe 


Type of situation 


Hypo- 

Source thetical Actual 
Person (P) 35.2 29.0 . 
Situation (S) 0 6 
Response mode (R) 4.6 5:7: 
PXS 14.0 21.5 
PXR 16.8 11.7 
SXR RS ath 

Residual 26.8 30.8 


Note. The hypothetical situations were the rat, test, 
and social 
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Table 5 
Correlations of Responses Across Actual 
Situations for Each Measure 


Situation Test Social 
STAI State scale 
Rat 34" VASES 
Test 30st 
ZIPERS State scale 
Rat A had rii 
Test Rebs 
Fear Thermometer 
Rat Vind ea pi} 
Test .40** 
Behavior Checklist composite 
Rat .24* Heath ed 
Test 16 
Note. STAI = State-Trait Anxiety Inventory; 
ZIPERS = Zuckerman Inventory of Personal 
Reactions. 
*p <.05. 
“> < 01, 


by each source were calculated, following 
guidelines described by Endler (1966). The 
Proportions were calculated twice under a 
mixed model (persons random,, situations, and 
Tesponse modes fixed), once assuming a mini- 
mum and once a maximum triple interaction. 
Since the two sets of results were quite simi- 
lar, it was possible to make the assumption 
of a zero triple interaction, permitting a 
unique solution for each component. 

Regarding statistical significance, this n = 1 
model permits F tests on only the situation 
and response mode main effects and the Situa- 
tion X Response Mode interaction effect. Of 
these three, only the situation effect was not 
significant in either the hypothetical or actual 
situation data sets, 

Table 4 shows that the person source ac- 
counted for 35% and 29% of the variance 
in the responses to the hypothetical and actual 
situations, respectively. In contrast, situations 
accounted for only 0%-1%, a finding related 
to the fact that the means of the three situa- 
tions were nearly identical. Mode of response 
explained only 5%-6% of the responses to 


the hypothetical and actual situations, and” 


son X Situation interaction accounted 
Z ot 22%, depending on the data set. 
idual accounted for 27% or 31%, de- 
y on the data set. Females had signifi- 
higher anxiety levels than the males 
ZIPERS specific A-Trait predictors, 
y did not differ from males on any of 
tate responses in the actual situations. 
assess the importance of individual dif- 
s when defined by traditional general 
it tests, more analyses of variance were 
ed on the ZIPERS data when the 
source was represented by groups of 
s scoring high, medium, and low on 
uiroticism composite. The situation and 
sources were the same as before, 
e analysis was done on both the hypo- 
l and actual situation data sets. 

results of this change were that the 
it or neuroticism source explained only 
nd 9% for the hypothetical and actual 
n data sets, respectively. As before, 
tion explained little or no variance, but 
this analysis the error term accounted for 
81% of the variance. 


ual Consistency Across Situations 


it out by Endler and Magnusson 
6), one of the main postulates of the 
Position is that the rank ordering of 
uals on a given trait should remain 
Me from situation to situation. To de- 
e the validity of this postulate for the 
f the present study, the correlations of 
ven situational anxiety measure with the 
Measure taken in other situations were 
hined. Table 5 shows that the size of the 
tions varied considerably from mea- 
Measure, ranging from .16 to .51, with 
Median correlation at about .36. 

Nother index of consistency, the general- 
lity coefficient (Cronbach, Gleser, Nanda, 
jatatnam, 1972), was computed on the 
RS data and submitted to analysis of 
ce. The person source had the largest 
ents, 38 and .31, respectively, for the 
Othetical and actual situations, whereas 
situation source had coefficients near zero, 
Onstrating no consistent or generalizable 
t across persons. 
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Discussion 


The findings on the comparative validity 
of the general and specific A-Trait measures 
corroborate those of the earlier study (Mell- 
strom et al., 1976) and suggest that the spe- 
cific measures may be more useful for pre- 
diction of fear responses. The specific com- 
posites surpassed the general ones in the rat 
situation for both sexes and in the social 
situation for females. In both of these in- 
stances, it would seem that subjects were 
capable of imagining how they would respond 
in the situation, and that this estimate was 
more accurate than one based on a measure 
of the person’s general anxiousness. 

Instances in which the general tests equaled 
the predictive accuracy of the specific ones 
(i.e., in the social situation for the males and 
in the test situation for both sexes) can be 
explained in several ways. First, the predic- 
tive power of the specific predictors may have 
been limited by two factors: (a) Subjects 
probably had no prior experience with any- 
thing similar to the test situation of the study, 
making it difficult for them to predict on the 
ZIPERS how they would react. and (b) 
There was a lack of correspondence between 
the test “situation” described in the FSS test 
item and the actual situation confronted; that 
is, performing a serial verbal learning task in 
an experiment differs considerably from “fail- 
ing a test,” the situation described in the 
FSS test item. Second, the specific predictors 
did not surpass the general ones in predicting 
the self-report criteria of the test situation, 
because the general ones achieved relatively 
high levels of predictive accuracy for this 
situation. It may be that the ego threat pres- 
ent in such situations is the main cause of 
anxiety, permitting the neuroticism compos- 
ite to be predictive of these self-reports. 
Third, the general and specific predictors may 
have shown equal predictive validity for the 
males in the social situation because many 
males may be less likely than females to ac- 
curately imagine their responding in social 
situations. That is, in line with traditional 
sex roles, females may be more attentive to 
subtle social cues and to their own responding 


in such situations. : ; 
Thus, while the idea of including the situa- 
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tion variable in assessment seems to be theo- 
tetically sound and often yields increments 
in predictive validity, variables like the “ego 
involvement” of subjects, their prior experi- 
ence (or lack of it) with situations like the 
ones of interest, and the similarity between 
the criterion situation and the situation de- 
scribed in the specific measure may attenuate 
the validity of the specific measures, 

It should be noted that the rat situation 
may have aroused ego threat in addition to 
rat fear, since many males, and perhaps fe- 
males as well, may consider it childish or 
foolish to show fear in such a situation, 
whether the experimenter is female or male, 
This may explain the significant correlations 
between the neuroticism composite and the 
self-report criterion in this situation. Although 
the effect of mixed-sex experimenter-subject 
dyads in this situation, as well as in the other 
situations, is not known, it seems reasonable 
to assume, without evidence to the contrary, 
that subjects would feel equally threatened 
by potential failure in the presence of a same- 
sex peer as they would in front of one of the 
Opposite sex. 

The findings on the consistency and per- 
son-versus-situation issues provided some sup- 
port for the idea of behavioral continuity 
(cf. Block, 1971), The indices of consistency 
suggest that persons’ anxiety levels were 
somewhat generalizable across situations. In 
addition, analyses of variance on the Mell- 
strom et al. (1976) data, presented in Zucker- 
man and Mellstrom (1977), also showed this 
pattern; the person source explained about 
29% of the variance. However, in both stud- 
ies, when the person source was defined by 
levels of the traditional A-Trait measures, the 
percentages of explained variance dropped. 
In the present study, they dropped from 35% 
to 12% and 29% to 9% for the hypothetical 
and actual situation data sets, respectively. 


degree of individual 


explanation is that 
traditional A-Trait tests are not valid mea- 
sures of the construct that they purport to 
measure. Alternately, the trait may be more 
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consistent in some people than in others 
that is, some people may be consistent whil 
others are “interactive,” being differential) 
responsive to situations (cf, Bem & Allen 
1974). The specific measures should be pre 
dictive for both types of persons as long a 
the situation is similar to one the person ha 
reacted to on past occasions, 

One problem that remains to be resolved 
is the optimal degree of specificity for predic 
tion. Too broad a definition of situations in 
the tests may attenuate the predictive poten: 
tial of the tests, whereas too narrow a defini 
tion may limit their usefulness to extremely 
narrowly defined situations. 
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Effect of Subliminal Stimulation of Symbiotic Fantasies 


on Behavior Modification Treatment of Obesity 


Lloyd H. Silverman, April Martin, Roseann Ungaro, and 
Eric Mendelsohn 
New York Veterans Administration Regional Office, New York, and 
Research Center for Mental Health, New York University 


In two studies, obese women were treated in a behavior modification program 
for overeating, in Study 1 for 8 weeks and in Study 2 for 12 weeks. In both 
studies, the behavior programs were accompanied by subliminal stimulation, 
with half of the subjects receiving the verbal message MOMMY AND I ARE ONE, 
intended to stimulate symbiotic gratification fantasies, and the other half, a 
control message. Weight loss was measured at the end of the program and at 
follow-up times; in Study 1, 4 weeks after termination and in Study 2, at 4 and 
12 weeks posttermination. In both studies the symbiotic condition gave evidence 
of enhancing weight loss, though it was only at follow-up that the difference 
between the groups attained statistical significance. This finding, when viewed 
in conjunction with results from earlier studies of schizophrenics and insect 
phobics, supports the proposition that the subliminal stimulation of symbiotic 
fantasies can enhance the effectiveness of therapeutic interventions of various 


kinds. 


During the past 12 years, a research method 
termed subliminal psychodynamic activation 
has been used in the experimental study of 
a critical aspect of psychoanalytic theory: 
the relationship between psychopathological 
behavior and unconscious libidinal and ag- 
gressive wishes. In most of this work, the 
stimuli were designed to stir up these wishes 
with the prediction that their subliminal pre- 
sentation (as compared to the subliminal pre- 
sentation of relatively neutral stimuli) would 
measurably intensify particular kinds of psy- 
chopathology. In over 20 studies completed 
to date (summarized in Silverman, 1976), 
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Place, Room 450, New York, New York 10003. 
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this expectation has been borne out. The sub- 
liminal presentation of a wish-related stimu- 
lus produced pathological reactions that did 
not appear after the subliminal presentation 
of a neutral stimulus; and in a number of 
studies, they also did not appear after the 
wish-related stimulus was presented supt 
liminally and in the subject’s awareness. 

In another aspect of the research, however, 
instead of using stimuli designed to stir 4p 
unconscious wishes and intensify psychopath 
ology, a stimulus was used that was intende! 


1 These studies should be distinguished from m 
more traditional experiments in the “subliminal na 
that have aimed at “finding” the tachistoscopicaly 
exposed stimulus (in transformed guise) in the ps 
sequent productions of the subject, rather apet 
observing its pathological effects. For the shies 
whose contact with the subliminal area ended ae 
the early skeptical critiques of the phenomenon, ty a 
the recent exhaustive and detailed review of Oe 
(1971). Finally, for a discussion of why the BA 
liminal presentation of the same stimuli usually pa 
to trigger a pathological reaction and why poia r 
presentation is then the method of choice for vi 
laboratory study of psychodynamic aspects of psy’ 
choanalytic theory, see Silverman (1972). 


domain. 


; 


; gratify a particular wish and reduce 

. This stimulus was the verbal mes- 
AND I ARE ONE. Its use was based 
following two interrelated assump- 
1) The fantasized gratification of the 
for oneness with mommy—the good 
of infancy—can ameliorate psycho- 
ogy of various kinds (cf. Silverman, 
and (b) the subliminal presentation 
ds MOMMY AND I ARE ONE has the 
to activate this fantasy. 
upport of the above assumptions, the 
ing can be cited: First, in experiments 
out with eight groups of male schizo- 
cs (Bronstein, 1976; Kaplan, 1976; 
1975; Leiter, 1973; Silverman & Can- 
1970; Silverman, Spiro, Weissberg, & 
ll, 1969; Spiro, 1975), the subliminal 
tation of this “symbiotic gratification 
s? when compared with the (sub- 
al) effects of a neutral control message 
found to reduce the degree to which 
thology” is manifested within a lab- 
/ session. Second, with two groups of 
omosexuals (research volunteers), an- 
ameliorative effect has been found—a 
se in anxiety and defensiveness within 
fatory session after the subliminal ex- 
of this same symbiotic gratification 
e (Silverman, Kwawer, Wolitzky, & 
1973), 
‘Addition to the above findings, there 
een three studies in which more than a 
tory effect” has been demonstrated. In 
ilverman, Frank, & Dachinger, 1974), 
Ctiveness of the symbiotic gratification 
$ as an aide in the behavioral treat- 
Of insect phobias was demonstrated. 
y women with insect phobias were seen 
Weekly for six sessions. The first and 
ssions were for pretreatment and post- 
lent assessments of the degree of pho- 
ith the intervening four sessions for 
ment—a variant of systematic desen- 
on. At each treatment session, subjects 
= Scenes of insects that they had pre- 
sy arranged hierarchically for their anx- 
Ousing effects. Subjects began with the 
earful image and progressed to more 
hing scenes, After each image, the sub- 
i Bave a subjective rating of the degree 
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of discomfort that they experienced. When 
discomfort ratings exceeded a specified level, 
the subjects looked into the tachistoscope for 
subliminal stimulation, Stimulation was re- 
peated until the discomfort ratings for a par- 
ticular image were below the criterion level, 
and subjects then progressed to the next image 
in the hierarchy. In the usual systematic de- 
sensitization paradigm, the imaging is accom- 
panied by deep muscle relaxation. This study 
substituted subliminal stimulation for the 
muscle relaxation technique, with the experi- 
mental group receiving MOMMY AND I ARE ONE 
while subjects in the control group received 
the stimulus PEOPLE WALKING, intended as a 
(relatively) neutral verbal message. Follow- 
ing the four intervention sessions, on measures 
of both avoidance and anxiety, the experi- 
mental subjects showed a significantly greater 
degree of improvement of their phobic symp- 
toms than the control subjects. 

A second experiment (Silverman, Levinson, 
Mendelsohn, Ungaro, & Bronstein, 1975) in- 
vestigated the effects of stimulating symbiotic 
fantasies during brief therapy with recently 
hospitalized male schizophrenics, Forty sub- 
jects were seen individually, three times a 
week, over a 6-week period. Treatment con- 
sisted of a “fantasy expression” procedure, in 
which the subjects were shown pictures and 
were encouraged to fantasize about them, with 
special emphasis on deriving pleasure from 
the fantasy and stressing the distinction be- 
tween fantasy and reality. Half of the sub- 
jects were subliminally stimulated with 
MOMMY AND I ARE ONE several times during 
each fantasy expression session. The other 
half received as a control the stimulus PEOPLE 
ARE WALKING. Pretreatment and posttreat- 
ment assessments were made of “ego pathol- 
ogy” on the basis of cognitive and projective 
tests, interview ratings, and ratings of ward 
behavior made by the nursing staff. Both 
groups showed a reduction in the amount of 
ego pathology in evidence after treatment, 
but those who had received the experimental 
stimulus showed a significantly greater re- 
duction. 

In the third study (Parker, 1977), two 
and female college under- 


s of male 
Sat matched 


graduates (7 = 20 in each group), 
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for academic performance, were given tachis- 
toscopic stimulation at the beginning of a 
class four times a week over a 6-week sum- 
mer term. For one group the stimulus was 
MOMMY AND I ARE ONE, whereas for their 
matched counterparts it was PEOPLE ARE 
WALKING. The students in the former group 
received grades on their final exam ( “blindly” 
marked) that were significantly and substan- 
tially higher than did the controls (average 
marks of 90.4% and 82.7%, respectively).? 

The present study was intended as an ex- 
tension of this earlier work with a new sub- 
ject population—obese women—and accom- 
panying a different intervention—behavior 
modification training in weight control. It was 
hypothesized that for persons being treated 
for obesity with this form of therapy, those 
whose treatment was accompanied by the sub- 
liminal presentation of the symbiotic stimulus 
would lose more weight than similar persons 
in the same treatment who were presented 
with subliminal neutral stimulation. In addi- 
tion to the point cited earlier about the gen- 
eral ameliorative effects of symbiotic grati- 
fication fantasies, the following specific ra- 
tionale can be offered for predicting such a 
finding with this population: From psycho- 
analytic clinical observations of obese patients 
(Bruch, 1973; Bychowski, 1950), it can be 
inferred that overeating for the obese person 
is often motivated by ungratified unconscious 
wishes for a symbiotic experience. Thus, the 
fantasied gratification provided by repeated 
subliminal exposure to the MOMMY AND 1 ARE 
ONE stimulus was expected to make their over- 
eating less necessary and thus aid them in 
successfully using the weight control program.® 


Study 14 
Method 


Subjects. Thirty obese women were recruited 
through an advertisement in a local newspaper. All 
subjects were at least 15% overweight, based on 
the 1959 Metropolitan Life Insurance norms for 
men (U:S. Department of 
Welfare, 1967). Percentage 
ermined using the middle 
en for a woman of medium 
frame at a given height as a baseline, To be eligible 
for the study, each subject had to state that she felt 
herself to be an overeater, that she was not currently 
involved in any organized program of treatment for 


desirable weights for wo 
Health, Education, and 
of overweight was det 
weight of the Tange giv 
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obesity, and that she had the time and motivatior 
to attend the treatment sessions. In addition, a brig 
interview was conducted with each Potential sub 
ject to screen out psychotic and borderline-psychotj 
applicants. Four women were excluded on this basi 
During the course of data collection, three subject 
dropped out after the first session (two of the 
had been assigned to the experimental group am 
one to the control), and they were replaced by othe 
applicants. 

Subjects were randomly assigned to an experi) 
mental and control group. The overall sample range 
in age from 22 to 59, with mean age of 30.7 for th 


2 In two other recent studies, subliminal symbiot 
stimulation has been used with college studenti 
Sackeim (1977) found that within a laboratory sé 
sion, the MOMMY AND I ARE ONE stimulus heightene 
self-esteem (as measured by a semantic differential 
scale). On the other hand, Condon (1976) obtaines. 
negative results in attempting to replicate the findi 
ings of Silverman, Frank, and Dachinger (1974) 
Here, it may be important that the population Com 
don used, unlike the original population, did m 
consist of persons seeking treatment for their pho k 
bias. Instead, the sample was comprised of studeni 
who although manifesting a certain degree of phob 
symptomatology, entered the study to fulfill a psy} 
chology class requirement. It thus may be that {dl 
subliminal symbiotic stimulation to enhance the o 
fectiveness of a treatment intervention, individual 
must be motivated to overcome whatever behavid 
the treatment is intended to address. Further reseattl 
is planned to test out this and other possibilities W 
account for Condon’s nonreplication. 

* The “wishes for a symbiotic experience” refer if 
to above can be seen as related to the “symbioll 
phase of development” (cf. Mahler, Pine, & Beg 
man, 1975), defined as the period of infancy wht 
differentiation from mother and a sense of separat 
ness from her are minimal and most incompld 
The “oneness” with her at this time can serve ; 
number of needs: her presence is guaranteed; nE 
“omnipotence” is shared; nurturance is always aval 
able; and she can offer both protection against e 
ternal dangers and assistance in mastering pe 
dangers—that is, helping to control unacceptan h 
impulses of various kinds. A legacy of this symbiol 7 
Phase of development is the merging wishes referred 
to above that are viewed as characterizing, tora 
ing degrees, different people throughout their ia 
That is, to the extent that needs for protectia 
omnipotence, nurturance, and so on, are povi 
and still sought in the manner of early no 
wishes for oneness with “mommy” arise. The ex 
to which they are then gratified depends on w 
gree of internal conflict such wishes generate as 
as on external circumstances. ; <certatidl 

*This study was part of a doctoral eee 
submitted by the second author (Martin, 197 fs 
Partial fulfillment of the requirements for a doci 
degree at New York University. 
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perimental subjects and 31.8 for the control sub- 
ects. The percent overweight for the sample ranged 
fom 15% to 94%, with means of 40.4% for the 
pperimental group and 42.7% for the control. 
Neither of these differences approached significance, 
or did the difference between the groups in racial 
mmposition (93% white). The subjects were asked 
io estimate the length of time that they have been 
verweight. The mean number of years estimated 
or the experimental group was 18.3 and for the 
wntrol group 19.0, a difference that proved to be 
jonsignificant (t < 1). : 
Stimuli and tachistoscope. The symbiotic gratifica- 
jon stimulus consisted of the verbal message MOMMY 
IND I ARE ONE, printed in ink on a white 3 X 5 
ard in capital letters with the words MOMMy AND I 
n one line and the words ARE ONE on a second line. 
nother card contained the neutral control message 
HOPLE WALKING presented on one line. The stimuli 
ete shown through an electronically controlled 
lirror tachistoscope, The subject looked through an 
yepiece at a blank field, and the stimulus was ex- 
osed from a second field. The viewing distance was 
4 inches (1.3 m), and the surface brightness of a 
hite card for the intensity settings of both fields 
as 5 ftL. (17.1 cd/m*). Exposure time for the 
imulus was 4 msec. In previous experiments under 
ese conditions, no subject was able to recognize 
è content aspects of any stimulus, and less than 
M% could discriminate between flashes of light pro- 
led by different stimuli (cf. Silverman, 1976). 
Procedure. Subjects were randomly assigned to 
Ie experimental (symbiosis) or the control group. 
wo “interventionist,” graduate students in psy- 
ology, conducted the behavioral treatment ses- 
ms, each working with an equal number of sub- 
tts in each stimulus group. The interventionists 
tntified the stimuli by code letters appearing on 
e back of the stimulus cards and remained “blind” 
toughout the data collection as to which group 
ch subject was in, 
a subject met with her interventionist indi- 
Y, once a week for 8 consecutive weeks. At 
on the subject’s weight, in indoor clothing 
Bae sy was measured using a conventional 
a ee e. Treatment sessions were ł hour 
E peek program was designed to be simi- 
3 a ehavior modification treatment program 
A ae, ed by Wollersheim (1970). Subjects 
A ucted on how to keep records of the food 
y € and its caloric content, how to systemati- 
Recs the number of situations in which they 
ee : to eat more slowly and with more aware- 
Leh and how to reward themselves for 
ee bee behavior. (The structure of the 
bed and the Specific techniques used are de- 
ete fully in Martin, 1975.) 
n the e emning and end of each treatment ses- 
histosco ul ject was instructed to look into the 
tulus. a a a presentation of the subliminal 
RTN rst presentation was introduced in 
ine h g manner: The subject was asked to 
erself in a situation in which she felt 
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tempted to overeat. She was asked to describe this 
situation fully—the place, time, circumstances, the 
food she was craving, and so on, until she reported 
that the image was very vivid. For example, she 
might describe her feeling when seeing cupcakes in 
the bakery window as she passed it. The experi- 
menter would then say: 


People usually feel a kind of tension when they 
want to eat something but are trying to tell them- 
selves to resist it because they want to lose 
weight. This is a machine [the tachistoscope] 
which presents flashes of light. Researchers have 
found that these flashes can be useful in helping 
people to relax. If you are able to relax yourself 
at moments when you are craving something to 
eat which you know you shouldn’t, you will be 
able to make a calmer decision to resist it. So I 
want you to look into the viewer, holding that 
image of the cupcakes trying to say “no” to them. 
I will say, “Ready, get set,” and then you will 
see a flash of light. I will repeat this for a sec- 
ond flash a few seconds later. I want you to use 
these flashes to help you relax and walk calmly 
away from the bakery window. 


The subject also was instructed that outside of the 
treatment sessions, whenever she found herself about 
to overeat, she should form a mental image of the 
flash of light she had seen in the machine and then 
try to refrain from eating. She was told that this 
would get easier when she had more experience with 
the flashes and that she would have a chance to see 
them again at the end of the session and at the 
beginning and end of every subsequent session. 
Eventually, it was explained, she would find that 
as she reached for food in an inappropriate way, she 
would automatically remember the flash of light 
and be able to remind herself to return to appro- 
priate eating. The rationale for presenting the sub- 
liminal stimulus in this manner was that it served 
to arouse the subject’s tension concerning eating be- 
havior, which then could be reduced by the symbiotic 
fantasies that were being tachistoscopically activated. 
Thus, it was analogous to the procedure used in the 
desensitization study earlier described (Silverman et 
al., 1974). 2 

Four weeks after the program ended (ie., 12 weeks 
after the first session), the subjects returned for a 
follow-up weigh-in and were debriefed. The de- 
briefing consisted of explaining the rationale for the 
study, showing both stimuli that had been used and 
informing each subject of the stimulus to which 
she had been exposed. Without exception, the sub- 
jects expressed surprise and in many instances dis- 
belief that any stimulus had been exposed, insist- 
ing that they consciously saw nothing more than 
flickers of light during the experiment. 


Results 


Table 1 presents the mean weights of both 
groups initially, at the end of treatment (8th 
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Table 1 

Means and Standard Deviations of Weight in 
Pounds for Study 1 
CE SEES In, PSS eS ey eT Sy 


Experimental group Control group 


Time M SD M SD 
Pretreatment 181.8 24.6 183.8 40.6 
Posttreatment 

(8th week) 173.8 27.0 178.5 40.7 
Follow-up 
(12th week) 170.9 27.2 179.4 40.9 


Note. 1 pound = .4536 kg. 


week), and at follow-up (12 weeks). Anal- 
yses were carried out in which the initial 
weights were covaried out of the 8th and 
12th week weights.’ At the end of treatment, 
the results, although in the hypothesized di- 
rection, were not significant, F(1, 27) = 1.84, 
p < .18. At follow-up, however, the difference 
between the two groups was significant, F(1, 
27) = 7.08, p < .01. As Table 1 indicates, 
significance was obtained at the latter time 
because during the 4-week follow-up period, 
the two groups behaved differently. The ex- 
perimental group, on the average, continued 
to lose weight while the control subjects 
gained. Since the subjects had not been de- 
briefed prior to the follow-up weigh-in, the 
difference in their eating behavior during the 
follow-up period can be ascribed to the con- 
tinuing effects of the differential subliminal 
stimulation. 

Table 2 presents the analogous findings for 
percent overweight. These results closely par- 
allel the results for actual weight, with the 
difference between the two groups approach- 
ing significance after 8 weeks, F(1, 27) = 


Table 2 


Means and Standard Deviations of Percent 
Overweight for Study 1 


Experimental group Control group 
Time M sD M SD 
Pretreatment 40.4 20.1 42.7 26.6 
Posttreatment 7 
_ (8th week) 34.1 31.9 39.8 26.9 
Follow-up 
(12th week) 32.0 19.9 40.5 27.7 
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3.75, p < .06, and reaching significance afte 
12 weeks, F(1, 27) = 9.77, p < .004, 


Study 2 


This study was intended as a replicatio 
and extension of Study 1, with several smal 
changes and additions instituted. 


Method 


Subjects. Subjects again were recruited throug! 
newspaper advertisements. The criteria for selection 
were the same as in the initial study, except tha 
a more extensive procedure was used for screening 
out psychotic and borderline-psychotic subjects, Ror. 
schach and figure-drawing protocols were collected 
in an initial intake assessment, which together with 
the impression made by the subjects in a brief inter 
view were reviewed by one of the authors (LHS) 
who has had extensive clinical experience. Six sub- 
jects were eliminated on this basis. Of the subjects 
who began the program, 11 dropped out (5 from 
the experimental and 6 from the control group) 
and were replaced. The final sample consisted of 13 
subjects in each of the two groups, with mean ages 
of 31.7 and 36.1 (ages ranged between 22 and 57) 
and mean percent overweight of 37.9 and 37.4 (per 
cent ranged between 15 and 118) for the experi- 
mental and control groups, respectively. Neither of 
these differences approached significance, nor did 
differences between the groups in racial composition 
(92% white) or in the number of years overweight 
(mean experimental = 20.1 and control = 21.0). 

Stimuli and tachistoscope. The tachistoscopic con- 
ditions were identical to those of the initial expen 
ment. However, the control stimulus was slightly 
altered so that it now consisted of the words PEOPLE 


*It is to be noted in Table 1 that with regard 
to the initial weights of the two groups, althoug 
their means were comparable, the standard deviation 
for the control group was considerably larger thay 
the standard deviation for the experimental iar 
However, an analysis of covariance still could 
carried out, since a test for homogeneity of teed 
sion for the two goups revealed that homogen d 
was in evidence. Furthermore, with regard to E 
comparability of the experimental and control or 
for percent overweight (to be discussed) the ET 
dard deviations for the two groups were not 
crepant. ent 

ê In this study (in contrast to Study 1), beir 
overweight was calculated taking into account n 
subject’s frame (designated as “small,” ‘medium 
or “large”). Also, in assigning subjects to the ae 
mental and control groups in Study 2, an effort Ye 
made to keep the groups equivalent for actual wei 4 
and percent overweight. In Study 1, however, 5 
jects were randomly distributed. 
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| agp WALKING (instead of PEOPLE WALKING) and was 
| printed on two lines instead of one. This change was 
made so that the control stimulus would be more 
| structurally similar to the experimental stimulus 
| MOMMY AND I ARE ONE, which was printed on two 
lines. 

Procedure. Four graduate psychology students not 
associated with the first study served as interven- 
tionists. Three were female and one was male, and 
each saw an equal percent of subjects in the ex- 
perimental and control groups. As in the first study, 
the interventionists were blind to the tachistoscopic 
condition for each subject. 
| The procedure was identical to that used in Study 
| 1, with the exception that it was extended from 8 
| weeks to 12 weeks. Also, a second follow-up weigh-in 
was conducted 8 weeks after the first. Thus, weights 
were recorded initially, at 12 weeks (when the be- 
havior program and subliminal stimulation ended), 
after 16 weeks (follow-up 1), and after 24 weeks 
“(follow-up 2). Additionally, to determine whether 
weight loss would be accompanied by symptom sub- 
stitution, the subjects also were given a symptom 
tating scale to fill out at the same four points in 
time. This was a variant of the Symptom Check 
List (90 items) (Derogatis, Lipman, Rickels, Uhlen- 
hutts, & Covi, 1974) in which they were asked to 
| indicate on a 5-point scale the degree to which each 
of 49 psychiatric symptoms were present.” 

Finally, at the time of the second follow-up, in- 
formation was elicited from the subjects about the 
extent to which they had made use of the two as- 
pects of the therapy program during the prior 24 
Weeks, outside of the treatment sessions, That is, 
a were asked to rate on a 10-point scale, ranging 
i not at all” (1) to “extremely frequently” (10), 
ee weekly use of (a) the behavior modi- 
Realy echniques and (b) the practice of forming 

ntal images of the flashes of light when they were 
trying to refrain from eating. 


Results 


{ÍT E 


Byles 3 and 4 present the data on the ac- 
s oe and percent overweight for Study 
a nalyses of covariance revealed that these 
esults closely parallel those from Study 1. 
ES end of treatment, the differences be- 
Sen the experimental and control groups 
3 € not significant on either measure, F0, 
eoe p = .130, and F(1, 23) = 1.86, 
z a for actual weight and percent over- 
a t, respectively. However, at the first fol- 
i. weeks later (or 16 weeks after treat- 
nific egan), both measures attained sig- 
ee F(1, 23) = 5.03, p= 033, and 

Ms 23) = 4.65, p = .039, respectively. At 
o ra follow-up 8 weeks after the first 
. 4 weeks after the program began), the 


i 
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Table 3 
Means and Standard Deviations of Weight in 
Pounds for Study 2 


Experimental group Control group 


Time M SD M SD 
Pretreatment 168.6 25.7 171.8 33.5 
Posttreatment 

(12th week) 156.8 25.9 164.7 32.5 
Follow-up 1 

(16th week) 154.6 26.7 165.1 26.1 
Follow-up 2 

(24th week) 1535501275. 165.9 28:2 


Note. 1 pound = .4536 kg. 


difference between the two groups was again 
significant on both measures, with the means 
even further apart than they were at the time 
of the first follow-up, F(1, 23) = 7.46, p= 
.011, and F(1, 23) = 7.40, p = .012, for ac- 
tual weights and percent overweight, respec- 
tively. As in Study 1, the significant differ- 
ences at the time of follow-up were due to 
the experimental group subjects’, on the aver- 
age, continuing to lose weight after the be- 
havior treatment sessions ended, whereas the 
control group gained back some of the weight 
that had been lost. 

On the symptom rating scale, both groups 
showed a significant reduction in pathology 
reported from the initial assessment to the 
16-week assessment (t = 2.84, p = .015, and 
t= 3.03, p= 011 for the experimental and 


7 However, sufficient data were available for com- 
paring the experimental and control subjects only 
initially and at Follow-up 1. At these times the sub- 
jects filled out the rating scale in our laboratory, 
whereas for the postassessment and for Follow-up 
2, many subjects took the scales home with them 
but never returned them. It also should be noted 
that at Follow-up 2, five subjects (two experimental 
and three control) could not return to our labora- 
tory for a weigh-in either because they were out of 
town or because their job situations would not allow 
it. We contacted them by telephone and asked them 
to weigh themselves elsewhere and then phoned them 
a day later for their weights. Although we cannot 
be certain that the weights they reported would co- 
incide with what their weights would have registered 
on our own scale, they followed the same pattern 
as the subjects in their stimulus groups who came in 


for the weigh-in. 
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Table 4 

Means and Standard Deviations of Percent 

Overweight for Study 2 

ee A a 
Experimental group Control group 


Time MSD: -M SD 
Pretreatment 37.9 26.7 36.3 26.6 
Posttreatment 

(12th week) 28.5 25.5 30.3 26.8 
Follow-up 1 

(16th week) 26.5 27.9 30.5 26.3 
Follow-up 2 

(24th week) 25.9 28.3 31.6 25.0 


control groups, respectively). The difference 
in symptom reduction between the two groups 
was nonsignificant. (With the premeasure co- 
varied out, F = .024, p = .99.) 

Finally, for their ratings (on a 10-point 
scale) of the degree to which they used the 
two aspects of the therapy program outside 
of the treatment sessions, the differences be- 
tween the two groups were also negligible. 
With regard to the behavior modification tech- 
niques, the average ratings for the experi- 
mental and control groups were, respectively, 
7.9 and 7.1, and for the use of the mental 
images of the flashes, 2.8 and 3.2 (ż <1 in 
both instances). 


Discussion 


Tt seems warranted to conclude that the 4- 
msec exposure Of MOMMY AND I ARE ONE 
over several weeks aided both groups of ex- 
perimental subjects in losing weight. It is 
true that this conclusion is based on the 
follow-up weights, but this does not detract 
from its significance, since there is nothing to 
indicate that anything other than the sub- 
liminal stimuli exposed during the treatment 
Program acted differentially on the groups of 
experimental and control subjects. In both 
studies the experimental subjects were given 
the same behavioral therapy program as the 
controls, and they had the same number of 
tachistoscopic exposures by interventionists 
who were blind to the stimulus messages. In 
the second study, there were self-report data 
that indicated that there was no difference in 
the frequency with which the two groups, 
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outside of the treatment sessions, used the 
techniques that they had been taught. Fur. 
ther, since the subjects themselves did not 
know what stimuli they were receiving, de- 
mand characteristics cannot be implicated to 
account for the significant differences at fol- 
low-up. Only the actual content of the stimuli 
that were subliminally exposed was different 
for the experimental and control subjects. 

The findings from the two studies reported 
here should be considered in combination with 
analogous results from investigations with (in- 
sect) phobics (Silverman et al., 1974), schizo- 
phrenics (Silverman et al., 1975), and under- 
graduate college students (Parker, 1977) 
cited earlier. In toto, the results indicate that 
the subliminal stimulation of the fantasy 
MOMMY AND I ARE ONE not only can lead to 
improved adaptation within a laboratory ses- 
sion, as earlier studies of subliminal psycho- — 
dynamic activation have demonstrated (sum- 
marized in Silverman, 1976), but when com- 
bined with an intervention that is effective in 
its own right and when given over a prolonged 
period of time, this effectiveness is enhanced 
outside the laboratory. Thus, this intervention 
can be viewed as having practical utility. 

As far as the intervention’s utility as an 
aid in weight control is concerned, it should 
be noted in the tables that the significant dif- 
ferences that were found were a function of 
the fact that while the control groups, on the 
average, regained weight during the follow-up 
periods, both experimental groups showed 
further weight loss; and in Study 2, even 
more weight was lost during the second fol- 
low-up period than during the first. This is f 
noteworthy, since as Hall, Hall, Hanson, and 
Borden (1974) pointed out, it is unusual for 
obese patients to continue to lose weight dur- 
ing follow-up periods. An examination of the 
individual data reveals that although only 
32% of the control subjects (in both studies 
combined) accomplished this, 84% of the ex- 
perimental subjects continued to lose weight. 
Thus, this experimental intervention may, be 
able to reverse a trend that typically limits 
the effectiveness of behavior modification 
treatment of obesity. On the other hand, it 
also should be noted that the largest (avet- 
age) weight loss reported for an experiment 
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goup in this pair of studies (15.1 pounds 
(68 kg) or a 12% reduction in weight in 
Study 2 over a 24-week period) must be 
viewed as modest and not beyond the range 
of what has been reported by other (rela- 
tively) successful intervention programs for 
weight control. Thus, if subliminal symbiotic 
stimulation is to be of substantive value in 
the treatment of obesity, ways should be 
sought to increase its potency. 

A comparison of the data from Study 1 
with that from Study 2 underscores the fact 
that what led to the differential behavior of 
the experimental and control subjects during 
the follow-up periods was not the additional 
| time that elapsed but their response to the 
nonavailability of treatment. By extending the 
program an additional 4 weeks in Study 2, 
the time between the preweights and post- 
| weights (12 weeks) was identical to the time 
| between the preweights and follow-up weights 
in Study 1. Yet even though this extra 4 
weeks led to increased weight loss for both 
groups (note in the tables the approximately 
30% greater posttreatment weight loss in 
Study 2 than in Study 1), it did not differ- 
entially affect the experimental and control 
subjects. Thus in Study 2, as in Study 1, the 
difference between the experimental and con- 
trol groups did not reach significance until 

the subjects were on their own for 4 weeks. 

Apparently, the crucial benefits of the sym- 
biotic stimulation was in allowing the sub- 

jects to retain their ability to diminish food 

intake in the absence of weekly treatment 

Contact. Or in psychoanalytic terminology, the 
_ Presumed activation of the MOMMY AND I ARE 

ONE fantasy may have allowed the subjects 

4 better internalize the techniques that they 

ad been taught. Whether this was because 

the therapist was unconsciously equated with 

a. (whom they now felt more at one 

ith) or for some other reason remains a mat- 

ter for further study. 

j A number of other questions also remains 
0 be addressed. First, it would be important 
i determine from continued follow-ups the 
ee of the “boost” that the subliminal 
ee ic stimulation gives to behavior modi- 

ion methods in controlling overeating. 
Second, there is the question of whether the 
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diminished overeating brought about by the 
symbiotic stimulation is at the expense of 
personality change that could be viewed as 
maladaptive. In Study 2, the fact that there 
was no difference in symptom reduction be- 
tween the experimental and control groups 
and that, in fact, both groups reported sig- 
nificantly fewer symptoms at 16 weeks than 
they did initially strongly argues against the 
operation of symptom substitution, However, 
as one of us has argued elsewhere (Silverman, 
1974) in evaluating whether symptom reduc- 
tion is “gained at a price” as a result of any 
kind of intervention, more than symptom 
substitution must be evaluated. From what 
can be observed clinically, the disappearance 
of symptoms is sometimes accompanied by 
the emergence of maladaptive behavior that 
is not experienced as a “symptom.” In evalu- 
ating any kind of therapeutic intervention, one 
should investigate whether, and if so to what 
degree, asymptomatic as well as symptomatic 
negative personality changes occur when 
symptoms remit, a practice that we plan to 
follow in investigations using subliminal sym- 
biotic stimulation. 

Third, there is the question of what the 
precise aspects are of the MOMMY AND I ARE 
ONE stimulus that account for its effectiveness 
in weight control. In work with other subject 
groups, data from several studies have indi- 
cated that in order for this stimulus to be 
ameliorative (when subliminally presented), 
it must contain a reference to “oneness,” but 
the inclusion of mommy as the person with 
whom the oneness is achieved is not essential.* 


8 With regard to the results on “oneness,” there 
were two relevant studies. Kaplan (1976) found that 
while MOMMY AND I ARE ONE reduced pathology in 
schizophrenics, other reassuring messages involving 
mommy (e.g., MOMMY FEEDS ME WELL and MOMMY 
IS ALWAYS WITH ME) did not have this effect. Bron- 
stein (1976) investigated the effectiveness of other 
“internalization” of mother messages (e.g, MOMMY 
1S INSIDE ME and MOMMY AND 1 ARE ALIKE) as well 
as MOMMY AND I ARE ONE and found that only the 
latter reduced pathology for schizophrenics. Even 
though the reference to oneness thus seems to be 
essential for an adaptation-enhancing effect, data 
from two other studies indicate that the fantasy of 
oneness does not have to involve mommy. Kaye 
(1975) found that the stimulus MY GIRL AND I ARE 
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Whether the same applies to obese individuals 
is something we are currently investigating. 

There is also an ethical question that can 
be raised about subjecting people to sub- 
liminal stimulation designed to affect their be- 
havior. This issue was dealt with in the cur- 
rent studies by debriefing the subjects at the 
experiment’s conclusion as to the content of 
their subliminal message and providing them 
with the opportunity to talk about their re- 
actions. More recently, we have been inform- 
ing subjects from the beginning that they will 
be receiving subliminal messages and have 
found that this does not prevent the Mommy 
AND I ARE ONE message from producing its 
ameliorative effect (Parker, 1977). A further 
step toward complete openness with subjects 
that currently is being explored is telling the 
subjects beforehand which subliminal mes- 
sages are being used, without telling them 
which particular message he or she is re- 
ceiving. 

Finally, a question can be raised about 
what mediates the effectiveness of the Mommy 
AND I ARE ONE message as an aid in weight 
control. Does it act synergistically with the 
behavior modification techniques, making it 
easier for subjects to learn and use the lat- 
ter? Or does it act in a more direct manner 
by strengthening behavior controls, reducing 
anxiety, or even by diminishing unconscious 
symbiotic longings? To make this determina- 
tion, measures will have to be obtained of the 
variables just cited in each treatment session, 
a research strategy that we plan to pursue. 


Reference Note 


1. Silverman, L. H. The unconscious symbiotic fan- 
tasy as a ubiquitous therapeutic agent. Paper pre- 
sented at the meeting of the American Psycho- 
analytic Association, Baltimore, Maryland, May 
1976. 


ONE also produced an amelioration of symptoms in 
male schizophrenics; and Parker (1977) in the earlier 
cited study 


MOMMY AND I ARE ONE, led to significantly higher 
grades than a neutral control stimulus. 
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University 


This study compared two behavioral treatments for marital discord with a 


of Iowa 


non- 


specific control and a waiting-list control. The behavioral treatments combined 


training in problem-solving skills with training in contingency management pro- 


cedures, differing only with Tespect to the contracting form: One group learned 


to form good faith contracts, 
couples were randomly assigned to one 
of three therapists. Improvement was a: 


and the other, quid pro quo contracts. Thirty-two 


of these treatment conditions and one 
ssessed by two observational measures 


and by two self-report questionnaires. On all measures, both behavioral groups 
improved significantly more than waiting-list couples. On three of the four mea- 


sures, behavioral 


couples improved significantly more than honspecific couples. 


The two behavioral groups did not differ from one another on any of the 


measures, 


Although an increasing number of reports 
have attested to the effectiveness of using so- 
cial-learning principles in treating marital 
problems, most of these reports consist of un- 
controlled case studies (cf. Jacobson & 
Martin, 1976), Those few controlled studies 
attempting to evaluate the effectiveness of 
such approaches have often been limited 
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either by equivocal findings (c.g., Harrell & 
Guerney, 1976), methodological inadequacies 
(Tsoi-Hoshmand, 1976), or the analogue na- 
ture of the study (Margolin, Note 1). r 

Despite the lack of definitive evidence in 
favor of a behavioral approach to marital 
therapy, the proliferation of suggestive find- 
ings warrants cautious optimism. Particularly 
promising is the combination of training in 
communication and problem-solving skills with 
instructions in contingency management pro- 
cedures, especially contingency contracting 
(Weiss, Birchler, & Vincent, 1974). A series of 
impressive case reports (Weiss, Hops, & 
Patterson, 1973), in addition to Margolin’s 
(Note 1) study, attest to the effectiveness of 
this combined treatment package. i 

Jacobson (1977a) evaluated the effective- 
ness of such a treatment program, comparing 
it with a minimal treatment, waiting-list con- 
trol group. On both observational and self- 
report measures, the couples receiving the be- 
havioral treatment package improved signifi- 
cantly more than did control couples. Within- 
subject analyses, using data collected by 
spouses in the home, tended to corroborate the 
group comparisons. A 1-year follow-up sug- 
gested that these positive changes had been 
maintained. 
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limited conclusions can be drawn 
this initial inquiry. In addition to the 
sample size and the inclusion of only 
rapist, it is unclear which factors ac- 
d for the effectiveness of the experi- 
treatment. 
present study served three purposes, 
which followed directly from the limita- 
$ of the prior study (Jacobson, 1977a). 
the prior study was replicated, using 
therapists instead of one. Second, a 
was included that was designed to con- 
for nonspecific factors in the experimental 
ent. The group was carefully con- 
cted to control for as many possible com- 
explanations for change as was feasible, 
some of the guidelines suggested by 
m and Baucom (1977). 
third purpose of the study was to 
light on a controversy regarding the type 
mtingency contracting that is most ef- 
us in the treatment of marital discord. 
(cf. Jacobson & Martin, 1976) have ad- 
a contracting procedure that Weiss 
(1974) have referred to as “quid pro 
contracting. Both partners agree to 
a change that the other has deemed de- 
le; one partner’s change serves as a re- 
et for the other’s change. Weiss et al. 
) cautioned against this form of con- 
ng in severely distressed relationships: 
argued that the contingent relationship 
en each spouse’s behavior change agree- 
Creates a “who goes first” problem; 
er partner is likely to change under such 
tances, given the degree of mistrust in 
rely distressed relationships. Another po- 
hazard of quid pro quo contracting is 
it a failure to change by one spouse sanc- 
the abdication of the other’s contractual 
sibilities. 
€iss et al. (1974) advocated an alterna- 
form of contracting, referred to as “paral- 
or “good faith” contracting. Here each 
Se independently agrees to initiate a 
ge in behavior desired by their partner; 
t than the changes being cross-linked to 
nother, each change is independently re- 
d and/or punished. 
Jacobson’s initial study (1977a), good 
Contracts were used. However, there is 
Mpirical evidence as yet to recommend 
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the good faith option. Good faith contracts 
carry with them a distinct disadvantage: They 
are less efficient, requiring the pinpointing of 
effective reinforcers and punishers outside the 
set of target problems defined by the partners. 
The specification of such consequences can be 
a cumbersome task. Thus, the present study 
sought to compare the effectiveness of quid 
pro quo and good faith contracting procedures. 

Couples seeking assistance for relationship 
problems were randomly assigned to one of 
three therapists and one of four treatment 
conditions: a behavioral treatment program 
using good faith contracts (GF group), a be- 
havioral treatment program using quid pro 
quo contracts (QPQ group), a nonspecific 
treatment condition (NS), and a waiting-list 


control group (WL). 
Method 


Subjects 


Client couples were solicited through three sources: 
(a) advertisements placed in local newspapers; (b) 
public service announcements on local radio stations; 
and (c) referrals from mental health agencies and 
individual clinicians. The vast majority of couples 
(n=32) who eventually participated in the study 
responded to one of the advertisements. The adver- 
tisement offered a new experimental treatment for 
couples experiencing marital problems. Couples were 
interviewed provided that they had completed the 
Marital Precounseling Inventory (Stuart & Stuart, 
1973), which was mailed to them before the sched- 
uled time of the initial interview. 


Therapists 


Three therapists treated 
were advanced graduate 
chology; the other was a 
‘All three had between 2 


perience. 
I was one of the advanced graduate student ther- 


apists. The other two received about 20 hours of 
training each from me, along with 1-14 hours of 


weekly supervision once the study began. 
ihe the first 4 months of the study, couples 


were randomly assigned either to Therapist 1 or 


Therapist 2. During the second 4 months of the 
study, couples were 


couples in the study. Two 
students in clinical psy- 
master’s level social worker, 
and 3 years of clinical ex- 


randomly assigned to either 
Therapist 1 or Therapist 3. 


Measures 


Marital Interaction Coding System (MICS ; Hops, 
Wills, Patterson, & Weiss, 1971). This was the same 
behavioral rating system used and described by 
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Jacobson (1977a). A modified version was used in 
the present study. Instead of using all 30 behavioral 
categories, couples’ behavior was coded according to 
1 of 12 categories; only verbal behavior was coded. 

The behavior coded by the MICS consisted of in- 
teraction by couples in problem-solving situations be- 
fore and after therapy. On each occasion, couples 
were given 5-10 minutes to solve a hypothetical mari- 
tal problem from the Inventory of Marital Conflicts 
(Olson & Ryder, 1970) and a “minor” problem in 
their relationship, which couples chose themselves. 
These discussions were videotaped and later coded 
by trained raters who were kept blind to experimental 
conditions. Reliability was assessed by dividing the 
frequency with which the two raters agreed that a 
particular category should be coded by the sum of 
agreements and disagreements, that is, the number of 
times a category was marked by one rater but not 
by the other. This quotient was then multiplied by 
100, Both raters coded each discussion for each couple 
in the study, so that reliability could be continuously 
assessed. The reliability quotients for particular prob- 
lem-solving discussions ranged from .56 to 1.00, with 
an average reliability quotient of .84. 

For the purposes of analyses, the 12 categories were 
further collapsed into 3: positive behavior, nega- 
tive behavior, and neutral behavior. Positive and 
negative behavior, recorded in terms of rate per 
minute, were the two dependent variables derived 
from the MICS. Scores for husbands and wives 
were combined so that each couple was analyzed as a 
unit. Similarly, for a given session, the hypothetical 
and “real” problems discussed were combined for 
analysis. 

Marital Adjustment Scale (MAS; Locke & Wal- 
lace, 1959), This traditional self-report index of 
marital adjustment was administered to spouses be- 
fore and after therapy. The scale provides an overall 
rating of marital satisfaction for each spouse, with a 
score of 100 (or 200 for a couple) as the cutoff be- 
tween a satisfying and unsatisfying relationship, For 
the purposes of analysis, the combined score for the 
couple was used, This test was administered to couples 
before and at the conclusion of therapy. 

Marital Happiness Scale (MHS). This rating 
scale is a subscale of Stuart and Stuart's (1973) 
Marital Pre-Counseling Inventory; couples are re- 
quired to rate their degree of happiness in regard to 
12 general categories of marital life, Since not all 12 
items were applicable to all couples, a score for a 
ven spol i ividi 
of the rating for each lem by ue sum total 

i ng tor each item by the total number of 
applicable items, yielding an average rating per item. 
Then spouse scores were combined, and each couple 
EE The unit. Lower scores indicated greater 

$ minimum score was 1.00, whereas 


a maximum score (suggestin; å ‘s, : 
ness”) was 5.00. Uggi ig maximum unhappi- 


Manipulation Checks 


Posttherapy session evaluat: 


2 ion. Afte: 
session, all spouses rated bo poe therapy 


th the therapy session and 
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the therapist on a variety of dimensions, using a 
questionnaire adapted from Lazarus (1971). Clients 
perceptions of the therapist were used as a means of 
evaluating the degree of stylistic and procedural simi- 
larity across therapists. In addition, certain items 
were used as indirect indices of the credibility of the 
various treatment conditions, # 

Descriptions of treatment procedures rated by 
undergraduates. A group of undergraduate students 
were randomly presented one of two written treat- 
ment descriptions: One described the behavioral treat. 
ment condition (without specifying the type of con- 
tracting form used), and the other described the non- | 
specific treatment condition. Subjects then com- 
pleted a brief questionnaire designed to assess the 
credibility of the procedure as perceived by each 
subject. 


Procedure 


At the conclusion of an initial interview, in which 
a brief history was taken, a set of problems was de- 
fined, pretesting was administered (MICS, MAS), 
and couples were randomly assigned to one of four 
treatment conditions: GF, QPQ, NS, or WL. WL 
couples were told that simply experiencing the 
evaluation and the initial interview had been helpful 
to some couples, and that they should wait for 8 
weeks and then return to decide whether or not they 
wanted further treatment. Such treatment was offered 
to couples on their return for‘posttesting. 

For the couples in the other three conditions, & 
brief description of the research aspects of the pro- 
gram was offered, Procedures subsequently varied de- 
pending on the condition to which the couple had 
been assigned. All treatment groups met weekly for 
1-1} hours, and there were eight treatment sessions 
in addition to the initial interview. 

NS group. Couples in this condition received @ 
treatment devoid of specific instructions in problem- 
solving and communication skills and without con- 
tingency contracting procedures. In all other respects, 
this condition was designed to duplicate the tw? 
behavioral treatment conditions. The condition was 
roughly equivalent to the behavioral treatment pro- 
gram on the following stylistic and procedural vari- 
ables: i 

1. Attention. PA SA 

2. Expectancies of therapeutic gain. Optimistic af 
confident statements were identical to those aired 
toward GF and QPQ couples. The degree to hta 
therapists succeeded in this endeavor was geten 
by clients’ ratings of their therapist following Pa 
therapy session. Couples were asked to rate the 0 
gree to which their therapists seemed “optimistic 
and “confident.” 

3. Credibility of treatment procedures. Ever. 
fort was made to design a treatment condition i 
would be perceived as credible. To obtain compar 
tive credibility ratings of the various treatment oe 
ditions that would not be confounded with Sr 
come, brief descriptions of each treatment proce’ a 
Were rated as to their credibility by undergraduates. 


y ef- 
that 
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4, Therapist activity level. In the experimental 
treatment conditions, the therapist spoke frequently. 
Similarly, in the NS condition therapists were asked 
to attempt to intervene as often as they did in GF 
and QPQ conditions. 

5. Directiveness-nondirectiveness. One frequent 
source of confounding in comparisons between ex- 
perimental and nonspecific treatment conditions in- 
volves the extent of therapist directiveness (cf. 
Jacobson & Baucom, 1977). Often, therapists in non- 
specific conditions are less directive than in cor- 
responding experimental conditions; this difference 
might be responsible for any discrepancies in treat- 
ment effectiveness observed between two groups. 

“Since in the experimental treatment groups the ther- 
apy sessions were structured and the therapist ac- 
tively initiated topics of conversation, such condi- 

‘tions were maintained for NS couples. Client’s per- 

“ceptions of therapy sessions were used to evaluate 
the therapists’ level of directiveness; in posttherapy 
session questionnaires, clients were asked, for ex- 

‘ample, “To what extent did the therapist initiate 

topics of conversation?” 

In addition, therapy sessions in the NS condition 
were structured. In fact, procedurally, the formats of 
‘the various conditions were virtually identical. 
Couples in all treatment conditions were asked to 
‘record their change agreements in writing. Also, NS 

couples engaged in homework assignments between 
treatment sessions. 

6. Opportunity to discuss marital problems. 

7. Presentation of a “rationale.” 

At the conclusion of the initial interview, NS 
Couples were asked to record all “affectionate” acts 
pititiated by the other and to designate each as either 
Pleasing” or “displeasing.” This assignment was on- 
going until the end of therapy. 

During the initial treatment session, couples were 
Presented with an explanation of the treatment pack- 
age. The program was presented as an attempt to 
maximize the effectiveness of many behavior change 
Strategies, each of which by itself has proven some- 
what effective in bringing about desirable changes. 

The treatment sessions were structured such that 
‘Problems specified during the initial interview were 
| ace Each problem was discussed by the couple 
i some agreement was reached. The therapist par- 

nea actively in such discussions, but at no time 

A Si he! suggest strategies for solving a particu- 

a ae lem. Nor did he (she) present couples with 

cific behavioral feedback regarding their per- 
‘ormance, 

Abe the therapist’s responses were of four types: 
| she) e (she) asked factual questions; second, he 
Rais restated clients’ verbalizations with an em- 
third a the apparent “affect” underlying them; 

Rar (she) made interpretative comments on the 

“ane the interaction process between the spouses; 

ae the therapist could emit self-disclosing re- 

| Bike either by offering them a personal reaction to 

“his oe in therapy or by offering examples from 

(her) own life. 
he structure of treatment sessions was identical to 
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that of the behavioral conditions. At the conclusion 
of the 8-week program, all couples were offered the 
opportunity to receive further therapeutic assistance; 
referrals were arranged by therapists in the study if 
requests for further therapy were made. 

GF group. The treatment program that these 
couples received replicated the experimental treat- 
ment used in the prior study (Jacobson, 1977a) and 
is described in detail by Jacobson (1977a). Briefly, 
the first session began with an explanation of the 
treatment program, including a theoretical rationale. 
For the remainder of that session and Sessions 2 and 
3, couples were taught communication and problem- 
solving skills, according to guidelines described by 
Jacobson (1977b, 1977c). Beginning with Ses- 
sion 4, good faith contracting strategies were intro- 
duced; for the duration of the treatment program, 
couples were taught to end their problem-solving ses- 
sions with a written agreement specified in the form 
of a good faith contract, The last treatment session 
served as a posttest, during which the MAS, MHS, 
and problem-solving discussions (coded using MICS) 
were repeated. 

Throughout the treatment procedures, couples were 
assigned problem-solving practice sessions at home 
between therapy sessions. They kept a notebook to 
record the details of each practice session. 

Skills such as problem solving and good faith con- 
tracting were taught using coaching or modeling, and 
behavior rehearsal (cf. Jacobson, 1977c). 

QPQ group. Couples in this condition received 
treatment identical to that received by GF couples, 
with one exception: Change agreements taught to 
couples in this group took the form of quid pro quo 
exchanges (cf. Weiss et al, 1974), From Session 4 
until the end of the treatment program, couples were 
taught to make exchanges, whereby the husband 
would make a change desired by the wife in return 
for a change by the wife desired by the husband, 
The specified reinforcer in every instance was the 
change agreed to by the other spouse. 


Results 


Data Analysis 


The primary statistical technique used was 
multivariate analysis of covariance, with pre- 
test scores on each of the criterion variables 
used as covariates. Only couples who com- 
pleted all treatment sessions and were avail- 
able for posttesting were included in the anal- 
ysis for treatment effects. Therapist effects as 
well as Therapist X Treatment interactions 
were analyzed with the WL condition ex- 
cluded. Dropouts were included in this analy- 
sis; if such couples were unavailable for post- 
testing, no change was assumed and posttest 
scores were assigned that were identical to 
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pretest scores. Given the small number of 
couples in each treatment cell, to exclude 
dropouts would have falsified actual therapist 
performance, since these were the couples who 
responded least favorably to treatment. The 
inclusion of dropouts added two couples to 
the total sample, adding one GF couple to 
Therapist 3 and one NS couple to Therapist 2. 

Planned comparisons were used for the 
three individual comparisons of greatest in- 
terest (see below), using nonorthogonal con- 
trasts with the alpha level set at .02. Planned 
comparisons were analyzed using analysis of 
covariance, with pretest scores serving as co- 
variates. Post hoc comparisons were analyzed 
using Tukey’s honestly significant difference 
test. For these comparisons, pre-post differ- 
ence scores were used. Other analyses are de- 
scribed below, 


Pretreatment Characteristics of Couples 


Demographic characteristics. Couples in 
the various treatment conditions were com- 
pared on age of husband, age of wife, dura- 
tion of marriage, number of children, and 
education level for both husband and wife. A 
multivariate analysis of variance failed to un- 
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cover differences between treatment conditions 
on these various measures. 

The age of husbands averaged 32.9 years; 
wives’ age averaged 31.8 years. Couples had 
been married for an average duration of 7.9 
years. The number of children averaged 1,32 
per couple. 

Pretreatment differences on criterion mea- 
sures. Table 1 shows pretreatment means for 
all treatment conditions on the four criterion 
measures used in the study. The overall multi- 
variate test comparing the groups on these 
measures was nonsignificant. 


Therapist Effects 


Therapist effects were analyzed by con- 
ducting a multivariate analysis of covariance 
with pretest scores on all dependent measures 
serving as covariates. WL couples were ex- 
cluded from this analysis. The test for ther- 
apist differences was nonsignificant, F(8, 20) 
= .971, p > .50. 

The overall multivariate test for the Thera- 
pist X Treatment interaction was similarly 
nonsignificant, F(16, 31) = .41, p > .75. 

Thus, there was no evidence of differential 
therapist performance. Nor was there any in- 


Table 1 
Cue and Posttreatment Scores on the Four Dependent Measures for the Four Treatment 
roups 
Negative behavior® Positive behaviors MAS MHS 
Group n Pre Post Pre Post Pre Post Pre Post 
GF 8 Z 
M 6.60 3.27 3.08 4 
7 i3 .99 173.50 214.13 2.99 2.02 
SD 2.20 1.75 1.00 1.50 29.67 31.56 07 50 
QPQ 9 
SD each Eo 158.56 205.56 299 2.31 
NS 7 20.73 33.37 49 
M 6.27 5.64 
: . 3.73 3,23 6 4 2.53 
SD 2,44 249 167.57 173.56 3.1 
ý x 2.49 T .90 
WL z 62 44.27 50.33 68 
M 
SD We ga 3.88 2.23 140.68 123.5 3.40 3.47 
: .11 1.69 1.28 23.01 25.34 27.90 


combination of a husband's 
Pro quo; NS = nonspecific treatment; W! 
Marital Happiness Scale. 


dication of change in relative therapist per- 
formance as a function of treatment condi- 
tion. Subsequent analyses ignored the thera- 


pist factor. 


Treatment Effects 


A multivariate analysis of covariance was 
conducted to determine whether an overall 
‘reatment effect existed; in this analysis, pre- 
test scores on the four dependent measures 
were used as covariates, The analysis indi- 
ated that a treatment effect did exist, F(12, 
$0.56 = 3.36, p < .001, Subsequent analyses 
specifying the nature of the treatment effect 
are presented below for each of the criterion 
measures separately. 

Negative behavior. Table 2 indicates the 
Significance tests for each of the four treat- 
ment groups in regard to the null hypothesis 
that the pretest-posttest difference scores 
were equal to zero for negative behavior. As 
Figure 1 indicates, the two behavioral treat- 
ment groups (GF and QPQ) as well as the 


FREQUENCY OF NEGATIVE BEHAVIORS (rate per minute! 


e—a nonspecific group 
waiting-list group 
good taith group 
quid pro que group 


Pretest 
posttest 


TIME OF ASSESSMENT 


Figur 
the Ae 1. Changes in negative behavior for each of 
ur treatment conditions. 
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Table 2 

Significance Tests Regarding Pretest—Posttest 
Difference Scores for Each Treatment Condition 
on Each of the Four Criterton Measures 


eS Ee 


Group and measure MS F 
GF 
Negative behavior 88.511 22.507*** 
Positive behavior 25.49 6,983* 
MAS 13,203.145 23.938*** 
MHS 1.644 55.806*** 
QPQ \ 
Negative behavior 50.505 12.842*** 
Positive behavior 34.496 9.451%" 
MAS 20,258.711 36,730*** 
MHS 30.881 30.881*** 
NS 
Negative behavior 167 042 
Positive behavior 1.760 482 
MAS 206.281 374 
MHS 18.310 18.310*** 
WL 
Negative behavior 15.974 4.062 
Positive behavior 16.236 4.448 
MAS 2,128.173 3.858 
MHS .028 .205 


N e EO A 
Note. MAS = Marital Adjustment Scale; MHS 
= Marital Happiness Scale. For all tests, df = 1, 26. 
*p <0. 
** p < .005. 
+ p < 001. 


NS group manifested, on the average, fewer 
negative behaviors at posttest than during 
pretest; in contrast, the WL group deterio- 
rated. For both the GF and QPQ groups, the 
changes were highly significant, whereas for 
the NS group the changes were nonsignificant. 
The WL couples’ trend toward deterioration 
bordered on significance. Thus, the only groups 
that changed si ificantly in the desired di- 


rection, in terms of the frequency of negative 


behavior, were the GF and QPQ groups. 
Planned comparisons of greatest interest 


were formed. As Table 3 indicates, the two 


ips combined were significantly 


behavioral grou! i 
more effective than the WL condition, F(1, 
25) = 31.17, P< 001. Also, the two behav- 


007. However, there were 
ferences between the two 


F(1, 25) = 49,2 < 45. ; 
Post hoc analyses were used for the remain- 


behavioral groups, 


E PET 


' abaan 
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ing paired comparisons.* For each comparison, 
a q Statistic was computed using Tukey’s 
honestly significant difference test. Both GF 
couples and QPQ couples improved signifi- 
cantly more than did WL couples (p < .01). 
However, only GF couples improved signifi- 
cantly more than did NS couples ($ < .01); 
although QPQ couples also changed more than 
did NS couples, the differences did not reach 
statistical significance. Differences between NS 
and WL conditions were nonsignificant. 

Positive behavior. Figure 2 depicts in- 
creases in positive behavior from pretest to 
posttest for each of the treatment conditions, 
Again, both behavioral groups changed sub- 
stantially in the desired direction; this time 
trends toward deterioration occurred in both 
the NS and WL conditions. As Table 2 in- 
dicates, the changes were statistically signifi- 
cant for both the GF group (p< .01) and 
the QPQ group (p < .005). 

Table 3 lists the planned comparisons for 
this measure. The analysis of covariance re- 
vealed that the behavioral groups were sig- 
nificantly more effective in increasing positive 
behavior than either the WL group (p< 


Table 3 
Planned Comparisons between Treatment Groups 
on the Four Criterion Measures 


--_—————— 


Comparison and measure MS F 

GF vs. OPQ 
Negative behavior 1.376 486 
Positive behavior -020 .007 
MAS 107.855 192 
MHS -356 2.635 

GF & QPQ vs. WL 
Negative behavior 88.214 31.171 *#** 
Positive behavior 41.052 14.131**** 
MAS 15,752.785 27,982**#+ 
MHS 3.443 25,463**8* 

GF & QPQ vs. NS 
Negative behavior 24.910 8.802** 
Positive behavior 20.118 6.925* 
MAS 7,031.145  12.490*** 
MHS 246 1.817 


Note. MAS = Marital 
Marital Happiness 
df = 1, 25. 
*p <01. 
** p< .007. 
*** b < 002. 
“nny < 001. 


Adjustment Scale; MHS = 
Scale. For all comparisons, 
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occa quid pro que group 
gi e—— good faith group 

= nonspecific group 
e-e waiting: list group 


FREQUENCY OF POSITIVE BEHAVIORS (rate per minute) 


oL_y 
pretest 


posttest 


TIME OF ASSESSMENT 


Figure 2. Changes in positive behavior for each of 
the four treatment conditions. 


001) or the NS group (p < .01). Using post 
hoc comparisons, the GF group alone was sig- 
nificantly more effective than the WL group 
(p < .01) and the NS group (p < .05). Simi- 
larly, the QPQ group was more effective than 
either the WL ( < .01) or the NS group (f 
<.05). Neither the two behavioral groups 
(p > .90) nor the two control groups ($ > 
05) differed significantly from one another. 

MAS. Figure 3 shows that on this self- 
report measure the two behavioral groups 
again improved substantially from pretest to 
posttest. With husbands’ and wives’ scores 
combined, a score of 200 is generally consid- 
ered to be the cutoff for a normal degree of 
marital adjustment. All four of the group 
means were well below this figure prior to 
therapy. Only the two behavioral groups aver- 
aged above 200 subsequent to therapy. The 
NS group scores remained virtually the same, 
but again there was a tendency for WL 
Couples to deteriorate. The only changes that 
were statistically significant, as indicated by 
Table 2, were those manifested by GF couples 
(Ż < .001) and QPQ couples (p < .001). 


*For more details, the reader is urged to write to 
the author, who will provide a way of obtaining 4 
copy of the complete doctoral dissertation. 


5 with the two observational measures, on 
he MAS there were no significant differences 
etween GF and QPQ groups. Taken to- 
ether, the two behavioral groups improved 
ficantly more than did either the WL 
p (p< .001) or the NS group (p< 
< Post hoc comparisons revealed that each 
“the behavioral groups improved signifi- 
antly more than did the waiting-list group. 
fin each instance, p < .01.) Similarly, each 
f the behavioral groups improved more than 
id the nonspecific group (p < .05). There 
ere no significant differences between the 
and NS groups. 
In addition to the GF and QPQ 
in showed substantial in- 
st to posttest on reported 
; JS group also showed positive 
anges on this measure; only the WL failed 
9 report such changes. As Table 2 indicates, 
leach case the improvement manifested by 
QPQ, and NS groups was statistically 
ficant. 
milarly, as Table 3 indicates, all three of 
le treatment groups improved significantly 
ore than did the WL group. However, they 
did not differ significantly from one another. 
re was a trend favoring the GF group over 
QPQ group, and the QPQ group over the 
group, but neither of these trends reached 
ical significance. 
Follow-up. An MAS form was mailed to 
ich spouse in the GF, QPQ, and NS groups 
k l-month, 3-month, and 6-month intervals 
porne the final treatment session. All 
Ouples who returned the forms on at least 
0 of the three occasions were included in 
follow-up analysis. Of the original sample, 
nly one couple in each of the three treat- 
ent conditions was excluded from the analy- 
Is, All three of these couples were treated by 
herapist 3. 
4 follow-up score was derived for each 
Duple by averaging the scores on forms re- 
ned by each spouse. These averages were 
n combined so that each couple could be 
: yzed as a unit. 
Difference scores were then computed for 
h treatment condition by subtracting pre- 
= Scores from follow-up scores. An analysis 
H variance that compared the combined mean 
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MAS SCORE (husbands and wives combined ) 


no 
pretest posttest 


TIME OF ASSESSMENT 


Figure 3. Changes in Marital Adjustment Scale 
(MAS) for each of the four treatment conditions. 


of the two behavioral groups with the mean of 
the nonspecific group indicated that the 
groups were significantly different in the di- 
rection favoring the behavioral groups, F 
(1, 24) = 13.01, p < .002. 

Within-group ¢ tests comparing posttest 
scores with follow-up scores revealed no sig- 
nificant differences between posttest scores 
and follow-up scores for any of the three 
treatment conditions. Thus, the differences 
favoring the behavioral groups relative to the 
nonspecific group on the MAS were main- 
tained at the time of the follow-up reports. In 
addition, group averages on the MAS did not 
change in any of the three treatment condi- 
tions from posttest to follow-up. 


Manipulation Checks 


session evaluations. 
jon, each spouse com- 
tionnaire rating the 
dimensions. It was 
would aid in deter- 


Clients’ posttherapy 
After each therapy sess 
pleted an extensive ques 
therapist on a variety of 
hoped that these ratings 
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mining whether (a) therapists performed simi- 
larly on relevant stylistic dimensions, (b) be- 
havior was consistent across treatment con- 
ditions, (c) the various treatment conditions 
were equivalent on various nonspecific stylistic 
and procedural dimensions, and (d) the vari- 
ous treatment conditions seemed equally cred- 
ible to couples. 

A subset of items from the questionnaire 
was analyzed by a multivariate analysis of 
variance, The units of analysis were means for 
each of the 12 items analyzed separately. 
Husbands’ and wives’ questionnaires were 
analyzed separately, and each mean reflected 
an average rating for a given spouse across 
therapy sessions. For husbands, the multi- 
variate test was nonsignificant for a treat- 
ment effect, F(24, 8)=.92, p>.50; a 
therapist effect, F(24, 8) = 1.57, p>.25; 
and a Therapist x Treatment interaction, F 
(48, 17.45) = 1.27, p> 30. Similarly, for 
wives, there was no significant effect for treat- 
ment, F(24, 8) = .95, p > .50; therapist, F 
(24, 8) = 1.58, p>.25; or Therapist x 
Treatment interaction, F(48, 17.45) = 1.96, 
p> .05. Thus, statistical analyses failed to 
support the notion that clients’ posttherapy 
ratings of their therapists depended on who 
their therapist was or which treatment condi- 
tion they were in. 

Evaluations of treatment rationales by un- 
dergraduate psychology students, After read- 
ing a brief description of either the behavioral 
treatment (with type of contracting unspeci- 
fied) or the nonspecific treatment, undergrad- 
uates answered a seven-item questionnaire de- 
signed to assess the apparent credibility of the 
treatment description. Each of the seven 
questions was scored separately. Responses for 
the two groups of subjects were compared 
using a Hoetelling’s T? test; the overall test 
was nonsignificant, F(7, 121) = .63, p > .70. 
On the basis of these responses, it appears 
that undergraduates found the two treatment 
rationales equally credible. 


Discussion 


In the present study, two variants of a be- 
havioral treatment Program significantly im- 
proved the quality of marriages for couples 
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who received it. The findings replicated a 
prior study (Jacobson, 1977a). In addition, a 
new behavioral condition, using contracts of 
the quid pro quo variety, was found to be 
significantly improved relative to the WL 
group on all criterion measures. The changes 
achieved by QPQ couples were comparable to 
those achieved by GF couples. 

A nonspecific control group was included in 
the present study to shed light on the im- 
portance of those factors that were theoreti- 
cally assumed to be the active ingredients of 
the treatment program. Despite the evidence 
provided by the manipulation checks suggest- 
ing that the NS group was perceived as cred- 
ible, on three of the four dependent measures 
the behavioral groups improved significantly 
more than did the NS group. Only on one of 
the self-report measures, the MHS, did dif- 
ferences between behavioral and NS condi- 
tions fail to reach statistical significance. The 
MHS asks couples to rate their degree’ of 
happiness on a 5-point scale in regard to 12 
categories of married life, The purposes of 
this questionnaire are obvious to clients; the 
way to score “happy” is clear. Intuitively it 
would appear that of the four measures, the 
MHS is most susceptible to demand charac- 
teristics. This might explain why the dis- 
crepancies between the NS and the behavioral 
groups were smaller on this measure than, on 
any other. This hypothesis is supported by 
the finding that the nonspecific group im- 
proved only on this measure. 

The findings of the present study indicate 
that improvement on the part of behavioral 
couples cannot be entirely a function of non- 
specific factors. On the basis of the manipula- 
tion checks, it seems clear that NS couples’ 
expectancies were equivalent to those of be- 
havioral couples. They were as “confident” 
and as “optimistic” regarding their treatment 
and their therapist as were behavioral couples. 
NS couples perceived their therapists as in- 
terested and involved to the same degree as 
did behavioral couples. This was true despite 
the differential effectiveness of the procedures; 
although NS couples were not improving, they 
seemed to be satisfied that they were receiv- 
ing a competent treatment program admin- 
istered by involved, concerned, competent 


These results were corroborated by 
uate students’ ratings of therapy 
for behavioral and NS conditions. 
couples receiving a behavioral form 
nt, it seemed to make little differ- 
ether they used GF contracts or QPQ 
. Although these results cannot be 
s conclusive, they tentatively suggest 
e two contracting forms are inter- 
ble. However, it is still possible that 
tracts are preferable for severely dis- 
couples, a hypothesis consistent with 
et al. (1974). A better test of this hy- 
would be to cross severity of marital 
with type of contract used ina 2 X 2 
n. Weiss et al. might predict a significant 
tion such that the contract forms would 
ly effective for moderately disturbed 
like those in the present study, but 
erely disturbed couples GF contracting 
üld yield more effective results. 
le problem with the present study was 
essity of using the principal investi- 
+ as one of the therapists. However, the 
not support the hypothesis of thera- 
Dias influencing the results. First, the 
? posttherapy ratings did not suggest 
any of the therapists behaved differently 
ing on which treatment program they 
implementing. In addition, the absence 
herapist X Treatment interactions argues 
st therapist bias. The performance of the 
cipal investigator as therapist, relative to 
of the other two therapists, was if any- 
slightly greater in the NS condition than 
ithe two behavioral conditions. 
ther limitation of the present study was 
| number of couples per treatment cell 
analysis of therapist effects. Although 
but one instance, at least two couples 
treated by each therapist in each treat- 
it condition (Therapist 2 treated only one 
ple in the NS condition, and this couple 
pped out prior to completing the program), 
Statistical tests for therapist differences 
not powerful. It was also unfortunate 
dropouts had to be included in the analy- 
r therapist effects (see Data Analysis 
n). To not include such couples would 
€ seriously biased the appearance of ther- 
st performance. The assumption of no 
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change from pretest to posttest for the two 
dropouts was felt to be a cautious estimate, 
since all other couples receiving treatment, 
even those who were considered treatment 
failures, changed in the positive direction on 
at least some of the measures. 

Future studies should investigate the effec- 
tiveness of behavioral marital therapy with 
severely distressed couples; especially needed 
are studies involving couples in which one or 
both spouses present severe behavioral prob- 
lems apart from relationship distress. Com- 
ponent analyses should also be undertaken to 
evaluate the particular contributions of prob- 
lem solving and contingency contracting to 
treatment efficacy. There is already some con- 
vergent evidence that the communication 
training component (problem solving) is both 
necessary and sufficient for positive change 
(Jacobson, in press-b), as well as growing 
controversy regarding the importance of con- 
tingency contracting in a behavioral ap- 
proach to marital problems (e.g. Jacobson, in 
press-a). One admittedly speculative but 
plausible interpretation of the apparent equiv- 
alence of good faith and quid pro quo con- 
tracts is that contracting per se is unnecessary. 
Only systematic research can provide a defini- 
tive answer to this question. 


Reference Note 


1, Margolin, G. A comparison of marital interven- 
tions: Behavioral, behavioral-cognitive, and non- 
directive. Paper presented at the meeting of the 


Western Psychological ‘Association, Los Angeles, 


April 1976. 
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Toward the Assessment 
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Two studies directed toward developmen 
sure of social competence in dating and 
18-item questionnaire consisting of items 
specific behaviors occurring and the degr 


programs, 


There has been a great deal of recent interest 
in social skills training, which has been ex- 
tended from the skill of refusing unreasonable 
tequests (McFall & Lillesand, 1971; McFall & 
Marston, 1970; McFall & Twentyman, 1973) 
to more general assertion skills by a number 
of investigators (Eisler, Hersen, & Miller, 
1973; Hersen, Eisler, & Miller, 1973). 

The social skills training literature has also 
&panded to include general social skills train- 
ing for lower-income clients in mental health 
centers (Goldstein, 1973), male psychiatric 
ets (Goldsmith & McFall, 1975), and 
a. skills (Curran, 1975; Curran & Gilbert, 

; Glass, Gottman, & Shmurak, 1976; 
Twentyman & McFall, 1975). 
ea review of social skills training as 
(m to heterosexual social anxiety, Curran 
Beor Teviewed 13 studies, concluding that a 
ae oe in the social skills training litera- 
aa assessment of social skills. He noted 
Bis ittle data exist with regard to the 
o ae Properties and construct validity 

i of the instruments used in previous 
oldfri xual-social anxiety research” (p. 154). 

ried and Linehan (1977) called for mea- 


The A 

by, Nancy wish to thank Dave Schlundt, John 
in try evin, and Jim B: 4 
Re ing out this ele arrett for their help 
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of Social Competence 


John M. Gottman 


University of Illinois 


t and validation of a self-report mea- 
assertion situations are described. An 
that assessed the likelihood of certain 
ee of discomfort and expected incom- 


petence in specific situations was derived. This questionnaire discriminated be- 
tween client and normal populations and between clients with dating and as- 
sertion problems, has psychometric properties of reliability and validity, and 
measures differential improvement following a variety of 8-week intervention 


sures that demonstrate content validity by 
empirical generation of a content domain 
(rather than relying on face validity). with 
attention to the situational context of the 
behavioral referents assessed. They suggested 
that discriminant validity studies that demon- 
strate the separateness of two behavioral con- 
cepts will clarify the conceptual ambiguity in 
behavioral concepts such as assertion. 

The present series of investigations is an 
attempt to develop a self-report assessment 
measure of social competence that has demon- 
strated psychometric properties of reliability 
and validity. Despite the fact that there is a 
general suspicion of all self-report measures 
among behavioral scientists, recent research 
has indicated that under certain specific condi- 
tions self-report measures may meet psycho- 
metric standards of reliability and validity 
(Goldfried & Kent, 1972). 

Mischel’s (1968) review of personality 
assessment literature led him to conclude that 
although observation of past behavior in 
situations with similar role requirements is the 
best predictor of future behavior in a specific 
situation, the next best predictor of future 
behavior is obtained from self-predictions. 
Furthermore, the research investigations of 
McFall and his associates have found that 
although global self-assessments of competence 
do not relate well to judges’ ratings of tapes of 
behavioral role-playing assessment, self-reports 
of discomfort and incompetence in specific 
situations (as measured by the Conflict 
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Resolution Inventory) do correlate well with 
behavioral assessments. For example, McFall 
and Lillesand (1971) wrote: 

The results obtained on the assertive score, nonassertive 
score, and difference score measures [of the Conflict 
Resolution Inventory], all of which assessed responses 
in specific refusal situations, were in sharp contrast to 
the nonspecific effects obtained on the global measure. 
(pp. 316-317). 

This general finding has been replicated by 
other investigators (e.g., Schwartz & Gottman, 
1976). In a social skills training study with 
male psychiatric inpatients, Clark (1975) used 
a global self-assessment of improvement, a 
situationally specific self-assessment, and a 
behavioral role-playing assessment. The con- 
trol group, which received didactic lectures, 
showed no improvement on the behavioral 
assessment measure and no improvement on 
the situationally specific assessment measure 
but did show improvement on the global self- 
assessment measure. The social skills training 
group showed improvement on all three 
measures. There is thus some initial evidence 
suggesting that a situationally specific self- 
report measure of social competence would 
have validity with respect to laboratory role- 
playing assessments, 

The current investigation requires a self- 
report measure of social competence to demon- 
strate several specific kinds of validity. First, 
it must discriminate between competent and 
incompetent populations, with competence 
independently defined. Second, it must dis- 
criminate among specific types of social in- 
competence; for example, nonassertive sub- 
jects should show a different scale pattern 
profile than subjects with heterosexual dating 
problems. Third, in cases in which treatment is 
used, the self-report measure must predict dif- 
ferential improvement in treatments designed 
for the amelioration of specific problems. For 
example, nonassertive subjects should generally 
improve on assertion items but not on dating 
items, compared to dating-problem subjects, 
who should improve on dating but not assertion 
items, compared to nonassertive subjects. This 
aes eneon of validity is dependent on 
intervention programs that et specific 
skills for training, and will an not EER 

to the extent that dating skills training 
programs and assertion training programs 
overlap in the skills they teach, 
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The present series of investigations was 
undertaken to design a self-report measure 
that meets the three criteria of validity de- 
scribed above, as well as internal consistency 
and test-retest reliabilities. The present in- 
vestigations also followed the recommendation 
of Goldfried and D’Zurilla (1969) in empirically 
constructing a domain of problematic social 
situations. From this domain, items that 
involved two specific self-reports were con- 
structed: (a) self-report of discomfort or in- 
competence—dimensions that have shown 
validity with behavioral assessments in 
McFall’s Conflict Resolution Inventory—and 
(b) self-report of the likelihood of engaging in 
specific behaviors. Items were selected from 
the larger domain in the two subdomains of 
assertion and heterosexual dating. A series of 
reliability and validity studies were undertaken 
using these items. 


Study 1 
Method 
Subjects 


During the second week of the fall 1976 semester, 4 
notice announcing the availability of social skills train- 
ing programs for students having problems in dating 
and assertion situations was placed in the student 
newspaper and posted on dormitory bulletin boards. 
The approximately 200 students who responded to the 
notice were mailed a package that included information 
about the training programs and three questionnaires 
(described below). Respondents were requested to 
complete the three questionnaires and return them 
along with a $5 deposit if they wished to be included 
in a training program. They were informed that the 
deposit would be refunded when they completed 4 
second set of questionnaires at the end of the program. 
When registration was terminated 3 weeks after the 
Notice first appeared, 92 students had completed the 
pretest materials, and these students became the 
“client” population for the study. 

At the same time, a group of 69 students who had 
not signed up for the training program were recruit 
from the introductory psychology classes and were 
given the complete set of questionnaires. These student 
were the “normal” population for the first experiment 


Procedure 


Three questionnaires were administered to the client 
and normal populations: (a) a situations questionat 
(b) a behavior inventory, and (c) a symptom checklist 
A description of these questionnaires follows. == 

Situations questionnaire (40 items), A domain ù 
items was generated by eight undergraduates (fo 


“Table 1 


Test 
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Situations 


Overall 

Refusal 

Getting What You Want 
Expressing Feeling 
Requesting Behavior Change 
Formal Situations 
Conversation Skills 

Close Interpersonal Situations 
Dating 


Behavior 


Overall 
Friendship 
Self-con fidence 
Assertiveness 
Intimacy 
Dating 


Symptom 
Overall 


Seminar on interviewing. Each member of the seminar 
hterviewed 10 undergraduates and obtained a descrip- 

of four social situations that the interviewee had 
cently found to be “difficult to handle.” A description 
each situation, written by the interviewer, sum- 

; ized the situational context, the roles of the principal 
ee in the situation, the action, and the time of 
ke culty that preceded a response demanded of 
A erviewee. The original list of 320 situations was 
to generate 97 nonredundant items that could be 


Ee between being overly general or overly specific. 
os were sorted into seven a priori scales by the 
ent of the task posed by the situation: (a) refusing 

asonable requests, (b) getting what you want, 

pressing how you feel, (d) requesting behavior 
= aa tom someone, (e) dealing with formal situations 

E A j dinner party), (£) initiating and continuing 

ee ions, and (g) dating situations (such as asking 

D e and getting close to someone of the opposite 


ppurham (Note 1) tested these a priori scales with 
enn dergraduates. He used three phrasings of the 
tee PO" question: (a) a phrasing that confounded 
hract ort with incompetence} (b) a discomfort 
ded and (c) an incompetence phrasing. The con- 
ia oinin showed the best a priori scale test- 
iabilities (.75) between administrations 3 


M 

Client Normal F(1, 157) 
3.0 3.6 55.575*** 
3.5 3.7 4.127" 
3.1 3.4 5.521** 
2.8 3.6 56.364*** 
3.2 3.7 17.494*** 
2.8 3.5 53,734*** 
2.6 3.6 75.947" 
3.3 3.7 ZETA 
2.6 3.3 40.133*** 
1.0 2.5 84.2220" 
1.9 27 71.480*** 
1.9 PE 32,613*** 
24 2.6 o.171"** 
1.9 2.3 32.290*** 
1.8 2.5 54,271*** 
2.2 1.8 20.917"** 


weeks apart and the best Cronbach alpha coefficient 
(97) and split-half reliability coefficient (,94). Using 
an item analysis of the correlation of items with a priori 
subscale totals, Durham reduced the original 97-item 
questionnaire to 40 items. Durham also conducted 
analyses of selected subject characteristics and found 
no differences between subjects’ scores as a function 
of sex, year in college, or marital status. 

The following excerpt from the social situations 
questionnaire illustrates the format used: 


After each situation, circle one of the numbers from 
1 to 5 which best describes you using the following 
scale: 
1 = I would be so uncomfortable and so unable to 
handle this situation that I would avoid it if 
ible. 
2 = I would feel very uncomfortable and would 
have a lot of difficulty handling this situation. 
3=1 would feel somewhat uncomfortable and 
would have some difficulty in handling this 
situation. 
4 = I would feel quite comfortable and would be 
able to handle this situation fairly well. 


1 This was the phrasing used in the Conflict Resolu- 
tion Inventory. 
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5 = I would feel very comfortable and be able to 
handle this situation very well. 


Your friend’s relatives invite you over for dinner. 
You accept, then begin to feel nervous about making 
a good impression. You arrive at their house, and 
everyone sits down to talk before dinner. One of the 
relatives smiles at you and seems to expect you to 
say something. 1 2 3 4 5 


Behavior inventory (26 items). Construction of the 
behavior inventory was considerably less formal than 
that of the social situations questionnaire. Five a priori 
subscales were established based on five social skills 
training groups that had been offered by clinical 
psychology graduate students supervised by us during 
the spring 1975 semester. The five groups were (a) 
friendship, (b) self-confidence, (c) assertion, (d) in- 
timacy, and (e) dating. Behaviors that were seen as 
being particularly difficult for participants in each 
group were converted into items on the inventory. The 
inventory was constructed to assess the likelihood of 
a respondent to exhibit these behaviors. The following 
excerpt illustrates the nature of the inventory, with 
examples from the self-confidence, assertion, and dating 
subscales, 


How much were you bothered by: 


Not at all 
Headaches 1 
Nervousness or shakiness inside 1 
Trouble remembering things 1 


Results and Discussion 


The presentation of the results is divided 
into three sections: One section is related to 
the first validity claim, namely, discrimination 
of clients from nonclients ; one section is related 
to the second validity claim, namely dis- 
crimination of assertion clients from dating 
skills clients; and a third section is related to 
psychometric properties of the measures, 


Clients and Nonclients 


Data were analyzed Separately for a priori 
subscales, individual items (on the situations 
questionnaire and the behavior inventory) for 
overall average item score for each question- 
naire and for the total Symptom checklist 
score. These data were analyzed in a two-way 
(clients vs. normals) analysis of variance. An 
unweighted means solution was used because 
of the unequal sample sizes. These analyses 
revealed that clients had greater difficulty on 
all subscale scores for both the situations 
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We are interested in finding out something about the 
likelihood of your acting in certain ways. Below you 
will find a list of specific behaviors you may or may 
not exhibit. Use the following rating scale: 


1 2 3 4 
Inever Isometimes Ioften I do this 
do this do this do this almost always 


Now after each of the items on the following list, 
place the number which best indicates the likelihood 
of your behaving in that way. Be as objective as 
possible. 


Volunteer to do something where there is 
a good chance you might fail. 

Say “no” when you feel like it. 

Start a conversation with a member of 
the opposite sex you would like to date. 


Symptom checklist (90 items). A questionnaire 
normally used with hospital inpatients? was adopted 
for use. Items on this questionnaire reflect anxiety, 
depression, and somatic symptoms. Subjects rated each 
item to the extent they are troubled by that problem on 
a 1 to 5 scale. The following excerpt illustrates the 
nature of the checklist : 


A little bit Moderately Quite a bit Extremely 
2 3 4 5 
2 3 4 i 
2 3 4 5 


questionnaire and the behavior inventory, 
greater difficulty on the overall average item 
Score on all three questionnaires, and greater 
difficulty on 21 of 26 items on the behavior 
inventory and 35 of 40 items on the situations 
questionnaire. In all cases, the significance 
level of these differences was less than .05. 
Table 1 presents means, F ratios, and levels 
for the subscale and overall average item 
scores for the clients and normals. The F ratios 
indicate a considerable degree of discriminative 
power. Note that the means refer to the item 
scale values described above for each question- 
naire and that smaller numbers indicate greater 
difficulty on the situations questionnaire and 
behavior inventories, whereas larger numbers 
indicate greater difficulty on the symptom 
checklist. 


* This symptom checklist was used as part of & 
standard clinical intake procedure by the Illinois State 
Psychiatric Institute and the Family Institute of 
Chicago. 
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Table 2 
Dating and Assertion Subjects’ Pretest Data on Overall and Subscale Scores 
SE $< nn ne 
M 
Te Dating Assertion F(1, 86) p< 
Situations 
Overall 
3.0 3.0 — 
Refusal i 
Getting What You Want si Se ate 33 
Expressing Feelings 29 2.9 a se 
Requesting Behavior Change 3.3 3.2 5 
Formal Situations 2.9 2.8 2 
Conversational Skills 2.5 28 i7 18 
Clore Interpersonal Situations 3.2 34 es i w 
ating 2.4 2.8 8.224 005 
Behavior 
Overall 
f Friendship as ne a 
Sonntag 2.0 1.8 8.635 005 
ssertiveness $ E i l 
aey P F a on eH 
ating 1.6 2.0 11.934 001 
Symptom 
Ov 
erall 21 2.3 3,420 064 


E pain clearly indicate that students 
En gned up for social skills training reported 
kia ed ifficulty across the range of 
IA oe measured by our instruments 
: ki ee students. Interestingly, they 
fii? A A a greater prevalence of “psychi- 
Eere na somatic symptoms. A picture 
itself a ol a su bpopulation that may present 
be ae ly less socially competent and 
The re ois ridden than its peers. 
the a ha. offer some initial validation of 
itd bone ri subscales used in the situations 
a questionnaires: The subscales 
| y discriminated between client and 
Feith Populations. The first validity criterion 
| erefore satisfied. 


Asserts a 
Sserlion Clients and Dating Skills Clients 


[e aaysis was carried out using the same 
i he T that only the 92 clients were used: 
tin, tton groups clients (n = 46) and the 

8 skills clients (n = 46). 
Rbjects) a (dating subjects vs. assertion 
for "etn of variance were performed 
Sales questionnaire scores, a priori sub- 
, and individual items. Results indicate 


that dating subjects showed significantly 
greater difficulty as compared to assertion 
subjects on the dating subscales of both the 
situations questionnaire and the behavior 
inventory. Assertion subjects showed signifi- 
cantly greater difficulty on the “self-confidence” 
and “assertiveness” subscales of the behavior 
inventory. Means, F ratios, and p levels are 
presented for these differences in Table 2. 
Analysis of the individual items revealed that 
8 of 40 items on the situations questionnaire 
and 8 of 26 items on the behavior inventory 
significantly differentiated dating subjects and 
assertion subjects at Pp < 05. 
Knowing the training prog 
clients had registered allowed a second em- 
pirical test of the validity of several of our 
subscales. The results of this experiment 
indicate that clients with dating and assertion 
problems tend to score accordingly on dating- 
related and assertion-related subscales. More- 
over, the use of two different kinds of self- 
report measures (i.e., the situations question- 
naire and behavior inventory), and the tend- 
ency of clients to score appropriately on both, 
provided us with convergent evidence that 
true differences existed between the dating and 


ram for which 
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assertion subpopulations and that these were 
measurable independently using self-report 
measures. 

To enhance the validity of the dating and 
assertion scales, we decided to focus on the 
dating and assertion subpopulations, to con- 
centrate on developing one questionnaire 
containing only the dating and assertion sub- 
scales, and to subject this new questionnaire 
to standard reliability tests prior to continuing 
with additional validation procedures. 


Psychometric Properties 


An 18-item questionnaire was developed 
with a 9-item dating subscale and a 9-item 
assertion subscale. This new questionnaire was 
tested scalewise for internal consistency and 
test-retest reliability. In addition, previous 
validity tests for discriminating clients versus 
normals and dating versus assertion problems 
were recomputed using these new subscales, 

The original 92 clients (46 dating, 46 asser- 
tion) and 69 normals were studied again for 
computing internal consistency and for per- 
forming concurrent validity checks. Seventy 
additional subjects who had not registered for 
the training programs were recruited from the 
introductory psychology classes to serve as a 
sample for performing a test-retest reliability 
analysis. 

The original 26-item behavior inventory and 
40-item situations questionnaire were trans- 
formed into an 18item questionnaire by 
selecting only those items that both success- 
fully discriminated clients from normals and 
successfully discriminated dating clients from 
assertion clients. Of the 18 items that met 
these criteria, dating clients indicated having 
greater difficulty with 9 of the items (S from 
the original behavior inventory and 4 from the 
original situations questionnaire), whereas 
assertion clients indicated greater difficulty 
with the other 9 (4 from the original behavior 
inventory and 5 from the original situations 
questionnaire). Thus, these sets of items be- 
came our 9-item dating and assertion subscales 
(see Appendix), which were tested for their 
Psychometric properties as follows: (a) A 
Cronbach alpha was computed for assessing 
the internal consistency of the dating and 
assertion subscales using the data from the 
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original clients and normals; (b) comparison 
of clients versus normals and of dating client 
versus assertion clients were made on the tw 
subscales using the original client and norma 
sample; (c) to assess test-retest reliability 
6 weeks prior to the end of the semester 4{ 
normal subjects were administered the origina 
test battery. An additional 30 normal subject 
took the test battery 4 weeks later. All 70 sub 
jects took the battery again 2 weeks later 
Usable data were obtained from 28 subjects 
for the 2-week test-retest interval and from 
an independent group of 39 subjects for th 
6-week interval, 
Analysis of internal consistency yielded a 
Cronbach alpha of .92 for the dating subscal 
and an alpha of .85 for the assertion scale 
Concurrent discriminant validity analyse 
revealed clients to have significantly greater 
difficulty than normals on both the dating 
subscale, F(1, 159) = 52.60, p < .001, and 
the assertion subscale, F(1, 159) = 34.33, 
$ < .001. Dating clients had more difficulty 
than assertion clients on the dating subscale 
F(1, 86) = 17.55, p < .001. Assertion clients 
had more difficulty than dating clients on the 
assertion subscale, F(1, 86) = 21.00, p < .00h 
To assess test-retest change, a 2X2 (0 
Week vs. 6 Week X Pretest vs. Posttest) 
analysis of variance was computed for the 
2-week and 6-week groups. The results indi 
cated no change at retesting at either interval 
for either the dating subscale or the assertion 
subscale. The test-retest correlations for both 
subscales at both testing intervals (ns =% 
and 39, respectively) were: For dating at q 
and 6 weeks, rs = .71 and .62. For assertion, 
rs = .71 and .70 (p < .001). i 
The results indicate that the dating an4 
assertion subscales have demonstrable pyd 
metric qualities of reliability and validity. 0 
particular interest was the finding that the! 
scales had internal consistency despite the n 
that items were selected on the basis of the 
ability to discriminate between population: 
This suggests that the scale items are in fat 
Measuring the same dimension and that H 
dimension is one for which salient difference 
do exist between the populations in question: | 
The test-retest experiment was performé 
to determine whether the subscales wou 
fluctuate greatly over the measurement perio 


in question. Especially important for Study 2 
mas the determination of whether significant 
‘changes in self-report of dating and assertion 
ficulties would occur as the end of semester 
approached. On the basis of our findings, there 
igno reason to expect these kinds of difficulties 
b spontaneously increase or decrease over the 
ourse of our testing intervals. However, these 
yst-retest data were obtained using normal 
subjects, and their applicability to client 
populations was not tested. 


Study 2 


In this section the results of an 8-week 

tervention directed toward amelioration of 

gecific social skills problems is presented. This 

| intervention was used to test the abilities of 
dur instrument to measure differential changes 
ia function of the type of social skills training 
rogram. 


Method 


tocedure 


a 46 dating clients who had completed the pretest 
any in Experiment 1 were assigned to one of three 
4 oat fontin: (a) group meeting (» = 11), 
i -help manual plus consultant (# = 11), or 
o] self-help manual (» = 24). 
Biei similar manner, the 46 assertion clients were 
O e either group meeting (n = 8), self-help 
Le 7 lus consultant (n = 12), or self-help manual 
a Conditions. A description of the three treat- 
nl ae follows. 
i meeting. Clients assigned to this condition 
e Tek, 90-minute sessions for 8 weeks under 
Bo ae ip of male and female cotherapists. The 
fyi ese groups was on behavioral rehearsal, role- 
Seen skill acquisition exercises. 
3 ae manual plus consultant. Original manuals 
Pee = that contained information and exercises 
ARA o assertion skills and dating skills.* These 
aaa divided into eight sections, with each 
ee information and exercises for 1 week. 
a ae to the manual, clients in this condition 
“With the a an undergraduate “consultant” who met 
te, 3: nt at the start of the 8-week period, called 
lable p ically to check on their progress, and was 
Se Pis phone consultation if the client so desired. 
ia manual. Clients in this condition received 
tindition niate self-help manual as in the previous 
Ae ihe ut were not assigned a consultant. 
a tae of the 8-week period, all clients in all 
taires an were mailed a package of posttest question- 
be A were reminded that their $5 deposits would 
Brea soon as the materials were completed 
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Table 3 
Pretest and Postiest Subscale Scores for Dating 
and Assertion Clients 


Group Dating Assertion 
Dating 

Pre (46) 1.95 2.91 

Post (38) 2.41 3.02 

t 5:552 1.34 
Assertion 

Pre (46) 2.45 2.48 

Post (35) 2.76 3.02 

t 315e 6.51* 


Note. Numbers in parentheses are ns. 
*p < 001. 


Results and Discussion 


We were able to obtain a fairly high rate of 
return from clients completing the program for 
our posttest materials. There was also a small 
number of clients who chose to drop out of the 
program prior to its completion. The overall 
return rate was 79%, and by treatment was 
group meeting (39%), self-help manual. plus 
consultant (78%), and self-help manual (76%). 

Data obtained from these clients were 
analyzed in a 2X 2 (Dating Clients versus 
Assertion Clients X Pretest versus Posttest) 
analysis of variance for the dating and assertion 
subscales. Results indicated a significant pre- 
test versus posttest main effect for both the 
dating subscale, F(1, 67) = 48.31, p < .001, 
and the assertion subscale, F (1, 67) = 37.87, 
p < 001. Significant Client X Test interac- 
tions were obtained for the dating subscale, 
F(1, 67) = 4.40, p = .037, and for the asser- 
tion subscale, F(1, 67) = 21.11, p < 001. 

Of greatest interest for the present investi- 
gation were the data concerning changes on 
the dating and assertion subscales for clients 
working in dating and assertion training 
programs. Pretest and posttest means for these 
clients and subscales are presented in Table 3. 
These results indicate that significant improve- 
ment occurred only for the dating subscales 
for dating clients. ‘Assertion clients improved 


3 The authors are extremely 


Jennifer Parkhurst, and David 
write and edit the manuals. 


grateful to John Embry, 
Schlundt who helped 
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on both subscales but showed more improve- 
ment on the assertion scale than on the dating 
scale. 

The results obtained from Study 2 indicate 
measurable improvement in both client popu- 
lations over the 8-week period. Despite the 
fact that a no-treatment control was not 
included, it can be argued for several reasons 
that these changes are most readily attribut- 
able to the interventions that occurred during 
this period. First, the most pronounced change 
occurred on the subscale related to the targeted 
problem. This was especially true for the dating 
clients, who showed no change on the assertion 
subscale. In addition, test-retest data on 
normal subjects in Study 1 over a similar time 
period indicated no change on either subscale. 
There is little reason to expect that client 
populations would spontaneously improve over 
this time period. 

Study 2 provides an extension of the utility 
of the dating and assertion subscales. Prior to 
initiating these pretreatment versus post- 
treatment comparisons, we were not optimistic 
about the likelihood that these subscales would 
be useful for measuring change following an 
8-week intervention. It had seemed to be the 
case that the utility of a scale to register 
changes on a personality dimension was quite 
independent of its ability to satisfy static 
criteria of reliability and validity. The addi- 
tional expectation of differential changes as a 
function of specific types of intervention thus 
serves as an additional validity check. The 
discriminant validity of the dating and asser- 
tion scales may clarify the frequent ambiguity 
inherent in the behavioral concept of social 
skills; it would seem reasonable to hypothesize 
that social competence consists of a set of 
relatively independent skills. 

This article is a step toward the assessment 
of specific aspects of social competence. We 
should add that these two scales should be 
used cautiously; the two questionnaires con- 
tain a narrow sampling of items from a larger 
domain (cf, Durham, Note 1) and should not 
be equated with social competence. We also 
Stress the limitation of this article in only using 
self-report measures in the validation proce- 
dure. Still to be demonstrated is that these 
Piet an with relevant extralabora- 

ory criteria and with measures obtain 
coding behavior samples, ay 
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Our primary interest was the construction 
of measures that successfully differentiate 
people who have a given difficulty from those 
who do not, that discriminate among people 
who have different kinds of related difficulties, 
and that indicate change in the level of this 
difficulty differentially as a function of the 
treatment received. The results of the studies 
presented here indicate that one kind of self. 
report measure that satisfies all of these 
criteria can be constructed by assessing the 
likelihood of certain behaviors occurring and 
the degree of discomfort and expected incom- 
petence in specific situations. A useful assess 
ment device was thus constructed from items 
that combined behavioral specificity with the 
phenomenology of expected difficulty and 
discomfort. 


Reference Note 


1. Durham, R. The social questionnaire: A new measut 
of social competence among college students. Unpub- 
lished manuscript, Indiana University, 1976. 
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7 pad up for your rights (A) 
» Maintain a long conversation 
with a member of the opposite 
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Be confident in your ability to 
Succeed in a situation in which 
you have to demonstrate your 
competence 
ji wy “no” when you feel like it 
» Get a second date with someone 
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Assume a role of leadership 
Be able to accurately sense how 
i member of the opposite sex 
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ave an intimate emotional 
relationship with a member of 
the Opposite sex 


(D) 


a 


(A) 
(A) 


(D) 
(A) 
(D) 


(D) 
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Appendix 


Dating and Assertion Questionnaire (18 items) 


9. Have an intimate physical rela- 
tionship with a member of the 
opposite sex (D) 


The following questions describe a variety 
of social situations that you might encounter. 
In each situation you may feel “put on the 
spot.’’ Some situations may be familiar to you, 
and others may not. We'd like you to read each 
situation and try to imagine yourself actually 
in the situation. The more vividly you get a 
mental picture and place yourself into the 
situation, the better. 

‘After each situation circle one of the num- 
bers from 1 to 5 which best describes you using 


the following scale: 


1 =I would be so uncomfortable and so 
unable to handle this situation that I 
would avoid it if possible. 

2=I1 would feel very uncomfortable and 
would have a lot of difficulty handling 
this situation. 

3 = I would feel somew: 
would have some 


this situation. 
4 = I would feel quite comfortable and would 


be able to handle this situation fairly well. 
5 = I would feel very comfortable and be able 
to handle this situation very well. 


hat uncomfortable and 
difficulty in handling 


1. You're waiting patiently in line at the 
checkout when a couple of people cut right in 
front of you. You feel really annoyed and want 
to tell them to wait their turn at the back of 
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the line. One of them says, “Look, you don’t 
mind do you? But we're in a terrible hurry.” 


12345 (A) 


2. You have enjoyed this date and would 
like to see your date again. The evening is 
coming to a close and you decide to say 
something. 


DEE E 1 45 (D) 


3. You are talking to a professor about 
dropping a class. You explain your situation, 
which you fabricate slightly for effect. Looking 
at his grade book the professor comments that 
you are pretty far behind. You go into greater 
detail about why you are behind and why 
you'd like to be allowed to withdraw from his 
class. He then says, “I'm sorry, but it’s against 
university policy to let you withdraw this late 
in the semester.” 


1 ABs Cars (A) 


4. You meet someone you don’t know very 
well but are attracted to. You want to ask 
them out for a date. 


T ZI AES (D) 
5. You meet someone of the opposite sex at 
lunch and have a very enjoyable conversation. 


You'd like to get together again and decide to 
say something. 


SAAS (D) 


6. Your roommate has several obnoxious 
traits that upset you very much. So far, you 
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have mentioned them once or twice, but no 
noticeable changes have occurred. You still 
have 3 months left to live together. You decide 
to say something. 


iee2e 3,4 5 (A) 


7. You're with a small group of people who 
you don’t know too well. Most of them are 
expressing a point of view that you disagree 
with. You'd like to state your opinion even if 
it means you'll probably be in the minority, 


M2.) 34'S (A) 


8. You go to a party where you don’t know 
many people. Someone of the opposite sex 
approaches you and introduces themself, You 
want to start a conversation and get to know 
him/her. 


eas A 5 (D) 


9. You are trying to make an appointment 
with the dean. You are talking to his secretary 
face-to-face. She asks you what division you 
are in and when you tell her, she starts asking 
you questions about the nature of your 
problem. You inquire as to why she is asking 
all these questions and she replies very snob- 
bishly that she is the person who decides if 
your problem is important enough to warrant 
an audience with the dean. You decide to 
say something. 


SE on 4S (A) 
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pared in a 2 X 2 factorial design on 44 


3-month follow-up (p < .001), 
tingency management maintaining w 


behaviors, yet only 1 of 10 such 


Recent reviews of behavior modification for 
obesity support the superiority of this approach 
télative to other treatments (Leon, 1976; 
Stunkard, 1975). However, several important 
Ksues regarding the evaluation of the be- 
havioral approach remain, These include (a) a 
common and perhaps erroneous assumption 
that subjects actually engage in program 

haviors and that these behaviors mediate 
Weight loss, (b) the failure to evaluate the 
influence of exercise, (c) long-term evaluation, 
and (d) the use of clinical as opposed to student 
Populations, 

i The first issue of subject compliance is high- 
ighted by a survey of over 20 recent reports 
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This study investigated crucial aspects of behavioral programs for obesity in- 
cluding (a) the assumption that subjects actually engage in requested behaviors 
and that these behaviors mediate weight loss, 
loss, and (c) the problem of long-term maintenance and generalization to the 
clinically obese. Exercise and self-managed contingency components were com- 


(b) the effect of exercise on weight 


obese subjects and were evaluated after 


10 weeks of treatment and 3-month and 1-year follow-ups. Significant weight 
loss was observed for all groups at program termination (p <.001) and the 
with only those exposed to exercise 
ight loss after 1 year. There were no main 
effects or interactions at program termination or at the 3-month follow-up. How- 
ever, the influence of exercise at the 1-year follow-up was noticeable ($ < .10). 
Assessment of program adherence indicated that subjects engaged in program 
behaviors was related to weight loss. 


and/or con- 


on the behavioral management of obesity 
(Johnson & Stalonas, Note 1). Only 3 of these 
reports supplied information relevant to 
whether subjects actually engaged in the 
behavioral changes suggested. For example, 
studies by Hall, Hall, Hanson, and Borden 
(1974); Horan, Baker, Hoffman, and Shute 
(1975); Jeffrey (1974) ; and Mahoney, Moura, 
and Wade (1973), while demonstrating weight 
loss, either provided no information on program 
adherence or were limited to post hoc reports 
(Bellack, Rozensky, & Schwartz, 1974; 
Mahoney, 1974) or pre-post questionnaires on 
eating habits (Wollersheim, 1970). Thus, in- 
formation supporting the claim that partici- 
pants actually use behavioral strategies 18 
notably lacking in the published literature. 
Investigators may be incorrectly inferring the 
operation of independent variables embodied 
in these programs on the basis of weight change 
alone (Mahoney, 1975). 

A second issue in the 
the influence of exercise. 
expenditure are typical of the obese, 


treatment of obesity is 
Low levels of energy 
and Mayer 
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(1968), among others, indicated that this rela- 
tive inactivity is a major factor in the develop- 
ment of obesity. Exercise not only increases 
caloric expenditure and the metabolism of fat, 
but it also decreases appetite, aids in cardio- 
vascular conditioning, and generally promotes 
a sense of psychological well-being (Bjérntorp, 
1976; Horton, 1974). Thus, the direct health 
benefits of exercise are obvious and strongly 
implicated as an important adjunct to weight 
reduction. 

In spite of its appeal, the contribution of 
regular exercise to weight loss has been evalu- 
ated in only one study. In this study, Harris 
and Hallbauer (1973) found no difference at 
the termination of a 12-week program between 
controlled eating and controlled eating plus 
exercise groups. However, the group that 
exercised lost significantly more weight at a 
7-month follow-up than the comparison group. 
This study stands in virtual isolation, and a 
replication of the exercise effect is indicated 
particularly in light of its potential contribu- 
tion to the maintenance of weight loss. 

The third issue is the evaluation of subjects 
over an extended period of time. The over- 
whelming majority of studies on weight 
reduction follow subjects for 3 months or less. 
For the clinically obese who are often 50% over 
their ideal weight, long-term follow-up of at 
least 1 year is necessary to consider program 
efficacy. 

Finally, past research on obesity has been 
largely limited to subjects in college settings, 
Evaluation of the effects of a behavioral weight 
loss intervention on community members of 
varying ages and socioeconomic classes is 
necessary if we are to address the generalized 
utility of such programs, 

The present study confronted these crucial 
aspects of behavioral approaches to weight 
reduction by attempting to (a) ascertain the 
relationship between ongoing reports of pro- 
gram adherence and weight loss; (b) assess 
the effectiveness of exercise; (c) plan long-term 


evaluation ; and (d) use a clinical, nonstudent 
population, 


Method 
Subjects 


Forty-eight subjects responded to local i 
advertise- 
ments of a free program for weight loss. During the 
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first 3 weeks of the program, 4 subjects dropped out 
due to scheduling or health problems, leaving 44 sub- 
jects who met the following requirements: (a) at least 
15% overweight based on figures tabulated by the 
Metropolitan Life Insurance Company (1959), (b) free 
of complicating health disorders and medication, and 
(c) no prior experience with formal behavioral programs 
for weight loss. 

Subjects averaged 31.5 years of age (range = 16-62), 
181.1 pounds (82.1 kg) (range = 130-275 pounds) 
(59.0-124.7 kg), and 40.2% overweight (range = 15%- 
112%). There were 37 female and 7 male participants 
who were from socioeconomic status Levels I to IV 
(Hollingshead & Redlich, 1958) with 2, 19, 21, and 2 
members per level, respectively. 


Procedure 


Subjects were randomly assigned to four groups 
(matched on age, sex, absolute weight, and overweight) 
in a 2X 2 factorial design. Also, ba: 
activity were assessed, and mean expenditures for the 
four groups were equivalent. All groups received a 
similar weight loss program (P) with the following 
exceptions: One group received the addition of both 
exercise and contingency components (PEC), a second 
group received the addition of just exercise (PE), a 
third group received the addition of just a contingency 
component (PC), and the final group was exposed to 
the basic program alone (P). 

The basic program used in all four groups consisted 
of 10 written lessons (Johnson & Stalonas, in press) 
that describe a sequence of behavioral tasks including 
monitoring all information related to eating; advice on 
the components of a balanced diet; making salient 
those activities that might inhibit eating; eating at 
specific times, places, and situations; manipulating 
elements in the eating chain; imagining aversive stimuli 
contingent on inappropriate urges to eat; and graphing 
the use of program behaviors. 

The exercise component consisted of specific attempts 


to increase weekly levels of physical activity. PEC and | 


PE groups were given a list relating activities to caloric 
expenditures and were encouraged to increase their 
physical activity over a 10-week period from 150 Y 
ao kcalories (630-1, 680 J) per day above their 
level. 

The contingency manipulation consisted of instruc- 
tion in the use of self-reinforcement for successfully 
applying the strategies of the program. Subjects in m 
PEC and PC groups compiled lists of activities tay 
could serve as self-administered rewards. In the thir 
week, these subjects completed daily checklists 1" 
which program activities were converted into points 
that could be exchanged for daily (e.g., reading è 
chapter of a book) and weekly rewards (e-g-, buying 
a new dress), 


Group Meetings 


The four groups of subjects attended 10 wea 
\-hour sessions, which were conducted by two exP@ 


seline levels of 


coleaders. To maximize attendance, subjects 
ted $10, of which $1 was returned each week. 
rocedure resulted in an average attendance for 
ects of nine meetings. 
first part of each meeting was directed toward 
ive, individual review of performance records, 
; and weights with one of the two therapists. 
eafter, a group discussion ensued regarding 
areas of difficulty, and suggestions with liberal 
tions of verbal encouragement and support. 
ard the end of the group meeting, the next lesson 
distributed and explained. 


Results 


e were two main sources of data—weight 
e and program behaviors. For weight 
data were available for Week 1 through 
termination at Week 10, a 3-month 
bw-up, and a 1-year follow-up. Data re- 
ding program behavior adherence are 
lable for Week 1-Week 10. Of the 44 sub- 
ts completing the program, one subject was 

able at the 3-month follow-up due to 
gnancy, and two additional subjects had 
‘the country at the 1-year follow-up. 


eight Loss 


Mean weight changes for each of the four 
ps over three time periods are presented 
able 1. These changes were analyzed in 
separate 2 X 2 analyses of variance for 
time period The main effects of con- 
ncy management and exercise and the 
actions did not reach significant levels in 
analysis. There was a nonsignificant 
lency, however, toward a main effect of 
ercise for Week 1 to the 1-year follow-up, 
37) = 3.0, p < .10. 
related { tests were performed for each of 
four groups separately. As Table 1 reveals, 
group lost a significant amount of weight 
Program termination. (M = 10.7 pounds 
kg) for the four groups taken together, 
001.) This weight loss amounts to a 
e of 8% in excess weight. 
t the 3-month follow-up, average weight 
ges for each group remained significant at 
001 level, with the total weight loss 
g 12.50 pounds (5.7 kg). Although 
€ were no significant differences between 
10 and the 3-month follow-up for any 
ip, most subjects maintained the weight 


BEHAVIOR MODIFICATION FOR OBESITY 


465 


Table 1 
Mean Weight Loss 


eS 


Follow-up 
Group Week 10 3 mo. 1 yr. 
PEC Lh ed 12.4*** 131% 
n 10 10 10 
PE 13,124 14,9*** 16.3* 
n 10 9 9 
PC 10.0*** 13.47" 11.4*** 
n 12 12 10 
P 10.3*** ol nig 5.2 
n 12 12 12 


Note. PEC = program with exercise and contingency 
components; PE = program with just exercise; 
PC = program with just the contingency com- 
ponent, and P = the basic program alone. 
*p < 05. 
**p < 01. 
+> < 001. 


loss engendered during the program or con- 
tinued to lose weight. 

Data available at the 1-year follow-up 
revealed a similarly striking yet less consistent 
pattern of maintained weight loss. As Table 1 
illustrates, the three groups exposed to con- 
tingency and/or exercise components were able 
to maintain their weight loss. In contrast, the 
P group, exposed to only the basic training in 
stimulus control, displayed an average weight 
increase of 4.7 pounds (2.1 kg) from the 
3-month to the 1-year follow-up. 


Program Adherence 


The performance of program behaviors was 
assessed on the basis of weekly reviews of each 
subject’s recordings and graphs. Ratings were 
made of stimulus control, notebook recordings, 
self-monitoring with graphs, chaining, time- 
outs, one portion of food per meal, caloric 
monitoring, between-meal uncontrolled eating, 
exercise, and self-administered reinforcement. 
With the exception of exercise and uncontrolled 
eating, program behaviors were rated on @ 
3-point scale based on the number of days per 


week in which the behavior was performed 


1 Statistical analyses were also performed on percent 
Sas ye eetein’s (1959) weight reduction 


erweight and Feinstein’: t 
aies with results virtually identical to those for weight 
change. 
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correctly. One point was awarded for 6 or 7 
days, 4 point for 4 or 5 days, and no points 
were given for less than 4 days of proper 
execution of the behavior in question. 

Ratings of exercise and uncontrolled eating 
were based on actual frequency counts ob- 
tained from graphs. For exercise this consisted 
of a proportion of the total caloric expenditure 
expected during the program to the number of 
calories actually performed. Subjects who ate 
uncontrollably once in a given week received 
1 point, twice earned } point, and more than 
twice in a given week earned no points. 

These program adherence scores varied from 
1.0 to .0. The actual mean scores ranged from 
a low of .57 for uncontrolled eating to a high 
of .94 for eating one portion, with the mean 
for all program behaviors together being .82. 
It is clear that within the limitations of these 
self-report data, subjects performed the re- 
quired program behaviors at or near an optimal 
level. 

Because the distributions of adherence were 
positively skewed, an arc sine transformation 
was used to achieve more variability. The 
transformed measures of program behavior 
adherence were correlated with weight loss at 
Week 10, and only uncontrolled eating was 
significantly correlated with weight change 
(r = .29, p < .05). 

With the exception of exercise and self- 
reinforcement, which were administered to 
only half of the subjects, the combined rela- 
tionship of the program adherence measures 
to weight change was evaluated in a multiple 
regression analysis, This multiple correlation 
accounted for an insignificant amount of the 
variation in weight change at Week 10 
(R = .229, R? = .042), In addition, average 
adherence scores generated for each subject 
correlated insignificantly with weight change. 


Discussion 


Our subjects lost an average of 10.7 pounds 
(4.8 kg) at program termination, which repre- 
sents a significant reduction over their initial 
weight. These results are consistent with 
previous reports on behavioral approaches to 
weight reduction, which indicate that subjects 


lose an average of 1 pound (.45 k er week 
(Leon, 1976; Stunkard, 1975) phe 


i 
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Even though these data are noteworthy, 
other weight reduction regimens such a 
anorectic drugs and special diets also produce 
weight loss during their implementation, In 
most cases, however, once the drug or diet is 
relinquished, weight gain ensues (C hlouverakis, 
1975). So, although behavioral approaches are 
similar to other forms of weight reduction in 
demonstrating weight loss during active treat- 
ment, whether they foster maintenance of 
weight loss following program termination is 
another question. 


Maintenance of Weight Loss 


To evaluate maintenance, we assessed our 
results with both 3-month and 1-year follow- 
ups. The data at the 3-month follow-up show 
the efficacy of all treatment conditions, Re- 
gardless of whether a subject was exposed to 
the basic program, exercise, contingency 
Management, or some combination thereof, 
weight loss, on the average, exceeded that at 
Program termination. Of the four conditions, 
only the P group displayed a slight increase 
(4 pounds (.2 kg)) over this 3-month period. 
More importantly, at the 1-year follow-up, 
treatment gains were maintained in our PEC, 
PE, and PC groups. 

In the studies surveyed by Leon (1976), 
only four are reported with at least a 6-month 
follow-up. In Stuart’s (1967) report, a mean 
weight loss at 1 year of 37.7 pounds (17.1 kg) 
was observed for eight subjects. Comparison 
with our data is difficult due to the variability 
in subject sessions (16-41) and the scheduling 
of maintenance sessions as needed. Also, 
Mahoney (1974) reported that subjects in an 
eating habit change group lost 8.3 pounds 
(3.8 kg) after an 8-week program. At the 1-year 
follow-up, 70% of these subjects maintained 
their lower weights or continued to lose. More 
recently, McReynolds, Lutz, Paulsen, and 
Kohrs (1976) reported over 17 pounds (7.7 kg) 
lost after a 15-week program, 19 pounds (8.6 
kg) at 3 months, and 17 pounds (7.7 kg) at 4 
6-month follow-up. Subjects in our PEC, PE, 
and PC groups performed similarly to those m 
the McReynolds et al. and Mahoney studies 
both in terms of the maintenance of weight 
loss and the percent of subjects maintaining 
lower weights. Seventy percent of our subjects 


PE, PC) maintained their lower weights 
inued to lose at 3 months and 65% at 
-year follow-up. 
though it is clear that most subjects 
a weight when involved in the program and 
ny are able to keep it off for extended 
ods, their ability to continue losing weight 
f program termination is more question- 
Twenty-nine percent of our subjects lost 
ch as 5 pounds (2.3 kg) more after 
m termination, whereas only 20% of 
n were able to lose as much as 10 pounds 
kg) more. Thus, although our subjects 
re able to maintain their weight loss, less 
m one third lost a substantial amount of 
litional weight. Since they averaged approxi- 
ely 40% overweight, few of them reached 
i desirable weights over the 1-year time 
iod. In terms of a decrease in percent over- 
ight, the PE group was 12.8% (p < .001), 
SPEC was 9.5% (p < .01), the PC was 
% (p < .01), and the P group was a non- 
lificant 3.8% below the initial values at 
year follow-up. Perhaps continued weight 
is best effected through continued in- 
therapy treatment, booster sessions a là Stuart, 
r a gradual phasing out of therapy with con- 
Mmitant buildup of environmental supports. 


eralized Use of Behavioral Programs 


Given the overall weight loss and the 
itacteristics of our population (age, socio- 
| Sohlomic status, percent overweight), it does 

ppear that we are able to generalize findings 
a nonstudent population. Our subjects 
ed in age from 16 to 62 years of age with 

of 31.5 years. We observed no signifi- 
relationship between weight change and 
Be Or socioeconomic status, although admit- 
ily the socioeconomic status range was 


A major focus of this experiment was to 
faluate the influence of exercise. A potential 
Toblem was our ability to convince subjects 
Varying ages and degrees of overweight to 
dually increase their energy output from 

400 kcalories above their basal levels. 
taken from exercise charts indicated that 
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the subjects averaged 91% of the required 
energy output. One might expect older and 
more obese subjects to have a more difficult 
time exercising, yet no such relationship was 
found. Many subjects commented that the 
gradual increase in exercise, starting at a low 
150 kcalories per day, helped them get their 
previously inactive bodies in shape at a pace 
that was not too painful. In addition, as the 
exercise requirement was gradually increased, 
it gave them time to build up environmental 
supports for such activity (buying bikes, join- 
ing formal exercise programs, etc.). Graphing 
was viewed as helpful in visualizing progress 
over time. Also, therapists urged group mem- 
bers to form exercise cohorts among them- 
selves. Finally, on two occasions, therapist 
models demonstrated appropriate exercises 
and levels of exertion. 


Exercise Versus Contingency Management 


This study also provided a contrast of 
exercise with contingency management whose 
efficacy has been previously demonstrated 
(Bellack, 1976; Mahoney, 1974). As previously 
noted, the main effects of exercise, contingency 
management, and the interactions were not 
significant. However, there was a tendency 
toward an effect for exercise at the 1-year 
follow-up with mean weight losses of 12.25 
and 14.6 pounds (5.5 and 6.6 kg) for those 
exposed to contingency management and 
exercise, respectively. In general, this influence 
of exercise is consistent with the findings of 
Harris and Hallbauer (1973) and supports its 
further implementation in weight reduction 


programs. 


Program Adherence 


As previously noted, data on program 
adherence is sparse and limited to self-report 
questionnaires. Wollersheim (1970) reported 
differences in eating behavior but did not 
relate these data to weight loss. However, 
Mahoney (1974) did show a significant corre- 


lation between weight loss and the ability to 


eliminate inappropriate eating habits. Also, 


Bellack et al. (1974) found that one of nine 
program behaviors was significantly related to 


weight loss. 
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In contrast to these questionnaire data, we 
rated program adherence during weekly struc- 
tured interviews. These data indicated that 
our subjects engaged in the required behaviors 
at a reasonably high level. However, as 
noted, of the 10 behaviors, only uncontrolled 
eating was significantly related to weight loss, 
and the combined influence of all behaviors 
accounted for a small and insignificant propor- 
tion of the weight loss variance. 

These data on program adherence are some- 
what perplexing. The combined effect of our 
scoring procedure and the response of the 
subjects provided little variation, which was 
corrected via transformed scores. Whether this 
transformation sufficiently enlarged the varia- 
tion to achieve a significant relationship is 
mere speculation. If so, the data appear to 
undermine the presumed influence of the 
behavioral approach to weight loss. If subjects 
apply behavioral strategies and change their 
eating and activity patterns but these variables 
do not correlate with weight loss, then other 
so-called nonspecific factors may be operative. 


Behavioral Interventions: A Social Influence 
Process? 


The changes in eating and activity patterns 
may operate within a broader social influence 
framework to mediate weight loss. In this 
study, social influence variables were ade- 
quately controlled within the therapeutic con- 
text. Wollersheim’s (1970) comparison of 
behavioral (focal), nonspecific, and social 
pressure groups also controlled for the influence 
of extraneous social variables during “in- 
therapy” time. However, a characteristic of 
behavioral approaches is the extremely large 
amount of “out-of-therapy” time requested of 
subjects. They record, graph, count, and 
monitor ad infinitum. These tasks require a 
great deal of time and activity in the change 
process. 

This effort is unique to behavioral ap- 
proaches and in marked contrast to other 
weight reduction regimens, For example, the 
performance of highly visible “out-of-session” 
behaviors (e.g., exercise) may increase social 
reinforcement for adherence well above that 
available for other less obvious weight reduc- 
tion regimens such as drugs or diets. It may 
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well be that any procedure with a credible f 
rationale, behavioral or otherwise, that engages | 
subjects and continuously prompts their atten. 
tion to weight loss will be effective. 


Reference Note 


1. Johnson, W. G., & Stalonas, P. M. Review of program 
adherence in obesity research: The case for missing 
independent variables. Unpublished manuscript, 
University of Mississippi Medica! Center, 1977. 
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A Longitudinal Study of the 
Personality Correlates of Marijuana Use 


Edwin J. Kay, Arthur Lyons, William Newman, 
Donald Mankin, and Roger C. Loeb 
Lehigh University 


Two hundred and fifty-one male students completed the California Psychological 
Inventory, the Adjective Check List, and a drug-use questionnaire in the fall of 
their freshman year and in each of one, two, or three succeeding springs. Three 
prevailing patterns of drug use were identified. Continuous nonusers never dis- 
closed marijuana use; switched nonusers did not disclose marijuana use initially 
but did so on a later questionnaire, and users disclosed marijuana use both 
initially and later. Enduring differences between users and continuous nonusers 
were found. The switched nonusers generally had scores between those of the 
users and continuous nonusers. On several scales, switched nonusers were sim- 
ilar to users both before and after their use of marijuana, It is concluded that 


marijuana use, both present and future, 
of reported personality characteristics. 


A number of investigators have compared 
users and nonusers of marijuana on a variety 
of psychosocial and personality measures, 
Several (Brill & Christie, 1974; McAree, 
Steffenhagen, & Zheutlin, 1972; Richek, Angle, 
McAdams, & D’Angelo, 1975) have found few 
or no significant differences between the two 
groups. In other cases (Grossman, Goldstein, 
& Eisenman, 1974; Zinberg & Weil, 1970), 
group differences have been found only when 
marijuana use was chronic, However, many 
researchers have reported significant differ- 
ences between users and nonusers (Cunning- 
ham, Cunningham, & English, 1974; Fisher, 
1974; Graham & Cross, 1975; Hogan, Mankin, 
Conway, & Fox, 1970; Jessor, Jessor, & Finney, 
1973; McLaughlin, 1974; Simon, Primavera, 
Simon, & Orndoff, 1974), The great majority 
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can be predicted by a certain pattern 


of the studies cited above have used data 
collected at one testing only and thus do not 
show whether the personality differences 
sometimes found were antecedent or conse- 
quent to marijuana use. In addition, they could 
provide no information on the consistency of 
the measures over time and on the character- 
istics of individuals who change their patterns 
of drug use. 

Two studies (Brill & Christie, 1974; Jessor 
et al., 1973) did contain a longitudinal design. 
Brill and Christie found only slight differences 
between users and nonusers and therefore did 
not report longitudinal data. In the Jessor et al. 
(1973) study, high school students wert 
classified into one of three groups: nonusers of 
marijuana initially and a year later (NU-NU); 
nonusers of marijuana initially who became 
users of marijuana 1 year later (NU-U); and 
users initially and 1 year later (U). On a num- 
ber of personality measures, the NU-U sub- 
jects initially fell between the NU-NU and U 
groups, differing significantly from NU-NU 
subjects on a number of dimensions. For the 
NU-NU subjects there were no significant 
changes over time. In contrast, for the NU-U 
Subjects there were significant changes in 
which they became more similar to the 
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n numerous measures. In summary, 
hool respondents who switched from 
muse to use of marijuana were initially be- 

m nonusers and users, and they became 
ke users over time. Jessor et al. also 
d a sample of college students in their 
but unfortunately they did not collect 
tudinal data on these subjects. 
present study is a modified replication 
fogan et al, (1970) in that the subjects were 

Lehigh University and some of the same 
ity measures were used. In that study, 
d nonusers differed on such measures 
alization, flexibility, and empathy. The 
n of a longitudinal design (similar to 
Jessor et al., 1973) makes it possible to 
the consistency of the reported differ- 
characteristics of users and nonusers 
me and to check for personality charac- 
s and changes in nonusers who became 
Concurrent validity could also be 
ated, since more than one personality 
jlire was used, 


Method 
je S 


h fall during the years 1971, 1972, 1973, 200 ran- 
sen male freshmen entering Lehigh University 
ted for voluntary participation. The various 
of students were then contacted for follow-up 
aes in the spring following initial contact and for 
sive springs up to and including the spring of 
thus, freshmen entering in the fall of 1971 were 
ted then and in the springs of 1972, 1973, and 
i freshmen entering in the fall of 1972 were con- 
then and in the springs of 1973 and 1974; fresh- 
entering in the fall of 1973 were contacted then 
the spring of 1974. Initially, 130 (65%) of the 
dents who were contacted in the fall of 1971 
to be in the study; the corresponding numbers 
men entering in the fall of 1972 and 1973 were 
%) and 112 (56%), respectively. The numbers 
hts participating in the initial testing and all 
Wup testing were 68, 85, and 98 for the freshmen 
in the fall of 1971, 1972, and 1973, respectively. 
bject was paid for each testing session in which 


And third, some students who were contacted 
Willing or unable to participate in all testing 
This resulted in the following percentages of 
volunteers completing all testing sessions: 
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52% of those entering in 1971, 69% entering in 1972, 
and 87% entering in 1973. These figures are similar to 
other longitudinal research in the area (Brill & Christie, 
1974; Jessor et al., 1973). 


Procedure 


i For each test session, the subjects completed three 
instruments: a drug questionnaire (including demo- 
graphic as well as drug use information); the Adjective 
Check List (ACL; Gough & Heilbrun, 1965); and the 
California Psychological Inventory (CPI; Gough, 
1964). Subjects reported varying levels of marijuana 
use, and a small minority reported the use of other 
illegal drugs. The clearest distinction among subjects 
was whether or not any marijuana use was disclosed 
for a given test session. 


Results 


The analyses reported below are only for 
those subjects who participated in all of the 
appropriate testing sessions of the experiment. 
According to their responses on the various 
administrations of the drug questionnaire, the 
subjects were assigned to one of four categories. 
A subject was categorized as a continuous non- 
user (CNU) if he did not disclose use of mari- 
juana on any administration of the drug 
questionnaire. A subject was categorized as a 
user (U) if he disclosed use of marijuana on 
the initial administration of the drug question- 
naire. A subject was categorized as a switd 
nonuser (SNU) if he did not disclose marijuana 
use at the initial administration of the drug 
questionnaire but did disclose such use on 
some subsequent administration of the drug 
questionnaire. No subjects were found in the 
fourth possible category, switched user (dis- 
closure of marijuana use on the initial adminis- 
tration of the drug questionnaire but denial of 
marijuana use on later questionnaires). 


Replication of Hogan et al. 


Our first aim was to see if our data replicated 
the study of Hogan et al. (1970), who adminis- 
tered the CPI to comparable subjects. Table 1 


shows the raw score means on the CPI scales 
for U and CNU subjects for various adminis- 
the fall and 


trations of the test. The data from the ta 
participated 


first spring are for subjects who 

in the experiment for 1, 2, or 3 years; the data 
from the second spring are for those subjects 
who participated in the experiment for 2 or 3 
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Table 1 
Mean Responses on the California Psychological Inventory Scales for User (U) and Continuous 
Nonuser (CNU) Subjects 


— 


Time of administration 


Fall ist spring 2nd spring 3rd spring | 


U CNU U CNU U CNU U CNU | 
Scale (78) (115) (78) (115) (38) (68) (12) (27) — 
Dominance 25.3 24.2 26.4 26.0 26.3 27.6 24.9 28.4 
Capacity for Status 17.6 17.2 18.7 17.4* 18.8 18,7 20.4 18.5 
Sociability 23.977 2240% 24.1 22.6 24.3 24.0 25.5 24,7 
Social Presence 37.4 33.4*** 39.5 35.6*** 39.0 36.1* 41.6 36.2" 
Self-acceptance 21.8  20.6* 219 21;5 21.6 21.4 21.9 21.8 
Well-Being 32.6 33.4 32.8 33.9 31.7 33.3 33.2. 33.0 
Responsibility 25.0 28.1*** 25.3 28,1%** 25.3 28.6*** 24.7 28.8" 
Socialization 33.7 38.0%** 32.8 37.798" 91.0, 30:8°"* 30.2) 371" 
Self-control 23.0 27,34%.. 22.5 26.4*** ET TA baia 24.1 270 | 
Tolerance 18.9 19.4 20.4 20.1 20.2 20.7 22.4 20.5 
Good Impression 12,9 15 eee 13.6 14.3 13.6 15.6 13.8 15.2 
Communality 24.1 24.3 23,1. 24.79** 23.0 23.6 23.8 24.0 
Achievement via 
Conformance 22.8 26.1*** 23.4 26.5*** oat 20,9 8** 24.1 261 
Achievement via 
Independence 19.1. 19.4 19.5 19.8 20.3 20.2 22.3 20.2 
Intellectual Efficiency 35.6 36.6 37.0 37.1 36.7 37.9 39.1 38.1 
Psychological-mindedness 11.7 11.4 11.6 11.8 10.8 11.9 12.2 12.0 
Flexibility 12.7 9,8257. 12:7 10.2%** 13.8 10.7*** 14.3 10.9** 
Femininity 16.7 17.5 15.7 16.7 16.9 17.0 16.2 17.6 


Note. Asterisks indicate significant ¢ tests between U and CNU means. Numbers in parentheses are ms. 
p <.05. 

** p < 025. 

"+p < 0l. 


Table 2 


Mean Scores of User (U), Switched Nonuser (SNU), and Continuous Nonuser (CNU) Subjects for 
Selected California Psychological I; nventory Scales 


ee ee 


Time of administration 


Fall 1st spring 2nd spring 3rd spring 
ea ENS SERS 
U SNU CNU U SNU CNU U SNU CNU u sNU CNU 
Scale (78) (S8) (115) (78) (58) (115) (38) (47) (689 a2) Q29) 0) 


Social Presence 37.4 37.3 334% 39.5. 36.7 356 39 a 
rese: i) ; k ; 3 .0 38.0 36.1 41.6 39.4 
Responsibility E20 27 BE 28 MIEIS 30166 78T 25.3 287 28.1 24.7 26.3 28.8 
Faeroe poi 37.2 38.0 32.8* 35.6 37.7 31.7 34.4 36.8 30.2** 36.5 a 
-control 0 26.0 27.3 5 i } ; 27. 

EE at 22.5 25.2 264 22.0 25.0 27.1 241 26.5 


aaa E oi a a a E 


Fietibiltiy 127111 98 42.7 116 102* 138 129 107 143 140 10 
Note. Asteri REES AN 
e reka next to scores indicate significant ¢ tests when compared to SNU means. Numbers in paren 
* 
b < 05. 


** p < 01. 
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years} and the data from the third spring are 
for those subjects who participated in the 
experiment for 3 years (i.e., those who entered 
Lehigh in 1971). If the four drug use categories 
of Hogan et al. (frequent users, occasional 
users, nonusers, and principled nonusers) are 
| collapsed into two categories (users and non- 
wers), then their Table 3 data can be com- 
pared with our fall data. The comparison 
| indicates that the results of our initial fall 
testing closely replicated Hogan et al. Further- 
more, the pattern of results endured over time 
as assessed by our three retests. Differences 
that both replicated Hogan et al. and endured 
over time occurred on six scales. Compared 
to CNU subjects, U subjects were significantly 
higher in Social Presence and Flexibility and 
lower on Responsibility, Socialization, Self- 
| control, and Achievement via Conformance. 


| Table 3 


Subjects 


Fall 

U CNU 

| Scale (78) (115) 
No, adjectives checked 49.4 49.4 

Defensiveness 48.8 50.9 

| Favorable adjectives checked 47.0 48.4 
| Unfavorable adjectives checked 51.7 50.2 
Self-con fidence 46.2 44.4 

Self-control 44.9 50.8** 

lability 54.9 49.6** 

Personal Adjustment 46.6 49.0* 

Achievement 48.2 51.9°* 

Dominance 488 48.8 

Endurance 47.2 53.0% 

py tder 46.6 53.5** 
Introception 50.3 51.8 

Nurturance 49.1 49.2 

Affiliation 48.4 48.1 

Heterosexuality 53.1 45.6** 

Exhibition 50.0 45.1%" 
Autonomy 31.8 47.3% 

(gsression 49.6 47.0 
renge 51.6 45.3** 

ccorance 48.3 49.3 

basement 49.1 51.9 

crictence 47.2 52.5% 

Sunseling Readiness 50.2 53.5 


b <.05, 
*b<.01. 


Note, Asterisks indicate significant £ tests between U and C. 


In Table 2, we present the mean CPI scores 
for all three categories of subjects on the six 
scales that consistently differentiated U and 
CNU subjects over time. SNU subjects 
generally fell between scores of U and CNU 
subjects. They were consistently similar to U 
subjects on the Social Presence scale and 
similar to CNU subjects in the Socialization, 
Self-control, and Achievement via Conform- 
ance scales. SNU subjects shifted on only one 
CPI scale; over time they came to resemble 
the U subjects on Flexibility. 


Adjective Check List 


The mean responses on the ACL by U and 
CNU subjects for various administrations of 
the test are presented in Table 3. Again the 
1 test was used to compare the scale means for 


Mean Responses on the Adjective Check List Scales for User (U) and Continuous Nonuser (CNU) 


Time of administration 


ist spring 2nd spring 3rd spring 
SS SEE i Eee 
u CNU uU CNU U CNU 
(18) (115) (38) (68) a2) (27) 


50.8 51.2 51.6 50.8 50.7 55.1 
49.1 51.0 48.5 51.0 46.9 50.5 
47.6 49.8 46.1 50.7* 45.5 50.8 
51.6 50.4 51.6 49.6 51.7 48.1 
46.7 45.3 46.1 46.4 40.5 44.6 


re 45.5° 50.4 45.2 50.9 
PF on 576 51.3** 58.5 49.7% 
weg agoe 458 49.77, 42.5 52.38% 
E E E A Oe E Ea 
50.6 48.9 484 49.5 45.7 48.7 
sree galore 466 525S 1 81S 
tga saor | 46.6 588r Ii 44.20 387 


451.7 50.6 54.6 49.8 56.6% 
toe 504 49.1 511 50.3 53.1 
48.7 50.1 47.7 49.5 47.8 49.0 
540 46.9% 51.9 50.4 53.6 50.4 
317 457 S14 47.6 50.5 44.6 

i e 234 481** 50.8 46.7 
53.7 41.6 

se 498 468 48.4 45.3 
51.5 46.5 7 
Pe OA yet ine Een 
46.5 48.2 A79 47.7 50.1 46.1 
472 50.8% 48.0 50.6 51.5 51.0 
E E E he Oe 53.6 
49.0 52.3 50.1 51.4 53.5 53.4 


NU means. Numbers in parentheses are "s. 


474 


U and CNU subjects. On the initial adminis- 
tration, U and CNU subjects differed signif- 
icantly at the .01 level on 10 of the 25 scales $ 
for one scale the difference was significant at 
the .05 level. Users were significantly higher 
than CNU subjects on Lability, Hetero- 
sexuality, Exhibition, and Change. CNU 
subjects were significantly higher than U 
subjects on Self-control, Personal Adjustment, 
Achievement, Endurance, Order, Autonomy, 
and Deference. 

In examining Table 3, we see that the 
original differences on these 11 scales held up 
consistently over the 3 years of the study. In 
the spring of the third year, the magnitude of 
the differences was the same, although some 
differences were not statistically significant. 
Only subjects who entered school in the fall 
of 1971 were measured in the spring of the 
third year; this drop in sample size substan- 
tially reduced the power of the statistical test. 
Nonetheless, it is impressive that 2} years 
after the first test, the direction of the differ- 
ences remained the same. 

The data for the 11 ACL scales for the SNU 
subjects as compared to U and CNU subjects 
are presented in Table 4. For the first two test 


_ Table 4 


Mean Scores of User (U), Switched Nonuser (SNU), 


for Selected Adjective Check List Scales 
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sessions, with only one exception (Personal _ 


Adjustment for first spring), SNU subjects 
fell between U and CNU subjects on these 
11 scales. However, in the last two test Sessions, 
this pattern did not occur. Thus, the differ. 
ences among the three groups on the 11 scales 
changed over time. A careful examination of 
Table 4 reveals a fairly consistent pattern, 
Over time, many of the ACL scale scores of 
the SNU subjects become more like those of 
the U subjects. The shift from similarity with 
CNU subjects to similarity with U subjects is 
clear on the following scales: Socialization 
through Conformity, Lability, Autonomy, and 
Change. However, on the Order, Hetero- 
sexuality, and Exhibition scales, the SNU 
subjects were consistently similar to U subjects. 

Finally, to assess concurrent validity, we 
hypothesized that the significant differences 
found between U and CNU sub jects would be 
in the same direction for the CPI and the ACL 
when the two scales are positively correlated 
(Gough & Heilbrun, 1965) and in opposite 
directions when the two scales are negatively 
correlated. Fifty-eight of the 66 possible cases 
supported this hypothesis; that is, the CPI 
and ACL scores were in the predicted direction. 


and Continuous Nonuser (CNU) Subjects 


Time of administration 
Fall 


1st spring 2nd spring 3rd spring 
aa al 
U SNU CNU U SNU CNU U SNU CNU U SNU CNU 
Scale (78) (58) (115) (18) (58) (115) (38) (47) (68) (12) (29) (27) 
Self-control 44.9%" 485 508 440 471 516 45.5 44.6 504* 45.2 45.2 50.9 
Lability PEP i523) 549,67 56:61 guG35° 4.9 eye 544 51.3 585 58.9 49.7 
Personal 3 $ . 

Adjustment 46.4 47.1 49.9 488 46.9 49.9 46.8 42.4 49.7* 42 5** 48,2 52.3 
Achievement e e EO TS OAA Mcpin\ 57 47.3 48.0 522 445 527 506 
Endurance 47.2 50.5 53.0 47.3 49.8 53.0 46.6 46.0 52.5** 45.9 49.1 51.3 
oe E a E a V Is0.7- Gah! ace ane 53.8** 442 46.7 53.7" 
Heterosexuality 53.1 52.1 45.6** 54.9 52.2 46.9** 51.9 52.5 50.4 53.6 57.7 50.4 
Roel SOONG ASB ASIF S17 = 503. 457. ste 52.9 47.6 50.5 56.9 44.6" 
Auron ay S18) 484 473 © 53.7 49.5 476 53.4 54.0 48,1** 50.8 56.2 46.1 
Dee a6" 46.6 45.3 53.6 48.0 46.5 52.6 48.8 46.7 54.2 54.0 44.4" 

stenee 47.2 50.2 52.5 45.3%" 491 51.9 47.2 45.5 52.2* 49.2 43.2 53.6 


Note. Asterisks ni 

theses are ns. 
*b <.05, 

"tp < 01. 


ext to scores indicate significant £ tests when compared to SNU means. Numbers in paren- 


, probability of finding this amount of 
ent by chance is extremely low 
001). Furthermore, on the correlations 
veen CPI and ACL scores, 32 of the 58 
lations in agreement with our hypothesis 
e significant, whereas none of the 8 cases 
i disagreement with our hypothesis was 
pnificant. 


Discussion 


Comparing our data to those of Hogan et al. 
(1970), we are impressed that the personality 
melates of marijuana use among the reput- 
ly “straight,” apathetic, job-oriented college 
youth of the early 1970s are strikingly similar 
‘the personality correlates associated with 
ana use among the “hippie” activist 
ge youth of the late 1960s. Although the 
t sample of college males at a relatively 
and conservative university is not likely 
be representative of American youth, the 
ects are probably similar to a large num- 
of college students. This successful replica- 
mn of Hogan et al. provides support for the 
findings of both studies and implies that the 
ality patterns associated with marijuana 
and nonuse have not changed significantly 
T the years. 

he consistency between studies and the 
sistency across measures and across years 


led nonusers, users, and switched nonusers. 
distinction between users and nonusers 1S 


up of scales described by Gough as “‘mea- 
S of socialization, maturity, responsibility, 
intrapersonal structuring of values” (1964, 
10). These three scales (Responsibility, 
zation, and Self-control) all reflect a 
em with social standards. The low scores 
the drug users reflect relative irresponsi- 
ity, rebelliousness, and hostility to rules and 
©ohventions compared to nonusers. It is inter- 
Sting to note that even nonusers in our sample 
ed below Gough’s (1964) male norms on 
(Responsibility, Self-control) of the scales. 
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Conformity among nonusers is also reflected 
on the ACL scales of Deference, Order, and 
Self-control. On the other hand, the individu- 
ality and lack of conformity of the user is 
evidenced in high ACL scores on Autonomy, 
Change, Exhibition, and Lability. 

The second distinguishing personality cluster 
is related to the issue of socialization and 
conformity. It is a set of characteristics that 
increases the likelihood of success through 
conventional means. Nonusers are relatively 
high in Achievement through Conformance 
on the CPI. This scale indicates efficiency, 
organization, and industriousness. Similarly, 
on the ACL, nonusers were higher on Achieve- 
ment and Endurance (persistence). They also 
demonstrated a more positive attitude toward 
their place in society by their superior scores 
on Personal Adjustment. 

The third cluster of traits involves adventure - 
seeking. On the CPI, users evinced more 
spontaneity (measured by Social Presence) 
and adventuresomeness (Flexibility). Simi- 
larly, users appeared high on adventure 
(Heterosexuality) and seeking novelty 
(Change) according to the ACL. 

In summary, nonusers appear to be well 
socialized. They conform to norms, respect 
authority, strive for traditional goals, and 
rarely act on impulse. Users show a strikingly 
different picture. They are nonconforming, 
independent, adventurous, and spontaneous. 
This pattern generally supports contemporary 
myths about college-age drug users and non- 
users. It should be remembered that these data 
are based on self-report measures; that is, 
they tell us how the subjects perceive, or claim 
to perceive, themselves. Thus, marijuana users 
and nonusers may in part be reflecting social 
expectations. How their self-perceptions relate 
to others’ perceptions or to reality (however 
it is defined) has not been assessed. 3 

At any rate, that users and nonusers reliably 
report self-perceptions that sharply differ is 
both interesting and potentially useful. The 
usefulness of these data becomes clearer when 
we examine the third group, the switched | 
nonusers. It is noteworthy that approximately 

j had switched to 
90% of the SNU subjects had s$ j 
marijuana use by the second spring testing. 
Thus, the data in the last two testings reflect 
i lity characteristics. 
postmarijuana use personality 
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In general, this group fell between the users 
and the continuous nonusers on the various 
personality scales; this outcome replicates the 
study by Jessor et al. (1973). If we are inter- 
ested in predicting which nonusers have the 
greatest chance of becoming users, there are 
several CPI and ACL scales that appear to 
be useful. Specifically, if SNU subjects show 
certain personality characteristics similar to 
U subjects (and dissimilar to CNU subjects) 
both before and after their drug use, those 
characteristics can be used as predictors of 
future drug use. SNU subjects were con- 
sistently similar to U subjects on the Social 
Presence scale of the CPI and on the Order, 
Heterosexuality, and Exhibition scales of the 
ACL. The SNU group can thus be described 
as consistently similar to the U group in being 
outgoing, socially self-confident, and spon- 
taneous. One might say that the “extraverted 
personality” is susceptible to the use of 
marijuana. 

Several studies have reported findings that 
support the conclusion that marijuana users 
are extraverts. Hogan et al. (1970) described 
the marijuana user as high in social poise 
(though this was offset by an “assertive non- 
conformity”), According to Brill & Christie 
(1974), marijuana users reported themselves 
as higher in the tendency to seek stimulation, 
Simon et al. (1974) reported users to be low in 
deference and order, Graham and Cross (1975) 
found that users value feelings and experience 
over planning and logic, an interpersonal 
responsiveness that seems to characterize our 
switched nonusers. 

On the other hand, SNU subjects shifted 
from similarity with CNU subjects to similarity 
with U subjects on the Flexibility scale of the 
CPI, and on the Socialization through Con- 
formity, Lability, Autonomy, and Change 
scales of the ACL. This second cluster of 
personality traits involving nonconformity, 
lack of responsibility, and change is reflected 
in most of the literature on drug users (eg., 
Cunningham et al., 1974; G 


rossman et al., 
1974; Hogan et al., 1970; Simon et al., 1974; 


Zinberg & Weil, 1970). However, these studies 
only compared users with nonusers. Thus, the 
characteristics that SNU subjects shared with 


U subjects after they started using marijuana 


appear to be excellent post hoc discriminators; 
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they are not useful in discriminating nonusers 
who later become users from nonusers who 
continue not to use marijuana. 

Two final points relate to the personality 
tests. First, contrary to the claims of some 
critics of the CPI, this longitudinal study 
found little evidence of shifts in personality 
results in conjunction with changes in drug use, 
Our SNU subjects changed on only one CPI 
scale—Flexibility. This supports our earlier 
contention that a certain personality type may 
be recruitable to marijuana use rather than the 
typical interpretation that marijuana use 
results in personality changes. 

The second point is that in contrast to the 
consistency of the CPI scores, some ACL 
scores did shift among SNU subjects. On the 
ACL, SNU subjects demonstrated a trend over 
time toward similarity to users. As the ACL 
was developed to assess self-concept, this 
difference may imply that SNU subjects are 
experiencing and reporting changes in their | 
self-concepts. Following their participation in 
marijuana use, the SNU subjects developed a 
self-concept of themselves similar to marijuana 
users. This change is not reflected in personality 
trait modifications as measured by the CPI. 
Perhaps such situational influences as drug use 
and associated environmental conditions have 
impact on ACL-type self-concept without 
resulting in changed perceptions of personality 
characteristics measured by the CPI. 
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The Child Behavior Profile: I. Boys Aged 6-11 


Thomas M. Achenbach 
National Institute of Mental Health, Bethesda, Maryland 


Profiles that are standardized separately for children of each sex at ages 4-5, 
6-11, and 12-16. The profiles are scored from the Child Behavior Checklist 
(CBCL), which was designed to obtain parents’ reports of their children’s com- 


One of the greatest handicaps to research 
and communication on child psychopathology 
has been the lack of a standardized, objective, 
and reliable way of describing and classifying 
behavior disorders. Not until the 1968 edition 
of the American Psychiatric Association’s 
Diagnostic and Statistical M, anual (DSM; APA, 
1968) was the need for a differentiated classi- 
fication of children’s disorders even recognized 
in the official nomenclature, Prior to the 1968 
edition, the only childhood disorders recognized 


The author is indebted to the staffs of the many 
clinical agencies that haye contributed to this work, as 


In the public domain. 
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by the DSM were adjustment reaction of 
childhood and childhood schizophrenia, 
Although provisional guides to classification 


such as the DSM may be needed during thel 


early stages of a field’s development, pE 
adequacy of the DSM has inspired numero 


attempts to evolve alternative methods o 


classification for childhood disorders. The most 
common approach has been to factor ani 
checklists of behavior problems. However, k 
diversity of checklists, subject samples, oti 
of data, and methods of analysis has led id 
almost equally great diversity of a 
(Achenbach & Edelbrock, in press). One of i 
recurrent differences among studies is whet! 

they yield a small number of broad-ban 
factors or a larger number of naron D 
factors. The number and breadth of the a 
has been determined largely by the pect 
items used and the methods of analysis. aa a 
extreme, Quay and Peterson’s (1967) 55- i 
Behavior Problem Checklist typically T 
two broad-band factors labeled Com 

Problem and Personality Problem, althoug 


th 

d 
d 

f 

ld 
duct 

h 
fi 
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{wo smaller factors labeled Inadequacy-Im- 
| maturity and Socialized Delinquency have also 
fen found (cf. Quay, 1972). At the opposite 
lastreme, Baker and Dreger (1973) have 
derived 30 factors from a checklist of 274 items. 
Findings of a few broad factors and numer- 
ws narrower factors are not necessarily 
ontradictory, as second-order analyses have 
town narrow-band factors to be subsumed by 
road groupings like the Conduct Problem and 
Personality Problem factors (Achenbach, 1966; 
Miller, 1967). Other analyses by Achenbach 
(1966) have shown that the two broad-band 
groupings—which he labeled Externalizing and 
Inernalizing—were replicated in child psy- 
“hiatric samples differing in age, sex, and 
‘ocioeconomic status, but that the narrow- 
band factors differed among various subgroups. 
It thus appears that the broad-band factors 
tepresent general behavior patterns but may 
ask syndromes that vary with such charac- 
(etistics as sex and developmental level. 
The value of any classification system 
depends on the function it is to serve. In our 
present state of ignorance about etiology, 
Prognosis, and appropriate treatment, the 
North of a system for describing and classifying 
thild psychopathology can be measured in 
lms of the following criteria: (a) It should 
Plovide a description of behavior in a standard- 
ed format that is useful to clinicians and 
ttsearchers alike. (b) It should be differentiated 
tough to include narrow-band syndromes 
oe to particular subgroups. (c) It should 
k: rest on clinical inferences by professionals, 
Stew children in need of help receive adequate 
ip osional attention. (d) It should reflect 
as positive adaptive competencies as 
k: T their maladaptive characteristics. (e) 
A ould enable us to group children for 
ip of research on etiology, epidemiology, 
a treatment effectiveness. (f) It should 
f itate quantitative assessment of behavorial 
oe in order to evaluate prognosis under 
Ous conditions. 
a his article reports efforts to develop a 
. criptive classification system that will fulfill 
a six criteria just enumerated. Separate 
ae are performed for children divided 
ee to age and sex in order to detect 
a ems that may be peculiar to particular 

groups. Data are obtained with the Child 
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Behavior Checklist (CBCL), which comprises 
not only a diverse array of behavior problems 
but also items reflecting adaptive competencies, 
including participation in various activities, 
social relationships, and school success. These 
items form three social competence scales on 
which children are scored in relation to norms 
for their age and sex. The CBCL, which takes 
about 17 minutes to complete, is designed to 
be filled out by parents or parent surrogates, 
because they typically have a more compre- 
hensive picture of their children’s problems and 
competencies than do any other possible 
informants. Furthermore, parents’ views and 
biases are pivotal in determining whether 
clinical services are obtained and which treat- 
ment options are implemented; they are also 
important in determining the long-term 
prognoses. 

Data obtained with the CBCL are entered on 
the Child Behavior Profile, which displays the 
items reported by parents as well as the child’s 
standing on narrow- and broad-band syn- 
dromes. Using either a computerized or a hand- 
scored version of the profile, the clinician or 
researcher can obtain an overview of the 
specific behavior reported by the parent, how 
the child’s problems and competencies cluster, 
and how the child compares with normal 
children of similar age and sex. The profile 
approach preserves more information than does 
classification into mutually exclusive categories 
according to individual syndromes, and the, 
profiles themselves can be used as a basis for 
multidimensional classification. The profile 
described here was constructed from data on 
normal and clinical samples of boys 6-11 years 
old. Subsequent articles will report the profiles 
for boys 4-5 and 12-16, and girls 4-5, 6-11, 


and 12-16. 


Method 


CBCL Behavior Problem Items 


i havior 

Development of the CBCL began with the be 
problem Checklist that was constructed by Achenbach 
(1966) from a survey of existing literature and case 


A Sbu s Sanal 
istories of 1,000 child psychiatric patients. The original 
pine igned to be filled out from case history 


data by raters using present-absent response alterna- 
tives. It was adapted for parents’ use by simplifying 
the wording, expanding the present-absent alternatives 
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to a 0-1-2 scale, and adding new items in consultation 
with clinicians, Pilot editions were further revised on 
the basis of item analyses and feedback from parents, 
clinicians, and paraprofessionals. 

The current edition comprises 118 behavior problem 
items to which the parent responds by circling a 0, 1, 
or 2 according to the following instructions: 


Below is a list of items that describe children. For each 
item that describes your child now or within the past 
12 months, please circle the 2 if the item is very true or 
often true of your child. Circle the / if the item is some- 
what or sometimes true of your child. If the item is not 
true of your child, circle the 0. 


The items are intended to provide broad but non- 
redundant coverage of behavioral problems that can 
be rated with a minimum of inference. The parent is 
requested to write in descriptions of behaviors for items 
that might otherwise be ambiguous. For example, 
Item 28 is: Eats or drinks things that are not food 
(describe) . Parents’ descriptions make it possible 
to discriminate between those who are concerned about 
their child’s consumption of junk foods and those whose 
child is eating dirt, paint, and so on. Only nonfood 
substances such as the latter are scored on the Child 
Behavior Profile. In addition to the 118 items, spaces 
are provided for parents to write in unlisted physical 
problems having no known medical cause and any other 
problems that are not listed. 


CBCL Social Competence Items 


Following a survey of the meager existing literature 
on social competence indices for children, descriptions 
of positive behavioral characteristics were piloted in 
various formats with parents. It was found that items 
paralleling the behavior problems but describing 
positive characteristics inevitably sounded like lists of 
“boy scout virtues with a strong social desirability 
component. Most parents endorsed all such items as 
describing their child. On the other hand, items of the 
type used on the Vineland Social Maturity Scale (Doll, 
1965) failed to discriminate among children of normal 
intelligence. The items found to be Most successful and 
ultimately selected for the CBCL comprise scales of 
in yoiyemeni and attainment in the three areas described 

elow, 


amount and quality of the child’s ticipation i 
sports; (b) nonsports hobbies, e O 
and (c) jobs and chores. The parent is first asked to 
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the parent is to list the child’s jobs or chores (up to 
three). Beside each entry the parent is to check boxe 
indicating how well the child carries it out, compared 
to other children of the same age, The response alterna. 
tives are like those of the preceding items. 

After trying other response formats, such as request, 
ing parents to report the actual frequency of each 
activity, the present format was chosen for three 
reasons: (a) The significance of the frequency of 
participation varies greatly with the particular activity 
and environment. (e.g., Opportunities for riding 
bicycles are generally more frequent than for skiing, 
but both depend on the season of the year and the 
locality.) (b) To make the parent’s task as easy as 
Possible, we wished to use a format of maximum 
simplicity and generality. (c) Recognizing that we ate 
obtaining the parent’s perception of the child, we 
wished to maximize the power of the CBCL to dis. 
criminate children for whom parents could report at 
least some evidence of social competence from those 
for whom parents could report nothing positive. 

The scoring system for the Activities scale allocates 
0-2 points for number of sports, with 0 being assigned’ 
if one or no sport is listed, 1 point for two sports, and 
2 points for three sports; 0-2 points for the mean score 
obtained for sports participation and skill, with 0 
assigned for each response of below average, 1 for each 
response of average, 2 for each response of above average, 
and don’t know responses omitted from scoring; 0-2 
points for number of activities; 0-2 points for average 
amount and quality of participation in activities; 0-2 
points for number of jobs and chores; and 0-2 points’ 
for average quality of job performance. The latter four 
scores follow the same scoring principles as outlined 
for sports. 

The reason for assigning a score of 0 to a report of 
either zero or one sport, activity, or job is that so few 
parents of children in clinical samples reported no 
Sports, other activities, or jobs that the difference 
between none and one did not appear worth recognizing 
in the scoring. However, a child who has none can 
receive only a 0 for participation and skill in that area, 
whereas a child who has one can receive up to 2 points 
for participation and skill for that one entry. The six 
scores for the Activities scale are added together i 
Provide a summary score that can range from 0 to 12, 
and this is entered with its T score on the Child Be 
havior Profile, as explained later. 

Social scale. The Social scale consists of scores for | 
(a) the child’s membership and participation in | 
organizations; (b) number of friends and contacts ia 
them; and (c) behavior with others and alone. On ) 
first item, the parent is asked to list (up to thr 
organizations, clubs, teams, or groups the child belongs 
to and to indicate how active the child is in each, com: 
pared to other children of the same age. The number a 
organizations and amount of participation in each, ne 
scored 0-2 in the same fashion as items on the Activi z 
scale. On the next item, the parent is to indicate by 
many close friends the child has, with the respon“ 
alternatives being none, 1, 2 or 3, and 4 or more. A 
and 1 are both scored 0, 2 or 3 is scored 1, and 4 or aa 
is scored 2. The parent is also asked to indicate ho 
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{y times a week the child does things with his/her 
is, The responses—less than 1, 1 or 2, and 3 or 
eare scored 0, 1, and 2, respectively. 
[The third item of the Social scale asks: “Compared 
other children of his/her age, how well does your 
id: Get along with his/her brothers and sisters? 
along with other children? Behave with his/her 
nts? Play and work by himself/herself?” The 
se alternatives are worse, about the same, and 
, and they are scored 0, 1, and 2, respectively. 
nses to the first three questions are averaged to 
ide a score for behavior with others, whereas the 
nse to the last question provides a score for 
endent behavior. The Social scale score is the 
of the six scores just described, each of which can 
from 0 to 2, for a possible total of 12. 
tool scale. The School scale consists of scores for 
the average of the child’s performance in academic 
jects; (b) placement in a regular or special class; 
being promoted regularly or held back; and (d) the 
nce or absence of school problems. For academic 
mance, the alternatives are failing, below average, 
ge, and above average for reading, writing, arith- 
, spelling, and/or other subjects. The response 
atives are scored 0, 1, 2, and 3, respectively, and 
averaged to provide a score ranging from 0 to 3. 
The parent is next asked to indicate whether the 
isin a special class, and, if so, what kind; whether 
child has ever repeated a grade, and, if so, the grade 
teason; and to describe any academic or other 
lems the child has had in school. Negative answers 
tach of these items are scored 1, and answers indica- 
Of school difficulties are scored 0. These items thus 
vide three 0-1 scores, which, when added to the 
N score for academic performance, yield a 0-6 score 
the School scale. 


truction of the Child Behavior Profile 


(aie problem scales of the profile were 
; naei factor analysis of CBCLs filled out by 
A ie 450 boys being evaluated in 20 East Coast 
ealth settings. These included guidance clinics, 
i at a organizations, and private prac- 
: Racial composition was 79.7% white, 18.7% 
ad 1.6% other. Mean socioeconomic status 
M Nis 4.4 (SD = 1.8) based on Hollingshead’s 
a “a scale for breadwinner’s occupation. To 
the younger and older boys contributed equally 
f ae le, the sample contained equal numbers of 
year-olds and 9- to 11-year-olds, with approxi- 

| ela numbers at each year. 
Ma for the profile were computed from CBCLs 
ormal boys, 50 at each age from 6 to 11. These 
ma obtained by interviewers who went to 
} y selected homes in the greater Washington, 
4 E as described elsewhere (Achenbach & Edel- 
l A ote 2). The normative sample contained no 
a o had received mental health services in the 
An Racial composition was 79.4% white, 
De oe and 2.3% other. Mean SES was 4.1 
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Behavior Problem Scales 


The frequency with which parents endorsed 
each item (i.e., scored it 1 or 2) was first 
tabulated to identify items that were too rare 
or common to add to the discriminative power 
of factor-based scales. With a lower cutoff of 
5% and an upper cutoff of 95%, four items 
were found to be too rare and none was too 
common for inclusion in the factor analysis. 
The four low frequency items were: Item 15. 
Sexual problems (describe) ; Item 78. 
Smears or plays with bowel movements; Item 
105. Uses alcohol or drugs (describe) 4 
and Item 110. Wishes to be of opposite sex. 
The low frequency for Items 73 and 110 does 
not mean that no sexual items remained for 
analysis (e.g., Item 5. Behaves like opposite 
sex; Item 59. Plays with own sex parts in 
public; Item 60. Plays with own sex parts too 
much; and Item 96. Thinks about sex too 
much were all reported for at least 5% of the 
cases). 

Narrow-band scales. A principal components 
analysis was performed on the 114 items meet- 
ing the 5% criterion for the 450 subjects. 
Because there is no unique criterion for rotation 
to simple structure, orthogonal (varimax) and 
oblique (direct quartimin) rotations were both 
performed on varying numbers of factors to 
identify the most robust. When more than 13 
factors were rotated, factors that had con- 
sistently appeared in smaller rotations began 
to break down into factors having only two or 
three large loadings. When less than 11 factors 
were rotated, substantial groupings that oc- 
curred in the 11-, 12-, and 13-factor rotations 
were combined into very large factors. The 
12-factor varimax rotation was selected as 
containing the best representation of the 
factors that appeared most consistently in 
the various rotations. However, only the 
largest 9 of the 12 factors were retained for the 
profile, as the smallest 3 had only 3-5 items 
with loadings > -30. Each of the 9 factors had 
at least 8 items with loadings 2 30. Because 
the largest rotated factor (Aggressive) had 33 
items with loadings > .30, and many of the 
items with loadings between .30 and .40 also 
had substantial loadings on other factors, only 
the items with loadings >.40 on this factor 
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Table 1 
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First-Order Varimax Loadings on Behavior Problem Scales 


Internalizing scales 


I. Schizoid 


40. Auditory hallucination 

70. Visual hallucination 

29. Fears 

30. Fears school 

11. Clings to adults 

50. Anxious 

47. Nightmares 

59. Public masturbation 

75. Shy, timid 
Eigenvalue 


II. Depressed 


35. Feels worthless 

52. Feels guilty 

32. Needs to be perfect 
33, Feels unloved 

112. Worrying 
103. Sad 

31. Fears own impulses 
91. Suicidal talk 

12. Lonely 

14. Cries much 

50. Anxious 

71, Self-conscious 

34, Feels persecuted 
88. Sulks 

45. Nervous 

89. Suspicious 

18. Harms self 

Eigenvalue 


III. Uncommunicative 


65. Won't talk 

69. Secretive 

75. Shy, timid 

103. Sad 

80. Stares blankly 

71. Self-conscious 

13. Confused 

86. Stubborn 
Eigenvalue 


IV. Obsessive-Compulsive 
85. Strange ideas 
100. Can't sleep 
76, Sleeps little 
84. Strange behavior 
9. Obsessions 
92. Walks, talks in sleep 
80. Stares blankly 
17. Daydreams 
46. Twitches 
83. Hoarding 
66. Compulsions 
54. Overtired 
13. Confused 


Note. Items are designated with 
the numb 
summary labels for their conte; 


excluded from scales because 


IV. Obsessive-Compulsive (cont.) 
93. Excess talk 34 
47. Nightmares 33 
-35 | 50. Anxious 33 
SH Eigenvalue 4.03 
Al | V. Somatic Complaints 
37 | s6f. Stomach problems 64 
36 | 56a. Pains -50 
:31 | 56b. Headaches 58 
30 | 56c. Nausea -56 
30 56g. Vomits 44 
2.53 49. Constipated Al 
51. Dizziness 39 
68 77. Sleeps much 32 
67 54. Overtired 31 
‘58 Eigenvalue 3.08 
55 Mixed scale 
3? | VI. Social Withdrawal 
48 | 48. Unliked 59 
46 | 25. Poor peer relations 59 
40 | 111. Withdrawn 56 
.39 | 42. Likes to be alone 46 
.39 | 38. Is teased 36 
-39 | 64. Prefers younger children .33 
34 | 34. Feels persecuted 32 
.32 | 102. Slow moving 31 
31 Eigenvalue 3.05 
a Externalizing scales 
4.94 | VIZ. Hyperactive 
8. Can't concentrate 65 
1. Acts too young 58 
«61 | 61. Poor school work -56 
50 62. Clumsy 48 
42 13. Confused AS 
-36 17. Daydreams 43 
33 41. Impulsive -40 
32 | 64. Prefers younger children .40 
32 10. Hyperactive 36 
30 | 79, Speech problem 31 
2.97 | 20. Destroys own things -30 
Eigenvalue 3.75 
-52 | VIII. Aggressive 
+52 3. Argues 71 
‘45 | 22. Disobedient at home 66 
43 95. Temper tantrums 64 
:42 | 86. Stubborn -63 
40 37. Fighting 61 
-40 | 16. Cruel to others -60 
38 97. Threatens People 57 
37 94. Teases 56 
37 | 74. Shows off 55 
-36 | 104. Loud ‘51 
36 | 23, Disobedient at school 51 
35 | 57, Attacks People 50 


VIII. Aggressive (cont.) 


68. Screams 

90. Swears 

25. Poor peer relations 

88. Sulks 

7. Brags 

43. Lies, cheats 

27. Jealous 

87. Moody 

19, Demands attention 

93. Excess talk 

48. Unliked 
Eigenvalue l 


IX. Delinquent 


82. Steals outside home 
81. Steals at home i 
21. Destroys things belong- 
ing to others i 
106. Vandalism f 
72. Sets fires : 
101. Truant i 
67. Runs away f 
39. Bad friends { 
43. Lies, cheats | 
20. Destroys own things | 
90. Swears f 
23. Disobedient at school 4 
Eigenvalue í 


Other Problems 


2. Allergy 
4. Asthma 

5. Acts like opposite sex ` 
6. Encopresis 
15. Cfuel to animals 

24. Doesn't eat well l 
26. Lacks guilt 

28. Eats nonfood 
36. Accident prone 

44. Bites nails 
53. Overeats i 
55. Overweight 
56d. Eye problems 
56e. Rashes 

58. Picking : 
60. Excess masturbation 
63. Prefers older children 
73. Sex problems 

78. Smears feces _ 

96. Sex preoccupation 
98. Thumb sucking 

99. Too neat 
105. Alcohol, drugs 
107. Wets self 
108. Wets bed 
109. Whining $ 
110. Wishes to be opposite 5°% | 


of low frequency or low loadings. 


8 anl 
ers they bear on the Child Behavior Checklist (ca) ‘i 
nt. For actual Wording of items, see the CBCL. Other Problems item 
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we retained for the scale constructed from 
factor. Each of the scales constructed from 
Ij other 8 factors consisted of the items hav- 
loadings > -30 on those factors. The items, 
ir varimax loadings, eigenvalues, and de- 
iptive labels for each scale are presented 
Table 1, 
Scoring of scales. Because equal weights for 
ins are likely to provide greater robustness 
ha linear discrimination system than are 
tights based on factor loadings (Wainer, 
io), the unweighted raw scores (0, 1, or 2) 
all items of a scale were summed to obtain 
fsubject’s total score on the scale. The raw 
es obtained by the 300 normal boys were 
dto compute normalized T scores for each 
the nine scales. A T score of 80 was assigned 
the highest raw score in the normal sample 
feach scale, excluding outliers. Because a 
stantial number of clinical subjects ob- 
ined higher raw scores than any normal 
ect on each scale, T scores up to 90 were 
led to the T scores based on the normal 
imple. This was done by dividing the T scores 
im 80 to 90 into as many intervals as there 
pte raw scores between the highest score 
ained by a normal subject and the highest 
Score obtainable on the scale. These 
ional T scores were then assigned to the 
[scores from the highest score in the normal 
mple to the highest possible score and were 
iinded to the nearest whole T score. 
Second-order analysis. The normalized T 
fites obtained by the clinical sample on the 
e Scales were intercorrelated and subjected 
là principal components analysis, with 
ax and direct quartimin rotations of the 
ee having eigenvalues > 1.00. The 
a of the two rotations were very similar 
p Scales 1-5 (Table 4) all had loadings 
-63 on the first second-order factor, whereas 


7 x ordering of the scales in that 
eS i ad a slightly higher loading than 
er = the varimax rotation, whereas this 
reversed in the quartimin rotation. 

i . moderate loadings on both factors 
Totations; they were 38 and .44 on 

E two factors of the quartimin rotation and 
1 a = on the two factors of the varimax 
- Because the scoring of some items on 
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more than one scale might inflate the correla- 
tions among scales, the second-order analysis 
was repeated on first-order scales from which 
all redundantly scored items were deleted. This 
yielded the same two second-order factors. 

The ordering of the scales in Table 1 follows 
the order of their loadings on the second-order 
varimax factors, with Scale I having the highest 
loading on the first second-order factor, Scale 
VI having moderate loadings on both second- 
order factors, and Scales VII-IX having 
progressively higher loadings on the second 
second-order factor. The items of Scales I-V 
clearly form a broad-band grouping like the 
Personality Problem and Internalizing group- 
ings found previously, whereas the items on 
Scales VII-IX form a broad-band grouping 
like the Conduct Problem and Externalizing 
groupings found previously (Achenbach, 1966; 
Miller, 1967; Quay & Peterson, 1967). 

To obtain normalized T scores for Internal- 
izing and Externalizing, raw scores were com- 
puted for each normal subject by summing 
his scores on all items of the five Internalizing 
scales and all items of the three Externalizing 
scales. Items that were included on more than 
one scale were scored only once to obtain the 
Internalizing or Externalizing score, but the 
three items that appeared on at least one 
Internalizing and one Externalizing scale were 
counted once each toward both the Internal- 
izing and Externalizing scores. Normalized T 
scores were derived from the distributions of 
raw scores for Internalizing and Externalizing 
in the same way as for the nine first-order 
scales, with the following modification: Be- 
cause the range of possible Internalizing and 
Externalizing scores extended far above the 
highest score obtained by any of the 450 
clinical subjects, the highest raw score actually 
obtained by any clinical subject was assigned 
a T score of 89 and all higher possible raw 
scores were assigned a T score of 90. The T 
scores from 80 to 89 were assigned by dividing 
these T scores into as many intervals as there 
were raw scores from the highest normal 
(excluding outliers) to the highest clinical 
subject. These fractional T scores were 
assigned in sequence to the raw scores ranging 
from the highest score obtained by a normal 
subject to the highest score obtained by a 
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clinical subject. The fractional T scores were 
then rounded to the nearest whole T score. 


Social Competence Scales 


The scores obtained on the social competence 
scales by the 300 normal sub jects were used to 
obtain normalized T scores. A T score of 20 
was assigned to the lowest raw score obtained 
by a normal subject, excluding outliers. Be- 
cause some clinical subjects obtained ower 
Scores than any normal subjects on these 
scales, T scores between 10 and 20 were 
assigned by dividing them into as many 
intervals as there were raw scores between the 
score assigned a T score of 20 and the lowest 
raw score obtainable on the scale. 


Age, SES, and Clinical Versus Normal 
Comparisons 


To assess differences in scores related to age, 
SES, and clinical status, unweighted-means 
analyses of variance (ANOVAs) were per- 
formed on the 300 normal subjects and 300 of 
the clinical subjects, 50 at each age. SES was 
divided into three levels, comprising Hollings- 
head Occupational Categories 1 and 2, 3 and 
4 and 5-7. Age was divided into two levels, 
years 6-8 and 9-11, 

Behavior problem scales. A 3 (SES) x 2 
(age) X 2 (clinical vs. normal) X 9 (repeated 
measures on behavior problem scales) ANOVA 
showed significantly higher scores for clinical 
than normal subjects, F(1, 588) = 517.73, 
p < .001. SES was also significant, with upper- 
SES subjects obtaining the lowest scores and 
lower-SES subjects obtaining the highest, 
F(2, 588) = 3.37, $ <.05. Age showed no 
effect (F = .09), and there were no significant 
interactions among SES, age, and clinical 
status. The repeated measures effect of scale 
was significant, F(8, 4704) 
as were the interactions of scale with age, 
F(8, 4704) = 2.39, p < .05, 


p< .05. 


To elucidate these effects, a 3 (SES) x 2 


(age) X 2 (clinical status) ANOVA was per- 
formed on each Scale, wi 
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Complain 
F(2, 588) = 3.88, p < .05; Hyperactive, Fi 
588) = 4.44, p< .05; Aggressive, F(2, 58 
= 3.45, p < .05; and Delinquent, F(2, 58 
= 5.43, p < .01. In all significant compariso 
lower-SES subjects had the highest scores an 
upper-SES the lowest. Modified least signi 
cant difference contrasts (Wi ner, 1971) show 
significantly higher scores for lower- th 
upper-SES subjects on al! four scales am 
significantly higher scores for lower- th 
middle-SES subjects on the Delinquency scal 

The ANOVA for the Schizoid scale shows 
significantly higher scores for younger th 
older boys, F(1, 588) = 6.33, p < .05, but th 
lack of significant main effects for age in th 
other eight ANOVAs and the very small i 
of .09 for age in the repeated measures ANOVA 
indicates that age differences were minimal 
Significant interactions between age anl 
clinical status in the ANOVAs for the D 
pressed and Social Withdrawal scales boti 
reflected higher scores for older clinical subjecti 
than younger clinical subjects and lower scott 
for older normals than younger normals. How 
ever, on both scales, clinical subjects of bot 
age groups scored significantly higher thal 
normals. Table 2 presents all the mean scores 
collapsed over age to save space. 

Internalizing, Externalizing, and total scort, 

Separate 3 (SES)X 2 (age) X 2 (cinia 

status) ANOVAs on Internalizing, External 

izing, and total raw score all showed signifi 
cantly higher scores for clinical than norm! 
subjects, with F(1, 588) values ranging fo 

362.47 to 479.76, ps < .001. The SES a 

was also significant for Externalizing, F (2, a 

= 3.57, p < .05, and for total score, F(2, a 

= 4.63, p< .01. In both cases, conni 

showed significantly higher scores for loni 

SES than upper-SES boys. No other effec 

were significant in any of the ANOVAs. 1 
Social competence scales. In a 3 (SES) A 

(age) X 2 (clinical status) X 3 (repeated n 

sures on social competence scales) ANO a 

clinical subjects had significantly lower a 

than normals, F(1, 528) = 254.78, p < 00" 
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Upper SES Middle SES Lower SES 
Clinical Normal Clinical Normal Clini 
linical Normal 
(63) (67) (141) (97) (96) (136) 
Social Competence 
48.2 53.4 45.6 52.9 43.3 

4 $ $ 49.1 

42.1 55.4 39.1 51.9 40.0 50.0 

43.0 60.1 41.2 57.8 38.2 52.3 

Behavior Problems 

62.4 55.7 64.3 53.5 63.3 54.0 

AR 63.8 529 66.1 524 674 52.5 
mmunicative 65.0 52.6 67.0 53.1 68.2 535 
sessive-Compulsive 62.5 52.3 64.9 52.4 67.0 53.2 
matic Complaint 61.0 56.4 62.3 55.9 64 570 
cial Withdrawal 64.5 54.2 65.7 53.7 66.9 53.7 
66.0 50.5 67.4 52.8 68.7 53.5 

64.3 51.9 68.2 50.6 10.6 51.5 

66.1 54.9 68.1 54.5 69.4 56.6 

64.6 51.1 67.2 50.7 68.8 51.4 

66.7 50.8 69.3 50.4 71.4 51.6 

54.0 20.4 61.8 20.9 65.6 22.8 


umbers in parentheses are ns. SES 


or tests of SES differences. 


total number of cases was less than in 


ita to be scored on the school scale.) SES was 

ficant, with lower-SES subjects having 
est scores and upper-SES subjects the 
scores, F(2, 528) = 14.61, p < .001. 
uiferences were not significant (F = 1.68), 
pete any of the interactions among non- 
ited measures dimensions. The repeated 
üres dimension was significant, F (2, 1056) 
» P < .001, as were the interactions of 
with age, F(2, 1056) = 3.09, $ < 05, 
a clinical status, F(2, 1056) = 27.83, 


sig] 
lg 


5 X Age X Clinical Status ANOVAs on 
i the three social competence scales 
ed significantly lower scores for clinical 
s than normal subjects on all three 
with Fs(1, 528) ranging from 42.36 to 
,all ps < .001. SES differences were also 
ficant on all three, with Fs(2, 528) ranging 
l £74 to 9.71, all ps < 01. Contrasts 
ee enificantly higher scores for upper- 
an lower-SES subjects on all three 


T scores, Di 3 socioeconomic status. All scores except t 
ores. Differences between clinical and normal samples are all significant at p 


otal raw score are 
< .001, See 


scales and higher scores for upper-SES than 
middle-SES subjects on the Social scale. 
Middle-SES subjects scored significantly higher 
than lower-SES subjects on the Activities and 
School scales. The only other significant effect 
reflected higher scores by older than younger 
subjects on the Social scale F(1, 528) = 4.73, 


p< 05. 


Format of the Child Behavior Profile 


On the computer-scored version of the pro- 
file, face sheets describe the nature and purpose 
of the profile and provide a listing of items on 
each behavior problem scale, plus items not 
appearing on any scale. The printout for the 
behavior problem scales presents a graphic 
display in which raw scores for the scales are 
listed in nine columns, percentiles are listed to 
the left, and T scores are listed to the right. 
An asterisk designates the child’s raw score in 
each column of the display, and the asterisks 
can be connected by pencil to provide a visual 
profile. Below the graphic display are printed 
abbreviations of the items reported by the 


486 


parent on each scale, together with the score 
(1 or 2) given each item by the parent, and the 
child’s raw score and T score for each scale. 
To the right of the nine scales are printed any 
items scored as present but not belonging to a 
scale. Also printed are the total number of 
items scored as present and the sum of 1s and 
2s for all items, for the Internalizing items, and 
for the Externalizing items, plus the T scores 
for Internalizing and Externalizing. The social 
competence scales are presented in similar 
fashion on another page. The hand-scored 
version of the profile is like the computerized 
version, except that all items are printed and 
the scorer enters and sums the scores obtained. 


Test-Retest Reliability 


An interviewer obtained CBCLs from 
mothers of 12 normal boys on two occasions at 
a mean interval of 8 days (range = 7-12 days). 
Pearson correlation coefficients on the 12 
scales, Internalizing, Externalizing, and total 
behavior problem scores ranged from .72 for 
the Activities scale to .97 for total score (all 
ps < .01), with a mean of ,89, (All means of 
correlations were computed by z transforma- 
tion.) Dependent ¢ tests for which p values 
were multiplied by 15 to correct for the number 
of comparisons (Winer, 1971) showed one 
significant difference from Time 1 to Time 2, 
a drop in total behavior problem score, 
(11) = 3.81, p < .05. Even though this was 
the only significant change in means, it should 
be noted that 14 of the 15 means decreased, 
whereas only the mean for Delinquent Be- 
havior increased from Time 1 to Time 2 
(p < .01 for the proportion of decreases/ 
increases by sign test). There thus appears to 
be a general tendency to report fewer behavior 
problems and fewer items indicative of com- 
petence on the second occasion, although the 
differences in most scores were slight, 

To assess the short-term stability of profile 
shapes, a Q correlation was computed between 
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is significant at p < .001 whether it is treated 
as a correlation on 12 observations or as 12 
subjects X 12 scores = 144 observations. To 
determine whether the correlation could be an 
artifact of the Q approach, a baseline correla. 
tion was obtained by pairing each boy’s Time 1 
profile with every other boy’s Time 2 profile 
except his own. The mean of these 66 correla- 
tions was —.04, which indicates that the mean 
correlation of .86 between each boy’s Time 1 
profile and his own Time 2 profile was not an 
artifact of the Q approach. For workers inter- 
ested in using the nine behavior problem scales 
alone, it may be useful to know that the mean 
Q correlation between Time 1 and Time 2 
9-scale profiles was .84, whereas between ram- 
dom pairs it was 02. As a measure more 
sensitive to the similarity between profile 
elevations, the mean of Cattell’s (1949) rp was 
78 between Time 1 and Time 2 12-scale 
profiles and .74 between the 9-scale profiles, 
For randomly paired Time 1 and Time 2 
Profiles, the 12- and 9-scale means were .01 
and .02, respectively, 


Inter parent Agreement 


Mothers and fathers of 37 clinic boys inde- 
pendently filled out the CBCL. Pearson corte 
lations between scores obtained from mothers 
and fathers’ CBCLs on the 12 profile scales 
Internalizing, Externalizing, and total behavior 
problem score ranged from .58 for the Activities 
scale to .87 for the School scale (all ps < .001), 
with a mean of .74, Dependent ¢ tests for which 
$ values were multiplied by 15 to correct for 
chance showed a significant interparent differ- 
ence only on the School scale, where fathers 
gave higher scores than mothers, 4(36) = 3.38; 
P< .05. Across all 15 comparisons, the 
fathers’ mean scores were higher on six and 
the mothers’ on nine (p > 40 by sign test). 

After standardization of scores on each scale 
within the sample of 37, the mean Q correlation 
for the 37 pairs of 12-scale profiles was .69. For 
the nine behavior problem scales, the average 
Q correlation was .74. By comparison, ithe 
means of the 666 12- and 9-scale correlations 
for randomly paired mothers and fathers were 
—.02 and .04, respectively. The mean fp a 
the wife-husband 12-scale profile pairs was - 
and for the 9-scale profile pairs was .69, com 


dto 03 and .04, respectively, for the 
dom pairs. 


Stability of Behavior Problem Scores 


As part of a follow-up study, 46 parents who 
ed out the behavior problem portion of 
CL when applying to child guidance 
Mics were asked to complete it again at a 
in interval of 14.8 months (range = 9-27 
The families had received a mean of 
Clinical interviews (range = 0-50), but 
fad terminated with the clinics before the 
low-up was begun. To avoid overlap between 
jal and follow-up data, the parents were 
d to report only behavior problems occur- 
thin the 6 months prior to follow-up, 
than within the previous 12 months, as 
ed on the initial CBCL. Because of this 
baseline period, possible regression of 
oward the mean, and “‘hello/good-bye” 
changes in scores should not necessarily 
rpreted as indicating improvement. 
correlations for the nine scales, 
nalizing, Externalizing, and total score 
from .26 for Somatic Complaints to .79 
Delinquent Behavior, with a mean of .63, 

Significant except Somatic Complaints. 
er, the follow-up CBCLs showed de- 
es On all nine behavior problem scales, as 
on total behavior problem score and T 
for Externalizing and Internalizing. 
values multiplied by 12 to correct for 
ce, the decreases were significant by 
dent t tests for all scores except Schizoid, 
atic Complaints, and Social Withdrawal. 
neral decrease in reported problems was 
Teflected in the mean 7, of .43 between 
and follow-up profiles, compared with 
fp of .03 between the 1,035 randomly 
intake and follow-up profiles. Despite 
decreases in reported problems, the mean 
relation between initial and follow-up T 
S was .64, indicating considerable long- 
Stability in profile shape, as compared 
mean Q correlation of .02 for the 1,035 
pairs. 


Discussion 


though no other studies have taken 
ly the same approach, comparison of 
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the present findings with the most similar 
previous studies, those of Achenbach (1966) 
and Miller (1967), reveals considerable simi- 
larity along with some differences that are 
worth noting. (A more extensive survey of 
previous findings for both sexes, various age 
groups, and various sources of data is pre- 
sented by Achenbach & Edelbrock, in press.) 
Despite the differences in behavior checklists 
and the fact that Achenbach (1966) used case 
history data, six of the present narrow-band 
factors are similar to factors that he obtained 
for boys. These six are the Schizoid, Obsessive- 
Compulsive, Somatic Complaints, Hyper- 
active, Aggressive, and Delinquent factors. The 
last four of these six are also similar to narrow- 
band factors found by Miller (1967), which he 
named Anxiety, Hyperactivity, Infantile 
Aggression, and Antisocial. In addition, the 
Social Withdrawal factor found in the present 
study is similar to Miller’s factor of the same 
name. Miller’s failure to find factors like the 
Schizoid and Obsessive-Compulsive factors is 
probably due, as Miller pointed out, to the 
lack of severely disturbed children in his 
sample. 

The remaining two narrow-band factors in 
the present study, those labeled Depressed and 
Uncommunicative, have no direct counterparts 
in the previous factor analyses of boys’ be- 
havior problems. However, the Depressed 
factor is quite similar to the Depressive Symp- 
toms factor that Achenbach (1966) obtained 
for girls. The emergence of such a factor for 
boys suggests that cultural changes may be 
leading either to a greater incidence of depres- 
sion in boys or to a greater willingness to 
acknowledge such feelings in boys. Whatever 
the reason, it appears that the recent spurt of 
interest in childhood depression is well justified 
(eg., Lewis & Lewis, Note 3). The other 
narrow-band factor, labeled Uncommunicative, 
has no clear counterpart in either of the 


previous studies. Although the value of each 
scale lies less in the interpretation of its mean- 
add discriminative 


ing than in its ability to i r 
power to the profile as a whole, this scale is 
suggestive of a constriction in self-expression 
that might accompany depression in some 
children. 

The second-order Externalizing and Inter- 
nalizing factors are quite similar to the group- 
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ings given these names by Achenbach (1966) 
and to the second-order Aggression and Social 
Inhibition factors found by Miller (1967). 
Miller’s remaining second-order factor, entitled 
Learning Disabilities, was not likely to appear 
in the present data because of the different 
approach taken to the scoring of school 
performance. Miller’s inclusion of several 
similar items reflecting poor school performance 
(e.g., reads poorly; spells poorly; writes 
poorly) made a factor comprising these items 
almost inevitable. To avoid factors resulting 
from redundancy in items, the CBCL includes 
only the general item, poor school work, as a 
behavior problem. However, to provide a 
differentiated picture of school performance, 
the CBCL assigns scores for all academic 
subjects, which are then averaged and com- 
bined with scores for special versus regular 
class status, repeating grades, and other school 
problems to yield a score for the School scale, 

The major objective of this research does 
not, of course, end with the creation of scales 
for behavioral problems and competencies. 
More important is the value of the profile for 
describing children’s behavior in an economical 
but comprehensive and meaningful fashion, 
the power of the profile to discriminate among 
children who may benefit from different kinds 
of help, and the sensitivity of the profile to 
changes as well as stabilities in children’s 
behavior. Because they preserve a maximum 
of information about children’s behavior, the 


profile patterns may provide a much better ` 


basis for classifying children than do tradi- 
tional diagnostic categories or scores on 
individual scales. The short- and long-term 
test-retest correlations indicate stability in 
patterning, and the interparent correlations 
reflect agreement between parents’ perceptions 
of patterning in their children’s behavior. 
Highly significant differences between normal 
and clinical subjects on all scales also demon- 
strate discriminative validity. Studies are now 
under way to determine whether profile pat- 
terns can be identified that significantly differ- 
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entiate children with respect to long-term 
Prognosis and other clinically 


£ relevant 
characteristics. 
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the Vete: 
either the control or treatment group. 


ipated in an additional 10 sessions 0 
improve interpersonal p 
the point of discharge 


problem-solving thinking than had cont 
responses in a structured discharge inte! 
jects were significantly more likely to 
charge problems than were coni 


postdischarge problems. Finally, follo 
indicated that the majority of 
use of the problem-solving principles 


has been suggested that the capacity to 
lem solve in real-life situations is one 
ion for defining positive mental health 
a, 1958). This suggestion has received 
ing empirical support from a number of 
S examining the relationship between 
itive interpersonal problem-solving skills 
ychological adjustment. These studies 
used the means-ends problem solving 
PS) procedure (Platt & Spivack, 1975) 
xamine interpersonal problem-solving cog- 


is article is based on the author's doctoral disserta- 
Supervised by Murray Levine and submitted to 
partment of Psychology at the State University 
w York at Buffalo. 
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e staff of the Alcoholism Treatment Unit 4-D of 
ee New York, Veterans Administration 
k for their assistance in conducting this study, 
Louis English who made valuable suggestions 
Tespect to the analysis of study data. 
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gliata, Division of Community Psychiatry, 462 
Street, Buffalo, New York 14215. 


dard treatment aspects of the program. 
f group therapy structured specifically to 


roblem-solving thinking 
(generally 6 weeks after admissi 
treatment subjects had made significantly greater improvemen! 


also revealed that the means-ends problem-s 
criminate individuals within an adult alc 


levels of social competence and in the qual 
w-up ai 
treatment subjects contacted had made practical 
that were taught in the group sessions. 
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Increasing the Interpersonal Problem-Solving 
Skills of an Alcoholic Population 


James C. Intagliata 
on of Community Psychiatry, State University of New York at Buffalo 


Sixty-four male alcoholics admitted into the Alcoholism Treatment Program at 
rans Administration Hospital in Buffalo, New York, were assigned to 
Control subjects participated in all stan- 


Treatment subjects, however, partic- 


skills. Comparisons conducted at 
on) demonstrated that 
t on a measure of 
rols. Further, a comparison of subjects’ 
rview demonstrated that treatment sub- 

ostdis- 


anticipate and plan ahead for p 


trol subjects. Analysis of the data in the study 


olving procedure can reliably dis- 
cholic population who differ in their 
lity of their planning for coping with 
t the 1-month postdischarge point 


nition in a variety of groups including preschool 
age children (Shure, Spivack, & Jaeger, 1971; 
Shure, Newman, & Silver, Note 1), young ado- 
lescents (Platt, Spivack, Altman, Altman, & 


Peizer, 1974; Spivack & Levine, 1963), and 


adults (Platt, Scura, & Hannon, 1973; Platt & 
k, & Siegel, 1975). 


Spivack, 1972a; Platt, Spivac 
The results have demonstrated that for each of 
the age categories, problem-solving cognition is 
an adaptive thinking ability that successfully 
discriminates between gro 


ups that clearly 
differ in their level of demonstrated adjust- 


ment (e.g., impulsive adolescents at a resi- 
dential school vs. normal high school controls; 
ts vs. normal adult 


adult psychiatric patient 

controls). While these studies have demon- 
strated that problem-solving cognition can 
discriminate between groups that differ grossly 
in their level of adjustment, there is evidence 
that this variable can also discriminate among 
persons in a homogeneous group who differ 
only in the degree of their social competence 
or effectiveness (Platt & Spivack, 1972b; 
Ziegler & Phillips, 1962). Thus, empirical 
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research has built a case for the assertion 
that interpersonal problem-solving cognition 
plays an important role in healthy human 
functioning. 

The evidence that interpersonal problem- 
solving skill is intimately related to behavioral 
adjustment has quite naturally stimulated the 
development of intervention strategies de- 
signed to increase the problem-solving effec- 
tiveness of groups deficient in this ability. One 
of the most ambitious and successful strategies 
used has been Spivack and Shure’s (1973) 12- 
week problem-solving skills training program 
for primary-grade children. The goal of this 
program was to teach children a style of 
problem solving that could guide them to cope 
more: successfully with everyday problems. 
Results demonstrated that children who re- 
ceived the training, when compared with 
controls, showed generation of more alterna- 
tives to problems, better consequential 
thought, and a decided shift away from 
aggressive solutions to problems. Behaviorally, 
the trained children showed greater concern 
for the feelings of other children, were better 
liked by their peers (ie., were sought out 
more), showed greater initiative in the class- 
room, and had greater autonomy (ability to 
complete activities and overcome obstacles 
without adult assistance). 

Following these results, several attempts 
were made to modify this training approach so 
that it could be extended for use with adult 
populations that were deficient in interpersonal 
problem-solving skills. For example, Siegel and 
Spivack (1973) designed a program to train 
chronic schizophrenics in interpersonal 
problem-solving processes, Although they only 
piloted the program with 11 subjects, the 
authors were encouraged by the patients’ re- 
sponses and concluded that the approach 
seemed generally feasible. Platt (Note 2) re- 
ported on the use of a problem-solving skills 
training program with incarcerated heroin 
addicts. Though subjects were not randomly 
assigned to groups, the authors felt that the 
findings, which included a significant relation- 
ship between graduating from the cognitive 
problem-solving skills training program and 
successful parole performance, suggested both 
the feasibility and potential effectiveness of 
their intervention approach. 
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The present article reports the results of an 
extensive two-part study designed to extend 
training in cognitive interpersonal problem 
solving to an adult alcoholic Population. The 
first study explores the interrelationships in 
this population between scores on the MEPS 
procedure and (a) the Ziegler—Phillips (1962) 
social competence scores, (b) scaled vocabulary 
subtest scores from the Wechsler Adult Intel- 
ligence Scale (WAIS), and (c) scores on a 
discharge interview structured to assess the 
patient’s anticipation of and planning for post- 
discharge problems. The second study assesses 
the impact of a 10-session cognitive problem- 
solving skills training program on subjects! 
problem-solving thinking and behavior. 

A major reason for selecting an alcoholic 
population with which to work was the rela- 
tively successful results of the training program 
that Platt (Note 2) conducted with heroin 
addicts. His work suggested the possible 
televance of such training programs for addict 
populations in general, It also seemed reason- 
able that alcoholics might benefit from a 
program that emphasized learning the habit 
of generating a variety of constructive coping 
behaviors in stressful real-life situations. 


Study 1 
Method 


Subjects. The subjects were 63 male patients ad- 
mitted to the alcoholism treatment unit (4-D) at the 
Veterans Administration Hospital in Buffalo, New 
York, during a 14-week period. The treatment program 
on this unit involves approximately 6 weeks of inpatient 
care, Only veterans are eligible for admission, The a 
32 consecutive admissions were assigned to the contro) 
group. The assignment of 32 consecutive admissions i 
the treatment group, however, did not take place ma 
3-4 weeks later. This design was used so that irea 
subjects would not begin their problem-solving ski! d 
training until all control subjects had been aisce 
from the program. If control subjects had been allow 
to be in daily contact with treatment subjects for k 
weeks, their usefulness as a control group would te 
been diminished substantially. Although 32 soe i 
were assigned to the treatment group, one patient A 
the hospital shortly after attending the first RaT : 
session and was dropped from the study sample. Thus, 
the treatment group was comprised of 31 puke ie 

The subjects ranged in age from 23 to 66. k- 
average age was 45, With respect to educational E 
ground, 27% had completed eight grades or less, hi k 
had attended high school, 36% had completed hig 


gone to college, and 5% had received 
degree. Subjects’ scaled scores on the 
subtest ranged from 5 to 16. The 
score was 10. Of the 63 subjects in the 
were single and had never been married, 
but had been married previously, and 
rrently married. In the 3 years preceding 
on to the hospital, 51% had held one 
1% had held and lost several jobs, and 
worked at all. 
| and treatment groups were compared 
‘to age, educational background, scores on 
ary subtest of the WAIS (which correlates 
Full Scale WAIS score), marital status, and 
it history. Results revealed that these two 
P not differ significantly on any of these 


s and procedure. Two weeks after their 
subjects were administered the MEPS 
wack, 1975). 


lure makes use of story stems portraying 
in which a need is aroused in the protago- 
‘the beginning of the story and is resolved by 
t the end. The respondent is required to com- 
tory by filling in those events which might 
curred between the arousal and satisfaction 
ero’s need. (Platt & Spivack, 1975, p. 16) 


es were scored for the number of discrete 
erated that effectively enabled story pro- 
to reach the resolution points of the stories. 
t administration of the procedure involved the 
‘selection of 5 stories from the 10-story MEPS 
| At the 6-week postadmission point, all sub- 
administered whichever 5 stories they had 
iously been given. Thus, subjects were admin- 
the MEPS procedure both shortly after recover- 
alcoholic detoxification and immediately prior 
‘discharge. Between the two administrations of 
PS, treatment subjects completed 10 group 
g Sessions in problem-solving thinking skills. 
participated in these sessions in addition to all 
d components of the 6-week treatment program. 
subjects participated in all aspects of the 
lent program with the exception, of course, of the 
ving training sessions. The nature of the 
Sessions will be explained in considerably more 
Study 2. 
the basis of information gathered at the time of 
on, each subject was also assigned a Ziegler- 
s social competency score. Following the proce- 
utlined by Ziegler and Phillips (1962), the social 
ence measure is an additive score based on @ 
Sage, IQ, educational background, employment 
» Marital history, and occupational level. Sub- 
eceived a score of 0, 1, or 2 in each of these six 
‘These six scores were totaled for each subject as 
lex of his social competence. 
wee each subject received a score based on his 
onses to a series of standardized probes in a struc- 
discharge interview administered to each patient 
group therapist shortly before discharge. The 
iew was structured to assess the comprehensive- 
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ness of subjects’ discharge planning with respect to 
three key life areas: (a) employment, (b) living situa- 
tion, and (c) free time. Four basic questions were used 
to explore the discharge planning that each patient had 
done in each area. These questions assessed (a) the 
specific plans a subject had made for dealing with this 
area of his life, (b) any obstacles that the subject 
anticipated might interfere with the plans he had made, 
(c) any alternative plans that the subject had considered 
before deciding on his chosen plan of action, and (d) 
any evidence that the subject could present of having 
already taken action to develop or implement his 
discharge plans. So that subjects’ responses to the _ 
discharge interview could be scored, all discharge 
interviews were tape recorded. 

Scoring. Scoring procedures for the MEPS testing 
were developed by Platt and Spivack (1975), whereas 
procedures for scoring the discharge interview were 
developed by me and involved only slight modifications 
of the MEPS scoring system. The scoring for both the 
MEPS instrument and the discharge interview was 
carried out by two senior psychology students who 
were unaware of the group membership of any indi- 
vidual protocol. 

The scoring procedure for the MEPS testing involved 
noting (a) the relevancy, the number and kinds of 
means generated by a subject, (b) the number of 
elaborations or added details that a subject provided 
to explain each means, (c) the number of obstacles 
mentioned that might get in the way of the hero reach- 
ing the goal in the story, and (d) any mention of time 
elapsing before the hero reaches the resolution point 
of the story. A relevant means was defined as an 
instrumental act that enabled the hero to move toward 
or reach the goal in the story. 

The scoring for the discharge interviews involved 
noting (a) the number of discrete, relevant means 
(instrumental acts) that a subject described as his plan 
for dealing with problems at work, home, or occupying 
his free time; (b) the number of obstacles mentioned 
that the subject felt might get in the way of his carrying 
out his plans, (c) the number of alternative plans that 


a subject had considered but decided not to put, into 
action, and (d) the number of discrete acts in which a 
subject had engaged to formulate or implement his 


discharge plans. $ 
The two raters were trained in the MEPS scoring 
procedures over & period of 1 month. As a first step, 
they familiarized themselves with the description of 
the scoring procedures provided by Platt 

(1975). The final step, 
practice with the system, 
hypothetical stories to 
coefficient computed be 
means assigned by the two raters to each stor} 
(Spearman-Brown prediction formula). Since the 
scoring procedures for the discharge interview were 
as those for the MEPS scoring, 


fundamentally the same 

no separate reliability coefficient was calculated for the 

interview scoring. The clear-cut scoring procedures for 

the Ziegler-Phillips competency measure have already 
and are out- 


been described in the preceding section a 
lined in greater detail by Ziegler and Phillips (1962). 
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Results 


This study was conducted to explore the 
content and concurrent validity of the MEPS 
measure by examining the set of interrelation- 
ships between it and measures of subjects’ IQ, 
social competence, and discharge planning. 
For the purpose of examining these relation- 
ships, the following specific scores were used: 
(a) The total number of relevant means served 
as a subject’s MEPS score; (b) the subject’s 
scaled WAIS Vocabulary score served as an 
index of IQ; (c) the subject’s total score from 
the six life areas designated by Ziegler and 
Phillips (1962) served as the social competence 
measure; and (d) the subject’s total points 
from the areas of work, living situation, and 
free time planning served as the discharge 
interview score (1 point for each discrete plan, 
obstacle, alternative, and action step 
mentioned), 

When the interrelationships between scores 
on the first administration of the MEPS, the 
scaled Vocabulary score from the WAIS, and 
the Ziegler—Phillips social competency measure 
were explored, it was demonstrated that scores 
on the MEPS instrument were significantly 
related both to the social competency score 
(r = .20, n = 58, $ < .05) and to the scaled 
Vocabulary subtest score (r = 38, n = 62, 
p < .001). However, when the effect of the 
Vocabulary score was partialed out of the rela- 
tionship between the MEPS score and social 
competency score, the strength of the relation- 
ship decreased greatly and was no longer 
significant (r = .06). This result is unlike that 
reported by Platt and Spivack (1972a). They 
found that when general intelligence factors 
were removed from the social competency 
score, the relationship between the MEPS 
measure of interpersonal problem-solving cog- 
nition and the social competency score 
increased. 

A second set of interrelationships that was 
explored examined the relationship between 
the scores on the final administration of the 
MEPS and both the discharge interview score 
and the scaled Vocabulary subtest score from 
the WAIS. Results revealed that the final 
MEPS score was related quite significantly 
both to the discharge interview score (r = 38, 
n = 56, p < .002) and to the original vocabu- 
lary measure (r = -26,n = 55, p < .02). When 
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the effect of the vocabulary score was partiale 
out of the relationship between the final MEPS 
score and the discharge interview Score, the 
relationship between the two remained strong 
and significant (r = .33), 

When the relationship between the WAIK 
vocabulary measure and the MEPS scores on 
both the initial and final MEPS testing was 
examined for the entire sample, there was 4 
significant correlation between WAIS and 
MEPS scores at both points (r = .38 for initial, 
r = .26 for final), However, when relationships 
were examined separately for the control and 
treatment groups, it appeared that the prob- 
lem-solving training sessions, to which only the 
treatment subjects were exposed, may have 
had some impact on the way in which the 
MEPS and vocabulary measures relate. For 
the control group, the correlation between the 
two measures changed from .43 (initial MEPS 
and vocabulary) to .32 (final MEPS and 
vocabulary). Both of these correlations were 
significant (p < .007 and p < .04, respec 
tively). For the treatment group, on the other 
hand, the correlation coefficient changed from 


36 (initial MEPS and vocabulary) to —.05 


(final MEPS and vocabulary). Although the 
relationship between the two measures was 
significant for the treatment group at the 
initial MEPS testing (p < .02), the two 
measures were relatively unrelated at the time 
of the final MEPS administration (p < 30). 
Further, not only did the strength of the 
relationship decrease for the treatment group, 
but the valence shifted from positive to 
negative. 


Study 2 
Method 


Subjects. The subjects are the same as described 
in Study 1. í 

Insiruments and procedure. The treatment group B 
31 subjects participated in a series of 10 60-minu 
sessions of interpersonal problem-solving group therapy: 
These sessions comprised a systematic ees ee 
tended to increase participants’ problem-solving skills. 
The 10 sessions combined and integrated a eee 
materials prepared by Platt, Spivack, and Swift (19 4 
in a structured Program of interpersonal on 
solving group therapy for adults, The 10 sessions Wi 
organized to teach a four-step approach to SN 
Problem solving. These steps were to (a) recognize t! 
a problem exists; (b) define the problem; (c) gor 
a number of alternative solutions; and (d) select t ae 
best alternative after having looked ahead to imag" 


{ 


— 


ences of each. The treatment group 
o four subgroups (7, 6, 8, 
which began the 10-session program 
1 week apart. The 10 sessions took place 
(2-3 sessions per week). All 
least 2 weeks in the treatment 


was divided int 


of 4 weeks 


had spent at l 
re problem-solving sessions began. 


ers of the treatment grou] 
sion program prior to t 
on. 


t variable was manipulated 


e 1 
ost Comparisons of Story Elements 


d in Study 1, all subjects were adminis- 
EPS procedure 2 weeks following their 
o the hospital and again 4 weeks later, This 
it provided a means of assessing the impact 
blem-solving sessions on the participants’ 
ving thinking skills. The timing was set so 


ion to the problem-solving sessions, one other 
in this study. generalization of problem-solving skills 
a aration set of subjects who training sessions was dependent on specific r 
out to be administered the structured interview cues. 
the quality and comprehensiveness of their 
e planning. One half of the subjects in both measures to assess the impac' 


trol and treatment groups were randomly 


ıp therapist would interview them about the 
g they had done for their discharge. They were 
med about an hour before they were to be inter- 
subjects in each group problems. 
in which they were not 

til it actually Results 


ere ee MEPS measure. It was hypothesized that 
their discharge the treatment group Wo 
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planning and would give them a chance to demonstrate 
to their group leader that they had given a lot of 
thought to the problems that they would soon be facing. 

This manipulation was designed to serve two pur- 
poses. First, it provided a means of testing the efficiency 
of the problem-solving skills training as a means of 
increasing the effectiveness and thoroughness of sub- 


jects’ planning for the real-life problems pos 
discharge. If the mere suggestion to control 


ed by their 
subjects 


that they ought to give a little thought to the problems 
that they would face following discharge should result 
in their performing as well as treatment subjects who 


have completed the problem-solving $ 
p had completed would have to conclude that the sessions were a rather 


he final MEPS inefficient means of encouraging subjects to anticipate 


essions, one 


problems and plan ahead. Second, this manipulation 


Pretest Posttest 
. eee eS (ieee 
Con- Treat- Con- Treat- Con- 
trol ment + trol ment + trol 
(32) (31) score (27) (29) score (27) 
5.96 5.77 5.85 9.27 .29 
3.33 2.61 26 3.62 3.29 =3170"*2 3.51 
69 .16 1.18 see =2,10% .06 
DARE T19 .26 416 27 
2.25 1.96 68 3.13 4.24 —.89 Ne 
2.92 2.41 4.02 4.73 43 
so as by 63 o7 4 - 1.50 et 
A2 +92 26 1.12 : 
18 .03 1.74 .07 41 —1.60 ries 
AT 18 26, 1.02 “ 


training sessions. The first was the 
ich they were told that MEPS measure used to assess chan 
problem-solving behavior. The second 
tured discharge interview to assess t 
subjects’ plans for dealing with impen 


. Numbers in parentheses are 7s. 
"p= 01. 


provided a means of testing the degree to which the 
taught in the 
eminder 


Thus, the second part of the study provided two 
t of the problem-solving 


paper-and-pencil 


Pre-post change 


Treat- 
ment 
(29) 


1.01 


ges in cognitive 
was the struc- 
he quality of 
ding real-life 


uld show significantly 


score 


06 
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Table 2 
Interview Analysis of Variance 


= i ee ET 


Source SS df MS F 
Main effects 1884 2 94.2 5,7%* 
Control vs. 
treatment (A) 183.7 1 183.7 11.1*** 
Interview 
preparation (B) S 1 5.7 ha 
AXB A 1 4 .02* 
Residual 971.5 59 16.4 
Total 1,160.4 62 18.7 
*p <.99, 
** p < 006, 
*** b < .002. 


more improvement on the MEPS measure 
than would the control group. This was ex- 
pected, since the MEPS instrument is designed 
to measure interpersonal problem-solving skills 
and only treatment subjects participated in 
group sessions specifically designed to increase 
these skills. Table 1 presents the means and 
standard deviations for the various criteria on 
which the subjects’ stories were scored. First, 
the two groups were compared with respect to 
the number of relevant means that they 
generated for the five MEPS stories that were 
administered as pretests and Posttests. An 
examination of the difference scores (difference 
= number of posttest means minus number of 


control group, 
—3.32, p< 002. Even though the 
two groups performed approximately the same 
on the pretest (control 


the scores obtained by 
the treatment and control groups on the indi- 
vidual MEPS stories for the posttest demon- 
strated that the significant overall differences 


was greater than that of the control group for 
each of the 10 stories that comprise the MEPS 
procedure, 


Mean relevancy scores for the treatment and 


e 
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control groups are also presented in Table 1 
The relevancy score was obtained by divin 
the number of relevant means generated h 
the total of all the means generated for eac 
subject. The relevancy score is an index of th 
extent to which subjects in the two groups 
responded with relevant and effective solution 
to the MEPS stories administered, Although 
the treatment group showed more positive 
change in the relevancy score from pretesting 
to posttesting than did the control group, this 
difference did not reach statistical significance, 
454) = .24, p< .24. That the treatment 
group increased their relevancy score from .16 
to .91, whereas controls increased merely from 
-69 to .75, however, Suggests that there is a 
strong trend that favors the treatment subjects 
in the predicted direction. 

Tn addition to the comparisons between the 
two groups on the basis of number and rele 
vancy of means generated, Table 1 also 
Presents a comparison of the groups with 
respect to other story elements including 
enumerations (elaborations) of means, ob- 
Stacles, and the acknowledgment of passed 
time. The control and treatment groups did 
not differ significantly on any of these mea- 
sures. However, the treatment group tended 
to perform better on each of these dimensions 
than did the control group of the posttest 
Protocols, 

Discharge interview. Performance in the 
interview was analyzed by means of a 2 X2 
analysis of variance (control/treatment vs. 
Prepared for interview/unprepared). The re- 
sults of this analysis are presented in Table 2. 
These results demonstrated a highly significant 
main effect due to group membership, F(1, 59) 
= 10.2, $ < .003, no main effect resulting from 

ifferences in preparation for the interview, — 
F(1, 59) = .002, p < .99, and no significant 
interaction between group membership and 
interview preparation F (1, 59) = 1.0, p < .99. 

Table 3 presents the actual means and 
standard deviations of interview scores for 
both the treatment and control groups. This 
table demonstrates that the significant differ- 
ence between the two groups in the interview 
was not a result of one question or a specific 
subset of questions. Rather, the treatment 
group performed better on all dimensions of 
the interview, 
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fiable 3 f 
parison of Discharge Interview Performance 
Variable and group M SD t df p< 
Total interview score 
Control 7.6 3.1 
Treatment 11.0 4.7 —34 51 002° 
Total plans score 
Control 4.6 1.7 
Treatment 5.9 22 —2.5 61 01 
Total obstacles score : 
Control 38 8 
Treatment 1.4 12 —2.5 51 018 
Total plans rejected score 
Control 4 6 
Treatment 9 1.3 -1.7 41 09 
Total action-steps taken score 
Control 1.8 1.3 
Treatment 2.8 1.6 -21 61 008 
Total score for employment planning 
oneal 2.3 1.1 
reatment 3.9 1.7 Re s1 00 
Total score for living situation planning 
ponte! 1.7 1.1 
reatment 2.7 1.6 =28 53 007" 
Total score for free time planning 
fontra! 3.4 1.9 
reatment 47 24 mae 61 ron 
Ae Numbers in parentheses are degrees of freedom. 
ljusted for nonhomogeneity of variance. 
T ; ; í : 
hough the interview was highly structured, of the MEPS measure. First, MEPS scores 


| eo important to monitor whether the 
E Ta erapists administered it to both con- 
Ee treatment subjects in a substantially 
Bern ee To check this, four of each 
TER : ischarge interviews were randomly 
lbet R analysis (two interviews of control 
there i ye of treatment subjects). Since 
ON re four therapists, this meant that a 
Both th eight interviews were selected from 
p control and treatment groups. These 
SHA ae of control and treatment groups 
ables peered on each of the following vari- 
A aa oth interviewers and interviewees: 
(a ae of responses, (b) length of responses 
ae a) and (c) total time taken for the 
NST, here were no significant differences 
a interviews of control and treatment 
s for any of the indices examined. 


Discussion 


ae results of Study 1 provide further 
nce of the content and concurrent validity 


were found to be significantly related to the 
social competency of the alcoholic subjects in 
the study sample. Further, the MEPS scores 
were also found to be significantly related to 


subjects’ performance in ani 
J pe 


an interview designed 
to assess the quality of their planning for 
dealing with postdischarge problems. 


As in previous work reported by Platt and 


that the MEPS is not merely another IQ Ke 


out of the correlation between 
social competency scores strengthened the 
relationship, the same partialing procedure in 
this study signifi 
ship between the two measures. One possible 
explanation for this discrepancy r 

Platt and Spivack (1972b) study, the index 
of IQ was a comprehensive full scale score. In 
the present study, & scaled vocabulary score 
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alone was the index of subjects’ IQ. Thus, the 
subjects’ verbal skills, and not their general 
intellectual ability, was the factor that was 
partialed out. Since the MEPS is a measure of 
cognitive ability heavily dependent on a 
subject’s ability to verbalize problem-solving 
thoughts, it is not surprising that the partialing 
out procedure had a more significant impact in 
this study. 

Finally, Study 1 explored, for the first time, 
the impact of a problem-solving skills training 
Program on the relationship between the 
MEPS measure and an index of intellectual 
ability. The results Suggest that the extent to 
which a subject’s score on the MEPS procedure 
is dependent on verbal skills decreases following 
problem-solving training. Though training did 
not increase the participants’ verbal pro- 
ficiency, it apparently changed the way in 
which they used whatever verbal skills they 
already possessed. Thus, they were able to 
conceptualize qualitatively better solutions to 
the real-life problems presented to them in the 
final MEPS administration than they had prior 
to training. This is an exciting discovery be- 
cause it suggests that individuals with even 
limlted verbal skills can learn a more effective 
cognitive strategy for approaching everyday 
problems, 

Study 2 demonstrated that 10 structured 
group sessions focusing on the Processes of 
effective interpersonal problem solving were 
sufficient to result in the treatment group show- 
ing significantly more improvement, relative to 
controls, on the MEPS measure of inter- 
personal problem-solving thought. A closer 
look at the specific nature of this greater im- 
provement revealed that the thinking of treat- 
ment subjects improved qualitatively as well 


control group, 

Treatment subjects, however, not only per- 
formed better on a Paper-and-pencil measure 
of problem-solving thought, but they also 
utilized their improved problem-solving think- 


7 
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ing to produce qualitatively better Plans for 
dealing with postdischarge problems than did 
controls. In the structured discharge interview, 
treatment subjects demonstrated that the 
Positive gains they had made following 
problem-solving training generalized success. 
fully at least to an in-hospital situation jp 
which they were asked to confront. real-life 
problems. 
The population of alcoholic subjects in this 
study demonstrated a significant deficiency in 
interpersonal problem-solving skills when com 
pared to the MEPS norms for a normal adult 
Population provided by Platt and Spivack 
(1975). As discussed previously, Platt (Note 2) 
identified a population of heroin addicts as 
similarly deficient (relative to normals) in 
these same cognitive skills, Together these 
findings suggest that drug-addict populations 
may, in general, be very appropriate targets 
for problem-solving skills training. More im- 
portantly, the generally positive results of such 
training reported by Platt (Note 2) and in the 
Present studies provide encouragement that 
such problem-solving skill deficiencies may be 
amenable to positive change. Future research 
in this area, however, must do more than 
explore the feasibility and potential effective- 
hess of problem-solving training with addict 
populations. Studies must attempt to demon- 
Strate that this intervention approach can lead 
to significant positive behavioral change after 
subjects have returned to the community. 
Designs that include plans for moderate and 
long-term follow-up of subjects are essential. 
The design of the present study included 
only a modest follow-up component. Treat- 
ment subjects were contacted 1 month follow- 
ing their discharge to determine how well they 
remembered the problem-solving principles 
that they were taught and whether or not they | 
had made use of these principles in dealing 
with real-life problems, Although the majority 
of those contacted (14/22) reported having 
made practical use of the principles, it was 
evident that subjects had already forgotten 
significant portions of the training material. 
Thus, if more rigorous follow-up efforts are to 
be conducted in the future, it would seem 
Teasonable to also explore the utility of geog 
sional “refresher” sessions for the subjects 
receiving problem-solving training. In addi- 
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jon, to increase the likelihood of long-term 
impact, it would seem essential to integrate 
interpersonal problem-solving principles and 
| concepts into all the components of a com- 
prehensive treatment program rather than to 
| artificially restrict problem-solving training to 
| {0 special sessions as was done in the present 
f study. 

Certainly the use of problem-solving skills 
training ought not to be restricted to addict 
populations. Siegel and Spivack (1973), for 
example, reported encouraging results with 
pilot efforts to use such training with chronic 
schizophrenics. Platt (Note 3) reported that 
he was aware of some encouraging pilot work 
using cognitive problem-solving skills training 
with the mentally retarded. In addition, 
D'Zurilla and Goldfried (1971) reported that 
informal clinical experimentation and pilot 
work with college freshmen suggested that 
teaching a general cognitive approach to 
solving real-life problems seemed to be a 
| promising conceptual strategy in aiding these 

individuals in coping with the transitions to 
college living. 

In selecting populations and designing 
programs to improve cognitive interpersonal 
problem-solving skills, however, a number of 
factors that can affect the outcome of these 
interventions should be considered. One such 
factor may be the verbal skills of the par- 
ticipants. It seems reasonable to assert that a 
en minimal level of verbal skill is essential 
or the cognitive training approach to have a 
significant impact. This assertion is supported, 
apart, by the results of this study. Although 
the majority of those contacted in the follow-up 
reported having made use of the training in 
real-life problem situations, it was the more 
verbally skilled subjects who reported having 
made greatest use of the training principles 
Tea leaving the hospital. Thus, although 
training may have had a significant immediate 
impact on the treatment group as a whole, the 
A and use of the principles taught may 

dependent on the level of the subjects’ 
verbal skills. Other important factors that 
a be considered include the participant’s 
a of social competence ; use of the training 

ervention as an adjunct therapy (as in this 
study) versus its use as a solo intervention ; 
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and the number, length, and sequencing of 
training sessions. 

Spivack (1973) proposed that healthy psy- 
chological functioning is, in large part, de- 
pendent on the ability to cope effectively with 
the demands of problematic life situations. To 
deal effectively with such demands, an indi- 
vidual must be able to respond in a competent 
manner both cognitively and behaviorally. 
That is, a person must be able to decide what 
is the best way to cope with the problem as 
well as be able to actually perform the chosen 
behavior. The treatment intervention used in 
this study attempted to provide participants 
with an opportunity to Jean and practice both 
cognitive and overt behavioral problem-solving 
skills. The focus of the training sessions, how- 
ever, was clearly to teach participants a 
cognitive strategy for dealing constructively 
with all sorts of problems regardless of their 
specific content. Thus, general cognitive style 
rather than specific behavioral skills was 
emphasized. Though this particular emphasis 
led to desirable consequences for the subjects 
in this study, other populations may receive 
more benefits from training programs that 
focus more oD teaching specific effective 
behaviors for specific types of real-life prob- 
lems. For example, Argyle, Trower, and 
Bryant (1974) reported that they have taken 
the more specific behavioral approach in their 
work with neurotic psychiatric patients. Fur- 
ther research must be done to determine which 
syntheses of cognitive and behavioral 
training are most effective for specific problem 


populations. 


Summary 


These studies demonstrated three key 
points: (a) Problem-solving thinking skill is 
related significantly to social competency an 
to “planning ahead for problems” behavior in 
an alcoholic population, (b) problem-solving 
thinking skills (as measured by MEPS) of 
adult alcoholics can be improved significantly 
through the use of structured training sessions, 
and (c) improved problem-solving thinking is 
generalized by subjects from the training 


sessions into real-life problem situations both 


within the hospital and after their discharge. 
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Anxiety during the stressful medical p 


heart rate, behavioral ratings, 
dependent measure, 


of number of tape exposures. 
with one viewing producing 


extinction of fear by three exposures. 


Preparation-for-stress messages have gener- 
ally proved effective in reducing the emotional 
trauma of stressful real-life experiences (¢.g., 
Cassell, 1965; Janis, 1958; Johnson & Leven- 
thal, 1974; Vernon & Bailey, 1974). The con- 
tent, emphasis, and medium of the preparation 
messages have varied greatly, but most are 
based on the modeling and/or accurate 
information theories. 

Modeling of approach responses has been 
effective in reducing phobic fear and avoidance 
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function of the number of prior viewi 
and of repression-sensitization coping sty! 
taped endoscopy either zero, one, or three 
tranquilizer required, and self-report. On each 
three viewings generally resulted in the least distress; one, 
more distress; and zero, the most distress. Most comparisons reached statistical 
ted as resulting from extinction and/or 
habituation of anxiety. The repression-sensitization factor interacted with heart 
rate change. Sensitizers showed a monotonic decrease in heart 
Repressors showed an 
the highest heart rate; 
ing from a disruption of repressing defenses by one 


significance. These results are interpret 
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rocedure of endoscopy was studied as a 
ings of an explicit preparation videotape 


Je. Sixty naive patients veiwed a video- 


times. Dependent measures included 


rate as a function 
inverted-U-shaped function, 
this is interpreted as result- 
tape exposure followed by 


(Bandura & Menlove, 1968). Viewing of 
models who are initially fearful and who then 
overcome their fear (“coping model”) has been 
found to be more effective than the viewing of 
models who are fearless throughout the stress- 
ful event (Kazdin, 1974; Meichenbaum, 1971; 
Vernon, 1974). Melamed and Siegel (1975) 
found that children who viewed a coping 
model were subsequently less anxious before 
and after surgery than children who viewed 
an unrelated control film. 

A second theory of preparation effects 
focuses on the formation of accurate cognitive 
expectancies. Emotional distress during a 
stressful event is thought to be caused by a 
discrepancy between current experience an 
prior expectation (Hebb, 1946; Johnson, 1973; 
McClelland, 1951). Preparation-for-stress mes- 
sages based on this theory attempt to Pr> 
vide the information necessary for u 
ject to form accurate expectancies regarding 
the stressor and their reaction to it (e.g. 
Johnson & Leventhal, 1974). ; 

The present study used a stressful medica 


In the public domain. 
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examination to investigate a third variable: 
the amount of prior nonreinforced exposure to 
the stimuli surrounding a stressful event. The 
histories of most adults include many painful 
experiences within a medical setting. Stimuli 
that precede these painful experiences may 
become conditioned stimuli capable of eliciting 
anxiety. One of the most reliable formulations 
coming out of the experimental laboratory is 
Pavloy’s original finding that presentation of 
the conditioned stimulus in the absence of the 
unconditioned stimulus leads to extinction of 
the learned response. Extinction of conditioned 
fear has been shown to be a positive function of 
the number and duration of exposures to the 
conditioned stimulus (Shipley, 1974). Fear can 
also be generated by exposure to novel stimuli, 
and it is likely that novel medical procedures 
elicit this unconditioned fear. As with condi- 
tioned fear, the rate of habituation to novel 
stimuli has generally been found to be a func- 
tion of the number and duration of stimulus 
exposures (Graham, 1973), 

Extrapolating from this research to the 
preparation-for-stress area, it was hypothesized 
that the more exposures a person receives to a 
Preparation message, the less fear will be 
experienced during the actual stressful event. 
The amount of stimulus preexposure is not 
viewed as an alternative to the provision of 
accurate information or to modeling of coping 
responses. Instead, it represents another factor 
that may contribute to the observed beneficial 
effects of preparation routines and that may 
be used to design more effective preparation 
messages. 

In the present study, amount of stimulus 
preexposure to a real-life threatening medical 
examination was manipulated. Patients sched- 
uled to receive an Upper-gastrointestinal en- 
doscopy viewed a videotape of an actual 


coping responses was 
model who appeared 
endoscopy, 

The endoscopy examination seems to be well 
Suited to the study of real-life stress (Johnson 
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& Leventhal, 1974; Johnson, Morrissey, & 
Leventhal, 1973). It involves the insertion of 
a flexible fiberoptic endoscope, 12 mm in 
diameter, through the mouth and ‘into the 
gastrointestinal tract. Air is pumped into the 
gut to make its interior more visible, and the 
endoscope is manipulated for approximately 
15-30 minutes while the physician views the 
lining of the gastrointestinal tract. Patients 
judged to be particularly anxious are sedated 
prior to the examination, but they are not 
anesthetized, since their active cooperation 
contributes to the success and safety-of the’ 
examination. Patients view endoscopy as 
Stressful, and it is easier to study than stressful 
procedures such as Surgery because endoscopy 
is more time limited. Since patients remain 
awake during the examination, they are able 
to demonstrate and assess their distress. The 
fact that the patient must lie relatively still 
allows reliable measurement of physiological 
arousal. 

Prior research has suggested that the effect 
of a preparation message may depend on the 


individual’s characteristic method of coping « 


with stress (Andrew, 1970; Delong, 1971). Phe 
Present study included a questionnaire measure 
of repression-sensitization coping style. Re- 
Pression-sensitization is a unidimensional cate- 
gorization, with Tepressors at one end of the 
continuum and sensitizers at the other. Sensi- 
tizers are generally described as handling stress 
by being Vigilant, overtly anxious, alert to 
threatening cues, and by using intellectualiza- 
tion as a defense, They actively seek informa- 
tion about a stressor as a means of preparing 
to experience it. Sensitizers were expected to be 
initially high in anxiety and to show a mono- 
tonic decrease in anxiety as a function of the 
number of exposures to the preparation video- 
tape. Repressors, on the other hand, are gener- 
ally described as being overtly nonanxious and 
as dealing with the threat of impending stress 
by not thinking about it, repressing it, or 
denying its potential stressfulness. 

It was hypothesized that repressors would 
be initially low in anxiety but that one exposure 
to an explicit videotape of the endoscopy 
examination would reduce their repressing 
defenses, resulting in increased distress. Re- 
Pressors were expected to respond to additional 
exposures to the preparation tape with 4 


e in anxiety similar to that predicted 
sensitizers. Thus, repressors who viewed the 
reparation tape once were expected to 
onstrate greater anxiety than those view- 
gan unrelated control tape. Repressors who 
iwed the tape three times were expected to 
ow less anxiety than those who received only 


The subjects were 60 hospitalized patient volunteers 
lyho had received no prior endoscopy examinations and 


the study if they could not read or 
flo be disoriented or physically unable to complete the 
experimental procedures. 


easures of Anxiety 


Anxiety is considered to be a multidimensional con- 
A that may be reflected in physiological response, 
i rvable behavior, and self-report (Lang, 1968). 
jl ae were selected to assess each of these response 
a hysiological measures. During the endoscopy, heart 
ae monitored with a Hewlett Packard Model 
S polygraph using three fluid column electrodes. 
Eor and skin conductance were also moni- 
a P espiration was not analyzed. Skin conductance 
a re not reported due to questionable validity 
lemming from inconsistent electrical grounding of the 
te ‘ie endoscopes used and to attenuation of electro- 
subj responding by the atropine administered to all 
Ts & Montagu, 1962). 

eee measures. A physician-nurse anxiety 
y S fe e, completed after the endoscopy examination 
Paired ee patient’s endoscopist and nurse, was 
Periods: @ assess the patient’s fear during three time 
Sopin ) a) prior to insertion of the endoscope (before 
Race n (b) while the endoscope was in the gastro- 
iti. pect (during scoping), and (c) after removal 
eight ie loscope (after scoping). The scale consists of 

tube is ms (e.g., “Patient appears anxious before the 
ete”) asa mitnng teeth, tight muscles, sweating, 
at all” ¢ 1 rated on a 5-point scale ranging from “not 
physician ‘very much.” The item ratings made by the 
| pro ie and nurse were combined, since Pearson 
tiks w oment correlations between the independent 
< 001 ere moderate and significant, r (58) = .44-.55, 
Waa summary anxiety ratings were then 
y averaging the items within the before, 
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during, and after scoping time periods. The internal 
consistency of these three subtests ranged from .80 ‘ 
to .87, using Cronbach’s (1951) alpha. 

A second behavioral indicator of anxiety was whether 
the patient received diazepam (Valium) intravenously 
prior to an attempt to pass the endoscope. Only those 
patients judged by the physician to be highly anxious 
were given diazepam. 

Self-report measures. Two questionnaires were used 
to obtain self reports of anxiety: the Spielberger State- 
Trait Anxiety Inventory (STAI) and the Post- 
Endoscopy Interview Schedule. The STAI (Spielberger, 
Gorsuch, & Lushene, 1970) consists of two separate 
20-item scales. The Trait Anxiety (A-Trait) scale 
assesses anxiety proneness, and the State Anxiety 
(A-State) scale measures momentary or situational 
anxiety. The A-State scale was expected to reflect 
changes in anxiety level over time and was administered 
prior to the experimental manipulations as well as before 
and after the endoscopy. Scores on the A-Trait scale 
were not expected to differ between groups or to change 
over time (Spielberger, Auerbach, Wadsworth, Dunn, 
& Taulbee, 1973). It was administered after the endos- 
copy primarily for descriptive purposes and to allow 
for calculation of correlations between measures of trait 
anxiety, state anxiety, and repression-sensitization. 

The Post-Endoscopy Interview Schedule was used 
in a structured inquiry into the patient’s reaction to 
the examination. Consisting of both open-ended 
questions and items to be rated on a 5-point scale, it was 
used to obtain a retrospective self-assessment of anxiety 
during the preparation videotape, before the endoscopy 
examination, and during the examination. Information 
was also gathered on the degree of annoyance and 
physical discomfort experienced during the examination 


and prior hospitalizations. 


Measure of Coping Disposition 


A Modified Repression-Sensitization (R-S) Scale, 
developed by Epstein and Fenz (1967), was used to 
classify patients as to their characteristic way of dealing 
with anxiety-provoking stimuli. The Modified R-S scale 
was developed in an attempt to eliminate the high 
(around .90) correlation between Byrne’s Minnesota 
Multiphasic Personality Inventory derived R-S scale 
and measures of anxiety (Byrne, 1964). 


Procedure 
Each patient in the study, regardless of treatment 
received information about the 
e separate occasions—from the 
and the experimenter. The patient 
was told of the need for the examination, 
that would be followed, 
possible but improbal 
the endoscopy, 
research 


the patient was asked to par- 
study and to sign a consent 
scale and the Modified 
and, depending on prior 
random the control 
videotape once, (Group E0), 


once (Group E1), 
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EO 


22 El 


Mean Anxiety item Rating 


Before 
Scoping 


During After 


Scoping 


Figure 1. Mean physician-nurse anxiety rating before, 
during, and after scoping for each of the treatment 
conditions. (E0, E1, and E3 refer to the groups that 
viewed the control tape once, the experimental tape 
once, or the experimental tape three times, respectively.) 


(Group E3). Subjects were instructed to watch each 
tape in its entirety. (Monitoring of time spent looking 
directly at the tape confirmed that subjects actually 
received the amount of exposure intended.) 

The 18-minute experimental videotape showed a 
35-year-old white male patient actually receiving an 
endoscopy. The patient in the tape showed an “average” 
amount of distress during the examination, gagging 
several times and requiring a moderate amount of 
“calming talk” by the nurse. The 26-minute control 
videotape was titled Because I was too Young: The 
Memoirs of a Distinguished Indiana Physician. 

On the morning of the endoscopy, each patient 
received an intramuscular injection of .6 mg of atropine 
to reduce oral and stomach secretions, and the A-State 
scale was completed for the second time. Between 5 
and 30 minutes later, the patient was taken into the 
endoscopy area, and electrodes were attached for the 
physiological measurements. After the physiological 
recordings stabilized, a 2-minute baseline was recorded. 

Two physicians and a nurse then joined the experi- 


were recorded for at least 1 minute prior to first placing 
the endoscope inside the patient’s mouth and for 10 
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after the examination, the Post-Endoscopy Interview. 
Schedule was used to question the patient, and the 
STAI was completed by the patient. 


Results 


Premanipulation Comparisons 


Analysis of variance of pretreatment vari- 
ables revealed that subjects in the three treat- 
ment conditions were comparable in age, male- 
female ratio, reported days of previous hospi« 
talization, and scores on both the Modified 
R-S scale and the Spielberger A-State scale, 


Analyses 


Two separate analyses of the data were 
performed. In the primary set of analyses, the 
repression-sensitization dimension was ignored 
and three one-tailed planned comparisons (i€ 
E0 vs. E1, E1 vs. E3, and EO vs. E3) were 
made for each dependent variable. These were 
based on the prediction that E3 would evidence 
the least distress, E1 more distress, and E0 the 
most distress. Unless otherwise noted, com 
parisons were made with ¢ tests using the 
appropriate error term from a one-way analysis 
of variance or, where three repeated measure- 
ments were obtained, from a 3 X 3 analysis 
(Treatment Condition X Time Period). The 
degrees of freedom for the pooled error term, 
used where repeated measures were obtained, 
was determined by Satterthwaite’s formula 
(Winer, 1971). A second set of analyses was 
performed to determine the relationship be 
between coping style and the dependent 
variables. 


Behavioral Measures 


Physician-nurse anxiety rating scale. Be 
1 shows the mean anxiety rating by timi 


1 Because I was too Young: The Memoirs of a a 
tinguished Indiana Physician can be obtained from a 
Instructional Media Resource Center, Indiana 202 
versity School of Medicine, Indianapolis, Indiana eat 


jods for each of the treatment conditions. 
standard deviations for these data ranged 


Tranquilizer required. 
quired by a significantly lower proportion of 
the E3 patients 
Patients (45%) or the E1 patients 
p> 1.09, p < .05). 


lart Rate 
Heart rate during the final minute of the 


tate during the minute immediately prior to 
Insertion of the endoscope, (b) mean heart rate 
during the first 5 minutes following insertion 
Í the endoscope, and (c) mean heart rate 
during the second 5 minutes following insertion 
af the endoscope. The groups did not differ 
ety during the minute just prior to 
oo of the endoscope. During the first 
minutes of scoping, Group E3 had signifi- 
al less heart rate increase (M = 18.53, 
k- 9.39) than both Group E1 (M = 21.92, 
e- 15.84), (101) = 2.64, p< 01, and 
aa E0 (M = 24.78, SD = 13.58), #(101) 
at A „p < .05. During the second 5 minutes 
Eee the groups did not differ signifi- 
ch y (Ms = 16.66, 19.00, and 14.04 for 
oups E0, E1, and E3, respectively). 


Self-repori Measures 


E . The groups did not differ signifi- 
ag rom each other in state anxiety just 
it o the endoscopy. Group means for E0, 
k and E3 were 43.30, 37.80, and 42.25, 
NA Following the endoscopy, the 
A a were ordered in the predicted direction 
39,05 joes (Ms for E0, E1, and E3 were 
signif 3.75, and 29.20, respectively), with 
a cance reached for the EO versus E1 
0 parison, £(109) = 1.74, p < .05, and the 
¥ en E3 comparison, #(109) = 3.02, 

005. As expected the groups did not differ 
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significantly in trait anxiety after the endos- 
copy (two-tailed tests). Group means for E0, 
E1, and E3 were 43.3, 36.0, and 39,5, respec- 
tively. 

Post-endoscopy interview schedule. The con- 
trol tape was rated as comparable to the 
experimental tape in both interest value 
(M = 3.7 for EO, 3.9 for E1, and 4.0 for E3) 
and the generation of emotional upset 
(M = 2.0 for E0, 2.3 for E1, and 2.0 for E3). 
Consistent with the habituation/extinction 
hypothesis, Group E3 rated the third viewing 
of the experimental tape as Jess upsetting 
(M = 1.5) than the first viewing of the tape 
(M = 2.0), #(19) for correlated means = 2.78, 
p < .005. The groups were comparable in 
subject response to the question, “Do you feel 
that you got too little or too much information 
about the coming endoscopy?” (Ms = pho 
3.2). No subject indicated that too little 
information was received. 

There were reliable group differences in 
amount of reported annoyance. To the open- 
ended question, “Very often some parts of an 
endoscopy examination cause annoyance, What 
things annoyed you?” one or more complaints 
were voiced by nine E0 subjects, eight E1 sub- 
jects, and only two E3 subjects. Proportion 


tests indicated that the number of E3 sibla 
an 


voicing complaints was significantly lower 
the number in either EO or El (282 2.19, 
p < 025). Asked to rate how annoyed they 
felt during the endoscopy, E0 subjects indi- 
cated higher annoyance (M = 1.9)/than either 
E1 or E3 subjects (M = 1.4 for both groups), 
ts(57) 2 1.70, P< 05. The groups did not 
differ in ratings of anxiety experienced during 
the hour before the endoscopy or in ratings of 
anxiety and physical discomfort experienced 


during the endoscopy. 


Repression-Sensitization 
The dependent variables were also analyzed 
for differences as 4 function of subjects’ 


2Qne subject was dropped from the heart rate 
analyses Daise of failure to obtain a valid basal heart 
rate. He had the highest basal rate of any patient 
(111 bpm) and was the only subject to exhibit a drop 
in heart rate during the stressful endoscopy, demon- 
strating that a true basal heart rate measurement 


not been obtained. 
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Sensitizers 


Repressors 


Mean Heart Rate Increase In Beats/Minute 
N 
D 


EO El E3 


Figure 2. Mean heart rate increase during the first 5 
minutes of scoping for repressors and sensitizers in each 
of the treatment conditions. (E0, E1, and E3 refer to 
the groups that viewed the control tape once, the 
experimental tape once, or the experimental tape three 
times, respectively.) 


repression-sensitization score. The 10 subjects 
whose R-S score fell at the median (12) were 
eliminated from these analyses, leaving 25 
repressors (with scores between 5 and 11) and 
25 sensitizers (scores = 13-21). Planned com- 
parisons were used to test the predictions that 
sensitizers would show a monotonic decrease 
in anxiety as a function of the number of times 
they viewed the preparation tape (E0 > E1 
> E3) and that repressors would show an 
inverted-U-shaped function, with E1 evidenc- 
ing the greatest anxiety (EO < E1 > E3). 
Each comparison was made with a one-tailed 
t-test using the appropriate error term. 

Physician-nurse anxiety rating scale. Sensi- 
tizers in Group E1 were rated as significantly 
less anxious than those in Group EO before, 
during, and after Scoping, ts(67) > 1,75, 
p < .05. No other comparisons for sensitizers 
or repressors reached significance, 

Tranquilizer required, The percentage of 
patients in each group requiring diazepam fol- 
lowed the predicted pattern for repressors 
(% tranquilized = 42, 67, and 9 for E0, E1, 
and E3, respectively), with the E1 versus E3 
comparison reaching significance (z = 2.58, 
b < .005). For sensitizers none of the com- 
parisons were significant (% tranquilized = 56, 
67, and 43 for E0, E1, and E3, respectively), 

Heart rate. There were no significant differ- 
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ences in heart rate change scores during the 

minute immediately prior to insertion of the 

endoscope. Figure 2 provides mean heart Tate 

increases during the first 5 minutes of scoping 

for repressors and sensitizers in each experi- 

mental condition. Sensitizers showed the pre- 
dicted monotonic decline in heart rate increase, 
with the EO versus E3 comparison reaching 

significance, ¢(76) = 2.43, p< .01, Support 
was also received for the prediction that a 
repressors would have higher heart rate in 
creases than repressors in either E0 or K, 

ts(76) > 2.46, p < .01. 

A similar pattern was observed during the 
second 5 minutes of scoping. Sensitizers in E3 
showed less heart rate increase (M = 8.43) 
than sensitizers in EO (M = 21.04), (76) 
= 2.27, p< .025. Heart rate increase for 
sensitizers in E1 (M = 16,52) fell between EÙ 
and E3 but did not differ significantly from 
either of these groups. For repressors, no 
differences reached significance, but the pat- 
tern of results was similar to that observed in 
the first 5 minutes of scoping. Means for E0, 
E1, and E3 repressors were 13.31, 21.65, and 
16.27, respectively. 

State anxiety. There were no differences 
between groups in state anxiety reported before 
the endoscopy. After the endoscopy, E1 and B 
sensitizers reported significantly less anxiety 
than EO sensitizers, ¢s(79) > 2.17, p < 0% 
(Ms = 44.78, 33.22, and 29.00 for Groups El, 
E1, and E3, respectively). The comparisons on 
repressors for this time period were not 
significant. i 

Post-endoscopy interview schedule. None 0 
the comparisons reached significance for any 
of the postendoscopy interview items. 


Intercorrelations Between Selected Dependent 
Variables 


Table 1 provides Pearson product-moment 
intercorrelations between the various mesa 
of anxiety and scores on the Modified x 
scale. Intercorrelations between physiolela 
behavioral, and self-report measures of anxie! y 
were low but frequently significant. Inter 
correlations between different self-report me, 
sures were moderate. The low nonsignifican 
correlation between amount of Valium ant 
mean heart rate shows that Valium did no 
confound the heart rate data. 


ns Between Selected Measures of A 
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ciety 
i on-Sensitization 


1 2 3 4 5 6 7 8 9 10 
— “28s 20 .30** ~—-.09 16 35**  .44** 00 .09 
= 43 a3 SEA i 36t 414 42+ Bort di de 
a : pa es) OL! .31* AY .24 K Eredi- 
ty— baseline = 64t  .48%* .53** Eaa E 
y— before endoscopy = igt 55e 1.22 ‘51% "34 
ty—after endoscopy — 30%: -Gle 62"* a1 
tive subject-rated anxiety—before endoscopy = A0** ‘45e ‘16 
ive subject-rated anxiety—during endoscopy SEO 20 
psy 41** 


(two-tailed). 
(two-tailed). 


; } Discussion 

ults support the hypothesis that fear 
tressful procedure is reduced as & 
f the number of prior viewings of a 
n videotape. The ordering of the 
conditions on the dependent mea- 
impressively consistent, with E3 
y: evidencing the least anxiety, El 
ixiety, and E0 the most anxiety. 

ults are consistent with an extinction/ 
ion hypothesis. According to this 
N, three exposures to endoscopy- 
f auli resulted in greater habituation 
or extinction of emotion than did one 
i Likewise, one exposure to the tape 
d Eo habituation and/or extinction 
) exposure. This interpretation is su 
by the fact that SEE in the E3 
n rated their third viewing of the tape 
üpsetting than their first viewing. 
glance these results appear similar to 
ETN on the vicarious extinction of 
e e avoidance. Bandura, Blanchard, 
a (1969) and Blanchard (1970) found 
fear and behavioral avoidance de- 
ed models handling a snake. However, 
€ studies the feared and avoided situa- 
a puy harmless, and the models 
A ed as appropriately calm and 
ey contrast, in the present study, the 
Situation (endoscopy) was actually 
e and unavoidable. The filmed model 


was shown as experiencing discomfort (e.g. 
gagging, tight muscles, distressed look on face). 
Since calm behavior was not modeled, it is un- 
likely that subjects learned to be calmer during 


the endoscopy through matching their response 


to that of the model. 

The model used in the present study might 
be labeled a flooding model or a realistically 
anxious model, since he showed distress 
throughout the endoscopy. Several investi- 
gators (Kazdin, 1974; Meichenbaum, 1971) 
have found the viewing of coping models (ies 
those showing initial distress followed by calm 
contact with the feared object) to be more 
effective in reducing fear and avoidance than 
viewing mastery models (i.e., those appearing 
calm throughout). Theoretically, the sequence 
of viewing a fearful model become calm 
facilitates subject identification and imitation. 
However, the positive findings of the present 
study using & flooding model suggest that the 
superiority of results with a coping model may 


result from the viewing of a fearful model 
er se rather than from the sequence of viewing 
a fearful model followed by viewing a calm 
identification with a fearful 
to vicarious arousal, When this 


ained long enough 


would result in gr 
a coping model, 
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tested by comparing the effectiveness of 
various amounts of exposure to mastery, cop- 
ing, and flooding models in producing reduc- 
tions in arousal. 

Another possible explanation for the present 
findings is provided by the accurate expectancy 
theory, According to this formulation, the E3 
subjects who repeatedly viewed the preparation 
tape learned more about the endoscopy exami- 
nation and formed a more accurate expectancy 
of what was in store for them. This is possible 
but unlikely, since all groups received detailed 
verbal information about the examination on 
at least three separate occasions. The endos- 
copy examination is relatively simple to under- 
stand, and it seems likely that near maximal 
information was acquired by subjects in all 
three treatment conditions. In fact, following 
the endoscopy, the groups were comparable in 
rated amount of information received, and all 
subjects indicated they had received enough 
information. Nevertheless, the experimental 
design could have been strengthened by 
administering a test of knowledge about endos- 
copy just prior to the examination. Andrew 
(1970) used such a procedure and found no 
correlation between information learned from 
an audio preparation-for-surgery tape and 
her dependent variables, 

Another possible explanation for the present 
results is that the operative variable was 
simply time spent viewing a videotape, regard- 
less of the content of the tape. However, despite 
the fact that the control tape was 8 minutes 
longer than the experimental tape, the E1 
subjects evidenced less anxiety than E0 sub- 
jects on most of the dependent measures, The 
results cannot, therefore, be attributed solely 
to length of viewing time. 

Regardless of the theoretical explanation of 
the results, the finding that anxiety during a 
stressful medical examination was reduced as 
a function of the number of prior viewings of a 
videotape of the examination is of potential 
practical importance, It Suggests that explicit 
repetitive preparation for stress may be 
beneficial, 

An area of both practical and theoretical 
concern is the interaction of subject character- 
istics with the preparation message. In the 
Present study, sensitizers showed the predicted 
monotonic decrease in anxiety as a function of 
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the number of videotape viewings, Repr 
showed the predicted inverted-U-shaped fun 
tion for the heart rate and diazepam 
with E1 subjects evidencing the g 
arousal. These findings are consistent wi { 
hypothesis that repressors maintain low arou 
in the face of threat by not thinking about 
and not seeking information. Theoretically, o 
exposure to the explicit stimuli of the p 
tion tape weakened their repressing 
and left them in an aroused state 
that of unprepared sensitizers, In E3 
additional viewings of the tape dimin 
extinguished this arousal. 

The finding that one viewing of the prepat 
tion tape produced increased anxiety in 
Pressors is consistent with the fin 
Andrew (1970) and Delong (1971), 
(1971) found that sensitizers who h 
brief preparation message prior to surge 
less complications and were discharged s 
than sensitizers hearing a control tape 
pressors who heard the preparation tap 
not show this positive outcome, and, in 
they expressed more complaints posto 
tively than repressors hearing the control 
Andrew (1970) also found negative eff 
repressors who heard a short audiotape. 
required more pain and sleeping medica 
after surgery. 

These results suggest that repressors 
sensitizers might benefit from different pre 
tion strategies, with sensitizers prepared 
sively and repressors left alone or at le 
with their defenses. Repressors might 
exposed to a preparation message that sup 
their defenses by minimizing danger 0 
couraging avoidance through selective 4 
tion. Though they did not differential 
patient subjects on the repression-sensitiz 
dimension, Langer, Janis, and Wolfer ( 
found beneficial effects from a 20-minute 
Preparation-for-surgery message emph 
cognitive defensive strategies such as Ca 
self-talk and selective attention. Perhaps 
type of preparation message woul 
especially effective with repressors. a 

Taken as a whole, the present study den 
Strates that reasonably well-controlled 
on preparation for stress can be conduct 
real-life stressful situations. This is of par 

interest at a time when increased concert 


subjects’ welfare and the need for comprehen- 
sive informed-consent procedures make the 
ise of analogue populations and stressors less 
‘gable, In addition, findings obtained in a 
rallife stressful situation seem more likely to 
eralize to other real-life situations than 
lis obtained in analogue studies that must 
jie relatively low-intensity stressors. Never- 
iieless, the present findings need to be ex- 
liaded to other settings. This is particularly 
ine of the regression-sensitization findings. 
AsAverill, Olbrich, and Lazarus (1972) noted, 
{ndings of relationships between personality 
vtiables and stress reactions have tended to 
itisappear when tested in a slightly different 
iting” (p. 29). 
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Addict Descriptions of Therapeutic Community, Multimodality 
and Methadone Maintenance Treatment Clients and Staff 


Patricia B. Sutker, Albert N. Allain, and Charles J. Smith 
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Gary H. Cohen 
Tulane University School of Medicine 


Adjective Check List (ACL) descriptions of 88 addicts in treatment toward 
methadone maintenance, multimodality, and therapeutic community clients and 
program staff within and across rating groups representing the three types of 
drug treatment conditions were compared. Data analysis procedures included 
single-groups analyses of variance; combined-groups analyses of covariance with 
sex, age, race, months addicted, months in treatment, and scores on the Raven 
Progressive Matrices treated as covariates; and principal-components analysis. 
Addicts as a group were characterized by high elevations on ACL scales Aggres- 
sion and Succorance. Client descriptions varied significantly as a function of 
category rated, with program staff described more positively than client groups, 
therapeutic community residents described more favorably than other client 
groups, and methadone clients rated with marked negativity. Between-category 
differences were most succinctly summarized by factor score comparisons on 
General Adjustment, one of four factors identified by principal-components anal- 
ysis and differentially associated with all four categories rated. Results suggest 


that addict opinions represent a valuable 
treatment approaches and identifying sel 


significance. 


Treatment successes have been limited 
among therapeutic approaches to opiate addic- 
tion, and pessimism has characterized reports 
of treatment outcome studies evaluating the 
effectiveness of strategies such as methadone 


This research was supported by General Research 
Support Grant 22620-GR26 from the Medical Uni- 
versity of South Carolina to the first author and was 
presented at the annual meeting of the Southeastern 


made access to clien: 


Sciences, Medical University of South Carolin: 
` 1 
Ashley Avenue, Charleston, South Carolina 29403. i 


source of ideas for evaluating current 
f- and staff perceptions of therapeutic 


maintenance, multimodality programming, and 
the therapeutic community. Predicting H 
long-range fate of addict clients treated m 
traditional fashion at the U.S. Public He 
Service Hospital at Lexington, Vaillant (191 
estimated that 2% of addicts at risk mee 
permanently abstinent each year. Similari 
methadone maintenance programs have bee 
regarded as only partially successful in ne 
earlier claims (Dole & Nyswander, 1976), # 
although Williams and Lee (1975) shove 
Positive behavioral and attitudinal ae 
among addicts remaining in methadone 
ment longer than 3 months, they descr! ted 
high dropout rate. Other investigators rep! it 

that methadone clients continue to use } jew 
drugs (Chambers & Taylor, 1972) an te 
methadone as an undesirable solution “oi 
Problems (Sutker, Allain, & Moan, how- 
Evaluations of therapeutic communities, ani 
ever, have been somewhat more positive, 
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tment conditions studied over ex- 
frames the therapeutic community 
have achieved greatest success 
Hijazi, 1974; Sugarman, 1974). 
ly, short-term personality changes 
ep associated with residence in thera- 
mmunities, hospitals, and even prisons 
Allain, & Cohen, 1974; Zuckerman, 
fasterson, & Angelone, 1975). 
several studies have explored 

in personality, attitudes, or behavior 
d with therapeutic intervention, there 
e been no reports of attempts to collect 
comparable fashion across treatment 
ions, Among other difficulties, varia- 
‘situational and temporal factors and 

populations from diverse treatment 
have posed formidable problems for 
and have restricted investigators in 
ting meaningful comparative research. 
onally, studies attempting to evaluate 
ment effectiveness have focused on per- 
ty or performance variables, and assess- 
of the personal attitudes of addict clients 
elves toward available treatment ap- 
s, clients, or staff has rarely been an 
ental goal. Attitudinal studies have 
ed to perceptions of single-treatment 
es (Brown, Bass, Gauvey, & Kozel, 
Crowther & Pantleo, 1971), and the 
mers of drug treatment services, or addict 
have been largely overlooked as a 
e of ideas for systematic evaluation of 
nent options. 
ming that potential or participating 
Is are information-processing beings whose 
Ptions of situations provide one key to 
tanding how they will respond within 
fic milieus, attitudinal sets may influence 
Selection and degree of participation 
given programs as well as provide 
able information regarding client-perceived 

fulness of treatment possibilities. There- 
the purpose of this investigation was (a) 
ibe attitudes of addict clients in thera- 
Ic community, multimodality, and metha- 
Maintenance programs toward clients in 
e conditions and their own program staff 
(b) to make comparisons of addict atti- 
toward treatment conditions within and 
S client category groups. It was reasoned 
attitudes toward treatment conditions 
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would best be reflected in measures that re- 
quired client-rating descriptions of the most 
obvious products of a treatment program 
program clients and staff. 


Method 
Subject Selection 


To sample attitudes toward drug treatment strategies 
across treatment categories, three programs in New 
Orleans representing the therapeutic community, 
methadone maintenance, and multimodality approaches 
were contacted to request access to program clients. 
Subsequent to explanation of study purposes and 
procedures, agreement to participate was obtained 
from (a) Odyssey House Louisiana, a highly structured 
therapeutic community with approximately 50 resi- 
dents, which requires long-term commitment and drug 
abstinence (client-staff ratio of 6:1); (b) the Drug 
Research Clinic, a methadone maintenance program 
with an approximate census of 160, which makes rela- 
tively limited demands for individual or group thera- 
peutic involvement (client-staff ratio of 16:1); and 
(c) the Narcotic ‘Addict Rehabilitation Act Program 
(NARA), administered by the Department of Psy- 
chiatry and Neurology of Tulane University School of 
Medicine, which offers inpatient treatment and out- 
patient counseling including methadone maintenance, 
chemotherapy, and drug-free regimes to approximately 
60 clients (client-staff ratio of 6:1). 

Each program allowed initial access to clients through 
individual or group discussions for explanation of the 
research project. Roughly half of NARA and Drug 
Research clients were approached following regular 
appointments, and Odyssey clients were seen in a large 
group assembled by program staff. Selection was 
voluntary but with personal solicitation, program 
encouragement, and promise of $5 payment. Participa- 
tion rates in terms of initial willingness varied across 
programs and ranged from 99% of Odyssey, 95% of 
NARA, and 70% of Drug Research clients contacted. 
Of 163 volunteers expressing interest, 75 were elimin- 
o read sufficiently well; 61 were 
procedures, and 3 had 
olled at least 3 months in their specific 
treatment condition. 
for NARA and Drug Research program: 
existent among Odyssey residents. ea 

‘The final sample of 88 treatment addicts included 28 
NARA clients, 36 therapeutic com- 
munity or Odyssey residents, and 24 methadone mainte- 

Jients. Racial and sexual 


atment programs is presented in 
distribution of women, 
x2(2) = 26.92, 


tlined in Table 2, and it can 


iffered si jficantly in age, 
opa ee a and months jn treatment, 


months addicted to opiates, 
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Table 1 


Percentage Breakdown by Race and Sex Among Multimodality, 


Methadone Maintenance Clients 


k 
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Therapeutic Community, and 


ee 


Treatment group White Black Male Female 
Multimodality—NARA 64 36 93 7 
Therapeutic community—Odyssey 83 17 75 25 
Methadone maintenance—Drug Research 17 83 62 38 

Note. n = 88. NARA = Narcotic Addict Rehabilitation Act Program. 
which necessitated careful statistical handling of Data Analysis 


between-group comparisons. All groups shared a long 
history of opiate addiction, and even the more youthful 
Odyssey residents were characterized by a mean length 
of addiction greater than 3 years. Subjects across treat- 
ment groups showed a mean of 28 months in drug 
treatment programs. 


Materials and Procedure 


Subjects were administered a battery of instruments, 
which included the Raven Progressive Matrices and 
four versions of the Adjective Check List (ACL; 
Gough & Heilbrun, 1965). Used among socially deviant 
groups in the past (Sutker & Moan, 1973), the Raven 
was selected to reflect general intellectual or problem- 
solving ability without undue bias of academic sophisti- 
cation. The ACL, a self-administered inventory of 300 
adjectives yielding scores on 24 need scales, was chosen 
because it has also been used successfully among addict 
samples (Brown et al., 1972) and provides a profile of 
personality attributes. The ACL format was presented 
four times to subjects who were asked to indicate those 
adjectives that best described addict clients in a multi- 
modality program, in a methadone maintenance 
program, in a therapeutic community, and their own 
program staff. It was reasoned that ratings of clients in 
the various treatment Programs would reflect attitu- 
dinal sets and opinions toward the treatment conditions 
themselves. 


Table 2 


Although treatment groups shared a history of 
chronic opiate use, criminal activity, and familiarity 
with drug treatment (over 68%, evenly distributed 
across sample groups, reported experience with at least 
two treatment conditions), they differed on several 
personal characteristic variables as seen in Table 2. In 
part, differences reflect the complexities of attempting 
to execute comparative research over varied clinical 
Settings, but sampling to avoid such differences may 
have rendered samples poorly representative of the 
groups from which they were drawn. Recognizing the 
limitations of study implementation, three approaches 
were pursued in data analysis. Preliminary procedures 
compared attitudes toward treatment client categories 
within rating groups by one-way analyses of variance 
with repeated measures for each ACL scale, and Tukey's 
tests were performed to evaluate differences for all 
possible comparisons in which F ratios for categories 
tated were significant. It was reasoned that extraneous 
or unplanned between-group differences would in no 
way contaminate interpretation of these results 
Second, attitudes toward treatment client categories 
were compared within and between rating groups i 
each ACL scale using Groups X Categories repeate 
measures analyses of covariance with sex, race, 48¢ 
months addicted, months in treatment, and Rayen 
scores included as covariates. The number of inde- 
pendent variables was within suggested limits for sample 


Group Means and F Values on Subject Characteristic Variables for Multimodality, Therapeulit 
Community, and Methadone Maintenance Groups 


Ge ne li Therapeutic Methadone 
laracteristic Multimodality community maintenance F 
Age 27.50 23.39 29.79 6.08 
Grade completed 10.36 11.33 10.13 1.87 
ven score 37.07 44.33 26.46 18.98 

Months addicted 71.68 37.39 110.92 E 
No. convictions 2.21 1.44 1.00 a 
Months incarcerated 30.11 10.97 19.00 2.59 
Months in treatment 30.64 14.56 43.79 12.08* 

Note. n = 88. 

*b < 01. 
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(Draper & Smith, 1966). Tukey’s tests were effects and Tukey’s tests were used for significant 


mformed to examine all possible paired comparisons Groups X Category interactions. Third, a principal- 
P significant main effects, and F tests for simple components analysis with orthogonal rotation was 


j Descriptions by Multimodality Clients (n=28) 
Program Staff 

---- Therapeutic Community Residents 
= Multimodality Clients 
Methadone Maintenance Clients 


STANDARD SCORES 
S3YODS GUVGNVLS 


36) 


SIYOS GUVGNVLS 


STANDARD SCORES 


*) 
o 


edirne SlsbPer Ach End Int Aff Exh Agg Suc Def Crs 

Ckd FavCidcn Adi Dom Ord Nur Het Aut Cha Aba 
Figure 1. Mean Adjective Check List profile patterns produced by multimodality, (here> umber of 
munity, and methadone maintenance client groups for four categories rated. (No. aie nae 
Adjectives Checked; Df = Defensiveness; Fav = Number of Favorable eae TEA Lab 
= Number of Unfavorable Adjectives Checked; S-Cfd = Self-confidence; S-Cn = tate End a En- 
= Lability; Per Adj = Personal Adjustment; Ach = Achievement; Dom = er ee eee ete 
durance; Ord = Order; Int = Intraception; Nur = Nurturance; Aff = ee Suc = Succor- 
Sexuality; Exh = Exhibition; Aut = Autonomy; = Aggression; Cha = Change; 
ance; Aba = Abasement; Def = Deference; Crs = Counseling Readiness.) 
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armed to identify independent constellations of 
yariables accounting for the major portion of total 
ance and to provide a concise basis for differentia- 
‘Yon of groups in relation to ACL variables. 


Results 


Within-groups comparisons of client re- 
sponses across staff and client categories 
Showed that descriptions varied depending on 
he category rated, and a total of 46 differences 
Were significant at the .05 level of probability. 
Similar distributions of ACL profile configura- 
were produced within rating groups, with 
ff regarded most positively and methadone 
ents most unfavorably. These relationships 
weillustrated in Figure 1, which presents mean 
ACL profile patterns for each category rated 
thin the three treatment groups sampled. 
k esults of within-groups analyses are not re- 
Ported in greater detail, because mean ACL 
Values differed at most by only tenths of a 
int from adjusted means derived from com- 
bined groups analyses (see Table 3). 
Inclusion of sex, race, age, months addicted, 
Months in treatment, and Raven scores as 
Covariates diminished the range of scores by 
djusting methadone client ratings in the more 
Positive direction and multimodality and 


STANDARD SCORES 


No Df FavUn- S- S- LabPer Ach End 
Ckd FavCidCn Adj Dom 


for four categories rated. (No. C! 


= Achievement; Dom = Dominance; 


Agg = Aggression; Cha = Change; 
Crs = Counseling Readiness.) 
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Multimodality Clients 
Methadone Maintenance 


nt AH Exh 
Ord Nur 


Figure 2. Mean Adjective Check List file pati 

s profile pa! terns 
‘kd = Number of Adjectives Checked; Df = 
ber of Favorable Adjectives Checked; Unfav = Num 
= Self-confidence; S-Cn = Self-control; Lab = Lability; 
End = 


Nur = Nurturance; Aff = Affiliation; Het = Heterosexuality; : 
Suc = Succorance; ba = Abasement; 
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therapeutic community ratings more nega- 
tively. There were, however, differences be- 
tween and within groups that exceeded the .05 
confidence level for statistical significance. 
Combined groups of treatment clients dis- 
tinguished among ratings categories in assign- 
ing descriptors with significant differences on 
22 of 24 ACL scales as seen in Table 3, and 
inspection of Figure 2 reveals distinct group 
separation in distribution of ACL configura- 
tions. Program staff were described more 
positively than other groups, followed in order 
by therapeutic community, multimodality, and 
methadone clients. Adjectives used to describe 
methadone clients were most critical and 
differed from those attributed to therapeutic 
community residents, multimodality clients, 
and program staff on 22, 18, and 20 scales, 
respectively. Ratings of multimodality clients 
were less favorable than those reported for 
therapeutic community residents on 17 scales 
and program staff on 19 scales, and ratings of 
program staff and therapeutic community 
residents differed on 13 scales, with staff 
regarded more favorably. 

Treatment groups produced different de- 
scriptions of client categories, and Groups 
Categories interaction terms were significant 


Residents 


Clients 


ed groups of treatment clients 
Defensiveness; Fav T nee 
Unfavorable Adjectives Checked; 9° 
i. Rees adj = Personal Adjustment; Ach 
Ord = Order; Int = Intraception; 


= Exhibition; Aut = Autonomy; 
Exh = Fxsment;, Def = Deference; 


produced by combini 


Endurance; 
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for 21 scales (see Table 3). Multimodality 
clients described themselves and therapeutic 
community residents similarly but pictured 
methadone clients more negatively than them- 
selves and therapeutic community residents 
on 16 scales, including Number of Favorable 
Adjectives Checked, Number of Unfavorable 
Adjectives Checked, Self-control, Personal 
Adjustment, Achievement, Nurturance, and 
Succorance (see Figure 1). In contrast, they 
described staff more favorably than themselves 
on Number of Favorable Adjectives Checked, 
Self-control, Endurance, Intraception, Affilia- 
tion, and Aggression. Methadone clients pro- 
duced a relatively homogeneous, negative 
assortment of responses with two significant 
differences. They described themselves as more 
succorant and less self-controlled than program 
staff. Inspection of Figure 1 shows an un- 
complimentary choice of descriptors for all 
categories, with highest elevations on Number 
of Unfavorable Adjectives Checked, Aggres- 
sion, Succorance, and Exhibition and lowest 
scores on Nurturance and Intraception. Thera- 
peutic community descriptions were highly 
variable, with significant differences on 20 
scales, and residents were positive in assess- 
ment of their own staff and clients but negative 
toward methadone and multimodality clients. 
Program staff, characterized by highest eleva- 
tions on Self-confidence, Dominance, Achieve- 
ment, and Endurance and lowest scores on 
Abasement, Succorance, Change, and Defer- 
ence, was rated more positively than multi- 
modality and methadone clients on 18 scales 
each. Therapeutic community residents de- 
scribed themselves less positively than staff on 
Number of Favorable Adjectives Checked, 

Self-confidence, Self-control, Personal Adjust- 

ment, Achievement, Dominance, Endurance, 

Order, Intraception, Change, and Succorance 

scales, but for the most part self-descriptions 

were in the positive range and followed the 

pattern attributed to staff with the exception 

of succorance ratings. Residents rated them- 

selves more positively than multimodality and 

methadone clients, with significant differences 

on 20 scales, and they described methadone 

clients more negatively than multimodality 

clients on Achievement, Defensiveness, Domi- 

nance, Endurance, and Succorance. The latter 

treatment groups were characterized by prom- 
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inent elevations on Number of Unfavorable 
Adjectives Checked, Aggression, and Succor 
ance, with low scores on Number of F. ‘avorable 
Adjectives Checked, Nurturance, Personal 
Adjustment, Intraception, Affiliation, an 
Achievement. 

Between-groups comparisons of category 
descriptions showed that multimodality clients 
were viewed similarly by themselves and 
methadone clients, but therapeutic communit 
residents attributed more negative adjectives 
to multimodality clients than did multimodal- 
ity and methadone groups on 16 scales. Metha- 
done clients were described by therapeuti 
community and multimodality clients moi 
negatively than they rated themselves on 19 
and 13 scales, whereas descriptions of thera 
peutic community residents tended to be con 
sistent and positive. There were no differences 
among rating groups in descriptions of thera- 
peutic community residents, with the exception 
that multimodality clients described residents 
as higher on Abasement than they viewed 
themselves. Finally, program staff ratings 
differed significantly depending on the treat 
ment group. Therapeutic community residents 
described staff more favorably on Self-con- 
fidence, Achievement, Dominance, and Abase- 
ment scales than multimodality clients and 
more favorably than methadone clients de 
scribed their staff on 8 scales including Self 
confidence, Achievement, Dominance, Endur 
ance, and Succorance. Multimodality clients 
assigned more favorable ratings to staff than 
did methadone clients on Number of Unfavor~ 
able Adjectives Checked, Endurance, and 
Nurturance. 

Principal-components analysis generated 
four factors with eigenvalues greater than 10 
as summarized in Table 4. Factor 1, accounting 
for 51% of the total variance and labeled 
General Adjustment, was characterized bY 
significant loadings (.50 or greater) on 17 0 
24 ACL scales including positive loadings for 
Endurance, Personal Adjustment, IntraceP 
tion, Nurturance, Defensiveness, Self-conttoy 
Affiliation, Number of Favorable Adjectives 
Checked, and Achievement, and negative oi 
ings for Aggression, Number of Unfavorable 
Adjectives Checked, and Succorance. Rema 
ing factors accounted for small portions of A 
variance but are described as follows: Factor 
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eined by high positive loadings for Auton- 
omy, Exhibition, Aggression, Self-confidence, 
md Change, appeared to represent Assertive- 
yes; Factor 3, with highest loadings for 
Abasement, Succorance, and Counseling Readi- 
(ress, was labeled Dependence; and, Factor 4, 
hbeled Change, was described by highest load- 
ings for Number of Adjectives Checked, 
lability, and Change. 

To determine if independent clusters of 
personality needs represented by the four 
factors were differentially associated with 


|Table 4 
Orthogonal Rotated Factor Pattern Matrix for 
U Adjective Check List Scales 


Bi eee 


Factor 

ee 

Scale 1 2 3 4 

No, Ckd OL =O), a 

Df 39° —.03 —.14 325 
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Unfav —.58 .50 34 02 

| S-Cfd 65 52; —.30 05 
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Lab 23° R SiS 
Per Adj 92 —18 —.10  .06 
Ach 86 21 —.27 10 
Dom 80 32 —41 07 
End .92 03 —.20 —.05 
Ord 86 02 —.15 —.01 
Int 90 —.06 —.09 10 
Nur 90 —.23 —.05 05 
Aff 88 —.13 —.01 27 
Het 17 —02 —.02 37 
Exh 12.83 —.09 AT 
Aut 99 EA ROS NROS 
ae — 89° 67 a ee 
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ae = S55 a3 aT E 
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ae Ti = 37 RAS EN 
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a Ho, Ckd = Number of Adjectives Checked; 
jective lensiveness; Fay = Number of Favorable 
able A, tee Checked; Unfav = Number of Unfavor- 
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= Pee et control; Lab = Lability; Per Adj 
= ae Adjustment; Ach = Achievement; Dom 
nt = oneal „End = Endurance; Ord = Order; 
lation: Soe = Nurturance; Aff = Affil- 
ues ie = Heterosexuality; Exh = Exhibition; 
ange; Soros Agg = Aggression; Cha = 
= D uc = Succorance; Aba = Abasement; 
eference; Crs = Counseling Readiness. 
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client rating groups or categories rated, 
standardized factor scores generated by 
principal-components analysis were subjected 
to univariate analyses of variance and Tukey’s 
tests. Although there were significant differ- 
ences between and within groups, only differ- 
ences associated with the categories effect are 
described in detail, Fs(3, 255) = 178.95, 35.23, 
and 6.29 for Factors 1, 3, and 4 (ps < .01), 
Constellations of personality needs defined by 
Factors 1, 3, and 4 were differentially related 
(ps < .01) to the four categories rated. Staff 
were characterized as higher than client groups 
on General Adjustment, with therapeutic 
community residents higher than multimodal- 
ity and methadone clients and multimodality 
clients higher than methadone clients. Thera- 
peutic community residents and staff were not 
described differently on Dependence, but rat- 
ings of both groups were significantly less 
associated with this factor than those for 
multimodality or methadone clients. Cate- 
gories rated were also differentially associated 
with Change, with therapeutic community 
residents characterized as higher on this 
dimension than staff and methadone clients. 


Discussion 


Although treatment addicts shared similar 
response sets in rating client and staff cate- 
gories, descriptions varied significantly over 
categories even with the effects of such poten- 
tially powerful variables as age, Sex, Tac, 
months addicted, months in treatment, and 
Raven scores partialed out. Differences among 
category descriptions were most succinctly 
summarized by comparisons of factor scores 
on General Adjustment, which was differen- 
tially associated with all four categories rated. 
Staff were described more positively than client 
groups and were seen as more dominant, better 
adjusted, less changeable, and less succorant, 
If staff elicited the most positive descriptors, 


methadone clients provided the rating stimulus 


most negatively assessed. Client groups seemed 
to share a negative se 


methadone clients, and 


t of attitudes toward 
by logical extension, 


s for categories rated 


1 Mean factor scores and F ratio: 
iin Jable on request. 


within treatment client groups are avai! 
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methadone programs, and to evidence greater 
enthusiasm for multimodality and therapeutic 
community approaches. 

Results are consistent with earlier reports of 
client pessimism regarding methadone mainte- 
nance (Brown et al., 1972; Dole & Nyswander, 
1976; Sutker, Allain, & Moan, 1974). Metha- 
done clients were described as more aggressive, 
more dissatisfied, more hostile and critical, less 
well-adjusted, less self-confident, and less 
likely to be dominant, independent, or achieve- 
ment-oriented than other client groups. Al- 
though variations in client composition of 
treatment programs may have influenced such 
results, descriptions were relatively uniform 
across treatment conditions, suggesting that 
methadone clients were rated without regard 
for specific programs or client personal charac- 
teristics. Addict attitudes toward methadone 
clients and programs, as well as data from other 
studies summarized by Cohen, Howard, Klein, 
and Newfield (1976), do not allow resolution 
of contradictory viewpoints toward this type 
of treatment. However, among possible ex- 
planations for the nearly unanimous client 
negativity are perceptions of dubious long- 
range goals, slow progress toward behavioral 
change, and less client/staff contact. Whereas 
other treatment strategies use structure, super- 
vision, and discipline to facilitate development 
of productive activity, personal independence, 
and achievement orientation, methadone pro- 
grams can be seen primarily as mechanisms to 
sustain dependence on a drug identified with 
sedative effects and to minimize encounters 
with law enforcement. The combination of drug 
dependence and certain daily doses is also one 
that negates much of the purpose, activity, and 
excitement previously associated with dru; 
use and client life-style (Preble & Casey, 
1969). These speculations suggest the need to 
determine if methadone programs facilitate 
development of negative attitudes toward self 
and others even more than nontreatment, 
drug-taking conditions. 

Clients across treatment categories seemed 
to hold therapeutic community residents, and 
perhaps programs, in relatively high esteem. 
Therapeutic community residents were par- 
ticularly enthusiastic about themselves and 
their staff, although they viewed multimodality 
clients somewhat negatively and methadone 


SUTKER, ALLAIN, SMITH, AND COHEN 


clients with marked disdain. Their Variability 
in descriptor assignments, or response polariza. 
tion, may be an outgrowth of the totality of 
program commitment required for continued 
participation as suggested by the cognitive 
dissonance model (Festinger, 1957) or evidence | 
that clients strongly endorse program goals and 
methods. Interestingly, therapeutic community | 
self-descriptions were congruent with the pat- 
tern of adjectives assigned to staff, suggesting 
a high degree of resident/staff identification, 
Perhaps because of constant interpersonal con 
tact, or association of reinforcement with stafi 
authority, therapeutic community residents 
aspire more than other groups to mimic theit 
treatment models. The reasons that prompt 
treatment clients in other conditions to rate 
therapeutic community residents relatively 
positively are as yet unknown; however, in| 
view of the extent of personal control relin- | 
quished in residential treatment, it is interest- 
ing that multimodality clients perceived thera | 
peutic community residents as more inclined | 
toward abasement than they saw themselves. 
Despite group differences, clients described | 
themselves in relatively unfavorable terms. | 
Rating patterns across groups showed prom 
inent elevations on Number of Unfavorable 
Adjectives Checked, Aggression, and Succot- 
ance. The results are consistent with reports 
by Reith, Crockett, and Craig (1975), which | 
point to a constellation of features among 
addict self-descriptions including exaggerate? | 
dependency needs, difficulties in appropriate 
expression of hostility, excessive demands for 
interpersonal support and attention, al 
limitations in responding in kind. Mutua 
exaggeration of such features provides an m | 
portant target for therapeutic intervention A 
assessment of program impact. Althoug! 
present data do not support the contention 
that aggressiveness and excessive demands n 
attention precede initial opiate use, their eariy 
identification may signal individuals at 
risk. Research exploring personality precurso A 
of drug use and dependence must also t iat 
into account interactions with situational | 
well as more stable environmental, social, an 
motivational variables. a 
In summary, addict attitudes represent of | 
interesting and largely untapped were 
ideas that may be assessed ‘systematically 


ADDICT DESCRIPTIONS OF TREATMENT CLIENTS 


uate available treatment approaches and 
identify self- and staff perceptions of thera- 
tic significance. The extent to which 
jtive opinions of staff or fellow clients is 
dated to program participation and ultimate 
atment outcome should be explored con- 
jmitantly with performance and personality 
fables across program types in outcome 
mluation. Research might also include atti- 
udnal measurement as one methodology to 
peify treatment parameters perceived as 
erating within the therapeutic community 
work, or conversely, associated with 
imthadone maintenance programs that elicit 
sitive or negative assessments across treat- 
tand nontreatment addict groups and to 
edict individual performances within a given 
lteatment setting. 
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Treatment of General Tension: Subjective and 
Physiological Effects of Progressive Relaxation 


T. D. Borkovec, J. B. Grayson, and K. M. Cooper 


University of Iowa 


Experiment 1 found virtually no effects of type of no-treatment condition or 
demand/suggestion on the self-monitoring of daily tension percentage and sever- 
ity among 43 overly tense college students during a 4-week baseline period. 
Subjects given four subsequent sessions of progressive relaxation did report 
significant reductions in tension, which were maintained at a 7-month follow-up. 
Experiment 2 included 36 overly tense college students and compared a no- 
treatment condition to groups given nine sessions of relaxation with versus 
without muscle tension release. Counterdemand instructions were in effect for 
the first seven sessions. Relaxation with tension release produced reductions in 
daily tension percentage significantly superior to no-treatment during the coun- 
terdemand period, whereas relaxation without tension release did not differ from 
either group. Treatment effects maintained at a 5-month follow-up. No treat- 
ment effects were found on several during-session physiological measures, al- 
though Session 1 physiological reduction predicted improvement in tension per- 
centage. Presence or absence of tension release did significantly influence the 
number of relaxation cycles necessary to produce reports of deep relaxation, fre- 
quency of practice, and successfulness of eliminating daily tension at follow-up. 


Tension and anxiety represent pervasive 
adjustment problems. Although these com- 
plaints commonly present themselves in out- 
patient and inpatient centers, the widespread 
occurrence of tension-related difficulties is 
aptly exemplified by the 77 million prescrip- 
tions for benzodiazepine antianxiety agents 
(two thirds for diazepam) filled by retail 
pharmacies in 1972 (Greenblatt & Shader, 
1974). Given the cost and potentially hazard- 
ous effects of drug intervention, a clear need 
exists for an effective, nonpharmacological 
alternative. The present article reports the 
initial studies in a planned series of investiga- 


Experiment 1 is based on a senior honors thesis con- 
ducted by the third author. Both experiments mee 
supported by Grant MH-27484 awarded to the first 
author from the National Institute of Mental Health 
and were presented, in Part, at the meeting of the 


inne aan Psychological Association, Chicago, May 


Requests for Tepi 
Borkovec, Depart 
Iowa, Iowa City, 


rints should be sent to T. D. 


ment of Psychology, University of 
Towa 52243) AE 


tions focusing on behavioral intervention 
strategies for general tension. 

Progressive relaxation training (Jacobson, 
1938) has an extensive history of clinical | 
application both as a primary treatment 
strategy for numerous disorders and as a com: 
ponent in the systematic desensitization of 
phobias (Wolpe, 1958). Even though well- | 
controlled studies in the latter area are com- | 
mon, internally valid designs in the former are 
infrequent. A review of the literature reveals | 
some cause-and-effect evidence for the pro- 
gressive relaxation treatment of Beet 
(e.g., Nicassio & Bootzin, 1974), childhoo 1 
asthma (Alexander, 1972; Alexander, Miklich, 
& Hershkoff, 1972), hypertension (Deablen 
Fidel, Dillenkoffer, & Elder, 1973; Shoemaker 
& Tasto, 1975), tension headaches Cr 
Freundlich, & Meyer, 1975), and pu 
(Mathews & Gelder, 1969). Encouraging on 
from these tension-related problems a i 
that relaxation training may be an ae | 
procedure for less severe, but more peve 
daily tension problems in the gene? 
population. 


Copyright 1978 by the American Psychological Association, Inc. 0022-006X/78/4603-0518$00.75 
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RELAXATION TREATMENT OF GENERAL TENSION 


Normative Data 


A questionnaire administered to college 
sudents at the university in the fall and spring 
if 1975-1976 revealed that 21.2% typically 
fat tense 50% or more of each day. For sub- 
jects from the spring semester (N = 479), 
percentage of daily tension ratings correlated 
‘vith tension severity (r = .35), the social 
ims of Geer’s (1965) Fear Survey Schedule 
(= .28), and Mandler, Mandler, and Uviller’s 
(1958) Autonomic Perception Questionnaire 
(= .27). Scores on another problem (sleep 
dsturbance) correlated with neither the ten- 
on items nor the other measures (rs < .07), 
suggesting that more than response set to 
feport problems contributed to the relation- 
ship between tension ratings and other anxiety- 

tated problems. i 

Among students for whom tension represents 

a adjustment problem, specific stressors are 
tadily identified as major contributors to 
te daily tension. In the first study reported 
wow, subjects filled out daily questionnaires 
asking them to attribute the day’s tension to 
the or more of six situational factors or to a 
fventh, “unattributable source,” category. 
Over a 4-week period, the majority of specific 
(ision was attributed to social situations 
2%), followed by. class work (14.5%), 
ests (14.5%), occupation (4.3%), and public 
= (1.4%) ; 23.5% was attributed to 
a Pas sources, whereas the source of 
“he n was unidentifiable for only 16.7%. 
_A sizable population of individuals thus 
Sts for whom tension is a daily problem 
ae to their student occupation. Prior 
ies ing resources to large-scale outcome 

, we investigated two methodological 


uit 
Matters relevant to the use of this target 
Jehavior, 


Experiment 1 


veto confounding influence of demand char- 
el and suggestions of improvement on 
ical €s of behavioral change has become a 
w P in outcome research (Borkovec, 
thtained ee 1977). Unless evidence is 
emand / at outcome data do not reflect 
ity 9 aes effects, the internal valid- 
Proble e study is seriously undermined. The 
m is particularly critical for self-report 
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measures so susceptible to distortion and un- 
reliability. Use of a placebo condition, tradi- 
tionally regarded as a control for nonspecific 
effects, has been considered a sufficient method- 
ological device for eliminating the demand 
confounding. Yet, recent empirical studies 
indicate that placebos may not establish a level 
of demand equivalent to that inherent in 
therapy conditions (cf. Kazdin & Wilcoxon, 
1976). By implication, type of no-treatment 
condition may have a systematic influence on 
target behavior reports in a similar way. Usual 
waiting-list conditions may contain subtle 
demands for improvement during the waiting 
period or may provide sufficient attention and 
positive expectancies that actual improvement 
takes place (Goldstein, 1960). Subjects ex- 
plicitly informed of their control group status 
with no mention of future treatment may not 
be exposed to the same demands or suggestions. 
Tn an early attempt to address this issue, Paul 
(1966) found greater anxiety reduction at 
follow-up assessment among waiting-list sub- 
jects relative to a no-contact control condition. 
Experiment 1 assessed the effects of demand 
for improvement and type of no-treatment 
condition on daily tension reports and provided 
a pilot evaluation of the effects of relaxation 
on this problem. 


Method 


Subjects. Subjects were selected from introductory 
psychology classes on the basis of two questions on @ 
group testing questionnaire: (a) a 21-point rating scale 
(0% to 100%) of the percentage of the day they 
typically felt tense and (b) a 5-point scale (very mild 
to severely tense) of the severity of that tension. Sub- 
jects indicating at least moderate tension for 50% or 
more of the day were interviewed, given a packet of 
daily questionnaires, and randomly assigned to four 
conditions: waiting list or informed control under 
demand or no-demand instructions for improvement. 
Of the 53 subjects obtained, 10 were excluded in the 
final analysis due to voluntary termination, incomplete 
data, or low baseline severity. Thus, 43 subjects (9 in 
each waiting-list condition; 11 in the informed-control, 
demand condition; 14 in informed-control, no-demand 
condition) completed the study and received course 

it for their participation. % 
Ke T flled out a daily tension 
questionnaire each night before retiring throughout the 
6 weeks of the study. The items ee viet i 
se and severity i 
po ee F e 2 waiting-list subjects 


were told tha a 
first needed. Jnformed-control subjects were told that 
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they were in a control group and that their 6 weeks of 
data would be compared to a therapy group. Within 
each of these no-treatment groups, demand subjects 
were informed that research had shown self-monitoring 
of a problem to result in a reduction of the problem and 
that they could expect such a reduction to occur in 
their tension level and severity. Subjects in the no- 
demand condition were not informed of any potential 
effects of the self-monitoring. 

Waiting-list subjects began treatment after the 4 
baseline weeks. Sessions were held either in groups of 
five or less or individually, depending on scheduling 
possibilities. Progressive relaxation followed procedures 
by Bernstein and Borkovec (1973), except that training 
was presented by a single tape, providing only two 
tension-release cycles for each muscle group with no 
provision of additional cycles if tension remained. The 
first two sessions dealt with 14 muscle groups, and the 
last two sessions involved a four-muscle-group com- 
bination procedure. The tapes were made by the first 
author. Relaxation was described to the subjects as a 
skill involving (a) systematic tension release of gross 
muscle groups to reduce tension and autonomic arousal 
and (b) learning to identify and relax away extant 
tension. The importance of daily practice and frequent 
application was emphasized. After the first session 
subjects were instructed to practice twice daily (once 
during a particularly tense time) and to apply the 
procedure as frequently throughout the day as tension 
was identified. 

Two undergraduate assistants served as training 
leaders, providing a demonstration of tension-release 
procedures, answering questions, and starting the 
taped procedures, 


Results and Discussion 


Sub jects’ scores on each of the two question- 
naire items (percent tense and severity rating) 


Table 1 
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were averaged for each of the 4 baseline weeks 
and the 2 therapy weeks. Table 1 presents the 
resulting means for each condition, 

Baseline phase. A three-way repeated mea- 
sures analysis of variance (Demand X No 
Treatment X Weeks) on the percent tense 
measure during the 4-week baseline revealed 
no effects due to demand or no-treatment 
factors. The main effect of weeks, F (3, 117) 
= 449, p < .01, indicated that the percent 
tense ratings declined during the first 3 weeks 
(M over weeks = 44.33, 40.05, 36.07, and 
37.08). Analysis on the severity data revealeda 
decline over weeks, F(3, 117) = 5.01, p < 01 
(Ms = 2.92, 2.67, 2.62, and 2.59), and a main 
effect of demand, F(1, 39) = 5.49, p< 0. 
Subjects given no demand for improvement 
reported /ess tension severity (M = 2.55) than 
did demand subjects (M = 2.86). 

With one exception, the results indicate that 
whether or not a subject expects future treat- 
ment and whether or not he/she is told to 
expect improvement have little effect on 
several weeks of tension ratings. In the one 
case of a demand effect, the results were 
opposite to those commonly found in investi- 
gations of phobic behavior (Bernstein, 1973). 
Both percent tense and severity scores declined 
over weeks. Since spring vacation occurred 
during the 3rd and 4th weeks, it is uncleat 
whether self-monitoring or a less tension- 
producing environment was primarily respon- 
sible for this decrease. 


Mean Self-Report Tension Scores Jor Waiting-List and Informed-Control Groups Under Demand 


and No-Demand Conditions During Baseline, 


Treatment Weeks, and Follow-up: Experiment 1 


Treatment 
Baseline week week 
Measure Group Condition 1 2 3 4 5 Gar aN 
% tense Waiting list Demand 514 39.8 40.1 40.9 36.0 30.0 
No demand 40.2 35.5 298 374 34.9 26.0 
458 37.7 34.9 39.2 35.4 28.0 332 
Informed control Demand 45.2 43.2 39.7 39.5 52.8 54.0 
ee 405 41.7 347 304 42.0 sl 46.3 
Severity Waiting list Demand or cre ae S E 2.5 
No demand PES GA) 2.3 2.6 2.5 2.3 5 
ee Mu DOM OWN S197. .2.6. 24 bok 
nformed control Demand S029. 2:9. 2:8 3.1 3.0 
Nodemand 29 26 26 22 28 2.6 9 
M AD) ihe 3 ak ae I ae a 


ent phase. During treatment, the 
ist group received relaxation training, 
informed-control group continued as 
atment group. With one exception, 
X Demand analyses of variance 
d that the groups did not differ on 
ependent measure at the 4th week, the 
eek prior to relaxation training ($ 
The exception was that no-demand 
reported less severity than demand 
, F(1, 39) = 5.53, p < .05. Three-way 
measures analyses of variance (Treat- 
Demand X Weeks) on the 4th, Sth, 
weeks found significant Treatment 
interactions for both the percent 
data, F(2, 78) = 11.61, p< 001, and 
erity data, F(2, 78) = 8.92, p< 001. 
measures, waiting-list subjects receiv- 
ation training continued to decline, 
informed-control subjects returned to 
rly baseline level, suggesting that the 
decline for the total group was due to 
vacation and not to self-monitoring. 
given no demand for improvement 
baseline continued to report lower levels 
sion severity during the therapy period 
2.49) than demand subjects (M = 2.82), 
9) = 4.67, p < .04.1 
ter 7 months, subjects contacted by phone 
‘asked to estimate current, typical percent 
and severity. Although differences were 
Significant due to large variability, the 
indicated maintenance of improvement 
g the 14 subjects contacted in the treated, 
g-list condition (M percent tense = 35.8, 
a 25.3 ; M severity = 2.57, SD= .65) 
Maintained baseline levels among the 19 
cts contacted in the informed-control 
on (M percent tense = 45.8, SD = 24.3; 
everity = 2.95, SD = .91). The earlier 
and condition had no effect on the follow- 
eports. 
he results from the treatment phase suggest 
specific or nonspecific ingredients in 
j ation training may provide an effective 
tm of brief intervention for daily tension 
lems, whereas follow-up indicated main- 
improvement for most treated subjects. 
conclusion is tentative, since presence OT 
nce of relaxation training was confounded 
type of no-treatment condition. 


RELAXATION TREATMENT OF GENERAL TENSION 


521 


Experiment 2 


The second experiment was designed to 
replicate the relaxation effect on reported 
tension under more controlled conditions and 
to initiate efforts to determine the active 
ingredient (s) within the progressive relaxation 
procedure. 

Although Experiment 1 suggested minimal 
impact of demand/suggestion on daily tension 
reports, placebo and Demand X Treatment 
interaction effects remained viable hypotheses 
to explain the outcome improvement. In addi- 
tion to problems mentioned earlier about the 
ability of placebo conditions in therapy re- 
search to control for client expectancy and 
demand, the extended use of placebos with 
suffering individuals raises ethical issues, and 
alternative control procedures should be ex- 
plored (O'Leary & Borkovec, in press). One 
such method, developed in our sleep dis- 
turbance program, involves counterdemand 
instructions: Subjects are told not to expect 
improvement until after a certain number of 
training sessions and weeks of practice, since 
past research has shown that this degree of 
training and application practice is required 
before noticeable effects can occur. Statistical 
comparisons among conditions are then made 
prior to the end of this counterdemand period. 
Expectation and demand are hypothetically 
held neutral during this period, allowing 
detection of active treatment effects relative 
to control conditions. The sleep studies have 
validated this function of the counterdemand 
procedure (cf. Borkovec & O’Brien, 1976), In 
refore, the counterdemand 


Experiment 2, the: em 
strategy was used instead of the traditional 
placebo condition to provide self-report data 


uninfluenced by main effects of expectancy an 
demand or their interaction with treatment. 

Progressive relaxation consists of two 
principal procedural components: tension re- 
lease of gross muscle groups and focused 
attention on the resulting sensations of tension 
and relaxation (Bernstein & Borkovec, 1973; 
Paul, 1966). A 2 X 2 design involving the 


1Both dependent measures (percent tense ang 
severity) were also analyzed by Treatment meal 
X Weeks analyses of variance with all 6 weeks included. 
Significant effects jdentical to those reported above 


emerged, and no additional effect was found. 
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Presence or absence of each component thus 
defines the critical comparison conditions for 
isolation of the active ingredients of the 
procedure. Over the past 4 years, we have 
compared progressive relaxation to each com- 
ponent control condition in the treatment of 
sleep onset disturbance. This set of studies has 
allowed specific conclusions regarding both 
the active ingredients of relaxation treatment 
as well as the maintaining factors of the 
disturbance itself. Our goal in the present 
series is to replicate these component com- 
parisons on the general tension problem. Ex- 
periment 2 involved the initial comparison 
between progressive relaxation and relaxation 
without muscle tension release. 

Physiological data were collected during 
treatment for two reasons: (a) Such data would 
be useful for identifying the effective mecha- 
nism of relaxation treatment of tension, and 
(b) the study provided an opportunity to care- 
fully assess the physiological effects of pro- 
gressive relaxation, A review of the existing 
literature on the physiological effects of relaxa- 
tion revealed equivocal findings (Borkovec & 
Sides, in press). Fifteen studies have found 
relaxation to be superior to control conditions, 
whereas 10 studies have revealed no differences, 
The two sets of studies differed significantly on 
two critical procedural details. Studies demon- 
strating relaxation superiority involved a 
greater number of sessions and more frequently 
used “live” rather than standardized taped 
instructions. Even though taped training re- 
moves a confounding therapist variable, sub- 
ject control over training Progress, an essential 
aspect of clinical use of the technique, is pre- 
cluded and may be the reason for the superi- 
ority of live training. 

In comparing progressive relaxation to re- 
laxation without tension release, therefore, 
Experiment 2 included nine training sessions 
(essentially the full course of training recom- 
mended by Bernstein & Borkovec, 1973, for 
clinical use). Second, taped training was used, 
but by means of two cassettes and subject con- 
trol over alternation between tapes, subjects 
controlled their own progress in treatment. 
Thus, the sessions were identical to the pro- 
cedure followed in live therapy, but the poten- 
tially confounding factors of therapist charac- 
teristics and therapist bias were removed. 
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Finally, typical clinical use of progressive 
relaxation training involves a progression over 
sessions from some number of muscle groups 
to combinations of those muscle groups and 
ultimately to a recall phase without muscle 
tension release (Bernstein & Borkovec, 1973), 
In the latter procedure, the subject presumably 
learns to quickly and efficiently relax by 


recalling how the‘muscle groups felt previously ` 


when tensed and released. The clinical assump- 
tion of generalization of relaxation over such 
a progression of increasingly efficient proce- 
dures has never been tested. Physiological 
tecordings for progressive relaxation subjects, 
therefore, occurred during the first training 
session involving the 14-muscle-group tension- 
release procedure, during the seventh session 
involving the 4-group tension release, and 
during the ninth session involving the 4-group 
recall procedure. A within-group test of transfer 
to the recall procedure was thus provided, 
Second, the inclusion of the relaxation without 
tension release condition and similar physio- 
logical assessments allowed comparisons at 
Session 7 between tension release and no- 
tension-release relaxation, and at Session 9, 
providing a between-groups test of the role of 
tension release pretraining on relaxation by the 
recall method. Since the same tape was used 
for both groups during Sessions 8 and 9, the 
groups differed only in terms of the presence 
or absence of tension release training during 
the initial seven sessions. 


Method 


Subjects. A new group of 36 undergraduates trop 
introductory psychology were selected on the basis o 
their responses to three tension items on the group 
testing questionnaire (50% or greater daily tension, 
moderately severe or greater, and desire to recai 
treatment for tension problems). Subjects were mE 
domly assigned within blocks of percent tense levels 
three treatment conditions: progressive relaxation wi : 
tension release (TR), relaxation without tension releas 
(NTR), and no treatment (NT). Subjects recei re 
course credit for participating, but they were not a 
formed of the credit until they made a firm cone 
ment to volunteer for the sake of therapeutic bene 
Although no assessment was made of the degree of 4 
awareness of relaxation techniques, it was assume a 
the subjects were relatively naïve, since treatment a 
concluded prior to their introduction to psychotherap 
and behavior therapy in their introductory Ges idl 

Experimenters. Three graduate students in ©! tae 
Psychology served as the session leaders and Ww 


unterbalanced across treatment conditions. Two 
ndergraduate assistants served as polygraph tech- 
| scias and session leaders for make-up sessions. 

4 measures. All subjects filled out, just 
for to retiring at night, a daily questionnaire asking 
for percent tense ratings, severity ratings, and a self- 
Fiptained 60-sec pulse rate. Daily ratings began at 
f jast 1 week prior to the first session and ended 1 week 
after the last session. 

Therapy conditions. Subjects in the TR and NTR 
tonditions received nine taped training sessions in their 
mspective procedures over a 5-week period. The NT 
fubjects came to the laboratory for ‘physiological 
sessment” sessions temporally corresponding to 
sions 1, 7, and 9 for treated subjects. Subjects in 
| ach condition were run in pairs during these recording 
gssions, whereas treated subjects were trained in groups 
six during Sessions 2, 3, 4, 5, 6, and 8. During all 
maining sessions, the group progressed through the 
| procedures at the rate of the slowest subject. 


| L Progressive relaxation. Subjects in TR received 
taining in progressive relaxation as described by 
f Bernstein and Borkovec (1973) with two modifications: 
w Tensing foot muscles was excluded, and (b) there 
were only 9 sessions instead of 10. Sessions 1-3 involved 
\ "muscle groups; Sessions 4-5, a 7-group combination; 
{esions 6-7, a 4-group combination; and Sessions 8-9, 
group recall procedure. 

| 2. Relaxation without tension release. Relaxation 
‘taining for NTR subjects was the same as that for TR, 
tkcept subjects were instructed to identify any tension 
cee group and to allow those muscles to 
te i o actual tensing of muscles occurred, although 
x lentical indirect suggestions of relaxation in TR 
Were provided during each muscle group cycle. Muscle 
naa their progressive combinations over sessions 
Ra lentical to those in TR. During Sessions 8 and 9 
ace for TR subjects), training tapes were 

a for both therapy conditions. 

fice xo peat: The NT subjects were told by 
aliar ey would not be able to begin treatment 
Beo in the semester but that they should continue 
| “complete daily questionnaires. 


Be ioni manipulation. Treated subjects were 
f ay in e initial contact, after the first session, and 
N i written statement on their questionnaires 
Bie research had shown that they would not 
session R treatment effects until after the seventh 
a aie t the eighth session, the session leader told 
cust jects that if they had been practicing conscien- 

_ Susly; improvement should become noticeable. 
hee recording sessions. During Sessions 1, 
oa Bae recordings were obtained on a 
ft, The 1 polygraph situated in an adjoining 
and oneal included heart rate, respiration, 
ithe RPA (EMG) from the frontalis muscles 
are 1, an introductory tape describing the 
ler Hoe the procedure was played, the session 
$ peo the muscle groups jnvolved, and 
E les and strain gauges were attached. Two 
ver chloride electrodes were attached to the 
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right and left sides of the chest for heart rate recording. 
Mercury strain gauges were taped across both the sub- 
ject’s chest and abdomen for respiration. Finally, three 
silver-silver chloride electrodes were attached to the 
subject’s head, one each to the right and left frontal 
group for EMG and one in the center of the forehead 
as a ground. Electrode placement was followed by a 
10-min adaptation period, during which the subjects 
were alone in the room and were instructed to lie 
quietly in recliner chairs with their eyes open. At the 
end of adaptation, the session leader reentered the 
room, and the training session began for TR or NTR 
subjects. NT subjects were asked to relax themselves 
for another 60 min with the experimenter absent from 
the room. Relaxation instructions for all nine sessions 
were presented to TR and NTR subjects via two 
cassette recorders: one containing instructions for two 
cycles of each muscle group and one containing repeated 
cycles, with both recorded by the first author. The 
session leader turned on the repeated cycles tape when- 
ever a subject signaled remaining tension after two 
initial cycles, up to a total of four cycles as required, 
For NTR and TR subjects, 15-sec physiological samples 
were taken during (a) the last minute of adaptation, 
(b) the second relaxation cycle of the left biceps, the 
neck, and the right calf, and (c) for 1 min during the 
last min of the 2-min quiet period after training was 
completed. For NT subjects, samples were taken during 
the last min of adaptation and after the 15th, 30th, and 
45th min of the 60-min period and during the last min 


‘The procedure for Sessions 7 and 9 for TR and NTR 
subjects was the same, except that four-muscle group 
training required only about 10 mi 
physiological 
relaxation cycle of each of the four-muscle combinations 
in addition to the adaptation and quiet period samples. 
For NT subjects the procedure was also similar to their 
first session, except that they were to relax themselves 
for only 10 min after the adaptation period, and 15-sec 
physiological samples were taken at the end of adapta- 
tion and at the 2nd, 4th, 6th, 8th, and 10th min of the 


‘At the beginning and end of each recording session, 
subjects completed Husek and Alexander's (1968) 
Anxiety Differential. At the conclusion of Session 1, 
treated subjects completed Borkovec and Nau’s (1972) 
therapy credibility questionnaire. 


Results 


Self-report outcome measures, Each subject’s 
scores on each of the three daily questionnaire 
items (percent tense, severity rating, and 
pulse rate) were averaged for the pretherapy 

after the seventh session 
(counterdemand period), and the week after 
the ninth session (positive demand period), 
Table 2 presents the resulting means for the 
three treatment conditions on each self-report 


measure. One-way analyses of variance foun 
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Table 2 
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Mean Self-report Tension Scores for Tension Release Relaxation (TR), Relaxation Without Tension 
Release (NTR), and No Treatment (NT) During Baseline, Counterdemand, and Positive Demand 


Weeks: Experiment 2 


—.—— —qe[Frqe_ umm 


Last 
Measure Condition Baseline counterdemand Positive demand 
% tense TR 52.6 40.0 35.8 
NTR 42.1 33.4 25.4 
NT 44.5 48.5 45.4 
Severity TR 3.0 2.8 2.4 
NTR 27 2.5 2.0 
NT 2.9 3.0 3.0 
Pulse rate TR 76.7 75.9 74.4 
NTR 75.5 72.1 72.9 
NT 67.7 68.7 70.1 


no differences among the groups during the 
baseline week on the percent tense and severity 
data. A significant treatment effect was found 
on the pulse rate measure, F(2, 31) = 4.37, 
p < .05. Scheffé post hoc comparisons (Hays, 
1963) indicated that neither TR nor NTR 
differed significantly from NT, although NT 
was significantly lower than the treated groups 
combined, 


1. Counterdemand period. Two-way re- 
peated measures analyses of variance (Treat- 
ment X Weeks) were conducted on each 
questionnaire item from baseline to the seventh 
session. The Treatment X Weeks interaction 
was significant only for the percent tense mea- 
sure, F(2, 31) = 5.12, p < 02. Scheffé post 
hoc comparisons revealed that the combination 
of TR and NTR showed significantly greater 
reductions in tension than NT, whereas TR 
alone was significantly superior to NT; NTR 
did not differ from TR or NT. Covariance 
analysis applied to the pulse rate scores 
indicated no treatment effect. 

2. Positive demand period, The same analy- 
ses applied to each measure from the baseline 
week to the week following the ninth session 
found significant Treatment X Weeks inter- 
actions on all three measures, percent tense, 
F(2, 31) = 6.69, p < .004; severity, F(2, 31) 
= 15.22, p < .001; pulse rate, F(2, 31) = 3.56, 
$ < .05. Covariance analysis of the pulse rate 
data revealed no treatment effect. Scheffé post 
hoc comparisons on the Percent tense and 
severity measures indicated (a) no differences 
between TR and NTR and that (b) the two 


treated groups, separately and combined, 
displayed significantly greater reductions than 
NT. 


3. Frequency of practice, frequency of 
application, and treatment credibility, Three 
additional daily questionnaire items were com- 
pleted by treated subjects once treatment was 
initiated: (a) frequency of formal relaxation 
practice during the day, (b) frequency of 
identifying daily tension and of attempting to 
apply relaxation to eliminate that tension, and | 
(c) frequency of successful elimination of 
identified tension. Weekly averages of each 
measure and of the ratio of successes to 
attempted applications were submitted to 4 
Treatment X Weeks repeated measures analy- 
sis of variance. The main effect of treatment 
was significant on frequency of practicing, 
F(1, 20) = 8.87, p < .008. The practice fre- 
quency of TR subjects was close to the recom- 
mended twice-a-day schedule (M = 1.78), 
whereas that of NTR subjects was lower 
(M = 1.43). No effect emerged from analysis 
of the frequency of application, whereas the 
frequency of successful application increase 
significantly over weeks among the treated 
subjects, F(6, 120) = 3.79, p < .002. The ratio 
of successes to attempts also increased Over 
weeks, F(6, 120) = 13.89, p < .001, with rer 
cess incrementing from 44.7% after the ar 
therapy session to 81.8% after the mn 
session. 


Credibility questionnaires (Borkovec & po 
1972) administered after the first St that 
after the final week of the study indicate 


TR groups did not differ on 
lence in their respective proce- 
ions between credibility scores 
ment on the percent tense and 
ures during both counterdemand 
‘demand periods were all non- 


process measures. Two-way re- 
ures analyses of variance (Treat- 

n) were applied to the Anxiety 
scores obtained at the beginning 
recording sessions (1, 7, and 9) and 
ssion-postsession difference in 
ifferential scores. Only the main 
was significant on the presession 
2, 62) = 4.94, p < .02. Subjects 
anxiety at the beginning of the 
d ninth sessions than at the begin- 
first session (Ms = 71.1, 64.9, and 
the postsession measure, the main 
treatment was significant, F(2, 31) 
b< 02. The no-treatment group 
highest anxiety level at the end 
ons, followed by TR and NTR 
59.2, and 52.3, respectively). 
Scheffé pairwise comparisons were 
. Finally, the main effect of sessions 
cant on the presession—postsession 
“scores, F(2, 62) = 3.26, p< 05. 
t decrements in anxiety occurred 
beginning to the end of the first 
Ms = —11.9, —7.3, and —6.5 for 
, 7, and 9, respectively). 
treatment sessions, subjects in TR 
. received two relaxation cycles on 
scle group before proceeding to the 
ip. Additional cycles were adminis- 
any subject who signaled remaining 
er all nine sessions, the total number 
nal cycles required to achieve reports 
plete relaxation was 8.5 for the TR 
and 17.8 for the NTR condition, 
= 2.10, p < .05. 

ological process measures. Physiolog- 
were amplified on the Beckman 
h, fed into a Hewlett-Packard FM 
rder, and then reduced by a PDP-12 
Five respiration measures were 
‘ d from each sample: sec/cycle, sec/ 
pon) sec/expiration, sec/inspiration to 
‘ation, and inspiration amplitude/ 
ae amplitude. Two cardiac measures 
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were obtained: heart rate in beats/min and 
heart rate variability. Too few EMG responses 
were recorded to allow meaningful analysis, 
One-way analyses of variance conducted on 
the Session 1 adaptation period revealed no 
differences on any measure among the three 
treatment conditions prior to therapy. Three- 
way repeated measures analyses of variance 
were then performed. In addition to the treat- 
ment and session (1, 7, and 9) factors, these 
analyses included a phase factor with five 
levels (an adaptation sample, three muscle 
group samples, and a quiet period sample). 
The three muscle group samples had been ob- 
tained in Session 1 at the left biceps, neck, and 
right calf. Sessions 7 and 9, however, involved 
four muscle group combinations with samples 
taken after each group, (a) hands, forearms, 
and biceps; (b) face and neck; (c) chest and 
abdomen; and (d) thighs and calves. Conse- 
quently, the average of the latter two major 
muscle groups was taken for Sessions 7 and 9 
to provide the third muscle group sample. 
The only significant effect involving the 
treatment factor to emerge from analysis of the 
five respiration measures was a Treatment 
X Phase interaction on the inspiration/ expira- 
tion amplitude ratio, F (8, 120) = 2.57, p < 05. 
Inspection of Table 3 means indicates that 
inspiration amplitude, relative to expiration 
amplitude, increased for TR and NTR condi- 
tions from baseline to the first or second 
muscle group samples and then declined to 
near-adaptation levels. The NT condition 


Table 3 i 
Inspiration/Expiration Amplitude Means for 


Treatment Groups During the Five Phases of 


the Sessions 
Group 

: Sones 
End of adaptation ah 1.02 1.03 
baa group sample in ‘nen 91 
3 106 105 108 
4 tor 103 102 
Quiet Friga ae PAARA 


fe E A 
TR = relaxation 


i ; N 
Note. TR = tension release; 
without tension release; NT = no treatment. 
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showed a marked reduction in the ratio at the 
time corresponding to the first muscle group 
sample for treated subjects and a marked 
increase at the second sample before declining 
again below adaptation level. 

No effects involving the treatment factor 
emerged from analysis of the heart rate data. 
Heart rate did decline from adaptation to 
quiet period for the total group (Ms = 67.4 to 
65.0, respectively), F(4, 108) = 4.29, p < .01. 
Heart rate variability declined initially and 
then rose during the five phases (Ms = 4.9, 
4.0, 4.7, 4.7, and 5.4), F(4, 64) = 2.59, 
DLRA 

Correlations between physiological change and 
subjective improvement. To further assess the 
relationship between physiological activity and 
subjective tension, correlations for the total 
group of subjects were computed between 
physiological change during Sessions 1 and 7 
and improvement in percent tense and severity 
scores during the counterdemand period (Ses- 
sions 1-7). Physiological change scores for each 
of the seven measures were obtained from the 
difference between the quiet period sample and 
the adaptation sample. Counterdemand im- 
provement on percent tense was significantly 
related to four of the seven Session 1 change 


scores (sec/cycle, r= —.57; sec/expiration, 
r = —.51; sec/inspiration, r = —.48; heart 
rate variability, ry = —.36). Thus, reduction in 


respiratory functioning and increased heart 
rate variability during Session 1 predicted out- 
come improvement on the percent tense mea- 
sure. Only one correlation was significant 
between percent tense change and physiological 
change during Session 7 (sec/inspiration to 
sec/expiration, r = .39), Counterdemand im- 
provement on the severity measure correlated 
significantly only with one Session 1 physio- 
logical measure (sec/cycle, r = —,33) and was 
not related to Session 7 measures. 
Physiological relationships with improve- 
ment during the positive demand period were 
of little interest, since demand and expectancy 
effects may have contributed to outcome re- 
ports at that time. The fact that only one of 
the seven physiological measures (Session 1 
heart rate variability) correlated significantly 
with improvement in percent tense (r = —.39) 
and severity (r = —.42) suggests that those 
Positive demand outcome measures were 
indeed a function of nonphysiological variables. 
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Finally, although the self-reported pulse Tate 
measure failed to reflect treatment effects due 
to pretherapy differences among the treatment 
groups, its potential usefulness in future re 
search was supported by the finding that pre- 
therapy pulse rate reports correlated signifi- 
cantly with continuously monitored heart rate 
during the adaptation period of Session 1, 
r(26) = .46, p < .02. 

Follow-up contact. Ten TR and 7 NTR 
subjects were located 5 months later and 
phoned by a research assistant “blind” to their 
condition status. Subjects estimated current 
daily tension percentage, rated the severity of 
current tension and the effectiveness of their 
relaxation skill in reducing daily tension on 
S-point scales, and indicated how often they 
used their relaxation skills during the 5-month 
period. Average percent tense scores wert 
24.0 for the TR condition and 20.3 for the 
NTR condition, indicating maintenance of the 
gains reported immediately at the end of 
treatment. Average severity ratings fell be 
tween “mild” and “moderate” levels of tension 
(2.70 for TR and 2.29 for NTR), representing 
slight though nonsignificant increases in sevet- 
ity since the end of treatment. The two treated 
groups did not differ significantly on either 
measure. On the third item, TR subjects re- 
ported being “quite” successful (M = 4.10, 
SD=.74) in applying the technique to 
eliminate tension, whereas the average sucen 
rating of NTR subjects fell between “very 
little” and “somewhat” (M = 2.71, SD= 
1.98). Although the mean difference only 
approached significance (Mann-Whitney * 
= 1.60, p < .12), the variances of the two. 
groups were significantly different, F(6,9) 
= 7.13. Finally, TR subjects reported sti 
practicing relaxation 2.11 times per ‘a 
(SD = 2.01), whereas NTR subjects practic ; 
only .55 times (SD = .72). The variance a 
was significant, F(9, 6) = 7.77, P<“? 


jon 
2 i jects during the first sessi0 
Recordings from NT subjects during stee at self- | 


were obtained for 60 minutes to guara: 
relaxation occurred at least as long as the TR eae is 
sessions. Average duration of the first oes. oe 
two latter conditions was 36.0 minutes. All eee 
physiological measures were therefore reanalyze K 
paring TR and NTR to NT separately during signifi 
minute, 45-minute, and 60-minute samples. No 
cant treatment effects were found for any ™ 
any of the three NT time samples. 


easure 4! 


nn-Whitney test indicated that 
eans also differed significantly 
02)? 


s of Experiment 2 lead more firmly 
lusion that progressive relaxation 
ignificant reductions in the daily 
ienced by overly tense individuals. 
tense improvement was greater 
subjects than for untreated con- 
t that this difference was obtained 
unterdemand period suggests that 
ingredient within progressive 
‘ocedures contributed to the sub- 
rovement that occurred, inde- 
‘demand, expectancy, and/or their 
1 with treatment. Follow-up reports 
s indicated that these gains were 
d over a fairly long interval subse- 
eatment. 
progressive relaxation differed 
ly from no treatment during counter- 
‘telaxation without tension release fell 
two latter conditions and differed 
ly from neither. The critical pro- 
omponent in progressive relaxation 
not unambiguously identified. Ap- 
frequent attempts to relax while 
m internal sensations are sufficient 
€ tension reduction. Additional re- 
ever, Suggest a potentially important 
muscle tension release. Subjects in 
e relaxation required significantly 
ng cycles to produce subjective 
of complete relaxation. This result 
that tension release relaxation may 
ore efficient procedure and perhaps 
e absence of differences between the 
atment conditions on the during- 
Anxiety Differential. Furthermore, 
sslve relaxation subjects practiced their 
“ure significantly more often both during 
and over the succeeding 5 months. 
Hit is unclear which aspect of tension 
Procedures accounts for these results, 
Practice of an adaptive skill is ob- 
esirable. Finally, perhaps due to the 
Practice, progressive relaxation sub- 
led to report greater success, with 
tly less variability, in eliminating 
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daily tension 5 months after treatment. All of 
these differences, due to the presence or absence 
of muscle tension release, occurred between 
conditions that did not differ in credibility or 
the amount of improvement expected. 

Our review of the literature had suggested 
that the present study maximized the proba- 
bility of obtaining significant physiological 
reduction effects: several sessions, subject con- 
trolled treatment, and a presenting problem 
with some significance for daily adjustment. 
The absence of treatment effects on physio- 
logical process leads to two separate observa- 
tions. First, the main difference between our 
procedures and those of previous studies find- 
ing significant relaxation effects (Borkovec ey 
Sides, in press) was the presence or absence of 
a therapist. We are forced to conclude that the 
therapist factor may be a critical variable in 
promoting physiological reduction during re- 
laxation training. Second, the fact that some 
relationships were found between physiological 
process and subjective improvement suggests 
that the subject’s ability to reduce physio- 
logical activity by any procedure, including 
self-relaxation, does contribute to reductions in 
subjective tension. However, the treatment 
effects on the subjective outcome measures 
indicate that such improvement will occur only 
if the subject realizes that relaxation is a skill 
to be practiced and applied to his/her daily life. 

Hopefully, future experiments involving 
comparisons of progressive relaxation with 
other component control conditions will further 
elaborate the mechanisms by which relaxation 
promotes reductions in daily tension. With 
such specifications, it should be possible to 
develop increasingly efficient and efficacious 
procedures for ameliorating a pervasive adjust- 
ment problem. 


2 After the conclusion of the treatment period of the 
study, NT subjects were given similar progressive 
relaxation training with three exceptions: (a) No 
physiological recording was conducted, (b) no counter- 
demand instructions were administered, and (c) sub- 
jects were not required to complete daily questionnaires 
during their treatment period. With these differences 
in mind, 5-month follow-up information obtained on 
5 NT subjects indicated mean values generally similar 
to those of the TR subjects (percent tense = 27.5, 
severity = 2.80; effectiveness of application = 3.25; 
and frequency of practice = 1.45). 
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consistently and rapidly effective, 


Jesch-Nyhan disease, which is characterized 
Pityperuricemia, stunting of growth, choreo- 
"tosis, generalized spasticity, and compul- 
self-mutilation, is a rare sex-linked genetic 
ase (Berman, Balis, & Dancis, 1969). The 
tic mutation is responsible for a deficiency 
| lypoxanthine phosphoribosyltransferase ac- 
ty, an enzyme involved in purine metab- 
i (Rubin, Dancis, Yip, Nowinski, & Balis, 
; Seegmiller, Rosenbloom, & Kelley, 
eeeh mental retardation was origin- 
| ought to be a necessary concomitant of 
3 oo disease (Nyhan, 1968), another 
a scribed a child with the disease who 
ae receptive intelligence (Scherzer & 
i A 9). However, the children are severely 
; in motor function and are unable to 
i he sit without support. Speech is dys- 
The ia difficult to understand. 
fio agnosis is suggested by the sympto- 
nice associated with excessive urinary 
n of uric acid and is confirmed by 
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Behavioral Contingencies and Self-mutilation 
in Lesch-Nyhan Disease 


Lowell Anderson, Joseph Dancis, and Murray Alpert 
New York University School of Medicine 


Lesch-Nyhan syndrome is a rare, sex-linked, recessive disease that is accompa- 
nied by severe self-mutilation, especially finger biting. Evidence is presented 
suggesting that parental response patterns may contribute to the genesis of the 
self-injurious behavior (SIB). The therapeutic effectiveness of punishment, posi- 
tive reinforcement of either SIB or non-SIB, and time-out learning paradigms 
were evaluated. Electric skin shock failed to suppress the behavior. Positive 
reinforcement of non-self-injury and time-out from social reinforcement were 
n indicating a complex interaction of genetic 
and environmental factors in the production of SIB. Elimination or major re- 
ductions in incidence of SIB was maintained during follow-up periods of 2 years. 


demonstrating the specific enzyme defect in 
red or white blood cells or skin fibroblasts. 
Family patterns of inheritance and mosaicism 
in tissues of the obligatory heterozygote 
(Migeon, Der Kaloustian, & Nyhan, 1968; 
Silver, Cox, Balis, & Dancis, 1972) indicate an 
X-linked recessive disease. Management strate- 
gies are supportive of kidney function by 
preventing excessive production of uric acid 
with allopurinol, thereby greatly increasing life 


expectancy. Questions concerning procedures 


for managing the behavioral consequences of 
the illness and for optimizing the life adjust- 

ment of these children thus become relevant. 
One of the most striking and disabling 
characteristics of the disease is a pattern of 
self-injurious behavior (SIB) that affects 
virtually all Lesch-Nyhan children (Lesch & 
Nyhan, 1964). Near the end of the second year 
of life, they begin to bite their fingers, lips, or 
arms with sufficient force to cause lacerations, 
expose tendons, and even amputate fingertips. 
If not prevented, SIB occurs at a very high 
rate, accompanied by general agitation and 
on tubular arm splint re- 


crying. Reliance : 
straints to prevent finger biting and extraction 
of teeth to prevent lip biting have constituted 
the only therapeutic approach to SIB in 
Lesch-Nyhan children. j 
‘Although no direct studie: l 
ported that demonstrate normal pain s$ 


s have been re- 
ensi- 
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tivity, observers agree that pain thresholds do 
not appear to be elevated in Lesch-Nyhan 
children and that the children are manifestly 
distressed by the tissue damage. In fact, it is 
commonly observed that the children appear 
placid and passive while in restraints but be- 
come agitated and distressed if the restraints 
are removed. It also seems clear that the pat- 
tern of SIB in these children differs from the 
pattern of occasional and accidental injury 
sometimes reported in individuals with con- 
genital lack of pain sensitivity. 

In our preliminary investigations, using 
naturalistic observations and careful interviews 
with parents, a pattern of possible environ- 
mental support for the behavior began to 
emerge. The frequency of self-injury varied 
considerably and predictably with variations 
in the environment. For example, we recorded 
attempts at SIB in one child while he was cared 
for by his grandfather, his mother, and his 
grandmother. During a 15-minute period when 
the grandfather cared for the child, there were 
110 attempts to self-injure on the first day of 
observation and 104 on another day. With the 
mother, the corresponding figures were 74 
and 77, whereas with the grandmother there 
were “‘only” 10 and 12 attempts at self-injury. 
The chronological order of these observations 
was counterbalanced to exclude an artifact of 
a time trend. 

The consistency of different response rates 
in the presence of different family members and 
the wide range of response rates among family 
members suggest a behavior that is under 
specific discrimination. Thus, although it is 
clear that the SIB is intimately related to the 
genetic defect, important environmental con- 
tributions may also be presumed. A series of 
studies were therefore initiated to identify the 
contingencies that supported SIB and to 
design an intervention to reduce the use of 
restraints. 


Method 
Subjects 


Five males, C.S., P.B., C.W., R.K., and J.A., 3, 5, 
i, 12, and 13 years old, respectively, were studied 
while they were inpatients in the clinical research unit 
of New York University School of Medicine. All cases 
had been previously studied for extended periods, and 
the diagnosis had been confirmed by demonstration of 
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the enzymatic defect. The oldest patient was institu, 
tionalized, whereas the four younger ones lived at home 
They demonstrated the characteristic motor retardatio 
and spasticity and could not sit or stand without sup 
port. Prior to treatment the children wore arm restrain 
and bore scars on fingers and lips. Upon removal of 
straints, they would immediately become agitated ay 
begin finger biting with an intensity that would lead 
severe injury in a matter of minutes. In addition { 
finger biting, other SIB such as lip biting, head bangi 
or neck snapping were present in all children, 

Despite the fact that all were severely impaired j 
verbal expression and motor control, it was our 
pression that the receptive intelligence of these fi 
children ranged from mildly retarded through norm 
Because of their extremely poor verbal and moti 
abilities, attempts to administer intelligence tests we 
either not undertaken or produced scores that seem 
far too low to be a valid indication of IQ. However, th 
children are responsive to social situations and 
speech. Three of the children can count, tell time, am 
have memorized the television schedule. At times, thej 
seem to speak in sentences, although none can 
The unusual nature of the disease makes it difficult 
accurately evaluate their intelligence, but they are nd 
severely retarded. 

Written informed consent was obtained from bol 
parents of all children, and our procedures wé 
approved by the Medical Center’s Human Res 
Committee. Parents and therapists experienced 
shock levels delivered to the children. 


Procedure 


Four learning paradigms were investigated in & 
children—punishment, positive reinforcement of 
SIB by noncontingent attention, positive reinforce! 
of SIB by contingent attention, and time-out conti 
on self-injury. These paradigms were compared toi 
response prevention baseline. Overriding clinical 
siderations prevented us from adhering to & n 
search design. Number of learning trials, sessions, © 
order of investigation of learning contingencies Vi 
slightly from child to child. The specific information 
provided in Figure 1. Throughout the study, Wi 
independent raters recorded the number of attemp 
SIBs. Observer reliability was found to be 
(Pearson r = .94). ii 

Response prevention. The child was seate a 
wheelchair, and his arms were released from the ® 
straint. An assistant, seated in front of thie 
intercepted the child’s hand as it approached his i 
and placed it back in the child’s lap without sp! 
or otherwise interacting with the child. ak 

Punishment. Subjects C.S., P.B., C.W. E 
were given electric finger shock contingent ae 
hand-to-mouth contact. The fifth child, J-A., W# 
shock contingent on attempts at head bone K 
was delivered by a Scientific Prototype TEF A 
source (102K) through electrodes attached A 
fingers of the hand that was most frequently a 
all cases the shock intensity was 3 mA, 2 EE whi 
described as quite painful but well below 
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a eames. The response prevention 
in are concurrently with finger shock. 
Rene ‘tee of non-SIB by noncontingent 
ethane a n P was not agitated or attempt- 
RA interact with peace baad sh E 
feo „consisted of verbal reinforcement 
; mii, — doing fine,” etc.) and stroking the 
Ath ae ead, and so forth. The first session 
Biomed oval of one arm restraint. The 
nt injury Ea response prevention procedure to 
em would hold and stroke the arm and 
ae ion only during periods of calm. 
bly i ee was given for very brief periods 
tls sas and gradually developed into longer 
a iting had been eliminated for intervals 
utes, the second arm restraint was 


pes 
: a eenen of SIB by contingent attention. 
i S approach was to evaluate, under 
only used ons, the effect of the management 
ded to en, y parents in the home. The therapist 
ion ch attempted SIB with a quick inter- 
Prevent injury while making reassuring 
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Figure 1 TENERE A 

f HA ; ae behavior (SIB) during therapeutic intervention and at follow-up. (Left side of 

Of chart rec Sehr of SIB during 30-minute treatment sessions administered by therapist. Right 

e AA jan $ time spent in restraints at home and in school, at follow-up. Pos. Reinf. = positive 
nt. CS, PB, CW, RK, and JA are the initials of the subjects.) 


statements and stroking the child. Non-SIBs were 
ignored. 

Time-out contingent on SIB. Time-out consisted of 
the withdrawal of all attention to the child following 
each SIB attempt. As the hand began moving toward 
the mouth, the therapist simply turned away from the 
child for about 5 sec. The response prevention procedure 
of intercepting hand-to-mouth contacts was dis- 


continued. 


Results 


Figure 1 presents the rate of SIB attempts 
by each child for each of the treatment modal- 
ities in which that child participated. Each bar 
represents the data from three 4-hour sessions. 
Usually three sessions were conducted each 
day. Note the relatively constant level of SIB 
under baseline conditions. In all cases punish- 
ment failed to suppress the frequency of SIB 
and, in fact, it appeared to act as a facilitator. 
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OBSERVATION PERIODS 


Figure 2. The effect of noncontingent attention, con- 
tingent attention, and time-out frequency of self- 
injurious behavior (SIB) in 12-year-old Lesch-Nyhan 
child R.K. (Each point represents the mean hand-to- 
mouth contact per minute in a 5-minute observation 
period.) 


The rate of self-injury ranged from 5 to 14 per 
minute, making the total number of learning 
trials for each 30-minute “punishment” session 
substantial. 

Positive reinforcement for non-SIB produced 
a decrease in the attempts for Subjects C.S. 
and C.W. but not for R.K. The procedure was 
not tried with Subjects P.B. and J.A. Time-out 
was associated with a decrease in SIB for 
Subjects P.B., C.W., R.K., and J.A. but was 
not tried with C.S. With the exception of 
Subject R.K., for whom the procedure was not 
used, the combination of time-out with positive 
reinforcement of non-SIB eliminated self- 
injury in all cases. 


SUBJECT JA. 


RESPONSE SHOCK PLUS RESPONSE 
PREVENTION | RESPONSE PREVENTION 
PREVENTION | 


SHOCK 
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The results of positive reinforcement of S] 
by contingent attention are presented 
Figure 2. This procedure was used with on 
the 12-year-old child (Subject R.K.) and Y» 
conducted prior to therapeutic training, ( 
Day 1, the child was given 1 hour of no 
contingent attention (reinforcement of no 
SIB), 40 minutes of attention Contingent ¢ 
self-injury (reinforcement of SIB), and 1 ho 
of time-out. On Day 2, the order of the tim 
out and noncontingent attention procedur 
was reversed. For each of the 2 days, thes 
sions were continuous with no pause betwe 
manipulations. Each data point represents 
mean number of finger- or lip-biting respons 
per minute for blocks of 5-minute ora 
averaged over the 2 test days. Both m 
contingent attention (reinforcement of oth 
behaviors) and time-out contingent on sel 
injury resulted in virtual elimination of 
self-injury. Positive reinforcement of SIB} 
giving attention contingent on self-inj r 
however, produced a rapid and dramatici 
crease in the rate of finger and lip biting. 

An extinction curve can be observed thi 
extends into the early minutes of the time-d 
period (following termination of the 40-mint 
period of contingent attention). The reinfon 
ment by contingent attention period, origina 
planned for 1 hour, could not be comple 
because the SIB occurred with increased i 
and force and the therapist was unable 
respond quickly enough to prevent damag? 


{ 


RESPONSE SHOCK 


PREVENTION 


TIME-OUT RESPONSE 


PREVENTION 


iade 


OBSERVATION PERIODS 


Figure 3. Effects on self-injurious behavior for Subject J.A. 


| figure 3 contains a summary of the results 
during procedures intended to differentiate the 
‘juence of positive reinforcement by con- 
Agent attention intrinsic to response preven- 
[im from the effect of shock. Prior to thera- 
peutic training, response prevention alone, 
ponse prevention plus finger shock (3 mA), 
shock alone, and time-out were investi- 
fed in Subject J.A. Four 15-minute observa- 
ns were made each day for 6 consecutive 
liys. Each manipulation was studied during 
litee of these observation periods. Response 
[vention was used as a baseline against which 
lhe effects of the other procedures were 
sewed. Thus, the response prevention base- 
e procedure both preceded and followed 
of the other manipulations. This child 
selected for this investigation because his 
te h had been extracted and it was possible 
minate the response prevention procedure 
out fear of serious damage. Each data 
Peete the rate of SIB per 5-minute 
triod. 
The response prevention procedure was 
uated four times during the 6-day test. On 
th occasion there was a consistent and rela- 
y stable rate of between four and seven 
lempts at self-injury per minute. When 
Y was administered without interrupting 
Eo (response prevention discon- 
ee e self-injury rate remained at about 
level of the response prevention “baseline.” 
n shock and response prevention were 
ined, the rate increased to almost twice 
of either procedure alone. Time-out 


; uced a rapid decrease in SIB to less than 
ne per minute. 


Hnerlcation Training 


growing elimination of SIB by time-out 
he 3 etorcentent of non-SIB be- 
a : additional therapists were introduced 
the quence. In all cases, by the fifth therapist, 
Beneralization process was complete and 
es eliminated in the presence of all 
Dike |e Parents were then brought 
it an pital and trained in the use of time- 
n positive reinforcement of non-SIB 
lainin = The parents received 1-3 days of 
the he training was then transferred to 
me, and, depending on the ease of 


hospi 


BEHAVIORAL CONTINGENCIES IN LESCH-NYHAN DISEASE 


533 


training, the therapist visited the home for 
3-7 days. 


Follow-up 


The follow-up period ranged from 22 to 24 
months. During this time, a therapist was 
available, on request, for home visits to 
analyze difficulties and refine the therapeutic 
skills of the parents. Rarely, however, did the 
therapist directly intervene in a behavioral 
problem, limiting herself to advice and in- 
struction. Most requests for additional assist- 
ance were not for recurrences of SIBs but for 
problems such as temper tantrums, breath 
holding, spitting, cursing, or other antisocial 
or noncompliant behaviors that had not been 
treated in the hospital. 

For a 1-week period at the end of follow-up, 
each family was asked to record the daily fre- 
quency of SIBs during mealtimes, bedtime, 
and while bathing or changing diapers. The 
parents also recorded the total amount of time 
their child spent in and out of restraints and 
the conditions surrounding the use of restraints. 

The youngest child (C.S.) was without arm 
restraints for 18 months following discharge, 
at which time the parents decided to use them 
occasionally during high-risk periods (defined 
as when the child was “cranky,” tired, or 
during automobile rides). It appeared that 
restraints were easier for the parents than the 
constant vigil required to maintain appropriate 
contingencies. Most recurrences of self-injury 
could be attributed to a specific event such as 
an untrained grandmother moving into the 
house or the illness of the mother necessitating 
a radical change in the daily routines. i 

SIB has been almost completely absent in 
the school setting. The rating form revealed 
that on the average, the cl ild was in restraints 
5% of the day. No incidents of self-injury were 
noted during the rating period, although the 
mother used the time-out procedure approxi- 
mately three times per day to interrupt 
threatened finger biting. At bedtime the child 
was placed in restraints. i 

The 5-year-old subject (PB) remained free 
of arm restraints in school. At other times 
restraints were used under specific circum- 
stances such as the bus ride to and from school. 


The behavior was under a high degree of 


534 


stimulus control. With certain adults or specific 
settings, SIB did not occur. During the 1-week 
period when the rating form was in use, the 
child was in restraints 12% of the day, and 
self-injury in the form of biting the inside of 
the mouth occurred, on the average, twice per 
day. At bedtime the child was placed in 
restraints. 

The 11-year-old subject (CW) has remained 
free of restraints while at school and at bed- 
time. At home he remained free of restraints 
for 6 months following treatment, at which 
time he was placed again in arm braces by his 
parents. The failure to maintain the gains in 
the home coincided with the development of 
a chaotic situation following the hospitalization 
of the mother and the inability of the father to 
care adequately for the child. One finger-biting 
event was recorded in the final week of follow- 
up, but there were numerous instances of head 
snapping. He was in restraints approximately 
39% of the day. 

The 12-year-old subject (RK) remained 
completely free of restraints during the day 
but was placed in elbow braces at bedtime. 
This child has never engaged in SIB during 
the night, and the restraints seemed more 
related to parental habit than necessity. No 
SIBs were recorded during the 1-week period 
when the rating form was in use. 

The oldest boy (JA), who is institutionalized 
at a state school, has been without restraints 
for the entire follow-up period. His teacher 
and a nurse’s aide with primary-care responsi- 
bilities for the child have responded inde- 
pendently to the questionnaire and to personal 
interview. Neither have reported the need of 
either day or night restraints and neither are 
aware of any recent attempts at self-injury. A 
physical examination of the child did not reveal 
any signs of recent self-injury. 

In summary, there has been a considerable 
reduction in the dependence on restraints for 
all of the children. With two children, arm 
restraints have been discontinued entirely 
during the day. For another two, the restraints 
are applied at infrequent intervals, usually in 
response to the parent’s wish to temporarily 
end the need for maintaining appropriate 
contingencies rather than to treat SIB. The 
fifth child, after 6 months free of restraints, is 
now in arm splints almost half of the day. This 
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is undoubtedly related to a turbulent home 
situation, with both parents being invalids and 
with the child having unruly siblings. Most 
important to educational and social develop. 
ment, the children are usually free of arm 
restraints in school. At night three of the five 
children are placed in restraints. 


Discussion 


To conduct an operant conditioning analysis 
without permitting self-injury, a response 
prevention procedure was instituted. Following 
removal of the restraints, a therapist seated in 
front of the child interrupted movement of 
finger to mouth just before injury could be 
inflicted. The frequency of such movements 
provided the baseline against which to measure 
the effectiveness of a variety of interventions, 

Because of the reported effectiveness of the 
punishment paradigm in the alleviation of SIB 
associated with other diseases (Lovaas & Sim- 
mons, 1969), electric shock administered to the 
fingers was the first conditioning modality that 
was tried. In most instances, there was an 
increase in finger-to-mouth movements (Figure 
1). The increased rate of SIB attempts was 
attributed to the effect of shock superimposed 
on the response prevention procedure. Separate 
analysis of these two components was possible 
in J.A., whose teeth had been previously re 
moved to prevent finger biting. Shock and 
response prevention were roughly equivalent 
in their ability to maintain the rate of SIB 
(Figure 3). The unexpected ineffectiveness of 
aversive stimulation contrasted with the rapid 
and consistent success in eliminating self- 
mutilation using time-out and positive rem- 
forcement of non-SIB. There did not appea! 
to be any replacement of SIB with other un- 
desirable behaviors. Spitting and cursing wer 
common, but these behaviors usually ante 
dated treatment. Pes i 

In other diagnostic categories, puniska 
has been found to be consistently and um X 
effective in reducing the rate of SIB e el, 
& Lovaas, 1968; Corte, Wolfe, & Locke, 1 z 
Lovaas & Simmons, 1969; Risley, 1968; 
1972; Tate & Baroff, 1966; Young & wn 
1974). Although failures to maintain the sock 
produced by contingent electric skin $ 
frequently occur (Harris & Romanczyk, 


S Simmons, & Frankel, 1974; Romanczyk 
ren, 1975), no report could be located 
indicated a total ineffectiveness of the 
dure, The failure of contingent shock to 
ce the rate of self-injury thus seems to be 
Miar to Lesch-Nyhan disease and may 
fefore be a clue to the action of the genetic 
ict and how it interacts with environmental 
prience to produce the behaviors character- 
of this disease. 
jgnificant gains, as measured by freedom 
farm restraints, were maintained in all 
fen during an observation period of 2 
$ The opportunity to attend school 
Out restraints permitted participation in 
itional and social activities, making for 
ier children and easing tensions at home. 
ains at home were also significant but 
More irregular, indicating that either 
can more reliably maintain the 
Opriate contingencies or perhaps a busy 
, nurse’s aide, or an unsuspecting 
er are less likely to attend to finger 
g, thus creating a discriminative period 
lon-self-injury. 
he maintenance of long-term improvement 
eral is better than that reported for SIB 
d with autism, schizophrenia, or 
tal retardation (Harris & Romanczyk, 
Romanczyk & Goren, 1975). This may 
lated to the presumed higher intelligence 
he Lesch-Nyhan child and the ability to 
pate in socially appropriate activities 
eed of physical restraints. This ability 
eract with their environment when not 
ined, and the loss of that activity con- 
ent on self-injury, may provide the 
sary contingencies to account for the 
Stence of treatment effects. 
ps evident from this study that the SIB 
oo disease is strongly influenced 
‘vironmental factors. Contingent atten- 
a as serves as a strong reinforcer (see 
4 A Tnexperienced attendants, such as 
and grandparents, are likely to react 
J with contingent attention. On the other 
» When trained in time-out and appro- 
i ee techniques, the same 
File ara elimination of the un- 
Bei = sand ae must be 
A wi e primal 
matic defect of Lesch Nyhan ae 
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deficiency in hypoxanthine phosphoribosyl- 
transferase, it is not an inevitable consequence. 
The sequence of events leading to the estab- 
lishment of SIB is not known, but its mainte- 
nance is dependent on environmental support. 

In selecting children for this treatment, a 
careful analysis of the home situation and 
parental motivation is essential. Parents should 
be carefully trained and candidly warned of 
the difficulties they will encounter, and 
initiation of treatment should be postponed 
until the parents are able to make use of 
consultation. In training parents or therapists 
to administer this therapy, there are three 
treatment issues to be emphasized. 

First and most important, in the early stages 
of training, time-out or isolation should be 
used with caution and only under the direction 
of a skilled therapist. Time-out is an extinction 
procedure, and therefore the schedule of 
reinforcement that has maintained the behavior 
is critically related to the rate of behavior 
change. Because of the constant vigilance that 
these children require, most SIB is maintained 
by almost continuous reinforcement. Behavior 
maintained on a high-density reinforcement 
schedule is the most amenable to extinction 
procedures, and it is important that this 
advantage not be lost. Care should be taken 
that self-injury is not reinforced during time- 
out or isolation, thereby increasing its resis- 
tance to extinction and accidentally rendering 
the time-out or isolation procedures ineffective. 
If parents or therapists have already trained 
a child to self-injure in the absence of direct 
visual contact, an abrupt change in the salient 
characteristics of the isolation situation or a 
gradual errorless learning procedure _may 
prove helpful. It would be unwise to initiate 
therapy unless isolation can be made a dis- 
criminative period for non-self-injury- 

Second, in designing continuing care proce- 
dures, it should be remembered that self-injury 
has proven to be an attention-eliciting be- 
havior. The likelihood of maintaining good 
treatment effects depends heavily on the child’s 
ability to gain attention in socially appropriate 
ways despite the extreme limitations fs 
physical movement imposed by the spasticity. 
Self-help skills such as eating, without assist- 
ance, arranging access to television, books, the 
out-of-doors, and so on, must be devised. Our 
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most successful cases had willing and capable 
parents as well as access to educational insti- 
tutions in which training in self-help skills was 
emphasized. 

Third, the results of this study suggest that 
Lesch-Nyhan children are either reinforced 
by, do not learn from, or are unable to inhibit 
a response associated with a “punishment” 
contingency. This effect is specific to punish- 
ment procedures in that the children can learn 
to inhibit a response when a time-out con- 
tingency is used. From the parents’ point of 
view, it appears that any attempt to use 
discipline or punishment for any behavior (on 
close scrutiny, the range of inappropriate be- 
havior is endless) is not only ineffective but 
frequently has the effect of increasing the rate 
of behavior. Inasmuch as discipline is a primary 
means of socialization and parental reliance on 
the technique is so pervasive, it is unreasonable 
to expect that the therapist’s instructions can 
be adequately followed on a daily basis. Thus 
the procedures described above should be 
viewed as management techniques for specific 
behaviors (e.g., finger or lip biting) occurring 
in specific situations (e.g., physical therapy or 
other formal educational experiences) and not 
as a general cure for the manifold behavioral 
problems associated with Lesch-Nyhan disease. 
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On the Decision to Be Assertive 
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This study exa mines the applicability of an expectancy/decision model to asser- 
tiveness in a nonclinical population. Assertiveness, defined here as refusal to 


comply with an unreasonable request, hi 
viewpoint of behavior theory, which pres 
sition for the training of assertive behav 


as been researched extensively from the 
cribes anxiety reduction and skill acqui- 


iors. However, little has been done to 


investigate the reasons why assertive behavior occurs in one situation and not 
in another. The results of this study suggest that participants, irrespective of 
their scores on standard measures of assertiveness and of anxiety, consider the 


consequences of being assertive when 


making a decision about how to behave. 


Moreover, it was found that the difference between participants who chose an 
assertive response and those who did not lies in the formers’ assessments of the 
probabilities that bad consequences will occur and good consequences will not 
rather than in their evaluations of how bad or how good those consequences 
would be. These results imply that training programs should take into account 
the participant’s perceptions of the risks involved in being assertive and that the’ 
focus should be on changing these perceptions rather than on attempting to 
change his or her values or focusing solely on specific assertive behaviors. 


Studies of assertiveness and designs for 
isertiveness training programs have revolved 


p ss assertive than they 
mt One hypothesis (Wolpe, 1958, 1969; 
aN & Lazarus, 1966) is that nonadaptive 
+ y inhibits the expression of assertiveness. 
E second hypothesis (Lazarus, 1971) is 
iis preserve: people lack the necessary 
i ee assertiveness. In practice, these 
i eses are not mutually exclusive and may 
166) ee ety (Wolpe & Lazarus, 
na they are presented as such to 
Re e most salient features of assertive 
Solem as described in each theory. To 
Ee ent and extend these views, we 
pose a third hypothesis: Prior to acting 
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either assertively or nonassertively, people 
weigh the consequences that could be expected 
to result from either behavior and they elect 
the behavior that appears most favorable. 
That is, the decision to act assertively is not a 
general trait. Instead, it varies in any situation 
according to the consequences expected by 
the person involved. Differences between 
persons who tend in general to be assertive and 
those who tend in general to be less so lie 
in differences in their expectations about 
these consequences. 

‘Assertiveness training programs are designed 
to help people who have problems with 
interpersonal communication as a result of 
overly passive or overly aggressive behaviors. 
Using techniques such as instruction, modeling, 
rehearsal, and feedback, training attempts to 
reduce the anxiety of interpersonal encounters 
and to teach specific behaviors such as appro- 
priate eye contact, appropriate voice tone, 

«y” statements, appropriate body 


use of 

language, and so on. These programs have 

been successfully offered to a variety of clinical 
Eisler, & Miller, 


populations (e.g, Hersen, 
1973). 
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Even though there is evidence that as a 
result of decreased anxiety and acquired 
assertiveness skills, there are changes in 
both assertive behavior and self-report of 
assertiveness (McFall & Lillisand, 1971; 
Rathus, 1973), there are still theoretical 
difficulties. One difficulty is that if one adopts 
a very extreme interpretation of either of the 
original models, training that lowers anxiety 
or teaches skills should result in the trainee 
being rather consistently assertive in all 
subsequent situations that call for assertive- 
ness. This, however, is patently not the case. 
One problem is that behaviors encouraged by 
training often are not rewarded, or are even 
punished, by others. A woman who is learning 
to be assertive may find that she was more 
highly valued when she was accommodating, 
self-denying, and quiet. Her assertiveness may 
increase her self-respect, but she may be 
unwilling to accept the reactions of others to 
her behavior and therefore cease to use her 
new skills. Even though in actual practice 
assertiveness trainers tacitly recognize this 
difficulty, neither a pure anxiety reduction 
theory nor a pure response acquisition theory 
approach can account for a subject’s own 
evaluation of his or her response. 

The second difficulty lies with the “lack of 
skills” hypothesis: There is evidence that 
unassertive persons do know the appropriate 
assertive responses. Gottman and Schwartz 
(Note 1) found that nonassertive persons did 
not differ from highly assertive persons in 
their ability to construct a written response 
or verbally deliver an assertive response in a 
hypothetical situation. The difference between 
the two groups of persons did not become 
apparent until they were confronted with a 
situation that was highly similar to an actual 
interpersonal confrontation. In short, non- 
assertive people may well know what to do, 
but in stressful or high-risk situations they 
tend not to do it. 

Although both skill acquisition and anxi- 
ety reduction models explain some of the 
consequent behavior change experienced 
during assertiveness training, a cognitive 
approach can add to our understanding of 
this process by exploring the conditions 
under which a person will choose to act in 
an assertive manner and how that person 
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then evaluates his or her response. We Propose 
therefore, that a person’s willingness to be 
assertive, defined here as his or her refusal 
to comply with an unreasonable request, 
can be predicted from knowing how he or she 
evaluates the possible consequences of being 
assertive in various kinds of situations. What 
differentiates assertive and nonassertive per- 
sons is less a matter of their “personalities” 
or differences in their repertoire of skills than 
it is of differences in how they evaluate the 
consequences of being assertive in different 
kinds of situations. This hypothesis implies 
that if a person maintains a rather stable, say, 
negative, evaluation of a specific class of 
situations, it would be expected that he or she 
would characteristically be nonassertive in 
such situations but not necessarily in other 
situations. It is also likely to be true that in 
any interaction a person may evaluate the 
consequences of assertion to be negative due 
to the specific characteristics of the situation 
(who is involved, the timing of the response, 
past history of response with that individual, 
etc.). In a similar interaction with different 
topographical attributes, assertion may be 
perceived as the desirable response. 


Decision/ Expectancy Theory 


Decision-expectancy theory provides & 
model for describing choice behavior. The 
basic assumptions are that people act m @ 
fairly rational way and that their behavior 1$ 
determined by their expectation that the 
behavior will lead to various consequences 
and their evaluation of these consequence 
(Mitchell & Biglan, 1971). Choosing the 
action that will result in the maximum 
expected long-range gain requires comparing 
the alternatives in terms of the decision 
maker’s utility for each of the possible con- 
sequences and in terms of the decision maker $ 
expectation that the action will result in the 
attainment of each of these outcomes; the 
action that promises the larger expecte 
gain (or the least expected loss) is the one to 
be selected. I 

Actually, there are two similar but forma y 
different models. The first, expectancy re 
as described by Fishbein and Ajzen (1979); 
is concerned with the relationships amoné 


its magnitude is assessed by 
person’s subjective judgment 
‘or not he or she will perform 
he attitude toward a behavior, 


f potential consequences, mea- 
oint bipolar affective dimension, 
ly measured belief about the 
it the consequences will occur. 
ent case, a subject’s evaluation of 
ences that could result from 
d be multiplied by his or her 
se will occur as a result of 
sum of these products is added 
ent that reflects relevant social 
ally obtained from ratings about 
significant others, social norms, 
sum is assumed to be monotonic- 
the magnitude of the person’s 
perform (BI) the assertive 


d model, subjective expected 
|, comes from the area of decision 
ards, 1954). It differs from the 
model primarily in its greater 
simplicity, but the logic is quite 
the person must decide between 
and a nonassertive course of 
she must evaluate these actions 
the utilities of their positive 
(U), the utilities of their negative 
(VU), and the subjective prob- 


Uit PU + ... + PoaUn 
(1= Pis)Oi+ (1— Poa)U2 
‘ Hai + (1—Poa)Un 
PixUi + PaxU2+ ... + PaxUn 
+ (1 — Pix)Ui+ (1 = Pox)U2 
+-+ (1— Pax)Un, 


nd A represent the two classes of 


e€ sum of the utilities of the possible 
nd negative consequences weighted 

babilities of occurrence, and the 
u d select the action with the larger 
social component of the expectancy 
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model is treated as just another utility in the 
SEU model, which is what makes the latter 
a simpler model. Merging expectancy theory 
and the SEU model yields the prediction that 
the greater the SEU for an action, the more 
the person should intend to follow that course 
of action (BI). This model has been applied 
to a number of areas including career decision 
planning (Holmstrom & Beach, 1973; Mitchell 
& Beach, 1977; Muchensky & Fitch, 1975), 
third-grade children’s decisions to attempt 
academic tasks (Gray, 1975), and family- 
planning decisions (Townes, Beach, Campbell, 
& Martin, 1977). 

This study examines women’s intent to act 
or not act in an assertive manner in light of 
their evaluations of the possible consequences 
of the two behaviors and the amount of risk 
perceived to be involved in performing either 
behavior given the characteristics of the 
situation. The strength of BI is measured 
by the number of times the participant states 
her intent to refuse a series of unreasonable 
requests that are made in videotaped or 
written scenes involving a high or low status 
male or female antagonist. Separate judgments 
of utility and probability permit computation 
of SEUs for assertion and nonassertion ; 
persons differing in BI should have corre- 
sponding differences in their SEUs. 


Method 


General Strategy 


First, participants were administered an assertiveness 


test and an anxiety test. Next they were presented with 
written scripts) in 


nine scenes (via videotape or W 

which a male or female, authority or peer, made 

an unreasonable request. They were then es a list 

of 15 possible consequences of assertiveness, nonasser- 

tiveness and were asked to rate the utility (desirability) 
ed again, an 


of each. Then the scenes were present g a 
after each scene the participants their subjec- 
tive probabilities that each of the aforemention 
consequences would eventuate should A A ae 
i i onassertive manner. Finally, 

in an assertive or ni BA ESRA a would in 
refuse.! The utilities and subjective 


probabilities were used to compute 4 SEU for 


it i il tto act ina 
1 Clearly, it is also possible to choose no! $ 
conflict situation (e.g., to wait for further information), 
but for convenience sake we have assumed two gene! : 
and mutually exclusive classes of behavior even 


this is an oversimplification. 
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scene for each participant, and the comply-refuse 
statements were used to compute an overall index of BI 
for each participant. 


Participants 


Sixty-four women attending undergraduate psy- 
chology classes and 47 women in a professional dental 
hygiene program participated in this study. The women 
were asked to attend one session lasting about 1} 
hours and were seen in small groups. Following the 
experiment they were either given credit or were offered 
a short assertiveness training course. 


Procedure 


Participants were told that this study concerned 
decision making in difficult interpersonal situations; 
the term assertiveness was not mentioned until the 
debriefing explanation following the experiment. 
First they were given the Rathus Assertion Inventory 
(Rathus, 1973) and the Trait scale of the State—Trait 
Anxiety Inventory (Spielberger, Gorsuch, & Lushene, 
1970). They were given a list of 15 positive and negative 
consequences (feelings and actions) that might result 
from interpersonal conflict. This list was developed 
by presenting six assertiveness trainers with three 
sample situations and asking them to generate lists 
of positive and negative consequences commonly 
experienced by clients. The resulting core list of 
consequences was similar to those suggested by Alberti 
and Emmons (1970). 

Participants reviewed the nine video or written 
scenes rapidly to give them an idea about what they 
were like and then rated their utilities for the conse- 
quences. They were asked first to decide if the con- 
sequence was positive or negative and then to show 
how positive by marking one end of a scale that ran 
from 1 to 4, or if negative, how negative by marking 
the other end of the scale from —4 to —1. A response of 
0 indicated that the consequence was neutral. 

They then were given a booklet of nine identical 
response sheets. The sheets were divided into two 
sections, the first for compliance and the second for 
refusal. Each was followed by a list of 10 consequences, 
which were generated from the core list used for the 


utility ratings. They were told that for each scene. 


individually, they were to mark on a scale from 1 to 
100 how probable each consequence would be if they 
personally were to comply and then how probable 
each would be if they personally refused to comply. 
(This method has been used successfully by Holmstrom 
& Beach, 1973; and Muchensky & Fitch, 1975.) 
When the probabilities were completed, participants 
were asked to mark whether they thought that they 
actually would comply or refuse the request; the propor- 
tion of refusals over the scenes was taken as the over- 
all BI measure for each person. 


The Scenes 


Eight of the nine scenes that were used were from 
among those developed by Nedelman (1977) for a 
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study of the generalization of assertive behaviors; al 
eight scenes involved an unreasonable request. A nint} 
scene, involving a reasonable request, was include 
to see if there was a tendency for subjects to refuse t 
comply regardless of the appropriateness of th 
request; results for this scene revealed no such bia 
and responses to it are not included in the data analysis 
The scenes and characters were described by a femal 
narrator. Then there followed a direct request by th 
antagonist to the participant. Antagonists were eithe 
males or females, presented as either authorities o 
peers; each participant saw two examples of the fou 
male—female/authority—peer combinations. For each 
session the sequences of scenes were presented in 4 
scrambled order, with the restriction that each scene 
was presented as the first one in one of the session 
to spread the effect of presentation order, 

Half of the participants saw videotape dramatization 
of the scenes, and half received the same scenes as 
written scripts in booklet form. 


Analysis of SEU 


An SEU was computed for each scene by multiplying 
rated utilities by the rated probabilities of each of the 
20 consequences. The SEUs were computed separately 
for compliance and for refusal and were then combined 
to yield a single score for each scene by subtracting the 
smaller SEU from the larger and assigning a minus 
sign to the difference if the larger SEU was for com: 
pliance and a plus sign if it was for refusal to comply. 
This was done so that there would be only one SEU 
datum for each scene rather than two. (Two would 
be awkward to analyze.) SEUs for each pair of similar 
scenes were averaged giving each participant four 
SEUs, one for each of the variations—male authority, 
female authority, male peer, and female peer. First 
participants were divided into three equal-sia 
groups according to their scores on the assertion 
measure (high, medium, and low). An analysis 
variance for repeated measures was performed y 
compare groups’ SEUs and to examine the effects 0 
the sex and status of the antagonist on SEUs. Pa 
the analysis was repeated dividing the participants FA 
three equal-sized groups according to their scores 
the anxiety measure (high, medium, and low). Final 
participants were again divided into three equ f a) 
groups on the basis of BI (high, medium, and ae 
and the analysis was repeated. Significant interact E 
were further analyzed using Duncan’s multiple-ran 
test, and only significant effects will be discussed. 


Results 


Assertiveness and Anxiety 
whether 
ji i- 


The first analyses were to see 
measured assertiveness and/or measu. ia 
ety were related to BI and to each o a 
Table 1 contains the intercorrelations we) j 
these three variables for the psycholog) 


the dental hygiene students 

analyses show the two 
significantly different.) Neither 
s nor the anxiety test scores 
mtly related to BI, and they 
tly negatively related to each 


iously explained each partic- 
J was computed for each of the 
kinds of scenes. Because these 
ation specific, it is inappropriate 
them; thus there was no single 
for each of the participants 
correlated with their single BI 
pherefore, participants were divided 
gy and dental hygiene students, 
group of students was divided into 
Lsized groups on the basis of the 
high, medium, and low. Then 
X 2 X 2 repeated measures ana- 
riance was performed on the SEUs, 
; independent variables of BI (high, 
nd low refusal to comply), kind of 
(psychology or dental hygiene), 

Presentation of the scenes (video 
, and the variables that defined the 
he status of the antagonist (authority/ 
Sex (male/female). Similar analyses 
tormed using the assertiveness test 
na the anxiety test scores instead of 
analyses showed these two variables 


, Correlations Among the Assertiveness 
es, the Anxiety Test Scores, and 
al Intent (BI) for Two Kinds. of 


hology students 
and anxiety 
7 and BI 
Anxiety and BI 


al hygiene students 


and anxii 
ert and BI 
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Table 2 

Interaction Effects of Behavioral Intent (BI), 
Method of Presentation, Sex and Status of 
Antagonist Mean Subjective Expected Utilities 
AE 


Status Sex 

Variable Authority Peer Male Female 
BI 

High 11.44 10.54 

Medium 11.23 7.96 

Low 6.21 1.97 
Presentation 

Video 6.80 5.24 

Written 10.61 7.57 
Sex 

Male 11.15 6.83 

Female 7.28 5.87 


Note. For Method of Presentation X Sex of An- 
tagonist, F(1, 94) = 4.1, p < .05. For Sex X Status 
of Antagonist, F(1, 94) = 6.6, p < .01, For BI 


X Status of Antagonist, F(2, 94) = 5.7, p < 01. 
For the BI analysis all main effects and 
Method of 


three interactions (BI X Status, 
Presentation X Sex of the Antagonist, and 
Status X Sex of the Antagonist) were signif- 
icant. 

The first main effect is BI, F(2, 94) = 9.6, 
p< 01, with a mean SEU for the high BI 
group of 10.99, for the medium BI group, of 
9.60, and for the low BI group of 4.09. Using 
Duncan’s test the difference between the 
means of the high and low BI group was 
significant at p < 05. The medium BI group 
did not differ from either the low or high BI 
groups. 

For the second main effect, kind of student, 
F(1, 94) =68, p < .05, the psychology 
students has a group mean SEU of 6.67 and 
the dental hygienists’ mean was 9.42. This 
shows that the SEUs for the latter favored 
refusal more than did the SEUs for the 
psychology students. Further, the mean BI 
for the hygienists (1.72) was significantly 


2 In light of this result for the 
familiar with this scale 
obtained means ae stan 
chology undergraduates, 
for dental hygiene students, M = 10.60, 5 


M = 12.62, S. 
0. SD = 22.41). 
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higher, ¢(104) = 2.80, p < .01, than the mean 
BI for the psychology students (1.44), indicat- 
ing that the hygienists said that they would 
refuse to comply with the requests in the 
scenes more often than the psychology students 
did. 

The third main effect, method of presenta- 
tion of the scene, F(1, 94) = 10.0, p < .01, 
had a SEU for the written presentation (9.09) 
that was higher than for video presentation 
(6.02). The corresponding mean BIs were 
1.55 for the written presentation and 1.50 for 
video presentation, /(104) = .5, ns. These 
results suggest that BI is independent of 
the method of presentation but that the 
participants’ expectations for refusal (SEU) 
were more positive for those who saw the 
written scripts, and that those who saw the 
videotapes—a more realistic stimulus—were 
more likely to have less favorable expectations 
for refusal. 

The fourth and fifth main effects were for 
the status of the antagonist, F(1, 94) = 27.3, 
p < .01, with mean SEUs of 9.22 for authorities 
and 6.35 for peers and for the sex of the 
antagonist, F(1, 94) = 45.6, p < .01, with 
mean SEUs of 8.99 for males and 6.58 for 
females. Participants’ expectations for asser- 
tion to peers and to women were more negative 
than for assertion to authorities and males. 
Because these two variables define the scenes 
rather than divide participants into groups, 
as did the student and presentation variables, 
BIs could not be calculated for each of the 
four variants for comparison with the SEUs. 
However, it is possible to calculate the propor- 
tion of times that the participants stated an 
intent to refuse to comply for each level of 
the two variables. For authorities the propor- 
tion of refusal was .85, for peers it was .69, 
for males it was .77, and for females it was 
-76. Because these proportions each involve 
multiple contributions by all of the partic- 
Ipants, no statistical tests can be performed. 
However, the proportions appear to be con. 


gruent with the SEU for status and less so 
for sex. 


_ All three significant two-way interactions 
volved the status and sex variables. These 
are shown in Table 2. 
Within the different kinds of scenes, the 
Participants in the high and the low BI 
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groups differed significantly in their SEUs 
for refusal to comply in scenes, although this 
is not true of their response to the scenes 
involving authority figures. For both scene 
types, as one moves from high to low BI, the 
SEUs for refusal to comply to authority and 
peer demands become increasingly disparate, 
though these differences did not reach statis- 
tical significance. For those persons who have 
a high BI to refuse, the difference between 
refusal to peer and to authority was only 


-90; for the medium group it was 3.27, and for 


the low group it was 4.24. For the participants 


with a low BI to refuse, unlike the high group, 


it appears much more difficult to refuse 
requests made by peers than by authorities. 

The interaction between method of presenta- 
tion and sex of antagonist shows that the 
SEUs for written presentation were higher 
overall than the SEUs for the video presenta- 
tion for both scene types. SEUs for refusal 
to comply with males were higher than SEUs 
for refusal to comply with females for both 
methods of presentation. 

The interaction between sex and status of 
the antagonists shows that the SEUs for 
refusal were higher for male authorities than 
for female authorities, male peers, and female 
peers. The SEUs for female authorities were 
higher than for female peers, with no differences 
between the remaining categories. The corre- 
sponding proportions of refusal to comply 
statements (BI) for the four kinds of antag- 
onists are .88 for male authorities, .82 for 
female authorities, .67 for male peers, and , 1 
for female peers—moderate agreement, with 
the SEUs. In sum, it appears that the highest 
SEUs for refusal occurred among the dental 
hygiene students, for participants with hig! 
BI, and when participants are confront 
with authority figures, particularly males. 


Utility and Probability 

Because SEU is clearly related to BI, it $ 
appropriate to see if the differences gt 
participants who frequently state that t g 
intend to refuse to comply and those W { 
less frequently state intention to bea, 
be attributed to differences in either heir 
utilities for the possible outcomes Of E 
subjective probabilities that these outcom 


sil occur. To do this the participants were 
fided into groups on the basis of the kind 
ii student (psychology, dental hygiene), 
inethod of presentation (video, written), and 
(high, low). A repeated measures analysis 
ji variance using utilities as the dependent 
friable yield no significant effects of any of 
e independent variables. As can be seen in 
re 1, students at both levels of BI eval- 
uted the utilities in the same way. 

Asimilar analysis was performed using the 
abilities as the dependent variable. Here 
ii three variables (type of student, method of 
tation, and BI) had a significant effect 
the probability estimates, and all interac- 
also were significant. This is illustrated 
lithigh and low BI participants in Figure 2. 

I The graph shows that if they were to comply, 
he low BI participants would expect the 
sitive consequences to be more probable, 
12) = 6.08, p< .01 than would the high 
ut participants. Similarly, if they were to 
use to comply, the low BI participants would 
k the positive consequences as less probable, 
If?) = 8.33, p < .01, and the negative con- 
ences as more probable, ¢(62) = 4.29, 
I< 01, than the high BI participants. In 
ort, the difference between high and low BI 
pepsi is not in their utilities for the 
BN and negative consequences but in 
4 ifferent perceptions of the probabilities 
ee Positive and negative consequences 
kuring should they elect to comply or to 
Muse to comply. 


i Discussion 


hools and Presentations 


anslyses revealed significant effects 
Sear of students and for the method of 
ime ìon of material. SEUs for the dental 
SEUs fo group were uniformly higher than 
the ee chology students. Attitudes 
Beeta a the requests in the scenes 
May have b y their higher intent to refuse 
bat they s mediated by their expectation 

ee d soon be employed in a highly 
Id advice ere profession. The experience 
Yorked of those few who had already 


as i P 
an ental assistants contributed to 
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Mean assessed utility 


9 loll 12 13 14 15 


678 
Negative 


T ANE, 
Positive 
Utilities 
Figure 1. Utilities for participants who were high and 
low in behavioral intent. 


The effect of presentation was also signif- 
icant. SEUs for the written scenes were higher 
for both sex and status and across schools, 
although the pattern of refusal to scenes did 
not differ. Even though the written scenes 
allowed participants to make greater use of 
their own experience in evaluating the scenes, 
the demand of the antagonist was less imme- 
diate. Participants evaluated the consequences 
for refusal as more negative when the request 
was made via videotape. Videotape has been 
a successful adjunct to treatment in assertive 
training studies for modeling appropriate 
responses (Hersen, Eisler, Alford, & Pinkston, 
1973), and for assessing change (Hersen, 
Eisler, & Miller, 1973). Although it has not 
been systematically tested in the literature, it 
is likely to be a better medium than audiotape 
or written instructions, since the stimulus it 
presents apparently is experienced as “more 


real.” 


The Scenes 


Although many studies acknowledge the 
variation of task difficulty involved in asser- 


tion, no systematic investigation 


is issue by hierarchically O i 
E R or task difficulty (Alberti & 
Emmons, 1970; Piaget & Lazarus, 1969; 
Wolpe & Lazarus, 1966). Situations involving 
interactions with persons of the same sex, peers, 
or persons who have a distant relationship 
are presented prior to those involving persons 


4 7891012345678 910 
123 N 5 


Consequences 
Figure 2. Probabilities for participants who were high 
and low in behavioral intent. 


of the opposite sex, spouse, or an authority 
figure (McFall & Lillisand, 1971). The 
rationale for this is to provide success in 
situations involving less risk of failure and 
emotional concomitants and to desensitize 
the anxious patient to difficult interactions. 
This pattern has become institutionalized, 
but, as was demonstrated in this study, it 
may not reflect the order of task difficulty 
for all participants. Perhaps since all subjects 
were students and were less likely to be 
confronted with authority figures in the work 
place, it is possible that they feared the 
consequences of assertion to authorities less 
than they feared “negative” consequences in 
relationships with peers. If this is the case, 
a training program designed to teach assertion 
should take into account the participants’ 
own evaluation of the salient perceived risks 
involved in the new behavior and the current 
situation. It is likely that these may differ 
from participant to participant and from 
established training procedures. 


Behavioral Intent 


The measure of intent to refuse (BI) was 
highly related to SEU for the content scenes. 
This attests to the situational specificity of 
assertion. It suggests that assertiveness is not 
a trait that applies equally across situations 
varying in sex and status of the antagonist 
and presumably varying also in the amount 
of risk entailed. 

In this study BI is not to be taken as an 
actual indication of the participant’s willing- 
ness to assert in a real-life situation. However, 
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a number of studies have found a high correla. j 
tion between behavioral intent and actual 
behavior in a variety of situations (reviewed 
in Fishbein & Ajzen, 1975). These authors 
assert that intent is the best indication of 
actual behavior, but they qualify its ability 
to predict behaviors in situations in which 
there is a time interval between the intent 
and the behavior, and when the behavior 
in question depends on the actions of others} 
Additionally, when the stimuli are highly 
artificial, as in this experiment, intent may be 
no more than an indication of the subjects! 
desire to behave in the intended manner. 

The fact that there were no systematic 
differences in SEU for partipants scoring 
high and low on the personality measures 
may have been a function of the tests used, 
and they may also reflect the inadequacy 
of assigning a “trait” value to a situation 
specific task. A major issue in the literature 
has been the specificity of the assertiveness 
response and the expectations that can be 
made for generalization. Hersen, Eisler, and 
Miller (1973) and Wolpe and Lazarus (1966) 
argued that assertion is specific to the social 
context in which it occurs. Others have found 
some degree of transfer of training as a result | 
of their intervention (McFall & Lillisand, 
1971; McFall & Marston, 1970; Nedelman, 
1977). 


f 


Utility and Probability Components 


Participants high and low in BI to refuse 
unreasonable requests did not differ mM ie, 
way they valued the utilities of the con- 
sequences. However, participants differed a 
nificantly in their perception of the probabil ; 
ties of the consequences occurring as 4 a 
of their behavior. For refusal, these trends We 
reversed ; participants with high BI for a 7 
saw the positive consequences of refusa 
more likely to occur and the negative co 
quences as less probable. For both events mi 
probability endorsements of low BI pes ar 
ipants were generally more constricted, whe o 
high BI subjects gave more extreme P" 
abilities. rted 

This result is consistent with results Saas 
by Mitchell and Knudsen (1973), who “H 


ys il- 
arated out the effects of utilities and probabi 
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‘ites. Comparing college students and business 
students, they found that although the 
students did not differ in their valuation of 
occupational goals, they held significantly 
‘different probabilities that a career in business 
would allow them to attain those goals. 
Findings support the hypotheses that intent 
W refuse to comply with an unreasonable 
request depends on the attributes of the 
situation. Additionally, subjects did not 
pe in their valuation of the positive and 
negative consequences, but instead they 
differed in their expectations that a particular 
consequence would occur should they elect to 
refuse to comply with unreasonable demands. 


Issues for Training 


Research on assertiveness has had its 
greatest impact on applied methods for clients. 
“Past studies (Hersen, Eisler, & Miller, 1973; 
McFall & Marston, 1970) have identified 
‘behavioral components of assertivness that 
can be taught to nonassertive persons and 
used in a variety of situations. In the present 
study it appears that participants scoring on 
all levels of an assertion inventory also 
“respond to situations differentially according 
| to the sex and status of the antagonist and 
presumably according to the degree of risk 
that they perceive to be a consequence of 
their behavior. 

Be eining a comprehensive assertiveness 
t aie these results suggest that the 
Eii training should be on changing the 
a pant’s cognitive expectations about 
e results of his or her behavior, as well as on 


changing attitudes or specific behaviors. This 
oo that the average client has 
“a nowledge about appropriate inter- 
eee than might be found 
mati A he hospitalized schizophrenic male 
the AA the Hersen et al. studies. In fact, 
Eai no training has not been directed 
Pomen S latter group but rather toward 
AR college students who may have 
ee skills but who do not act assertively 
S Ta. perhaps should. The Gottman and 
a bes (Note 1) suggests that both 
appropri nonassertive people know the 
Bins ate behaviors. It may be that the 

Ty “skill deficit” lacking in nonassertive 
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persons is their inability to accurately estimate 
the consequences of their assertion. If this is 
true, greater emphasis must be placed on 
changing the person’s cognitive expectation 
of the consequences. This may involve a 
process of teaching the client a new set of 
expectancies about possible outcomes based 
on the characteristics of differing situations. 
Since information affects beliefs about con- 
sequences, active participation such as role 
playing is likely to be an effective strategy for 
change, because it exposes the person to new 
information and allows for change in belief. 
Additionally, as suggested by Mitchell and 
Biglan (1971), exploring the components of 
the client’s expectancies in therapy might 
change the client’s perception of instrumental 
relations between behavior and outcomes or 
the evaluations of the probabilities of outcomes 
occurring. 

This research also suggests that training 
programs might profit from a broader apprecia- 
tion of the client’s negative as well as positive 
evaluations of the consequences of assertion. 
The weighting of these consequences appears 
to differ from situation to situation and may 
determine the persons’ willingness to respond. 
Assertive behavior may be realistically eval- 
uated by the subject as counterindicated in a 
given situation. For example, in some situa- 
tions compliance to a mildly unreasonable 
request may have the secondary positive 
consequences of strengthening a friendship. 
In other situations the effect might be to 
advance a long-range goal. It may be necessary, 
as Hersen Eisler, and Miller (1973) suggested, 
to train for generalization of assertive behavior, 
in this case to teach clients appropriate 
expectations in a variety of situations. It may 
also be important to teach relevant others 
to change their expectations as the client 
changes his or hers. Since we have only 
treated one type of assertion in this study, 
refusal to comply with an unreasonable request, 
further investigation is needed to see whether 
other assertive responses (complimenting, 
standing up for one’s opinions, etc.) are 
similarly affected by the situational variables 
and by the participant’s expectations of the 
consequences. - 

A number of issu! 
study that have not 


es have been raised in this 
been dealt with in the 
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literature of assertion and that seem worthy 
of further investigation. These include an 
appreciation of the client’s perception of the 
risk involved. The traditional pattern of 
training female peer situations first because 
they involve less risk than authority situations 
does not take into account the possible 
ascendance of peer and friendship values in a 
female college population. In an analogue 
study of this type, the use of self-report and 
self-ratings are limited in their ability to 
predict actual behavior. It appears, however, 
that the decision/expectancy model has value 
as a research tool in specifying under what 
conditions a participant might choose to act 
in an assertive manner in a given situation. 
Future research might investigate the ability 
of this approach to predict and shape the 
responses of individuals in real-life situations. 


Reference Note 


1. Gottman, J., & Schwartz, R. Toward a task analysis 
of assertive behavior. Unpublished manuscript, 
Indiana University, 1976. 
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Use of Paradoxical Intention in a Behavioral Program 
for Sleep Onset Insomnia 


L. Michael Ascher and Jay S. Efran 
Temple University 


Sleep onset insomnia seems often to be based on performance anxiety associated 


with a client’s fears of being able to fall asleep; 


in some cases, a therapeutic 


program might actually exacerbate this performance anxiety by focusing on the 
client’s efforts to voluntarily control the sleep onset process. Five cases of sleep 
onset difficulty, unusually resistant to a conventional behavioral program for 
this problem (i.e., deep muscle relaxation and systematic desensitization), were 
exposed to paradoxical intention suggestions requiring that they try to remain 


awake as long as possible, rather than attempt to 


fall asleep. A rapid reduction 


of sleep onset latency occurred following the shift from the conventional pro- 
gram to the paradoxical intention instructions. 


Frankl (1975) has recently provided a more 
detailed and updated description of “paradoxi- 
lintention,” a psychotherapy technique that 
he developed in the framework of logotherapy. 
‘He has also begun to articulate some of the 
Tinks between this technique and behavior 
‘therapy approaches. This article reports the 
of paradoxical intention as a complement 

to other behavioral procedures in cases of sleep 
‘onset “insomnia” that proved resistant to 
treatment. 

Deep muscle relaxation as developed by 
oo (1938) and modified by Wolpe (1958, 
a has been used with success in sleep dis- 
a due to other than physiological 
im (Kahn, Baker, & Weiss, 1968; Nicassio, 
oe 1974). There are, however, a signifi- 
a number of clients with whom relaxation 
3 e is not sufficient to produce the desired 
a of improvement. In such cases, be- 
a therapists have commonly used sys- 
SON desensitization focused either on the 
po oe themes that occupy the 
na to sleep onset or toward anxiety 
orke related to the sleep situation 
be Steinmark, & Niu, 1973; Geer & 

in, 1966). Other behavioral methods, such 
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as covert conditioning, thought stopping, and 
operant stimulus control have also been used 
singly and in combination. 

The present article explores the use of 
paradoxical intention as an ancillary treatment 
with individuals for whom the relaxation- 
desensitization program seemed insufficient. It 
is relatively easy to apply and usually produces 
immediate behavior changes without the com- 
plications of medication (Kales, Allen, Schaff, 
& Kales, 1970). Briefly, paradoxical intention 
can be viewed as a behavioral prescription re- 
quiring clients to perform responses that ap- 
pear to be incompatible with the goal for which 
they are seeking therapeutic assistance. Thus, 
in the present context, clients with sleep onset 
disturbance are requested to try to remain 
awake for as long as possible, rather than to 
focus on trying to fall asleep. In other words, 


they are asked to exaggerate the very behavior 
reduce. In the present 


tment of five clients 
difficulty t yie 
relaxation-desensitization program. 


Method 


Subjects 

i clinic 
All five clients had applied for treatment to the 
of the Behavior Therapy Unit of the Department of 


Association, Inc. 0022-006X/78/4603-0547$00.75 
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Table 1 
Characteristics of the Five Clients 
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Average sleep Length of 


Marital onset latency complaint Primary presenting 
Client Sex Age status Occupation (minutes) (years) problem 
A Male 32 Single Lawyer 60 5 Primary erectile 
dysfunction 
B Male 27 Married Graduate student 45 12 Interpersonal 
(mathematics) difficulties 
C Female 41 Married Housewife 90 21 Sleep onset 
D Female 23 Married Social worker 45 3 Sleep onset 
E Male 25 Single Salesman 75 4 


Sleep onset 


Psychiatry at Temple University. The first author 
served as the therapist and conducted weekly sessions 
individually with each client, The clients are further 
described in Table 1. 


Procedure 


Two weeks prior to initiating therapy directed at 
sleep onset difficulty, each client was asked to chart the 
approximate length of sleep onset latency each morning 
in addition to recording other details relevant to the 
sleep situation (e.g., mood when retiring, time of retir- 
ing, number of awakenings, “restfulness” of sleep). 
Beginning with the initial session following this 2-week 
baseline period, and continuing for 10 weeks, all clients 
were instructed in deep muscle relaxation (Wolpe, 
1974) with particular emphasis on the use of these 
exercises prior to retiring. Clients were also advised on 
the modification of their sleeping arrangements to 
produce optimal conditions for rapid sleep onset. With 
successive sessions, additional techniques were in- 
troduced as required (e.g., desensitization and covert 
conditioning). (Although the behavioral program out- 
lined above has been of demonstrated efficacy with sleep 
onset difficulty, it failed to produce the desired improve- 
ment in the five cases reported in the present study. 
This represents about 10%-15% of individuals seen for 
sleep disturbance problems over a 4-year period.) 

During the first session following the 10-week period, 
the therapist suggested that a modification in the pro- 
cedure might enhance the patient’s progress. Paradoxi- 
cal intention was administered by instructing the client 
(with an appropriate rationale) to try to remain awake. 
Three clients (A, B, and E) were told that although the 


suggested that the relaxation component of the be- 
havioral program was not of sufficient duration to 
“produce the level of relaxation requisite for sleep,” 
Therefore, instructions were given to lengthen the 
number of steps (and, consequently, the length of time) 
required to complete the relaxation practice. Clients 
were advised to go through the entire procedure several 
times to achieve a satisfactory level of relaxation, They 
were asked to do this even if it meant resisting the urge 
to sleep. 

The paradoxical intention procedure was continued 
for all five clients for 2 weeks. During the sessions the 
clients were asked how they had progressed with respect 
to the assigned tasks (either obtaining the thoughts 
experienced prior to sleep or increasing the relaxation 
procedure to achieve a “deeper” level of relaxation). 
Typically, they reported that they had not been able to 
accomplish the goal because they had fallen asleep too 
quickly. The therapist briefly expressed interest and 
encouragement on hearing this information but sug- 
gested that they redouble efforts to accomplish the goal, 
trying harder to remain awake. This entire interaction 
took only a short time during the initial portion of a 
three relevant therapy sessions, the remainder of whic 
were devoted to continuation of the regular desensitiza- 
tion procedures. é iod 

Following the 2-week paradoxical intention period, 
four of the clients (A-D) were given no further sleep 
onset treatment. However, Client E was instructed to 
return to his previous program, which incorpo 
techniques focused on the reduction of anxiety- proyo 
ing thoughts experienced during the sleep onset ee 
period (i.e., thought stopping, covert positive ae 
forcement, systematic desensitization). The ee 
mained on this program for 3 additional weeks, P 
which he was told that his efforts to reduce dir ue 
ing thoughts seemed to have been effective and t A 
should reemphasize the relaxation (paradoxical) Vins 
ponent of the program. In this way, paradon ee 
tention instructions were again administered (this 


‘These other measures seemed less relevant P 
context than the latency measure, and, for the z oa 
brevity, they have been omitted. All of the me 
followed the same pattern. 


Table 2 


k of the Treatment Program 


Mean Number of Minutes to Sleep Onset for Each Wee. 


Phase 5 


Phase 4 


Phase 3 


Phase 2 


Phase 1 


Week 
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Phase 3 = paradoxical intention ; Phase 4 = readministration of conventional program ; Phase 


Note. Phase 1 = baseline; Phase 2 = conventional program ; 
5 = readministration of paradoxical intention. 


in a manner similar to that for Clients C and D) and 
remained in effect for 3 final weeks.? 


Results 


Table 2 presents the mean self-report estima- 
tions of sleep onset latency (i.e., the duration 
between “lights out” and sleep onset) for each 
client during the 2-week baseline period (Stage 
1), the 10-week behavioral program administra- 
tion (Stage 2), and the 3-week paradoxical in- 
tention period (Stage 3), which represented the 
terminal stage for four clients (A-D). In 
addition, data are reported for Client E’s 
second behavioral administration (Stage 4) and 
return to paradoxical intention (Stage 5). A 
comparison of the sleep onset latencies during 
the baseline period with the data following 10 
weeks of behavior modification indicates that 
in most cases ‘‘some improvement” was ob- 
tained, even though it was judged insufficient 
by the client.* 

The data show a marked reduction in sleep 
onset latency following the administration of 
paradoxical intention instructions to “try to 
remain awake.” In the case of Client E, the 
data for the first three stages of the study are 
congruent with those for the remaining clients. 
Sleep onset latency was somewhat reduced as a 
result of relaxation training and behavioral 
procedures directed at distracting anxiety- 
provoking thoughts. However, presentation of 
paradoxical intention instructions produced a 
marked reduction in sleep onset latency. 
Reinstitution of the previous behavioral pro- 
gram was coincident with an increase in sleep 
onset latency, which decreased again when 
paradoxical intention instructions were read- 
ministered. Informal long-term follow-up (by 
telephone) indicated that each of the clients 
remained satisfied with their sleep behavior 


after a period of 1 year. 


2The clients appeared to believe the treatment 
rationales that i wee offered, although formal data 
on this issue could obviously not be collected in this 
clinical setting. 

3 Formal statistical treatment is perhaps unwarranted 
in this study. However, the reader may wish to note 
chat ż tests between the baseline and behavior therapy, 
the baseline and paradoxical intention, and penio 
therapy and paradoxical intention were, respectively, 
978 (ns), 3.86 (p < 02), and 4.58 ($ < 02). 
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Discussion 


The present study illustrates the utility of 
paradoxical intention within the context of the 
behavioral treatment of sleep onset difficulties, 
A reasonable question would seem to be, why 
did paradoxical intention produce a change 
that the conventional treatment alone could 
not? Paradoxical intention has been shown to 
be effective with a wide variety of psychosoma- 
tic dysfunctions, that is, physiological proces- 
ses, having autonomic nervous system innerva- 
tion, which can be inhibited by anxiety. Such 
dysfunctions can occur, for example, with 
various aspects of sexual activity, elimination, 
and, as in the present case, components of 
sleep behavior. 

Most people experience occasional difficulty 
in falling asleep. This difficulty is usually seen 
by the individual as an isolated event pre- 
cipitated by unusual excitement or tension 
during the day, too much sleep prior to bed- 
time, and so forth. Succeeding evenings nor- 
mally result in a rapid retum of the individual’s 
typical sleep pattern. However, a small per- 
centage of people view instances of sleep onset 
difficulty as indicants of a trend toward de- 
creasing levels of satisfactory functioning, This 
latter group considers each successive evening 
a test of their ability to fall asleep. The level of 
performance anxiety increases as each test ap- 
proaches. Anxiety is assumed to have a re- 
ciprocally inhibiting relationship with. sleep 
onset (as with sexual arousal); that is, it 
stimulates the sympathetic Nervous system, a 
a system that alerts the organism and is the 
reciprocal of the parasympathetic system, 
which has recuperative functions compatible 
with sleep onset. 

Some people who experience performance 
anxiety at bedtime appear to focus on at least 


the following ay, 
For these individuals, concern about successful 
performance and contingencies of failure serve 
to increase anxiety prior to bedtime and main- 
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tain a high level of anxiety in the sleep sity 
tion. “Trying hard” to get to sleep only mak 
this cycle more pernicious. 


The cycle can apparently’ be bike 
paradoxical intention, which decreases pg 
formance anxiety by redefining the situatigy 
The paradoxical suggestion is incompatibl 
with the “common sense” effort that the ij 
dividual has been making to perform the targe 
response. Thus, because the conventional pre 
gram was aimed at helping these individual 
fall asleep more rapidly, it inadvertently sup 
ports the cycle of “performance anxiety- 
failure to perform—increased perform: 
anxiety.” Paradoxical intention removes th 


client from this system. In this article, para 
doxical intention was used with difficulty 
Tt remains to be seen whether it wi be 


useful with the wider range of clients complai 
ing of sleep onset latency. , 
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When schizophrenics are asked to interpret 
roverbs, they often respond to the proverb as 
‘literal statement rather than as a bearer of a 
figurative meaning. For example, when asked 
to interpret the-statement “When the cat’s 
kway, the mice will play,” even educated and 
intelligent schizophrenic patients may explain 
the actions of cats and of mice, rather than of 
people. This article offers a system for scoring 
literalness of proverb interpretation. 

Dozens of writers have discussed literalness 
ae interpretation, but scoring systems 
ies a manifestation of concreteness. 
i, systems score either concreteness 
Fine 4 ion, which is merely a term for ac- 
‘saci Proverb interpretation that is in- 
and with literalness. Both low abstrac- 
ose igh concreteness are usually viewed 
4 FA TIE literalness. However, scoring for 
Brine ERA or abstraction classifies 
e ma at Cag kinds of poor per- 

CEPA ae ittle to do with literalness. 

ore specific error than either 
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A Scoring Manual for 
Literalness in Proverb Interpretation 


is Chris A. Hertler, Loren J. Chapman, and Jean P. Chapman 
University of Wisconsin—Madison 


A scoring system is offered for literalness of proverb interpretation as an alter- 
native to scoring concreteness. For a group of 115 schizophrenic and normal 
subjects, literalness and Gorham’s concreteness were equivalent on coefficient 
alpha (.85 for literalness and .84 for concreteness). Interrater reliability was 
90 for both scoring systems. Nevertheless, abstraction correlated lower (p < .01) 
with literalness than with concreteness. For 77 schizophrenics, Verbal IQ cor- 
related significantly with concreteness (r = —.52, P< .01) but not with literal- 
ness (r = —.15, ns). Thus, literalness is less affected by intelligence and by 
ability to respond abstractly than is Gorham’s concreteness. 


concreteness or lack of abstraction. Literalness 
is an active attempt to interpret the words of 
the proverb as a literal message rather than as 
symbols to be interpreted. 

The most commonly used scoring system for 
proverbs is that of Gorham (1956a, 1956b), 
who scores abstraction and concreteness sepa- 
rately. He offered a detailed scoring system for 
abstraction but evidently regarded concrete re- 
sponses as so obvious that no formal scoring 
system need be offered. Gorham (1956a) 
stated his criteria for concreteness very briefly. 


Concrete answers are usually apparent to a clinical 
observer. They stick closely to the symbols of the 
proverb. In schizophrenics, it is common for patients to 
substitute “That’s right,” “exactly,” “that is not $0 
because,” or “yes” and tno” for a restatement of the 
proverb in concrete form. These answers are considered 


to be concrete.” (p. 3) 


Gorham supplemented this statement with one 
example of a concrete response to each of seven 
proverbs. À 

“ Many responses by both normal subjects 
with low intelligence and schizophrenics stick 
closely to the symbols of the proverb but yet 
are not literal interpretations of the proverb. If 
a subject is unable to interpret the proverb but 
is verbose, he or she will talk about the sym- 
bols. Subjects who cannot interpret a proverb 
appropriately often simply repeat some of the 
words of the proverb without further elabora- 
tion, give associate responses to the words, 
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relate the words to their own experience, or 
talk in other discursive ways about the proverb 
that they are asked to interpret. Such talk is 
not evidence that he or she interprets the 
symbols literally, but it would be scored as 
“concrete” by Gorham’s criteria. For example, 
one schizophrenic responded to “The worst 
spoke in the wheel breaks first” with “Wheel 
breaks, brake locks, break off.” Another patient 
responded to ‘‘He who stumbles twice over one 
stone deserves to break his shins” with “I 
don’t stumble, walk straight, never stumble, 
got to control me, can’t see.” Both of these 
responses stick closely to the words of the 
proverb and, hence, would be scored as con- 
crete, but they are not attempts at a statement 
of a literal meaning. Gorham’s concreteness 
seems to reflect in large part dullness and a lack 
of accuracy. In schizophrenia, concreteness is 
heavily affected by a failure to focus on the task 
of interpretation and by other aspects of 
generalized deficit. Literalness should be less 
affected by generalized deficit because it is a 
more specific kind of error. Because literalness 
is less a reflection of generalized deficit than is 
concreteness, a score for literalness should de- 
pend less than does concreteness on both Verbal 
IQ and abstraction. 

The system of scoring literalness that we 
offer here labels all appropriate answers as 
nonliteral and only some incorrect answers as 
literal. This system. provides for scoring each 
proverb on a 3-point scale. This follows from 
dividing the proverb into two halves, each of 
which could receive a literalness score of 0 or 1. 
Thus the total literalness score for the sum of 
the two halves could be 0, 1, or 2. For example, 
the proverb “Rome was not built in a day” is 
sometimes interpreted as “Rome took a long 
time to build.” In this response, Rome is 
treated literally, and in a day is treated ab- 
pore which yields a total literalness score 

The Scoring principles offered here could be 
applied to the interpretation of any figurative 
statement, although we developed them for 
proverbs from Gorham’s (1956a, 1956b) clini- 
cal form of the Proverbs Test. Actually 
all 36 of Gorham’s items are proverbs ne 
sense of being figurative statements to 
ieee se items are, instead, aphori: 

interpreted literally. E; 


not 
the 

be 
isms 
xXamples 


< Rome.” 
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are “The more cost, the more honor” and 
“Where there’s a will, there’s a way.” We do 
not include such items in our scoring system, | 
We regard 24 of Gorham’s items as clearly 
figurative statements and, therefore, include 
them. These are Form I, Items 2-9 and Item 
12; Form II, Items 1-7 and Item 10; Form 
III, Items 1, 2, 5, 6, 7, 9, and 11. 

Following the example of Friedes, Grisell, 
Levin, Dobie, and Cohen (Note 1), we desig- — 
nate certain of the words in each half of the 
proverb as symbols that must be generalized 
or interpreted to obtain a correct abstract 
interpretation. For example, in the proverb, 
“A drowning man will clutch at a straw,” 
drowning and straw are symbols that must be 
interpreted, but clutch is not. If drowning or 
straw are repeated in the answer, a score for 
literalness must be considered. However, the 
appearance of man or clutch in the answer need 
not imply literalness. For example, the re- 
sponse “A man who is in trouble will clutch at 
any method to save himself” is an adequate 
abstract interpretation rather than a literal one. 


Scoring Principles for Literalness 


For the sake of brevity, we will illustrate each 
scoring principle with responses to the proverb 
“Rome was not built in a day.” The two halves 
of this proverb are Rome and was not built ina 
day. The symbols to be generalized in an ab- 
Stract response are Rome, built, and day. 

An entire proverb is considered completely 
unscorable for literalness if the entire response 
consists of any of the following: 


1. An “I don’t know,” without further 
elaboration. 

2. A reference to a personal experience of the 
subject as a substitute for interpreting the 
proverb, for example, “I have never been to 


3. A response that has no recognizable re- 
lationship either to the literal meaning of the 
Proverb or to a possible interpretation of the 
proverb. Responses can be judged as falling m! 
this category even if they contain one or more 
of the symbols of the proverb, for example, 
“Rome is in Italy.” PE, 

4. A repetition of the proverb withou 
further elaboration, for example, “Rome wéS 
not built in a day.” 


tion of only part of the proverb 

ther elaboration, for example, 

day.” 

ntic associate or a clang associate 

e symbols without further elabora- 
xample, “Paris,” or ‘Cathedral 


single word other than yes or no and 
n an equivalent to yes or no such as 
_ An example of the unscorable 


further elaboration, for example, 
vices accentuate carnal lust” or 
vices can’t be learned quickly.” 
response whatever. 


owever, that many of these kinds of 
are scored if the subject adds other 
the response. See examples below. 

+ receives a total literalness score of 


esponse is a reason for the verity of 
tb as literally stated or is an elabora- 
its meaning and the explanation or 
on is based on either physical at- 
of the symbols or associates to the 
J 1 the proverb, for example, ‘‘Rome is 
response is yes or no or an equivalent 


h halves of the proverb receive a 
ess score of 1 by the criteria listed below. 


the Tesponse is scorable, one half is 
for literalness if 


tesponse half includes a repetition of 
we symbols from the proverb half, for 
Rome took a long time to complete.” 
a repetition of a symbol. Took a long 
‘omplete is an appropriate abstract re- 
AW proverb half. The total literal- 
ynonym for a symbol or a rewording 
bol from the proverb half is included 
sponse, for example, “The capital of 
a long time.” Capital of Italy is a 
for the symbol Rome. The total 

Score is 1. 
Tesponse half includes physical at- 
ae symbol from the proverb half, for 
€ “A big city can’t be built in a day.” 
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A big city states physical attributes of the 
symbol Rome. Built in a day is a repetition of a 
symbol. Both halves earn a literalness score 
of 7 The total literalness score for the proverb 
is 2. 

4. The response half is primarily a semantic 
associate to a symbol from the proverb half, 
for example, “It took more than one day to 
build Paris.” Paris is a semantic associate to 
Rome. It took more than one day is a rewording 
of not built in a day. The total literalness score 
is 2. 

A scorable response meeting none of the 
criteria for literalness receives a literalness 
score of 0. Responses that are scored 0 include 
the following: (a) a response that is correct 
(abstract) according to the Gorham scoring 
manual; (b) another proverb that has the same 
meaning as the original proverb; and (c) an 
attempt at an abstract interpretation of the 
proverb, even though incorrect; for example, 
“Big projects require great will power.” 

The total literalness score for a proverb is 
the sum of the scores for the two halves. If one 
half is unscorable, the total score is the score of 
the scorable half. 

The divisions of each proverb into halves and 
the designated symbols of each proverb are as 
follows: Form I: 2. Rome/was not built ina 
day. Symbols: Rome, built, day. 3. When the 
cat’s away/the mice will play. Symbols: cat, 
mice. 4. Barking dogs/seldom bite. Symbols: 
Barking, dogs, bite. 5. A stream/cannot rise 
higher than its source. Symbols: stream, source. 
6. Don’t swap horses/when crossing a stream, 
Symbols: horses, stream. 7. The used key/is 
always bright. Symbols: key, bright. 8. Gold 
goes in/at any gate except heaven’s. Symbols: 
gold, gate. 9. One swallow/doesn’t make a 
summer. Symbols: swallow, summer. 12. Don’t 
cast pearls/before swine. Symbols: „pearls, 
swine. Form II: 1. He who stumbles twice over 
one stone/deserves to break his shins. Symbols: 
stumbles, stone, break, shins. 2. Don’t judge a 
book/by its cover. Symbols: book, cover. 3. 
The proof of the pudding/is in the eating. 
Symbols: pudding, eating. 4. One may ride a 
free horse/to death. Symbols: ride, horse, 
death. 5. A rolling stone/gathers no moss. 
Symbols: rolling, stone, moss. 6. Strike/while 
the iron is hot. Symbols: strike, iron, hot. 7. 
All is not gold/that glitters. Symbols: gold, 
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Table 1 
Mean Literalness, Concreteness, and 
Abstraction Scores on 15 Proverbs 


Schizo- Normal 

Variable phrenics subjects 
Gorham's concreteness* 9.68 2.10 
Literalness 7.65 1.34 
Gorham’s abstraction 6.53 17.51 


a The concreteness values for the (0,1) Gorham 
system have been doubled to make them comparable 
to the (0, 1, 2) values for literalness. 


glitters. 10. Let sleeping dogs/lie. Symbols: 
sleeping, dogs, lie. Form III: 1. The sun/shines 
upon all alike. Symbols: sun, shines. 2. The 
grass is always greener/in the other fellow’s 
yard. Symbols: grass, greener, yard. 5. A 
drowning man/will clutch at a straw. Symbols: 
drowning, straw. 6. Too many cooks/spoil the 
broth. Symbols: cooks, broth. 7. The worst 
spoke/in the wheel breaks first. Symbols: 
spoke, wheel. 9. It never rains/but it pours. 
Symbols: rains, pours. 11. There’s many a slip 
twixt the cup/and the lip. Symbols: cup, lip. 


Use of the Scoring Scheme 
with Clinical Groups 


Forms II and III of the Gorham Proverbs 
Test were administered to 77 schizophrenics 
and 38 firefighters. A brief verbal IQ test con- 
sisting of the Comprehension, Vocabulary, and 
Similarities subtests of the Wechsler Adult 
Intelligence Scale was also given to these 
schizophrenics. The firefighters cannot be 
viewed as control subjects for the schizo- 
phrenics because of the lack of full information 
on their demographic characteristics. The fire- 
fighters’ data do, however, provide some in- 
formation on literalness scores of normal 
subjects. 

Mean age of the schizophrenic sample was 
37.3 years (SD = 10.2), mean years ea: 
tion was 11.7 (SD = 3.3), and mean score on 
the Hollingshead Index of Social Position was 
47.2 (SD = 15.2). Mean prorated verbal IQ 
on the brief intelligence test was 92.5 
(SD = 18.0). Sixty-two percent of the sample 
was male, 38% was female. Ninety-five per- 
cent was white, 5% was black. Mean score on 
the Phillips Scale of Premorbid Adjustment was 
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17.7 (SD = 4.4). Mean length of hospitaliza- 
tion was 109.6 months (SD = 109.9). 

All firefighters were white males. Assuming 
that the average firefighter receives a high 
school diploma, their average score on the 
Hollingshead index would be 51.0. No informa- 
tion on the age, IQ, or marital status of the 
firefighters was available. 

The Proverbs Test was administered using 
Gorham’s instructions, and the responses were 
scored for literalness by the first author and for 
concreteness using Gorham’s criteria by a 
graduate student. The scorers were kept blind 
as to whether a protocol was that of a schizo- 
phrenic or a normal subject. To assess inter- 
rater reliability, a third scorer rated 40 
schizophrenics’ protocols according to both 
systems. To assess the relationship of adequacy 
of proverb interpretation to both concreteness 
and literalness, the graduate student also scored 
all protocols for abstraction using Gorham’si 
manual. 


Reliability 


The coefficient alpha estimate of reliability 
for the 115 subjects was .85 for literalness, .84 | 
for Gorham’s concreteness, and .92 fot 
Gorham’s abstraction. The corresponding 
values for the 77 schizophrenics were .82 for 
literalness, .81 for Gorham’s concreteness, and 
-92 for Gorham’s abstraction. The correlation 
between concreteness and literalness was :80 
for both groups combined and .74 for the 
schizophrenics. Interrater reliability for the 40 
subjects was .90 for both concreteness and 
literalness. 


Relation to Clinical Status 


Table 1 gives the mean scores of both groups 
according to both scoring systems. As seen mM 
Table 1, both groups received lower scores oñ 
literalness than on Gorham’s concreteness 
Schizophrenics were significantly different from 
normal subjects on literalness, concreteness 
and abstraction (p < .001, in each case). 


; e 
Relation to Intelligence and to Abstraction Scor 


d 
For the schizophrenics, Verbal IQ corc y 
—.52 (p <.01) with concreteness but 


iteralness. Thus literalness is 
concreteness by sheer intel- 
We interpret. these values to 
ss is less affected than con- 
ed deficit. The relation of 
ction to literalness and to con- 
further support to this in- 
the schizophrenics, abstrac- 
.64 with Gorham concreteness 
literalness. Thus abstraction ac- 
[% of the variance of con- 
t only 23% of the variance of 
h correlations were inflated by 
mereteness and literalness were 
the same responses as abstrac- 
artifact should not affect the cor- 
iteralness any differently than the 
concreteness. The difference be- 
© correlation coefficients was 
74) = 2.61, p < .01, as indicated 
correlations based on dependent 
combined group of normal and 
subjects, abstraction correlated 
concreteness and —.62 with 
hus abstraction accounted for 
le variance of concreteness and 38% 
nce of literalness. The difference 
‘two correlation coefficients was, 
cant, (112) = 2.65, p < .01. 
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Conclusion 


Literalness of proverb interpretation is less 
affected by intelligence and by ability to re- 
spond abstractly than is Gorham’s concrete- 
ness. Concreteness depends too much on gen- 
eralized intellectual deficit to be maximally 
useful for describing schizophrenic thought 
disorder. Literalness is an important and more 
specific kind of error than concreteness and is 
less affected than concreteness by generalized 
deficit. The present scoring scheme for literal- 
ness should be useful in many situations in 
which Gorham’s concreteness has been used in 
the past. 


Reference Note 


1. Friedes, D., Grisell, J. L., Levin, S., Dobie, S., & 
Cohen, D. B. Manual for scoring proverb inter preta- 
tion. Unpublished manuscript, Lafayette Clinic, 
1964. (Available from James Grisell, Lafayette 
Clinic, 951 East Lafayette, Detroit, Michigan 48207) 
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This article differentiates between two important classes of behavior that can be 
identified in any psychotherapy. One class concerns cohesive behaviors (Type 
C), which bring organisms together, and the other concerns dispersal behaviors 
(Type D), which drive organisms apart. This study examined changes in C and 
D behaviors that occurred during the first 100 hours of the psychoanalytic treat- 
ment of Mrs. C, a woman whose presenting complaint was sexual frigidity. The 
data showed improvement in both types of behaviors. In addition, progress in 
Type D behavior preceded progress in Type C behavior, a relationship that had 
been predicted by the case formulation. Then we identified approximately 350 
complaints made by the patient during the treatment, complaints of the form 
“I can’t (do something)” and “I have to (do something).” These complaints 
also declined in frequency during the treatment. 


Personality theorists (e.g, Horney, 1945; Mur- 
ray, 1938) have sometimes classified interpersonal 
behaviors into three broad categories. This classi- 
fication is generated by considering, first, whether 
the subject is (a) avoiding the other person or 
(b) getting involved with the other person. If the 
latter, the behavior can be further classified into 
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(bı) behaviors expressing a positive involvement 
and (be) behaviors expressing a negative involve- 
ment. The resulting three categories can be labeled 
(a) avoidance, (by) positive involvement, and 
(ba) negative involvement. Horney (1945) has 
called these three classes moving away from, mov- 
ing toward, and moving against the other peren 
Murray (1938) has written of abience, adience, 
and contrience with similar meaning. 4 
Behaviors involving other people, both a 
tively and negatively, are frequently as 
psychotherapy. Those reflecting positive at ihe 
ment occur when a person cooperates, CO ais 
rates, or concurs with another person, comp ie 
and shares thoughts and feelings and is moma 
warm, and loving. Ethologists (Mussen & Ro 
zweig, 1973, chap. 28) have called these Ro 
behaviors cohesive behaviors, since REEE ! 
organisms together. Since many cohesive A 
iors begin with the letter c (coopera : 
comply), we shall call them Type C behaviors. i 
In contrast, behaviors reflecting nee 
volvement produce a psychological differen ae 
from the other person. They occur when a P 
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defies another person, disagrees with, distrusts, or 
disapproves of the other person, hates, criticizes 
or opposes the other person. Ethologists have 
qilled these behaviors dispersal behaviors, since 
they (assertively and aggressively) drive orga- 
nisms apart. Since many dispersal behaviors begin 
with the letter d (defy, disagree, disapprove), we 
all call them Type D behaviors. As ethologists 
fave noted, C and D behaviors show a complex 
interplay throughout the phylogenetic scale, pro- 
noting the survival of both the individual and the 
species. 

Frequently during psychotherapy people com- 
pain of having poor control over C and D be- 
laviors. Either they are unable to express the be- 
lavior comfortably or they are unable to mod- 
uate the behavior. In the case described below, 
for example, the woman sometimes wanted to be 
affectionate but found herself provoking. At other 
times she wanted to demur but found herself 
yielding. Her poor control was accompanied by 
psychological distress. 


Impulse Versus Behavior 


To clarify poorly controlled behaviors, let us 
begin with a basic postulate of psychoanalytic 
(is well as other) theories, namely that an im- 
pulse * precedes any nonreflexive behavior. The 
distinction is analogous to the psycholinguist’s 
distinction between the underlying abstract repre- 
oe of a thought and the corresponding sur- 
ia aoe of verbal behavior; one, an in- 
ae event, precedes the other, an ob- 
es ; surface phenomenon. The impulse, an 
e le representation, becomes decoded through 
; oo that involves optional and obligatory 
ee transformations; the defense mecha- 
FA would thus be viewed as a subset of trans- 
ations that occur during decoding (cf. Sup- 
Pes & Warren, 1975). 
aa as the correspondence between the deep 
e surface structure of language is not neces- 


Sari i é 
Mily 1:1, the impulse is not necessarily iso- 


Beas, the behavior; different relationships 
is oe etween them. Sometimes an impulse 
the beh y expressed in behavior, at other times 
‘thee ie is simply inhibited, and at still 
alee the behavior is partly camouflaged 
oat er ot derived from another impulse. 
i nae ehavior were to exhibit both a C and 
erent Reece we would assume that two dif- 
ited, A, pulses, a C and a D impulse, both ex- 
postulate. affectionate pinch, according to the 
pulses os result from simultaneous im- 
urt and to be close to the same person. 
Psychological “problem” is experienced when 
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people lack control in translating impulse into 
behavior. For example, they might intend to ex- 
press one impulse and yet find themselves ex- 
pressing another coexisting impulse. That is, on 
the one hand, they might find themselves unable 
to express an intended behavior directly and com- 
plain, for example, that they cannot cooperate 
or cannot fight even though they want to. On the 
other hand, they might find themselves express- 
ing a behavior more intensely or more compul- 
sively than they want to, complaining that they 
have to share intimacies or have to defy even 
though they do not want to; such behaviors 
would have an obligatory quality. 

A successful therapy should help people gain 
control over each kind of behavior, They should 
acquire the capacity to experience and express 
more directly both C and D behaviors. One goal 
of the following studies is to objectify such im- 
provements and to examine the relationship be- 
tween them. 


Observation 1: Two Concomitant Changes 


Method 


This set of studies was based on a psycho- 
analytic case treated by a psychoanalyst who was 
not familiar with the views expressed here. Every 
session of the analysis was tape-recorded with the 
written consent of the patient. The analyst also 
took process notes during each hour describing 
the content of the hour. As the patient was talk- 
ing, the analyst was writing. His notes, however, 
did not report any commentary or clinical infer- 
ence; they only summarized the patient’s talk and 
his own interventions. 

A group of clinical psychologists and psycho- 
analysts met weekly to discuss the case. Drawing 
only on the process notes of the first 10 hours 
and information of the intake interview, they 
formulated the case and predicted a sequence of 
changes. The following case description sum- 
marizes the main details of the case and the 
group’s formulation and clinical prediction. 

Case description and formulation. The patient, 
Mrs. C, was a prim, married schoolteacher in 
her late 20s who came to treatment complaining 
of sexual frigidity, difficulty experiencing pleasur- 
able feelings, and low self-esteem, Her father 
was a professional man, and her mother was a 
housewife. She was the second of four children 


is meant to be neutral theo- 
rm underlying abstract 
for example, no en- 


1The term impulse 
retically in the way that the te: 
representation is neutral. Thus, 
ergic connotations are intended. 
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(an older sister, a younger sister, and a much 
younger brother). When the treatment began, 
Mrs. C had been married for less than 2 years. 
She considered her marriage successful, though 
she felt that her sexual inadequacy created a 
major marital problem. 

Mrs. C’s parents were described as controlled 
people, undemonstrative of any affection. The 
mother, who was an organized and effcient 
woman, ran the house well. She was also very 
controlling, and Mrs. C felt in danger of being 
“owned” by her. On the other hand, the mother 
was not able to defend herself very well. Once, 
for example, the patient hit the mother in the 
stomach, and the mother could not defend her- 
self or correct the patient except by retiring to 
her bedroom in obvious discomfort, leaving the 
patient to feel guilty, helpless, and frightened. 
The patient thus came to feel capable of hurting 
other people and guilty over aggression and as- 
sertiveness. Between the ages of 5 and 8, she 
had recurrent nightmares of something happening 
to her mother. 

The father was also undemonstrative and easily 
embarrassed by other people’s display of affec- 
tion. Although he was generally controlled, he 
sometimes lost control of his anger and had tem- 
per tantrums that revealed murderous Tage; at 
times Mrs. C felt that he was capable of killing 
her. The father was also upset by crying women 
and became angry over masochistic displays from 
the patient. 

In the period before the analysis began, Mrs, 
C was feeling beleaguered and upset, In situations 
that called for intimacy, she experienced intense 
ambivalence, which left her feeling confused and 
in turmoil. The ambivalence resulted from nu- 
merous opposing tendencies: If she had an im- 
pulse to be sadistic, for example, she felt poten- 
tially guilty. Then, identifying with her mother, 
she would turn to masochistic feelings (feeling 
hurt, victimized, neglected, unfavored), which 
served iog camouflage sadistic impulses, These 

eelings, however, were also unsati i 
that she felt that they would upset She cae 
as they had upset her father, who sometimes lost 
could not ex- 
impulses com- 
m, using each 
Tesult was tur- 
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people, sexual intimacy could be a problem in 
that she did not have the means of ending the 
closeness when she wanted to. Thus, an impair. 
ment in Type D behaviors could produce a cor- 
responding impairment in Type C behaviors, 

It was therefore hypothesized that during the 
treatment, Mrs. C first had to develop a better 
capacity to defend herself against other people 
(stubbornly resist other people without feeling 
guilty, disagree with other people, etc.) to allow 
herself to get closer to other people. It was hy- 
pothesized that as she acquired a better capacity 
for Type D behavior, she would feel less vul- 
nerable in expressing Type C behavior and would 
therefore express Type C behaviors more freely? 
The following procedure was designed to test 
these hypotheses, 

Procedure. The first step was to examine Mrs. 
C's ability to express behaviors of Type D com: 
fortably. A prominent subset of Type D behav- 
iors contained instances in which she blamed, 
criticized, disagreed with, or opposed another 
person (the therapist or someone else). Three 
clinicians independently read the process notes 
of the first 100 hours in unsystematic order to 
avoid bias, looking for all passages in the notes 
that described such behaviors (in the present or 
in the past, toward the therapist or anyone else). 
Then the three clinicians together reviewed all 
of the passages that they had identified and re- 
tained the ones that they agreed were instances 
of blaming, criticizing, disagreeing, or opposing, 
Their resulting set contained 190 passages, 1- 
volving both direct behaviors (e.g., criticizing 
the analyst) and self-reports of such behaviors. 

Then a 4-point rating scale was developed to 
assess the directness of the behavior described 
in each passage. If the blame or criticism was 
only implied or if it was expressed tentatively 
with extreme discomfort, the scale value was l; 
as the behavior became more explicit and direct, 
the scale value increased. A rating of 2 meant 
that a Type D behavior was expressed but im- 
mediately undone. Ratings of 3 and 4 meant that 
a Type D behavior was overtly expressed—3 r 
telling a third party about it, and 4 by directly 


2It might be noted at this point that during the 
first 100 hours of treatment (covering a period a 
approximately 6 months), Mrs. C did achieve is 
increased, though limited, capacity to respond Set 
ually, She also became more able to free aen N 
easily, to reveal symptoms and preoccupations, me 
to think and work more productively. She Pelai 
more able to express and tolerate strong 7 dis- 
and found herself exercising a better-modulate 
cipline over the students in her class. 
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Figure 1. Mean rating of passages in each block 
of hours, 


confronting the offending party. Each rating was 
also increased by .5 if the event occurred in the 
present tense (after the treatment began). Thus, 


the possible ratings were 1.0, 1.5, 2.0,.--5 4.5. 


The passages were divided into two subsets, 


and the passages of each subset were presented 


independently in random order to four clinical 
ress. who were naive about the case. 
peut scoring rules were developed for rating 
i passages, and the judges followed these rules 
n rating each behavior. 
aes major step was to follow a similar 
ET qo identifying and rating the Type C 
laris A prominent subset of these included 
aoe which the patient complimented 
Ri. , felt affection or compassion for some- 
clinical wanted to be loved by someone. Two 
the pape: read the process notes of 
forall 00 hours in unsystematic order looking 
lai ate in the notes that described such 
dlnicia rs. As with the Type D passages, the 
i inea reviewed the passages that they had 
Weis inst and retained the ones that they agreed 
ite Get 4 ances of Type C behaviors. The result- 
ontained 106 passages. 
eae scale was also developed to 
aes. irectness of these behaviors. If the 
ehas va implied or expressed with extreme 
Scale a y, lack of clarity, or discomfort, the 
Presio ue was 1. A rating of 2 denoted an ex- 
n of closeness with immediate undoing. 


559 


As the behavior became more explicit and direct, 
the scale value increased: 3 indicated that the 
feeling of closeness was expressed to a third 
party, and 4 indicated that it was expressed di- 
rectly to the other person. Each rating was also 
increased by .5 if the event occurred in the 
present tense (after the treatment began). The 
possible ratings thus ranged from 1.0 to 4.5. 
These passages were also presented to a panel 
of four clinical psychologists who were naive 
about the case. Explicit scoring rules were de- 
veloped for rating the passages, and the judges 
followed these rules to rate each behavior. 


Results 


Type D behaviors. To assess the reliability of 
the judges’ ratings, the four ratings for a given 
passage were averaged, and the reliability of the 
four judges’ means was computed for each set 
through an analysis of variance. The reliability 
was .89 for one set, .90 for the other set. 

The 100 sessions were then grouped into 10- 
session blocks denoted I, pi a Ose ry O'S The 
number of passages within each block were I= 
31; H=13; Tl=18; IV=9; V=35; vVi= 
13; VII=13; VII = 10; IX=29; and X = 19. 
The ratings of passages within each block were 
averaged, and the means ranged from 2.62 to 
3.81. These means are reported in Figure 1, which 
shows the development of Type D behavior 
across successive blocks of sessions. 

To examine changes in Type D behaviors more 
closely, all passages rated alike were examined 
as a group. Because of the small frequencies in 
some categories, passages rated 2.0 and 2.5 were 
combined, as were passages rated 4.0 and 4.5. 
Also, to obtain more stable frequencies, the ses- 
sions were grouped into 20-session blocks. 

The relative frequency of each rating was com- 
puted for each block, and these relative fre- 
quencies are shown in Figure 2. The top two 
graphs show 4 monotonic decline for passages 
rated 1,0-2.5. Figure 2 also shows a decline in 
3.0s (criticizing someone for a past event) but 
an increase in 3.5s (criticizing someone for a 
current event). Direct confrontations (4.0 and 
4.5) also became more frequent throughout the 
treatment. The graphs are largely monotonic 
and characterize major changes that occurred in 
the patient’s behavior during the treatment.® 


3 These graphs, of course, are not independent of 
one another. The overall improvement in Figure 1 
requires that the lower ratings generally decline 
over the 100 hours while the higher ratings generally 


increase. 
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Figure 2. Relative 
(Type D). 


To compare the frequencies of past and present 
events, the relative frequencies of 3.0s and 3.55 
were compared, There were 101 Passages with 
these ratings. For each block of 20 sessions, the 
relative frequency of 3.55 (present tense) was 
computed. For successive 20-session blocks, the 
values were 6/24 (i.e., 6 cases were in the present 
tense, 18 were in the past tense) = 25; 8/16 = 
„50; 17/26=.65; 8/11 =.73; and 19/24 = .79. 
Thus, the patient increasingly came to criticize 
others for events in her current life. It is as- 
sumed that events from the past tense were less 
threatening for her and provided a convenient 
starting point for the therapy, but as the sessions 
progressed, she shifted her focus to her current 
life. Thus, part of the increment in Figure 1 is 


i s 
frequency of passages in each rating category for successive blocks of hour: 


due to the patient’s shift to present tense van 

and part is due to the decline in lower Ta 
rimarily 1.5 and 2.5). i 

ue C behaviors. The reliability of B 

Type C ratings was also assessed. T a 

judges’ ratings for a given passage were ae 

aged, and the reliability of the four judges 


~ 


i 10- 
was .83. The 100 sessions were grouped into = 


i necies 
session blocks, with the following freque! 


within each block: Block I=11; m= 3 
15; 1V—10; V—=9; VI = 12; vi= He pas- 
=9; IX=5; X=12. The ratings of tho Pg 
sages within each block were then Oe o 3.75. 
the resulting means ranged from a Type 
Figure 1 shows the development of t ae 
behaviors across successive blocks of sês 
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To examine the change in Type C behaviors 
more closely, passages within each rating category 
were examined separately. Since the frequencies 

vere smaller than those concerning Type D be- 
haviors, all of the ratings from 1.0 to 3.0 were 
pooled (These were the categories that had shown 
declining relative frequencies in Type D behav- 
irs.); likewise, all of the ratings from 3.5 to 
4.5 were pooled. (These were the categories that 
had shown increasing relative frequencies in Type 
D behaviors.) Relative frequencies of occurrence 
were computed for each block of 20 sessions, as 
shown in Figure 3. One graph shows a progres- 
sive decline in the relative frequencies of the 
lower ratings, and the other graph shows a pro- 
gressive increase in the relative frequencies of 
Cae The two graphs thus resemble 
t ained for the Type D behaviors. Events 
| eee tense were also examined, but their 
s were too smal 
permit any inference. n poi n 
Thus, it is clear that two types of changes oc- 
curred, but it still needed to be demonstrated 


that a change occurred in the patient’s presenting 
complaint, sexual frigidity. Therefore, every ref- 
Sane the patient’s sexual behavior was noted 
ee wet the 100 hours. There were 18 such 
Cac s (comprising a subset of the 106 Type 
i fae all occurring between Hours 28 and 
noT : eee contained the word intercourse 
ee ui which contained the phrase sexual 
ve 5 Biss some examples: From Hour 33 
Pie ee ee when she is trying to 
Rican ave intercourse with Bill, she feels 
hi fae ae to hurt him. She just doesn’t 
is che it. She ll go from feeling very warm 
Four a nothing toward him suddenly.” From 
Til tea (rated 4.5): “This weekend she and 
AER intercourse, and she was thinking how 
rei it can be when she’s thinking about him 
in se a close to him and not all wrapped up 
. passages occurred in the first 50 sessions, 
fing ee in the last 50 sessions. The Cc 
those ae to each passage was noted. For 
ate w early block, 6 had ratings of 1.5 to 
ore, (in Hour 43) had a rating of 3.5 to 
ratings ose 11 passages in the later block, 4 had 
ae oe to 2.5, and 7 had ratings of 3.5 
mainly $ e 7 passages with high ratings were 
ia mple, direct statements that the patient 
aa Sea intercourse. A Fisher exact test 
eee to test the significance of this 
ech, ; the chance probability of the observed 
r a more extreme one is .022. 
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Figure 3. Relative frequency of passages in different 
rating categories for successive blocks of hours 


(Type C). 


Observation 2: Relationship Between 
C and D Behaviors 


The capacities to express C and D behaviors 
comfortably seem to be related, since a defect 
in D can produce a corresponding defect in C. 
That is, if a person did not have the capacity to 
disengage from the other person, intimacy would 
be unsafe, since the person would not be able to 
end the closeness and would run the danger of 
feeling oppressed oT entrapped. On the other 
hand, once the person gained the capacity to ex- 
press D behaviors comfortably, closeness woul 
not be as threatening. 

Thus, as Mrs. 


press Type D bel 
become easier for her to express Type 


jors. In any block of therapy sessions 
significant gains are observed in Type D behav- 
iors, improvement should subsequently be ob- 
served in Type C behaviors. This hypothesis is 


examined below. 


1-20 


the capacity to ex- 


haviors comfortably, it shoul 
C behav- 


Method, Results, and Discussion 

In Figure 1 the C graph resembles the general 
form of the D graph. To examine the relation- 
ship between the graphs more closely, the posi- 
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tions of greatest increase along each graph 
were noted. A “significant improvement” in either 
function is defined as an increment from Block 
i to Block ¿+1 that exceeded .25. Significant 
improvements in Type D behavior occurred three 
times—from Block II to Block III, from Block 
III to Block IV, and from Block V to Block VI. 
Furthermore, significant improvements in Type C 
behavior also occurred three times—from Block 
II to Block IV, from Block IV to Block V, 
and from Block VI to Block VII. In each case, 
a significant improvement in Type C behavior 
followed a significant improvement in Type D 
behavior: An improvement in D occurred from 
II to III, an improvement in C occurred from 
III to IV. The chance probability that the three 
Type C improvements would occur in these par- 
ticular three positions is .012. 

In addition, a “setback” in either function is 
defined as a decrement from Block i to Block 
i+ 1. A setback in the Type D behavior occurred 
three times—from Block I to Block II, from 
Block IV to Block V, and from Block VI to 
Block VII. A setback also occurred three times 
in the Type C behavior—from Block II to Block 
TII, from Block V to Block VI, and from Block 
VII to Block VIII. Thus, a setback in Type C 
behavior always followed a setback in Type D 
behavior. In other words, the two graphs took 
very similar courses, with one displaced from the 
other by one block of sessions, 

The data therefore suggest that the patient’s 
progress in expressing Type C behaviors followed 
her progress in expressing Type D behaviors. As 
she became progressively able to criticize, op- 
pose, and disagree with other people, she felt pro- 
gressively less vulnerable; then, feeling less vul- 
nerable, she could relax her defenses and permit 
herself to feel close, affectionate, and compas- 
sionate toward other people. If the two graphs 
had simply exhibited a correlation, other factors 
could account for their concomitant rise and fall, 
But their displacement in time Suggests that an 
advance in one type of behavior may have facili- 
tated an advance in the other, 

This inference must be made with reservations 
for three reasons, First, the relationship may only 


ee Mrs. C’s 
shift from past to present t 

tions of present tales or) e E ETORO 
treatment, while simila; 
were not identical. It js 


F be best 
l ¥ A 
aave by replicating the findings on another 
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Second, changes in Type C and Type D be. 
haviors, as operationalized here, may be trivial, 
That is, they may reflect changes that occur in 
any developing human relationship in the way 
that the partners relate to each other (talking 
more directly, less cautiously, less formally) and 
are thus not necessarily to be traced to the ther- 
apy itself. It is possible that whenever Mrs, C 
entered a new relationship with someone, she 
would initially qualify with great caution any 
statements that she made so as to present a bal- 
anced view on any subject; such a tendency 
would involve statements that would get lower 
ratings. Then, as she came to know the other 
person better, she might drop this tendency and 
become more direct. If this interpretation were 
correct, though, C and D changes should occur 
simultaneously, rather than one consistently lag- 
ging behind the other. 

Finally another kind of explanation might ac- 
count for the observations in Figure 1. Suppose 
the direct expression of aggression is in some 
sense incompatible with the direct expression of 
intimacy, so that the relative prominence of one 
would imply a relative decline in the other. Then, 
as one graph rose from Block i to Block i+1, 
the other graph would fall. For example, in Fig- 
ure 1, from Block I to Block II, the C graph 
rises while the D graph falls, causing the graphs 
to cross. Then, proceeding to Block III, the C 
graph falls while the D graph rises, producing 
another crossing. Additional crossings occur as 
the graphs proceed to Blocks V, VI, VII, and 
VIII. This characterization of the data has the 
virtue of parsimony, but it does not explain why 
both graphs would show concomitant overall im- 
Provement, It also suggests that the frequency 
of Type D behaviors should be strongly and 
hegatively related to the frequency of Type © 
behaviors. The correlation was negative, but it 
was not significant (r = —.33, p > .20). 

Thus, alternative hypotheses may account for 
Some aspects of the data, and perhaps may even 
accurately account for aspects of the therapeutic 
Process. However, they do not adequately ex- 
Plain the lag between graphs or the overall im- 
provement in each type of behavior. For this rea- 
son, it is tentatively concluded that improvement 
in Type D behavior, at least in this patient, pet 
mitted subsequent improvement in Type C be 
havior. 


Observation 3: The Nature of 
Mrs. C’s Complaints 


In the coures of 100 hours of treatment, re 
C mentioned a large number of other proble 


ot directly related to sexual frigidity 
ify the nature of her distress. Many 

laints were expressed in the form 
(do something)” or “I have to (do 
» revealing inhibitions and compul- 
\ large subset of these complaints could 

ied according to the C and D categories, 
s hypothesized that many complaints 
lect general problems over C and D 


ople reading the process notes indepen- 
ified 248 complaints involving “can’t” 
can’t praise her assistant) and 103 
nts involving “has to” (e.g., She has to 
inst her husband), making a total of 
iplaints.* Near symptoms of can’t and has 
also accepted, 

statements were presented to a group of 
(10 graduate students and 10 clini- 
Each judge was asked to classify each 
behavior as to Type C, Type D, or 
A statement was considered a Type C 
D) complaint if 14 or more judges so 
it. Using this 14-or-more criterion, 60 
were of Type C and 56 were of 


complaints of each type were then ex- 
further to determine how many were of 
t form and how many were of the has to 
-Of the 60 Type C complaints, 51 were of 

t form and 9 of the has to form, Of the 
D complaints, the corresponding fre- 
S were 24 and 32. The chi-square com- 
or this 2 X 2 matrix was 20.7 (p < .001). 
4 Type C complaints were typically of 
# form, Type D complaints were more 
divided between the two. The single high- 
lag was for complaints of the form 
» a form that corresponds to the pre- 
Complaint, sexual frigidity.* The other 
S, involving aggression and assertive- 
cted poor control both ways: Sometimes 
ent could not express behaviors that she 
© express, but at other times she could 
‘ain herself. 


General Discussion 


icated studies of psychotherapy out- 
e been undertaken in recent years, as 
zed in the recent review of Bergin and 
975). Most of these studies (e.g., Ber- 
nar, & Severy, 1975; Sloane et al., 
ve reported data about treatment out- 
hough details of the therapeutic process 
Generally unclear. The present set of 
in contrast, focuses on the treatment 
; er se and assumes that therapeutic out- 
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come is best evaluated in the light of one pa- 
tient’s needs and goals. 

The present article has examined several ex- 
plicit propositions about the nature of Mrs. C’s 
psychopathology and therapeutic progress. One 
major result showed that Mrs. C’s difficulty in 
expressing Type C behavior was related to her 
difficulty in expressing Type D behavior; thus, 
the way to solve one specifiable set of problems 
involved the simultaneous treatment of another 
set. Throughout the treatment, progress on one 
set was a prerequisite for concomitant progress 
on the other. 

These results thus point out one feature of a 
therapeutic process that is often overlooked in 
treatments that set more specific behavioral goals, 
namely, that an advance in one behavioral domain 
may be a prerequisite for an advance in another, 
quite different, domain. For a patient like Mrs. 
C, a gain in assertiveness may be necessary for 
a gain in intimacy. Occasional writers have im- 
plied such a relationship (eg., Smith, 1975), but 
no systematic documentation or explanation of 
the relationship has previously been offered, 

Furthermore, in Mrs. C’s treatment, there were 
really two major therapeutic goals, but only one 
corresponded to her presenting complaint (sexual 
frigidity). It is possible, of course, that Mrs, C 
would have been helped more efficiently by a 
combination of assertiveness training and sexual 
therapy, but it is not necessarily the case that 
she perceived herself as needing to become more 
assertive. Indeed, a tabulation of her complaints 
throughout the first 100 hours showed that she 
often found herself too aggressive and opposi- 
tional, more than she wanted to be. Nonetheless, 
in principle, one could imagine a research design 
with patients like Mrs. C, comparing each kind 


4 These various complaints declined in frequency 
over the 100 hours. The relative frequencies occur- 
ring in successive 20-session blocks were 26, .24, 17, 
119, 15; x°(4) = 15.48, $ <.01. i 

5 Very few of these can’t C complaints were spe- 
cifically sexual in content, however. They concerned 
various people—the patient’s husband, therapist, 
pupils, assistants—and they involved various forms 
of closeness—giving unilaterally to other people 
(praising, helping, reassuring, comforting, disclosing 
personal information), as yal Sel Enara e, 
ing to other people, trusting, believing, r! y 
aie ae te or relaxed with). The Type D prob- 
lem behaviors were also heterogeneous, involving as- 
sertiveness (getting her own way, sticking to her 
views, developing her own teaching method, making 
demands on other people, disagreeing with other 
people) and aggression (expressing anger, being bas 
to people, criticizing other people, opposing 0! 


people). 
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of treatment singly and in combination with the 
other kind of treatment. 

One early theme in Mrs. C’s therapy consisted 
of her criticizing people (e.g., her parents) for 
events of the past. This kind of theme often oc- 
curs early in a treatment as the patient spon- 
taneously produces data from the past. It is pos- 
sible that Mrs. C saw as one demand character- 
istic of therapy that she criticize her parents for 
events of the past. In our view, however, she 
was not only producing personal data but was 
also serving very specific therapeutic ends by be- 
ginning the treatment in this way. Her criticisms 
allowed her to observe the therapist’s reaction to 
one very mild form of aggression and assure her- 
self of the safety of similar undertakings in the 
future. This low-level criticizing can be viewed 
as an early test of a therapist. Other evidence 
of such tests has been presented by Horowitz, 
Sampson, Siegelman, Wolfson, and Weiss (1975). 

Finally, the distinction between C and D be- 
haviors emphasizes the meaning of the behavior 
in addition to its observable form. A given be- 
havior may have multiple meanings for a par- 
ticular person. For example, the very same be- 
havior might be of Type D with respect to one 
person and of Type C with respect to another 
person. A criticism rated 3 in this study would 
be such a case; it was of Type D with respect 
to the criticized person and of Type C with re- 
spect to the therapist (since the patient is con- 
fiding in or confessing to the therapist). This 
form of closeness to the therapist was never tab- 
ulated among the Type C behaviors of this study, 
but it may comprise a significant aspect of the 
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therapeutic process: The patient criticizes ą 
third person, tentatively viewing the therapist as 
an ally; then, when the therapist permits the 
alliance, a closeness is established between them 
that neither party has directly solicited. Such as- 
pects of the therapeutic process need to be ex- 
amined further. 
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Research on the locus of control construct 
Rotter, 1966) has shown that internals com- 
red to externals obtain higher grades (Brown 
A Strickland, 1972) and make more accurate 
Predictions for self-relevant achievement out- 
tomes (Steger, Simmons, & Lavelle, 1973; Wolfe, 
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i ae consistently for younger samples than 
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Locus of Control, Prediction, and Performance 
on University Examinations 


Timothy M. Gilmor and David W. Reid 
York University, Downsview, Canada 


Fifty-two internal and external locus of control subjects estimated what their out- 
term exams and their 
actual outcomes tended to be higher than those of externals. Accuracy of predictions 
as assessed by difference scores did not differentiate 
nals’ estimates from the first to second exam were characterized by more typical 
expectancy shifts demonstrating a greater responsiveness to their initial performance 


final grade. Internals’ estimates and 


the two groups. However, inter- 


that internals exhibit more typical expectancy 
shifts (increase in expectancy after success, de- 
crease in expectancy after failure), whereas ex- 
ternals exhibit more atypical expectancy shifts 
(increase in expectancy after failure, decrease in 
expectancy after success). This pattern of expec- 
tancy shifts was expected to occur in the present 
context and to provide support for the contention 
that internals are more responsive to the feed- 
back that they receive in making predictions for 
future performance. No specific prediction was 
made concerning the proportion of overestimates 
versus underestimates for each group. 

Subjects were 20 male and 32 female under- 
graduates in a third-year psychology course. Sev- 
eral weeks after the class had begun, the first 
of several questionnaires was administered. In- 
formation sought included: “How many univer- 
sity exams or term tests have you written?” 
“What is the final grade you expect to get in 
this course?” and, “It is early to predict, but 
how well do you think you will do on the first 
exam in this course? /100.” 

One week later the class completed an €x- 
panded version of the Reid and Ware (1974) 
multidimensional Locus of Control scale. A me- 
dian split of scores on the Fatalism subscale 
determined the internal (0-6) and external (8- 
18) groups. Subjects with scores at the median 


were eliminated. 

After the first exam results were made known 
to the class, a “follow-up” questionnaire was ad- 
ministered asking subjects to report their exam 
result and to predict what their second exam result 
would be (scheduled for 4 weeks later). At the 
end of the term, their second exam results and 
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final grade were obtained for comparison to their 
predictions. 

An initial analysis revealed no main or interac- 
tion effects due to high versus low exam experi- 
ence (cf. Wolfe, 1972). Overall, the group had 
considerable previous exam writing experience 
(M = 14.46). Consequently, 2 X 2 (Sex x 
Fatalism) analyses of variance were conducted on 
the (a) performance estimates, (b) actual results, 
and (c) difference scores for each exam and the 
final grade. Chi-square analyses were conducted 
on the frequency of overestimates versus under- 
estimates and typical versus atypical expectancy 
shifts. 

For the first exam, the estimates of internals 
(M = 75.9) were marginally higher than the esti- 
mates of externals (M = 72.6), F(1, 48) = 2.56, 
p<.15. The actual first-exam performance of 
internals (M = 76.9) was significantly higher 
than that of externals (M = 64.0), F(1, 48) = 
11.02, p < .01. For the second exam, a significant 
difference was found between internals’ (M = 
77.1) and external’ (M = 72.0) estimates, F(1, 
48) = 6.54, p < .05. However, only a marginal 
difference, F(1, 48) = 2.40, p < .15, was obtained 
for internals’ (M = 66.1) versus externals’ (M = 
60.6) actual performance. 

With respect to the final grade, no differences 
were found between the different groups’ esti- 
mates, F(1, 48) < 1.00, p> .25, although inter- 
nals (M = 4.0) compared to externals (M = 5.1) 
obtained higher final grades, F(1, 48) = 5.00, p 
< .05. The numerical means approximate a grade 
letter of B for internals and C for externals. 
Both groups had initially estimated that they 
would achieve a grade of B+. 

Analyses of the difference scores for each exam 
and the final grade revealed no sex or locus of 
control differences. 

Analysis of the percentage of overestimates 
versus underestimates for both exams combined 
revealed that externals (78.6%) more frequently 
overestimated their upcoming exam result than 
did internals (60.4%), x?(45) = 4.92, p <.05. 

Consistent with the above results and with 
prior research was the finding that the expectancy 
shifts of internals (62.5%) were more frequently 
typical (vs. atypical) than those of externals 
(42.8%), x°(42) = 3.98, p <.05. That is, inter- 
nals more frequently raised their estimates for 
the second exam if expectancies for the first 
exam had been surpassed, and they lowered their 
estimates for the second exam if expectancies 


BRIEF REPORTS 


for the first exam had not been met, Externals 
did the opposite; they more frequently raised 
their expectancies for the second exam despite 
poorer-than-expected performance on the first 
exam and lowered their expectancies for the 
second exam despite better-than-expected per- 
formance on the first exam. These results support 
the theoretical contention that externals are not 
as responsive as internals to initial feedback in 
making estimates for future performance. 

Although internals and externals in this sample 
were not differentiated in terms of the accuracy 
of their predictions, their predictions did differ 
qualitatively in terms of the frequency of over 
estimations and typical expectancy shifts, Also, 
the finding that internals attained higher per 
formance outcomes suggests that the locus of 
control/academic achievement relationship 50 
often found in studies with children also holds 
for adult samples. 
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Person X Situation Interaction in Personality Prediction: 


Some Specifics of 
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This study attempted to determin 
Trait), as measured by the S-R In 


to differential personality profiles. Results 


yses, using the Internal-External Locus of Co: 


Inventory, the Fear of Negative Evaluati 
as predictor measures, 
of the four facets of A~Trait. These findi 


S-R Inventory of General Trait Anxiousness and to thi 


situation in the measurement of anxiety. 


Recently, Endler and Okada (1975) discussed 
the limitations of many existing omnibus mea- 
sures of anxiety proneness (A-Trait). At the 
sme time these researchers provided data sup- 
porting their claim that the S-R Inventory of 
General Trait Anxiousness offers a more appropi- 
ile assessment of A-Trait. This inventory is a 
ee easional measure of anxiety (e.g., Endler 
ee esson, 1976; Endler, Magnusson, Eke- 
ae & Okada, 1976) that provides indices of 
A rie, to situations involving inter- 
a a relations, physical danger, ambiguity, 

pcaly innocuous conditions. 

i os purpose of this study was to determine 
hae multidimensionality of the Endler-Okada 
Pain, would be reflected in differential per- 
ie z Miles, These profiles were to be derived 
aa attery of personality tests administered 
logical is inventory and were expected to be 
Brain, ae theoretically consistent with the four 
ee, dimensions of anxiety ostensibly as- 
ed by the S-R Inventory of General Trait 

xiousness. 
de were 278 senior-year, female nursing 
san in training at 13 institutions in the 
ce of Manitoba, Canada. The subjects were 
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ait anxiety (A- 
ted 
of stepwise multiple linear regression anal- 
ntrol Scale, the California Psychological 
ale, and the Interpersonal Trust Scale 
f distinct personality profiles for each 
the multidimensionality of the 
e importance of specifying the 


ion Sc 


ings attest to 


administered a personality battery that consisted 
of the 18 scales of the California Psychological 
Inventory (CPI; Gough, 1957). The CPI seeks 
to measure interpersonal facets of personality 
such as sociability and self-control. They were 
also given the Internal-External Locus of Control 
Scale (Rotter, 1966); the Interpersonal Trust 
Scale (ITS; Rotter, 1967); the Fear of Negative 
Evaluation scale (FNE), developed by Watson 
and Friend (1969); and the S-R Inventory of 
General Trait Anxiousness (Endler & Okada, 
1975). Obtained data were subjected to separate 
stepwise multiple linear regression analyses with 
the four S-R situation scores used as the criterion 
variables. 

For parsimony those variables 
that related signifi 105) to the S-R 
situation scores and yielded a multiple correlation 
change equal to or greater than 01 are discussed. 
There were additional variables whose statistical 


contributions were significant but whose practical 
(ie., multiple correla- 


contribution was negligible 


tion change < OL) 52 f ; ? 
With these considerations in mind, it was 


found that scores on the interpersonal facet of 
A-Trait (i.e., you are 1n situations involving in- 
teractions with other people) were negatively 
related to scores on the CPI measures of intel- 
lectual efficiency, sociability, self-acceptance, and 


1 Complete data summary tables are available from 


the first author on request. 
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good impression (R = .55, p < .05). High scores 
on the physical danger facet of A-Trait (i.e., you 
are in situations where you are about to or may 
encounter physical danger) were associated with 
low scores on two measures (CPI scale of good 
impression and ITS and with high scores on the 
CPI Sociability scale (R = .34, p< .05). For 
subjects expressing anxiety on the ambiguous 
facet of A-Trait (i.e, you are in a new or 
strange situation), there was a negative associa- 
tion with their scores on the CPI measures of 
sociability, capacity for status, and good impres- 
sion, These same individuals obtained high scores 
on the FNE scale and low scores on the CPI 
measure of flexibility (R = .54, p < .05). For the 
innocuous facet of A-Trait (i.e, you are in- 
volved in your daily routines), the scores were 
negatively related to scores on the CPI measures 
of intellectual efficiency and achievement via 
conformance (R = .34, p < .05). 

Independence among the four scales of the 
S-R inventory can be inferred from the distinc- 
tions among the predictive regression equations 
and also (of course) from the correlations among 
the parts. With respect to the latter, the correla- 
tion coefficients were low and ranged from —.04 
to .21 for physical danger versus innocuous and 
ambiguous versus innocuous, respectively. 

In summary, the obtained results strongly sug- 
gest the existence of separate personality pro- 
files for the four facets of A-Trait. Furthermore, 
it is evident that these profiles are logically con- 
gruent with the type of situation described. For 
example, it is reasonable to expect that individuals 
who express anxiety in an interpersonal situa- 
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tion should score low on measures of sociability, 
self-acceptance, good-impression, and so on. Simi- 
larily, persons who feel anxious in unfamiliar, 
ambiguous situations might well indicate a lack of 
flexibility and a concern over social evaluation, 
In sum, the data tend to form patterns that are 
consistent with the claim for the multidimen- 
sionality of the S-R inventory. 
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‘Abnormality of Subtest Score Differences on the WISC-R 
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The abnormality of the difference as a method for evaluating the magnitude of differ- 
ences between pairs of Wechsler Intelligence Scale for Children-Revised subtests is 
discussed. Generally, abnormal differences at the .05 level of significance range from 
6 to 7 scaled score points and 8 to 10 scaled score points at the .01 level. Abnormal 
Verbal-Performance scale IQ differences are also considered. Such differences averaged 
18 IQ points at the .05 level of significance and 24 IQ points at the .01 level. The 
diagnostic implications of the use of the abnormality of the difference for evaluating 
subtest score differences are discussed. 


evaluation of differences between subtest error, this other method reveals how large of a 
is a common problem for the users of tests difference must exist for there to be little chance 
as the Wechsler Intelligence Scale for Child- of it occurring in a normal population (Silver- 
Revised (WISC-R). Unfortunately, the stein, 1973). 
tude of subtest score differences has too The abnormality of the difference (Aq) can be 
been evaluated on the basis of clinical computed using the following formula: 


ition or on the recommendations of test pub- 7 
p Aa = 2/01" + 92° — 2 M12 01 02 


ts without an adequate statistical basis for 
recommendations. where o1 and oz are the standard deviations of 
Jhen a statistical answer to this problem has the subtests being compared, 712 is the inter- 
given, it has generally involved computing correlation between these two subtests, and z is 
standard error of measurement of the differ- the probability level that was used. i 

between subtest pairs. Using this approach, The present investigation involved computing 
WISC-R manual (Wechsler, 1974) provides the abnormality of the difference between all 


able of significant subtest score differences possible pairs of the 12 WISC-R subtests at each 
on the average values for all 11 age groups of the 11 age levels included in the standardiza- 


luded in the standardization sample. Using the tion sample. Differences between Verbal and 
le IQs were also examined. The 


er weak .15 level of significance, differences Performance sca: z 

only 2.35-3.45 scaled score points between necessary standard deviations and intercorrela- 

sts are considered significant. Using the tions were obtained from the WISC-R manual 

conventional .05 and .01 levels of signif- (Wechsler, 1974). CR 

ce and considering age levels separately, it was For a difference between any two WIS a 

ound that 3-5 points at the .05 level and 4-6 subtests to be considered “abnormal, a te 5 
ats at the .01 level were generally required point difference is generally required at the .0 


ifi i i ; Sat- ignificance, whereas an 8- to 10-point 
oA (Piotrowski & Grubb, 1976; Sat- level of signi Mat br level: 


difference is typically required EN 10 differ 
nother way of statistically evaluating subtest fe 


Abnormal Verbal-Performance Sci e 
e differences, the abnormality of the differ- ences averaged 18 points at the .05 level of sig- 
€, was suggested by Silverstein (1973). While 
standard error of measurement of the differ- 
indicates how large a difference must be 


i i e 
t it cannot be attributed to measurement normality of the differenc 


t scaled score differences for 


ints higher at 
quests for reprints and for an extended reper! figher F the .01 level. Abnormal Verbal-Per- 


a tudy should be sent to Richard J. Pokora 
eny Intermediate Unit #3, Suite 1300, r : 
gheny Center, Pittsburgh, ee 5 15212. higher at the 05 level and 8-10 points higher at 
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the .01 level than when the weaker standard er- 
ror of measurement of the difference was used. 

In light of the rather large discrepancy in the 

size of “differences” generated by these two in- 
dices, the implications of differences of varying 
magnitude should be considered. Test score dif- 
ferences falling short of the standard error of 
measurement of the difference may, of course, be 
attributed to measurement error or chance. A 
difference between two scores, which is sig- 
nificant in light of the standard error of mea- 
surement of the difference, suggests that there 
is a difference that can be reliably measured in 
the abilities or skills tapped by the tests being 
compared. Such differences may well have edu- 
cational or program planning significance, since 
they point to “real” differences between the 
abilities measured by the subtests being com- 
pared. However, the fact that an individual’s 
abilities are not uniformly developed should not 
be considered highly unusual. Only when the dif- 
ference between scores reaches the magnitude of 
the abnormality of the difference might the scat- 
ter between an individual’s abilities be considered 
unusual when compared to the amount of scatter 
generally found within the abilities of other in- 
dividuals. 

Two recent studies are relevant in regard to 
the size of differences obtained when using the 
WISC-R. Using data computed from the stan- 
dardization sample of the WISC-R, Kaufman 
(1976a) found that rather large differences be- 
tween subtest scores are typical. When all 12 
WISC-R subtests were given, less than 30% of 
the normative population showed a difference of 
6 or less points between their highest and lowest 
subtest scaled scores. At the same time, approxi- 
mately 20% had differences of 9 points or more, 
5% had differences of 11 points or more, and 2% 
had differences of 13 points or more. 

A second study by Kaufman (1976b) revealed 
that rather large Verbal-Performance IQ scale 
differences were also typical of the standardiza- 
tion sample. Verbal—Performance IQ differences 
of 9 points or greater were found in 48% of sub- 
jects, 34% had differences of 12 Points or more. 
and 25% had differences of 15 points or more. 
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At the .05 level, abnormal Verbal—Performance 
differences averaging 18 IQ points were found 
in the present study. Differences of this size or 
greater were found by Kaufman (1976b) in 12% 
of the standardization samples. Similarly, at the 
.01 level, abnormal Verbal—Performance differ- 
ences averaged 24 points. Only 4% of the orig- 
inal sample had differences this large or greater, 

In light of Kaufman’s findings, it appears that 
the abnormality of the difference provides a 
statistic that is realistically related to the actual 
performance of children on the WISC-R. This 
close degree of correspondence between the ab- 
normality of the difference and Kaufman’s find- 
ings has important diagnostic implications. In 
regard to the identification of children who are 
expected to show a wide variability between 
their “peaks” and “valleys” (e.g., learning dis- 
abled), differences more in line with the magni- 
tude of the abnormality of the difference or 
greater might be expected. Similarly, the toler- 
ance for the spread of abilities in children seen 
as potential candidates for programs character- 
ized by a “flat” ability profile (e.g., educable 
mentally handicapped, slow learners) might be 
extended to just short of the point generated by 
the abnormality of the difference. 
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University of Alabama in Birmingham 


Ten pairs of brothers with a mean age of 24 years and a mean of 13 years of educa- 
tion were individually examined with the Wechsler Adult Intelligence Scale some 10 


months apart by a highly experienced clinical 


psychologist who was unaware of the 


consanguineous relationship. The obtained correlation of 42 for Full Scale 1Q is con- 
sistent with the median correlation of 49 reported by Erlenmeyer-Kimling and Jarvik 
in their 1963 review of the world’s literature. 


The relationship between consanguinity and 
measured intelligence was the focus of active 
investigative interest between 1910 and 1950, 
with the results reported in the world’s literature 
succinctly summarized in a review by Erlen- 
meyer-Kimling and Jarvik (1963). Inasmuch as 
the Wechsler scales had not yet been developed 
during the period when most of these studies 
were being conducted, we were curious as to 
whether the post-1940 studies on the relationship 
between IQ in siblings had ever used Wechsler’s 
measures, In their review Erlenmeyer-Kimling 
and Jarvik reported 35 studies of siblings reared 
together and 2 studies of siblings reared apart. 
At our recent request, they kindly sent us their 
bibliographic references for each of these 37 
pass plus a summary list of the intelligence 
ests used by each of the investigators. 
ne list revealed that the 37 sibling studies 
tae been reported in 31 different publications 
Sia 1912 and 1962 and that a total of 15 
H erent tests had been used in 37 sets of com- 
es: Additionally, 4 of the 37 studies failed 
DE AA which test(s) yielded the sibling cor- 
Š a reported. Not surprisingly for this early 
Ae e test most frequently used in studying the 
ae sie between IQs of siblings was the Stan- 
te ” inet (13 studies). This test was followed 
Me 4 Otis (6 studies), Army Alpha (4 studies), 
Nati erman Group Test (2 studies), and the 
a Intelligence Test (2 studies). Finally, 
An of the following tests were used in 1 study: 

my Beta, IER Test of Selective and Rational 


ae for reprints should be sent to Joseph D. 

ae Department of Medical Psychology, Uni- 

0 y of Oregon Health Sciences Center, Portland, 
tegon 97201. 
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Thinking, South Africa Group Intelligence Tests, 
Volcabulary tests from the University of Min- 
nesota College Aptitude Test, Haggerty Tntelli- 
gence Examination, College Entrance Test, Cat- 
tell scales, Pintner Rapid Survey Intelligence 
Test, Test mosaique de Gille, and the CVB 
(Swedish abbreviated Wechsler-Bellevue). It is 
clear from this that excluding the Stanford-Binet, 
many of these tests are paper-and-pencil instru- 
ments that were designed for self or group rather 
than individual administration. 

The background for the present article is that 
since 1959 we have been involved in a patrolman 
selection program for the city of Portland. The 
6-hour patrolman examination includes a number 
of standard clinical psychological assessment 
instruments, one of which is the individually ad- 
ministered Wechsler Adult Intelligence Scale 
(WAIS). Noting recently that to date we had 
examined some 1,200 applicants between the ages 
of 21 and 31, we researched these records and 
found that 10 pairs of these applicants were 
brothers. Although relatively few in number, we 
decided to compute the correlation between pairs 
of WAIS scores on these 20 brothers for several 
reasons. First, because of the value of such data. 
As Alstrom (1961) has noted: 


By comparison with the immense quantity of (intel- 


igence) test investigations which have been carrie 

re 7 over the world, there have been extremely 
few carried out on representative series of families. . . . 
Every such study of a representative sample of 
families is therefore 0 


Second, as a test of measured intelligence, the 
WAIS is among the best such measures currently 
available. Third, unlike many studies of measured 
intelligence in families that individually examined 
(often using student examiners) the two family 
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members in one sitting and thus conceivably 
could introduce a bit of examiner halo effect or 
other bias, our “study” was neither planned 
nor were the siblings examined on the same day. 
That is, except for a single pair of fraternal twins 
among our 10 pairs who were examined on the 
same day, the WAIS examiner (ANW) for the 
present study, a board-certified clinical psycholo- 
gist, did not know at the time he examined the 
second applicant that the applicant’s brother also 
had been examined by him on a previous occasion. 

After we identified them, the 10 pairs of 
brothers were divided into Group A (first 
brother) and Group B (second brother). Descrip- 
tive data for the day of examination for Group 
A and Group B yielded the following means and 
ranges (in parentheses), respectively: ages = 24.0 
years (21-32) and 24.6 years (22-29), and edu- 
cation =13.2 years (12-16) and 13.7 years (12- 
17). The interval between testing of Brother A 
and Brother B ranged fom 0 to 83 months, with a 
mean of 17.4 months and a median of 10.5 
months. Thus our study involved 10 pairs of 24- 
year-old brothers with a mean of 13 years of 
education who were individually examined with a 
median interval of 10 months by a highly ex- 
perienced clinical psychologist who was unaware 
of the consanguineous relationship. 

Followers of Pearson’s studies on silbing cor- 
relation are aware that in his research he com- 
puted a double-entry Peason correlation, entering 
each pair of scores twice; first as XY then as 
YX. His procedure yielded the following values 
for our data: rs = .31, .05, and .48 (p= .02), for 
Full Scale IQ, Verbal IQ, and Performance IQ, 
respectively. 

We were less interested in such statistical 
sophistication, being more interested here in a 
standard Pearson correlation to satisfy our clini- 
cal curiosity as to what value it would reach for 
the WAIS given to 10 pairs of brothers, The 
value of the standard Pearson correlation that we 
obtained for the Full Scale IQ was .42. Although 
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a bit short of statistical significance with our V 
of 10 pairs, this is a value not unlike the median 
correlation of .49 yielded in the 37 studies re- 
viewed by Erlenmeyer-Kimling and Jarvik 
(1963) despite the fact that our 20 subjects 
earned IQs only in the top 50% of the range 
(i.e., from a low Full Scale WAIS IQ of 105 to 
a high of 126, with a mean of 112.6 for Group 
A and a mean of 115.2 for Group B). The cor- 
responding correlation for Verbal IQ of our 10 
pairs was .12 and for Performance IQ was .54, 
Other than small sample size, we could find no 
explanation in our data, such as one or two devi- 
ant cases, to explain this lower correlation for 
Verbal IQ. 

The data from the 13 studies that consisted of 
individually administered Stanford-Binets were 
probably the most robust among the 37 studies in 
the review by Erlenmeyer-Kimling and Jarvik 
(1963). The correlation of .42 for Full Scale IQ 
obtained by us with the equally robust WAIS in 
these 10 pairs of adult brothers adds one addi- 
tional datum, suggesting that the median correla- 
tion of .49 for siblings in the 37 studies reviewed 
by Erlenmyer-Kimling and Jarvik is a reliable 
value. This latter interpretation is especially im- 
portant inasmuch as our examiner was unaware 
that he was examining a sibling in all but one 
pair, and thus our gnawing concern since we be- 
gan rereading this literature on a potential ex- 
aminer halo effect in some of the earlier studies 
was put to rest by the results of the present 
study. 
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f focused attention on either affective-discomfort or 


The differential effectiveness o! 
hock-induced pain experiences was examined. In addi- 


sensory components of electric-sl 


tion, the impact of models displaying 
on reports of these different componen 
ponents led to lower pain tolerance, 


cepted during the disc 
effective in influencing 


Recent conceptualizations of pain experience 
distinguish between sensory-discriminative and 
iflective-motivational qualities. Experimental and 
inical research indicate that the discrimination 


‘between sensory qualities and affective attributes 


fs readily accomplished. This study examined 
Whether a potent form of social influence, ex- 
posure to tolerant or intolerant models, would dif- 
ferentially influence characterizations of physical 
as contrasted with affective qualities of pain 
experience, 

The social context was expected to be a more 


| potent determinant of characterizations of dis- 


comfort than physical qualities. This is because 
at discomfort appears to be more inclu- 
E anon experiential components of pain 
k characterizations of physical qualities, since 
oe the product of sensory, affective, and 
i ual factors. In addition, reports of discom- 
Renal ppear to have more important social func- 
ue and, consequently, should reflect 
oh Sensitivity to the social context. 
Er te investigation also examined the effects of 
| Aaa either discomfort or physical 
tion a sera on pain tolerance. Demonstra- 
ae ifferential attentional effects could have 
Fhe fee on the choice of cognitive strategies 
Sie sed for the self-management of pain. 
a eg were 30 male undergraduate student 
inteers. They participated in sessions with a 


ran’ investigation was supported by research 
: tom the Canada Council. 

ot ee for reprints and for an extended report 

EN study should be sent to Kenneth D. Craig, 

ment of Psychology, University of British 


‘Olumbi 
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tolerance or intolerance for pain ‘was examined 
ts of the experience. Attention to sensory com- 
ed t avoidance of shocks at lower current levels, and 
characterizations of the shocks as more painful than stronger current intensities ac- 
omfort rating task. The stral 
both sensory and affective reports of the experience. 


tegy of modeling tolerance was 


male model portraying a naive subject. He did 
not receive shocks, and his ratings were based on 
subjects’ responses. Subjects were assigned ran- 
domly to one of six groups based on a 3X2 
factorial. The first variable concerned whether 
the model assumed the role of a tolerant or in- 
tolerant companion also subjected to shocks or an 
inactive observer. The second variable was based 
on the two possible sequences of rating the elec- 
tric shocks on physical and discomfort intensity 
judgmental tasks. 

Electric shocks of a 1-sec duration were gen- 
erated by a 60-Hz stimulator and delivered 
through concentric electrodes to the volar surface 
of the right forearm. Three ascending series of 
shocks were delivered with each trial increasing 
by .5 mA until subjects reported that they could 
endure no further increases, 

Rating scales were described to subjects that 
provided appropriate labels for characterizing the 
shocks in terms of either physical intensity or af- 
fective discomfort. For judgments of physical 
intensity, labels read undetectable, low, moder- 
ate, high, and extremely high. For discomfort rat- 
ings, labels were not uncomfortable, uncomfort- 
able, very uncomfortable, painful, and intolerable. 
It was emphasized that the last switch was to be 
reserved for a current level beyond which they 
could endure no further shocks and that no shocks 
would be given after it was used. All subjects 
undertook both judgmental tasks in counter- 


balanced order. ; 
ted the in 


The model enac r 
maining one label ahead of the subject on the in- 
tensity scale. The tolerant role was enacted by 
essentially having in one label 
behind the subject. 

Additional quesi 


tolerant role by re- 


the monitor rema! 


tionnaires required evaluations 
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of the shocks on intensity and discomfort dimen- 
sions and ratings concerning other aspects of the 
study. 

Four-way analyses of variance were used to 
evaluate the effects of modeling, rating order, the 
two judgmental tasks, and the repeated shock 
series on pain tolerance. Significant differences 
were observed as a result of the two judgmental 
tasks, F(1, 24) = 4.84, p<.05, and the two 
modeling roles, F(2, 24) = 4.75, p< .01. Tukey 
(b) tests (œ= .01) indicated that the tolerant 
group accepted substantially greater shocks (M 
= 10.41 mA) than the intolerant group (M = 
6.68 mA) and the no-model group (M = 5.97 
mA), whereas these latter two groups did not dif- 
fer from each other (p>.10). There were 
neither order nor trial effects. 

Modeling effects were consistent whether judg- 
ments of physical intensity or subjective dis- 
comfort were required, with the biasing effects 
generally consistent across the range of judg- 
mental categories. The tolerant-modeling impact 
was sufficiently substantial for those exposed to 
the tolerant model to not characterize as even 
painful those shocks deemed to be intolerable by 
subjects paired with an intolerant model or no 
model. 

Tolerance levels were higher when attention 
was focused on personal discomfort (M = 8.12 
mA), as contrasted with judgments izi 
perceived physical intensity (M = 7.26 mA). The 
differential impact of the self-monitoring tasks 
indicated that the two sets of descriptors referred 
to discriminable states. 

Questionnaire analyses of the severity of dis- 
comfort experienced when subjects were involved 
in either the ratings of physical intensity or af- 
fective discomfort indicated that when discom- 
fort was being rated, the most intense shocks 
were described as less painful than when physical 
intensity was rated, F(1, 24) = 7.37, p < .05. 

The study clearly demonstrated the substantial 
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effects of exposure to a tolerant model and that 
the social influence strategy equally affected char- 
acterizations of sensory and affective qualities, 
Additionally, instructional sets to attend to dis- 
comfort led to greater pain tolerance than in- 
structions to describe the sensory quality of the 
experience, its physical intensity. Lower pain 
tolerance during intensity ratings was also char 
acterized as more painful than the current intensi- 
ties that provoked pain tolerance during the dis 
comfort rating task. 

A number of possible explanations could ac 
count for the superiority of the discomfort-rating 
task. Attending to personal discomfort may per- 
mit a more global and realistic appraisal of the 
experience. It also may have enhanced perception 
of the ability to control the experience, because 
subjective experiences tend to be seen as more 
amenable to volitional control than sensory ex 
periences. Attending to affective components of 
experience may also have facilitated preparation 
for subsequent noxious experience. Restricting at- 
tention to the physical intensity scale may have 
left subjects unprepared for the severity of 
shocks at more intense levels. 

Because clinical pain, in most instances, does 
not permit deliberate appraisal of low intensity 
levels, preparatory communications have been 
used effectively to manage stress reactions. Be 
cause of the problems involved in the choice of 
language to describe most probable reactions to 
noxious events, the present study indicates the 
need to instruct individuals to evaluate descrip: 
tions of others’ experiences within the context of 
their own responses to noxious stimuli. Attention 
to different components of the pain experience 
itself may differentially affect qualities of the 
experience. Attending to some of the affective 
qualities of the experience would appear to be an 
important component of effective self-regulation. 
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Comparison of the 
Self-concepts o 


Jerom 
Eagleville Hospital and Rehabilitati 


This study compared self-concept scores 


apeutic community. A multivariate ani 


nificant results (True/False ratio; 
were interpreted as indicating greater 


substance abuse patterns and self-concept. 


n increasing number of spokespersons within 
Substance-abuse field have advocated com- 
treatment, that is, treating alcoholics and 
addicts together in the same rehabilitation 
ram (Carroll & Malloy, 1977). This proposal, 
lever, raises a number of important questions, 
of which is, to what extent are the person- 
dynamics of these two groups similar/dis- 
lilar? 

ven though research studies have shown that 
holic or drug-dependent persons are low in 
esteem (e.g., Fitts, Arney, & Patton, 1973; 
binson, 1973), there has been little data pub- 
d comparing the self-concepts of alcoholics 
i drug-dependent clients seeking treatment. We 
mpted to fill this gap by comparing the Ten- 
ee Self Concept Scale (TSCS; Fitts, 1965) 
of alcoholic and drug-dependent men 
statistically controlling for the effects of age 
d race, 

The present study was undertaken at Eagle- 
le Hospital and Rehabilitation Center, & resi- 
Mitial therapeutic community whose abstinence 
tment program serves a near-equal number 
alcoholics and drug addicts. For a period of 6 
uutive months, all male clients with a diag- 
of either alcoholism or drug dependence 
tho completed the TSCS as part of the standard 
Hup test battery at EHRC were included in 
Study, 


# this study should be seni . F. X. 
Director Psychological Services, Eagleville Hospital 
Rehabilitation Center, Eagleville, Pennsylvania 


stance abuse, when age and race were statis! 
Psychosis; 
similarity than difference for the alcohol- 
drug-dependent men regarding self-concept. 
a multivariate rather than a bivariate design in exami 
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Similarities and Differences in the 
f Male Alcoholics and Addicts 


e F. X. Carroll, M. Israel Klein, and Yoav Santo 


ion Center, Eagleville, Pennsylvania 


from the Tennessee Self Concept Scale 


(TSCS) of 178 alcohol- and 156 drug-dependent male clients in an abstinent, ther- 
alysis of the TSCS scores indicated that sub- 


tically controlled, yielded only three sig- 
and Personality Disorder). These data 
and 


The data also indicated the value of using 
ining the relationships between 


Clients with a dual diagnosis of alcoholism 
and drug dependence were excluded from the 
study. So too were clients with severe reading 
problems, which prevented them from completing 
the TSCS. 

No attempt was made to differentiate among 
alcoholics regarding their preference for beer, 
wine, or liquor, nor did we attempt to differentiate 
among the varieties of drug-dependent clients, 
Most of the drug-dependent clients had been re- 
ferred for treatment due to heroin abuse, al- 
had abused other drugs as 
and minor tranquilizers). 
Classification of clients as being either alcoholic 
or drug dependent was based solely on the 
extant system Eagleville Hospital and 
Rehabilitation Center's Medical Records Depart- 
ment for classifying newly admitted clients. 

The sample consisted of 334 male clients; 178 
had been diagnosed as alcoholics and 156 as ad- 
dicts. With respect to race, 170 of the men were 
black and 164 were white, We divided our sample 
into four age groups (up to 23, 24-33, 34-44, 45 
and older) as had been done in a previous study 
of EHRC clients by Barr, Ottenberg, and Rosen 


well (e.g. 


significan: 
older clients were more 

Similarly, a conti 
tionship between race an 
these variables also were not independent, 
= 11.41, p < 001. Black clients were significantly 
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more likely to be alcoholics, and whites were 
more likely to be addicts. 

These results differ somewhat from reports 
received from the National Institute on Drug 
Abuse (NIDA) and the National Institute of 
Alcohol Abuse and Alcoholism (NIAAA). Re- 
ports from NIAAA indicate more white than 
black alcoholics are receiving treatment in feder- 
ally funded programs, although we believe the 
extent of the alcohol abuse problem among blacks 
has been underestimated by these data. The dis- 
tribubution of races for drug abuse reported by 
NIDA correspond with the findings we have re- 
ported. 

Having demonstrated that addiction was not 
independent of either age or race in our sample, 
a three-way analysis of variance was performed 
for each of the 29 TSCS scales to ascertain the 
unique contribution of addiction, age, and race, 
and their interaction effects on TSCS scores. 

Addiction, as a main effect independent of race 
and age, yielded significant main effects for three 
TSCS scales, namely True/False ratio, F(1, 319) 
= 3.51, p < .05; Psychosis, F(1, 319) = 4.07, p 
<.05; and Personality Disorder, F(1, 319) = 
4.07, p < .05. Alcoholics scored higher on the 
T/F ratio, indicating a more acquiescent response 
set, which we interpret to reflect a more passive 
and compliant coping style. The higher scores of 
the alcoholics on the Psychosis and Personality 
Disorder (empirical) scales indicated a greater 
degree of emotional distress, poorer reality con- 
tact, greater depression and emotional lability, 
greater mental confusion, higher levels of suspi- 
cion, and more personality weaknesses and vulner- 
abilities than that observed for addicts. 

Race as a main effect, independent of substance 
of abuse and age, yielded two significant main 
effects, Total Conflict, F(1, 319) = 5.14, p <.05 
and Personality Integration, F(1, 319) = 4.00, $ 
< .05. Black males evidenced greater confusion 
contradiction, and general conflict in self-percep- 
tion than did white males. Black males also scored 
lower than white males on the Personality Inte- 
gration scale. 

There were no ‘Significant main effects due to 
age. This finding is in sharp contrast to that re- 
Ported in a companion study (Carroll, Santo, & 
Klein, Note 2) using the Personality Research 
rea aS a measure of normal personality needs. 

pparently age exercises a greater influence when 
pula Personality needs are measured and ex- 


Concerning interaction effects, only two sig- 
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nificant results were observed. A significant in- 
teraction for race and age on the Behavior scale 
was obtained, F(3, 319) = 2.98, p< .05, with 
white males in the 24-33 age group scoring 
highest, and black males 23 years or younger 
and 45 years or older scoring lowest. We interpret 
the significant interaction effect as indicating that 
the greatest degree of guilt regarding behavior 
was borne by the two extreme age groups of 
blacks. 

A significant interaction effect for race and 
addiction was observed on the Personality Inte- 
gration scale F(1, 319) = 4.94, p <.05. In this 
instance, black alcoholics scored lowest, and 
white addicts scored highest. 

While not minimizing the differences noted 
above, it is nonetheless important to note that 
nearly all of the scales that had been significant 
in various preliminary bivariate analyses (eg, 
comparing race, age, or substance abuse and self- , 
concept) ceased to be significant in the multi- 
variate analysis. When race and age are control- 
led, therefore, alcohol- and drug-dependent men 
appear to be more similar than dissimilar with 
respect to their self-concepts. 
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Twelve drug abuse scores obtained 


types of drugs. Multiple 


MMPI profiles of both groups 
less highly elevated than, 
suggesting certain differences 
countered in other settings. 


vious research on the average Minnesota 
sic Personality Inventory (MMPI) pro- 
the common profile types among drug- 
t individuals has focused primarily on 
Otic addicts and has led to disagreement over 
ons of both the existence of personality 
peculiar to addicts and the relative im- 
nce of psychopathic versus neurotic and/or 
Wid components of addict profiles. However, 
abuse is a multidimensional phenomenon, 
the classification of subjects based on mul- 
measurements reflecting types of drugs and 
ers of use would be expected to yield 
accurate descriptions of drug abuse pat- 
MS and correlates than has been possible by 
ing the profiles of extreme groups of in- 
s designated as either addicts or non- 


ight of the foregoing considerations, the 
‘abuse histories of 215 prisoners undergoing 
Presentence evaluation at the California 
ion for Men, Chino, were quantified by 
each of three major classes of drugs 
8, nonopiate “hard” drugs, and cannabis) 


ful acknowledgment is given to Hung Tran 
ta processing assistance and to Kathleen Mc- 
4 clerical support. 
ough conducted under the auspices of the 
tment of Corrections, the opinions expressed 
the views of the author and do not neces- 
Teflect the official position of the California 
nent of Corrections or the Health and Wel- 
ency. 
$ a for reprints should be sent to Terrill R. 
) California Institution for Men, Box 128, 
California 91710. 
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ve dr from each of 215 prisoners were factor analyzed, 
resulting in two factors describing the lifetime degree of use of cannabis versus opiate 
discriminant analysis of Minnesota 
Inventory (MMPI) profiles versus drug abuse patterns indicated a moderate, unidi- 
mensional relationship between these two sets of variables (Re = 38, p < .05). The 
of opiate users were configurally similar to, though 
those identified in previous research with narcotic addicts, 
between the present sample and drug abuse cases en- 


Multiphasic Personality 


for four parameters of use (frequency of use, 
method of administration, age at first exposure, 
and years of use) using a modification of the 
scale described by Gunderson, Russell, and Nail 
(1973). The 12 scores obtained in this fashion 
were standardized and subjected to a principal 
components analysis with varimax rotation, re- 
sulting in two factors with eigenvalues greater 
than one. As shown in Table 1, cannabis parame- 
ters of use load most highly on Factor 1, opiate 
parameters load on Factor 2, and nonopiate hard 
drug parameters load on both factors, indicating 
that a two-dimensional model of drug abuse is 
appropriate for the present subjects and method 
of measurement. Since parameters of use are not 
statistically distinct from each other, and since 
the use of nonopiate hard drugs is not distinct 
from the use of cannabis and opiates, the 12 drug 
scores reduce to two underlying dimensions de- 
scribing the lifetime degree of use of cannabis 
versus opiate types of drugs. 

Calculating factor scores for individuals and 
dividing the distributions of these scores at the me- 
dian resulted in four groups defined by levels and 
combinations of the two drug abuse dimensions. 
The groups and their associated sample sizes and 
MMPI Welsh codes are as follows: low canna- 
bis—low opiate 
KFL/; high cannabis — low © 
49-—856273 1/KFL/; low cannabis — high 
opiate (n= 24) = 428967—53 1/F —KL/; 
and high cannabis — high opiate (n= 62)= 4 
9578—263 1/FKL/. Multiple discriminant 
analysis of these profiles yielded a single sig- 


nificant dimension of group difference (R= 38; 
A=.78), x7(36) = 51.31, p < 05, the composi- 
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Table 1 
Varimax Rotated Factor Loadings for Drug 
Abuse Scores 


Score Factor 1 Factor 2 

Opiate rate 241 934 
Opiate method .250 935 
Opiate age 241 -889 
Opiate duration 195 919 
Hard drug rate -680 594 
Hard drug method 664 641 
Hard drug age 718 513 
Hard drug duration 659 -600 
Cannabis rate .875 «200 
Cannabis method -905 -158 
Cannabis age 864 156 
Cannabis duration -883 .280 

Eigenvalue 8.208 1,910 

Proportion of variance 684 159 


tion of which may be interpreted on the basis of 
“structure coefficients,” that is, bivariate cor- 
relations between the total discriminant score 
and each of the original variables (Cooley & 
Lohnes, 1971). In this respect, the groups are 
maximally discriminated along a dimension that 
is correlated .30 or greater with Scales F(r= 
— 39), D == 32), Pd(r=—.73), Pa(r = 
— 30), Pt = — .48), and Sc(r = — -36), with se- 
lective opiate users manifesting the highest score 
in each instance. One-way univariate analyses of 
variance resulted in statistically significant F ratios 
for D (p < .05), Pd (p< .001), and Pt (p< 05). 

The- findings indicate a moderate, unidimen- 
sional relationship between pattern of drug abuse 
and MMPI performance. To the extent that these 
two sets of variables are related to each other, 
nondrug users and selective cannabis users exhibit 
the least pronounced personality liabilities, 
whereas selective and nonselective opiate users 
manifest the most conspicuous deficiencies, 
Further, although all four group profiles are sug- 
gestive, to a greater or lesser degree, of social 
nonconformity, the selective opiate group is char- 
acterized by mild, though noteworthy, additional 
elements of subjective distress and disturbed 
thinking. 

The configurations of the MMPI Profiles of 
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both of the present groups of opiate users are 
similar to the major profile types identified 
through cluster analysis by Berzins, Ross, English, 
and Haley (1974) among narcotic addicts, thus 
providing additional support for the hypothesis 
that opiate dependence occurs within more than 
one personality context. Nonetheless, the present 
profile elevations are considerably lower than 
those obtained in both the Berzins et al. study 
and in most previous research on the relationship 
between drug abuse and MMPI performance, 
These differences might be attributed partially to 
the profile-suppressing variables of detoxifica 
tion, nonvolunteer status, and incarceration in 4 
relatively protected environment. However, this 
trend is also consistent with the tendency of 
presentence cases of the type studied to exhibit 
a generally lesser degree of psychopathology on 
the MMPI than most other groups of incarcer 
ated offenders (Holland & Holt, 1975), Factor 
probably responsible for this phenomenon are the 
motivation of the subjects to present themselves 
in a favorable light to their evaluators, optimism 
resulting from the possibility of returning home 
on probation in the forseeable future, and an 
element of selectivity in which many offenders 
who are clearly emotionally disturbed are often 
incorrectly perceived by the courts as being 
dangerous and therefore are given lengthy sem 
tences without first being referred for presententt 
evaluation through the state prison system. 


References 


Berzins, J. I., Ross, W. F., English, G. E. & ea 
J. V. Subgroups among opiate addicts: A Baa 
logical investigation, Journal of Abnormal Psy 
chology, 1974, 83, 65-73. Bisa 

Cooley, W. W., & Lohnes, P. R. Multivariate data 
analysis. New York: Wiley, 1971. ; ion 

Gunderson, E. E., Russell, J. W., & Nail, R. w 
drug involvement scale for classification of cad y 
users. Journal of Community Psychology, 1973, 4i 
339-403. 4 

Holland, T. R, & Holt, N. Personality ee 
among short-term prisoners undergoing ne th 
tence evaluations. Psychological Reports, 197% 
827-836. 


Received June 20, 1977 1 


a 


of Consulting and Clinical Psychology 
fg, Vol. 46, No. 3, 579-581 


MMPI Evaluation of 5-Year Methadone Treatment Status 


Gennaro Ottomanelli, Peter Wilson, and Richard Whyte 
Department of Psychiatry, Downstate Medical Center 
State University of New York, Brooklyn 


Treatment outcome over a 5-year period was evaluated for 148 first admissions to a 
methadone treatment program. Eleven patients (7%) were successful treatment com- 
pletions, 16 patients (11%) transferred to other methadone programs, 38 patients 
(26%) remained in continuous treatment, and 83 patients (56%) were unsuccessful 
treatment terminations. Discriminant analysis using the Minnesota Multiphasic Per- 
sonality Inventory (MMPI) suggested that the more stable patients at admission had 


| The questions formulated for research were 
ft) Is the Minnesota Multiphasic Personality 
Inventory (MMPI) useful in predicting long- 
fem treatment outcome for methadone patients? 
do, does the MMPI present different group 
laracteristics representative of different treat- 
Ment outcomes? and (b) Given a group of 
Methadone patients in long-term treatment, does 
4 MMPI demonstrate change over time on the 
tł 


Personality dimension? 
“The sample consisted of 148 narcotic addicts 
admitted to a methadone clinic at the Kings 
County Addictive Disease Hospital, Brooklyn, 
New York, during 1971. The sample represented 
Voluntary first admissions to the methadone 
Minic. The criteria for program admission were 
Vor more years of admitted drug use, age over 20 
Years, and absence of overt:psychosis (determined 
A hitri evaluations). The sample con- 
, ed of 110 (74%) males and 38 (26%) 
ats with a mean age of 27.14 years (SD= 
E 73 (49%) were white, 43 (29%) were 
ns and 32 (22%) were Hispanic. The sam- 
tle had a mean of 2.19 months (SD = 3.51) of 
tmployment in the 1 year prior to admission, a 
Mean of 3.11 arrests (SD = 4.02) prior to ad- 
Mission, and a mean length of drug use of 94.07 
months (SD = 55.8 months). 
A e MMPI and a questionnaire eliciting 
tmographic information were administered 
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the best treatment outcome. For the patients in continuous treatment, MMPIs ad- 
ministered at 6-week, 6-month, and 5-year intervals indicated that this group of 


patients did not change on the personality dimension. 


within 6 weeks after admission. Another ques- 
tionnaire eliciting employment and arrest in- 
formation, housing stability, current sources of 
income, and methadone-related attitudes sup- 
plemented the MMPI administration at the 5- 
year follow-up. MMPI profiles were classified as 
jnvalid on the basis of an F score > 22. 
Treatment outcome over a 5-year period for 
the group of 148 patients showed that 11 patients 
(7%) completed treatment, 38 patients (26%) 
were in continuous treatment for 5 years, 83 
patients (56%) were unsuccessful treatment 
terminations (patients discharged for an assort- 
ment of administrative and disciplinary reasons), 
and 16 patients (11%) had transferred to other 
methadone programs. Of the 148 patients ad- 
mitted to the study, 38 patients did not co- 
operate with the MMPI testing. This patient 
group consisted of 29 males (76%) and 9 females 
(24%); 14 patients (37%) were white, 15 
(39%) were black, and 9 (24%) were Hispanic. 
Five-year treatment outcome for this group 
showed that 3 patients (8%) were treatment 
completions, 4 patients (10%) were in continu- 
ous treatment for 5 years, 29 patients (76%) 
were unsuccessful treatment terminations, and 2 
patients (5%) transferred to other programs. 
The remaining 110 patients completed the 
MMPI within 6 weeks after admission. Five- 
year treatment outcome for this group showed 
that 8 patients (7%) were treatment comple- 
tions, 34 patients (31%) were jn continuous 
treatment for 5 years, 54 patients (49%) were 
unsuccessful treatment terminations, and 14 
patients (13%) transferred to other programs. 
The demographics of the 110 MMPI patients 
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classified according to treatment outcome are as 
follows: (a) The 8 patients in the treatment-com- 
pleted group consisted of 7 males (88%) and 1 
female (12%); 3 were white (37%), 3 were 
black (37%), and 2 were Hispanic (25%). (b) 
The in-treatment group consisted of 25 males 
(74%) and 9 females (26%); 20 were white 
(59%), 5 were black (15%), and 9 were His- 
panic (26%). (c) The unsuccessful treatment 
group consisted of 41 males (76%) and 13 fe- 
males (24%); 27 were white (50%), 15 were 
black (28%), and 12 were Hispanic (22%). (d) 
The transferred group contained 14 patients, 7 
males (50%) and 7 females (50%); 9 were white 
(64%) and 5 were black (36%). 

Discriminant analysis was used to evaluate 
treatment outcome based on the MMPIs ad- 
ministered in the 6th week after admission. The 
groups were sorted on the basis of 5-year treat- 
ment outcome, that is, treatment completed 
(successful treatment effort), currently in treat- 
ment, and unsuccessful treatment terminations. 
One patient, in each of the treatment completed, 
active, and unsuccessful treatment groups was 
eliminated from the discriminant analysis be- 
cause of invalid MMPIs (F > 22). A chi-square 
analysis compared the accuracy of the classifica- 
tion matrix of the discriminant functions with the 
expected frequencies based on chance and was 
found to be statistically significant, x7(4) = 
27.83, p<.01. 

Comparison of the means of the MMPI scales 
for the patients grouped on the basis of treat- 
ment outcome showed that the patients in the 
successful treatment group (n=7) had their 
highest mean T scores on the Pd (70.00) and Ma 
(66.29) scales, whereas the remaining eight 
MMPI clinical scales ranged from a mean of 64 
(D) to a mean of 50.14 (Si). For the patients 
who remained in treatment (n = 33), the highest 
mean T scores were obtained on the D (71.48) 
and Pd (74.21) scales, and the remaining eight 
MMPI clinical scales ranged from a mean of 
66.03 (Ma) to a mean of 57.36 (Si). For the 
patients in the unsuccessful treatment group (n 
= 53), the highest mean T scores were obtained 
on the Pd (72.34) and Ma (70.85) scales, and 
the remaining eight MMPI clinical scales ranged 
from a mean of 68.07 (Sc) to 54.30 (Si). Gen- 
erally, the scales satisfied clinical expectations on 
two counts: (a) The patients with the best treat- 
ment outcome, that is, the treatment completed 
group, were the most stable at the time of admis- 
sion, since this group had the lowest mean T 
Score on 8 of 10 clinical scales of the MMPI; 
and (b) there was a linear trend (i.e., the gate 
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tionship approaches a straight line) for the three 
groups on the F, Pa, Pt, and Sc scales that rank 
patients on emotional traits, leading to a Greater 
probability of acting out and concomitant greater 
probability of unsuccessful treatment. 

Although attempts were made to retest the 34. 
patient MMPI in-treatment group, only 24 pa- 
tients in this group provided valid 6-week and 5. 
year MMPIs; within this subgroup were 13 
patients who completed the MMPI at 6-week, 
6-month, and 5-year intervals. Although the 24- 
patient group showed increased scores on K, Hs, 
D, Hy, Pd, Mf, Pa, Pt, and Sc from the 6-week 
to the 5-year testing interval, Hotelling’s T? for 
a one-sample test-retest design using 12 scales 
(K was excluded from the analysis) was not 
statistically significant, F(12, 12) = 2.06, p> 
.05. 

Since the 13-patient subgroup had been tested 
on three occasions in the follow-up period, an 
additional analysis was conducted on these three 
test occasions. The comparison of 12 MMPI 
scores (the K scale was excluded) for 6 weeks 
and 6 months was not statistically significant; 
Hotelling’s T? for one sample, F(12, 1) = 47.82. 
Likewise the comparison of the 12 MMPI scores 
for the 6-month and 5-year intervals was not 
Statistically significant; Hotelling’s T? for one 
sample, F(12, 1) = 2.24. 

Aside from the patients who transferred to 
other institutions, the MMPIs of the three groups 
appear to be in accord with clinical expectations, 
that is, the most stable group at admission con 
sisted of those patients who successfully com- 
pleted treatment. Those patients who remaine 
in treatment had their highest MMPI elevations 
on the D and Pd scales. For this group the 
pharmacology of methadone as a potent analgesic 
may have been an influential factor. The aa 
cessful treatment group had their highest ee 
elevations on the Pd, Sc, and Ma scales, with y ; 
accompanying higher probability of acting $ 
and higher probability of unsuccessful treatmen 
termination. wes 

The follow-up of the 24-patient group inc f 
tinuous treatment for the 5-year period did fe 
demonstrate substantive gains on the personal 7 
dimensions measured by the MMPI. When 3 re 
considers the analysis of the MMPIs of the 
patient group and the analysis of the 1 
subgroup, it appears that the patients ae 
unchanged on the personality eae ce 
week, 6-month, and 5-year intervals. Fane 
the results of the present 148-patient stu ea 
subject to the limitations imposed by 4 


3-patient 
remaine! 


with patients cooperating on a 
, the findings of the present study 
methadone stabilization is most 
T the small proportion of patients with 
nd social strengths. Methadone pati- 
e treatment manageable but lacking 
y and social strengths remain in 
“preserve marginal social function- 
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ing, with minimal improvement resulting from 
long-term stabilization. The marginal social func- 
tioning of this group could deteriorate if treat- 


‘ment efforts are terminated. The largest group of 


patients studied were treatment unmanageable 
and unsuccessful treatment terminations. 
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Stimulus-Seeking Behavior in Three Delinquent Personality Types 


David A. Shostak 
University of Virginia 


Curtis W. McIntyre 
Southern Methodist University 


This study examined stimulus-seeking behavior in three delinquent personality types 
(psychopathic, neurotic, and socialized) drawn from three populations (juvenile de- 
linquent, young adult offenders, and college). Results from a kinesthetic aftereffect 
task and the Sensation Seeking Scale indicate only limited evidence of pathological 
stimulus seeking by the psychopathic delinquent personality type. We suggest that 
this limited evidence results from the incarceration experiences of the young adult 


offenders. 


Three delinquent personality types (unsocial- 
ized psychopathic, disturbed neurotic, and so- 
cialized delinquent) have been described by Quay 
(1965). Moreover, Quay has suggested that the 
behavior of the unsocialized psychopath results 
from pathological stimulus seeking. 

More recently, individual differences in stimu- 
lus-seeking behavior have been investigated by 
two personality researchers, Petrie (1967) and 
Zuckerman (1971). Petrie has shown that indi- 
viduals differ in their tendency to reduce or to 
augment stimulation received during a kinesthetic 
aftereffect task (KAE). Reducers subjectively 
diminish their sensory input and seek higher levels 
of stimulation. Augmenters subjectively increase 
their sensory input and avoid higher levels of 
stimulation. Zuckerman has shown that indi- 
viduals differ in the optimal level of stimulation 
they seek as measured by his Sensation Seeking 
Scale (SSS). 

In the present study, Quay’s suggestion that 
the unsocialized psychopath is a pathological 
stimulus seeker was tested by administering the 
KAE and SSS to all three delinquent personality 
types. The general expectation was that unsocial- 
ized psychopaths would be reducers on the KAE 
and high scorers on the SSS. 

Three male populations were used: juvenile of- 
fenders (M age=15 years 2 months), young 
adult offenders (M age = 20 years 6 months), and 
college students (M age = i9 years). Ten sub- 
jects of each delinquent personality type were 
selected within each population using the Quay 
and Peterson Personal Opinion Questionnaire. 


Following selection, each subject took KAE 
and SSS. $ ane 
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Separate one-way analyses of variance were ap- 
plied to each population to assess stimulus-seek- 
ing differences between the three delinquent per- 
sonality types. With respect to the KAE, the 
only significant difference was found for the 
young adult offenders, F(2, 36) = 3.39, p < 05. 
On this task the unsocialized psychopaths re- 
duced most; that is, they sought higher levels of 
sensory input. A general tendency to reduce was 
observed for both the juvenile and young adult 
offender populations. 

With respect to the SSS, no differences be- 
tween the three delinquent personality types were 
found for any of the three populations tested. 
Evidently, sensation seeking (as measured by 
the SSS) was uniform across all three diagnostic 
categories, In addition, correlations between the 
KAE and each subscale of the SSS failed to 
Teach significance for any of the personality types 
and populations. Evidently, the stimulus-seeking 
behaviors assessed by these measures were in- 
dependent. 

Tn conclusion, only limited evidence of patho- 
logical stimulus seeking by Quay’s psychopathic 
delinquent personality type was found. More- , 
over, we suggest that this limited evidence may 
not be etiological in nature but may result from 
the longer incarceration (or isolation) experi- 
ences of the young adult offenders. 
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Validity Generalization of the WISC-R Factor Structure 
with 10%-Year-Old Children 


David A. Shiek and John E, Miller 
Western Kentucky University 


The purpose of this study was to investigate the robustness of the Wechsler Intelli- 
gence Scale for Children-Revised (WISC-R) factor structure. The generalization 
sample was composed of 126 10ł-year-old children of which 62 were male and 64 
were female. A principal-components method of factor analysis yielded three reliable 
factors. Comparisons of the loadings obtained with the generalization sample and the 

| 104-year-old national standardization sample suggest a high degree of similarity in 
composition, magnitude, and pattern. The findings highly support the robustness of 
the WISC-R’s factor structure across divergent 104-year-old samples. 


The purpose of this study was to investigate 
the robustness of the factor structure of the 
‘Wechsler Intelligence Scale for Children—Revised 
(WISC-R). The factor analysis of the standard- 
Nation data (Kaufman, 1975) yielded three stable 
‘tors: Verbal Comprehension, Perceptual 
Organization, and Freedom from Distractibility. 
The present sample was composed of 126 children 
‘With a mean age of 10.6 years from lower to 
jlower-middle-class homes in the central south- 
‘astern United States, The sample was composed 
tf 62 males and 64 females of which 87 were 
white and 39 were black. A preliminary analysis 
indicated two basic differences between this 
‘imple and the standardization sample: The Ver- 
‘il, Performance, and Full Scale IQs were sig- 
ty lower, and the variances on the Per- 
mance and Full Scale variables were sig- 
jtificantly restricted. 
. primary analysis consisted of a principal- 
poent method of factor analysis with 
n multiple correlations in the diagonals 
nd a varimax rotation procedure. This procedure 
Melded three reliable factors with eigenvalues 
‘Metter than 1.0. The first factor (Verbal Com- 
oo consisted of Information, Vocabulary, 
hits arities, and Comprehension. The second 
$ Bk erepta Organization) was composed 
an ck Design, Object Assembly, Picture Com- 

ion, Mazes, and Picture Arrangement. The 
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third factor (Freedom from Distractibility) in- 
cluded Coding, Arithmetic, and Digit Span. Com- 
parisons of the factor structure obtained with the 
generalization sample (this study) and Kaufman’s 
(1975) 104-year-old national standardization 
group were made, Visual comparisons of the load- 
ings obtained on the two analyses suggested a 
high degree of similarity in the loadings. Vector 
comparisons yielded an intraclass coefficient of 
89, indicating a high degree of similarity in pat- 
tern and magnitude in the factor loadings. Matrix 
comparisons indicated that the three factors were 
similarily delinated in both analyses. 

When compared to Kaufman's (1975) findings 
and conclusions, the factor structure obtained 
with the generalization sample was highly con- 
sistent. Only insignificant variations in the two 
solutions were present, and the factor structure 
of the WISC-R appeared quite stable across the 
two relatively divergent 10}-year-old samples. 
The conclusions highly supported the robustness 
of the WISC-R’s factor structure and Wechsler’s 
original proposition that his scales assess verbal, 
performance, and nonintellectual functions. The 
existence of such a stable factor structure sug- 
gests the possibility of exploration as to the clini- 
cal usefulness of factor scores in differential 


diagnosis and prediction. 


Reference 


Kaufman, A. S. Factor analysis of the WISC-R at 
11 age levels between 64 and 164 years, Journal of 
Consulting and Clinical Psychology, 1975, 43, 135- 


147. 
Received September 14, 1977 m 


‘Association, Inc. 0022-006X/78/: '4603-0583$00.75 


583 


Journal of Consulting and Clinical Psychology 
1978, Vol. 46, No. 3, 584-585 


Comparison of Self-administered and Examiner-administered 
Depression Adjective Check Lists 
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Self-administered and examiner-administered Depression Adjective Check Lists were 
compared. One half of each equivalent form A, B, C, D was administered in the 
standard manner and one half was read by the examiner to 64 male and 64 female 
psychiatric inpatients. The Wechsler Adult Intelligence Scale Vocabulary subtest was 
administered at the same time. A repeated measures analysis of variance revealed no 
significant effects for vocabulary (median split), sex, or method of administration. The 
significant main effect of lists seems best understood as a chance finding. The results 
support the use of the examiner-administered method in cases of functional illiteracy. 


The standard method of administration of a 
number of self-administered personality measures 
used in research and in clinical practice is not 
feasible in the case of persons who are function- 
ally illiterate. The possible effects of substituting 
examiner administration for self-administration 
is an important but relatively unexplored ques- 
tion, Earlier work comparing the two methods of 
administering the Rotter Incomplete Sentences 
Blank (Flynn, 1974) and the Minnesota Multi- 
phasic Personality Inventory (Reese, Webb, & 
Foulks, 1968) found no significant differences. 
Unfortunately, these instruments do not have 
alternate forms, and the findings might be the 
result of practice effect. In addition, these find- 
ings were established with “trait” measures; they 
might not generalize to “state” measures. 

The Depression Adjective Check Lists (DACL; 
Lubin, 1967) were designed as brief, self-admin- 
istered measures of transient depressive mood. 
Their alternate forms (Lubin, 1967; Lubin, 
Dupre, & Lubin, 1967) make them useful in 
comparing administration methods, 

Sixty-four male and 64 female Psychiatric in- 
patients of a large state hospital were individually 
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administered Set 1 of the DACL (Forms A, B, C, 
and D) in random order. Each form was divided 
so that one column was administered verbally by 
the experimenter and one column was self-ad- 
ministered (written) by the subject. The order of 
columns was counterbalanced. The Vocabulary 
subtest of the Wechsler Adult Intelligence Scale 
was then administered to study the effect of 
verbal fluency on DACL scores. A median split 
of the scaled scores was used to analyze the effect 
of this factor. 

A repeated measures analysis of variance re- 
vealed a significant main effect of lists, F(3, 372) 
= 23.09, p<.001, but no significant effects of 
vocabulary, sex, or method of administration on 
the depression scores, The significant main effect 
of lists was not consistent with earlier findings 
(Lubin, 1967; Lubin et al., 1967). It seems best 
understood as a chance finding and thus uninter- 
pretable, 

There was no significant difference between 
self-administered (M = 5.80, SD=4.23) and 
examiner-administered (M = 5.53, SD = et 
depression scores. In fact, the self-administere 
scores averaged only about 4 point higher on the 
standard DACL (range = 0-34) than the an 
aminer-administered score. The trend of F(1, 124) 
= 3.42, p < .07, should be interpreted within the 
perspective of the high intercorrelations arozi 
lists across administrations (mean 7 = 85, n= 
12). 

These results strongly suggest that the Ne 
methods of administering the DACL are similar, 
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port the use of the examiner method 
tion in cases of functional illiteracy.* 
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Early Childhood Autism and Structural Therapy: 
Outcome After 3 Years 


Alan J. Ward 
Henry Horner Children’s Center, Chicago, Illinois 


The effect of 3 years of structural therapy on 21 inpatient cases of early childhood 
autism (ECA) is examined. Treatment resulted in the discharge of 12 patients. Details 
of treatment procedure, therapeutic progress, and their effects on diagnostic and prog- 
nostic conceptualizations are presented, Comparisons are made among previous reports 
of attempted treatment of ECA, as well as the results of two other treatment units 
in the same setting. Results support the hypothesis that the high stimulation, phys- 
ically intrusive, gamelike, novelty filled, and developmentally oriented treatment ap- 
Proach of structural therapy is capable of producing a significant improvement in 


cases of ECA. 


This article is a preliminary report on the out- 
come of the application of structural therapy to 
the residential treatment of early childhood 
autism. The term early childhood autism (ECA) 
is used to include both the classically defined 
rare cases of early infantile autism (EIA), as 
well as the much more common cases of organic 
autism and the variously handicapped and/or 
disturbed children who are mistakenly labeled as 
suffering from EIA. 

The treatment of EIA is a topic that has 
usually aroused great feelings of futility. 

Examination of outcome data reported by both 
Bettelheim (1967) and by Eisenberg (1956) re- 
vealed strongly contrasting findings. Eisenberg’s 
follow-up evaluation was divided into the three 
categories of poor, fair, or good outcome, 

Bettelheim has reported outcome figures on a 
group of 40 “autistic” children and has used the 
categories devised by Eisenberg, A good outcome 
was reported for 17 children (42%); a fair out- 
come, for 15 children (38%); and a poor out- 
come, for only 8 patients (20%). These out- 
come figures are in marked contrast to those re- 
ported by Eisenberg of 5% for a good outcome 
22% for a fair outcome, and 73% for a poor out- 
come, 

A new therapeutic approach labeled structural 
therapy (Ward, 1970) was used in the develop- 
ment of a treatment program for EIA children. 

Evaluation of the 21 original children in this 
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program revealed only 4 children who met the 
research definition of EIA, which was (a) lack of 
the development of object relations from birth; 
(b) lack of the use of speech for communication; 
(c) maintenance of sameness via stereotypic be- 
havior with a rage or withdrawal reaction on 
interruption, and (d) no major neurological dys- 
function. The behavioral characteristic of “lack 
of affective response” is one that was evaluated 
and agreed on by both me and the co-director of 
the program, H. Allen Handford. This evaluation 
was based on clinical interview, play observation, 
family interview, and review of clinical referral 
material that included social history, psychologi- 
cal evaluation, psychiatric evaluation, and pedi- 
atric neurological evaluation. The other children 
fell into the diagnostic categories of childhood 
schizophrenia (5) primary retardation (7), 
secondary retardation (3), and developmental re- 
tardation associated with diffuse brain damage 
(2). However, all of these children were found to 
display the behavioral characteristics of a “lack 
of affective response,” whereas children from all 
five of the above diagnostic categories were found 
to display the characteristics of “lack of object 
relationships,” “lack of the use of speech for 
communication,” and of having come from an 
“unstimulating mother/infant relationship.” The 
4 EIA cases were distinguished from the other 
disturbed children by their combination of (a) 
lack of neurological dysfuncition and their (b) 
maintenance of sameness via stereotypic behavior. , 
The total research unit was organized according 
to the precepts of structural therapy. The eee 
emphasized spontaneous physical and ven 
stimulation applied to the children in a see 
and gamelike fashion. The goal of this approac 
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s increase the amount of varied and novel 
received by these children and to use 
eased stimulation to makè them more 
their external environment and to help 
progress from their positions of early 
mental fixation, The physical stimulation 
ed to develop body image and bodily 
and to help provide the body ego, 
ppears to be necessary for the develop- 
‘higher ego functions. Twelve of the 21 
were seen in individual therapy on a 
eekly basis, although often the children 
en on an informal basis five times a week. 
children were seen for sessions in speech 
on a twice weekly basis for periods rang- 
3 months to 2 years. 
ies were contacted on a weekly basis by 
social workers, Group counseling was pro- 
n a biweekly basis for all of the parents, 
e majority of them had a weekly day visit, 
ht visit, or weekend visit with their child. 
asic goal of this structural therapy treat- 
ogram has not been “cure” but develop- 
TA and ECA are viewed as severe devel- 
disorders of the same kind as are often 
fed with the rubella child, the blind child, 
> . The basic etiology is considered 
“rooted in a deficit of novel and varied 
multiple reasons. 
nt program attempted to help 
to progress to the point where 
Achieved the goals of (a) relationships with 
le; (b) self-care such as toilet training, 
„ and dressing; (c) communication of 
needs in a consistent manner; and (d) 
pacity to follow simple directions. The 
ent of these goals revealed a child who 
Still functioning below age-appropriate level 
d to cognitive and affective behavior. The 
as now at a point in development in which 
‘Onventional play therapy, speech therapy, 
special education could be used. It was 
t these children should be viewed as being 
he midst of their therapeutic course at the 
f discharge. They were discharged at this 
because the institution was no longer able 
ide the needed therapeutic level of stimu- 
Those children who were discharged to 
omes in the community seemed appropriate 
Cement in Eisenberg’s category of fair out- 
& Each child was used as his or her own 
Ol in this research, but some comparisons 
älso be drawn with two other treatment 
in the same setting, that have attempted 
rk with autistic children (ECA). 
application of 3 years of structural therapy 
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to the original group of 21 cases of ECA resulted 
in the discharge to homes in the community of 
12 of the original 21 children (57%). These 
children were placed in normal nursery schools, 
special classes for the retarded or the emotion- 
ally disturbed in public schools and private 
schools, and sheltered workshops run by the local 
association for retarded children. The families 
were referred to the appropriate agencies for 
continued counseling. 

The experimental unit was Unit A, which had 
a population of boys and girls with a mean age of 
8.9 years and 10.2 years, respectively, at admis- 
sion and whose mean length of prior hospitaliza- 
tion was 1.08 years and 1.92 years. Inspection of 
comparable data on Comparison Units B and C 
revealed little difference as to mean age but a 
great difference as to length of prior hospitaliza- 
tion. These data suggest that the children on 
Unit A were more severely disturbed than those 
on Unit B or Unit C. The boys on Unit B had a 
mean age of 7.2 years on admission and a mean 
length of prior hospitalization of 6.5 months, 
whereas the girls on Unit C had a mean age of 
8.9 years on admission and a mean length of 
prior hospitalization of 5.9 months. 

Between September 1966 and September 1969, 
Unit A discharged 12 cases of ECA to home in 
the community, whereas Unit B discharged 2 
boys and Unit C discharged 5 girls. A compari- 
son of the outcome figures of Eisenberg and 
Bettelheim, vis-à-vis Units A, B, and C, revealed 
that Unit A exceeded both Eisenberg’s and Bettel- 
heim’s results in regard to the percentage of 
children who achieved a fair outcome, Unit C’s 
outcome rate of 25% approximated the outcome 
rate reported by Eisenberg (22%), but Unit B’s 
outcome figure of 9% fell markedly below that. 
A classical, psychodynamically oriented psycho- 
therapy and play therapy treatment was used in 
both Units B and C. n 

The outcome figures of this preliminary study 
appear to support the hypothesis that structural 
therapy is capable of producing significant thera- 
peutic change in children classified as having 


ECA. 
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Coital Position and Sex Roles: 


Responses to Cross-sex Behavior in Bed 


Elizabeth Rice Allgeier 
State University of New York, Fredonia 


Eastern Michigan University 


The impact of changes in sex role norms regarding heterosexual interaction was ex- 
plored by varying both the coital position used by a couple in stimulus slides and the 
extent to which observers identifed with stereotypic sex role norms. Females were 
more negative toward the couple having intercourse in the woman-above position than 
they were toward the couple in the woman-below position. Observers’ degree of sex 
typing was unrelated to their reactions to the woman-above couple, suggesting that 
gender may still be more important than sex typing in determining responses to roles 


in the context of heterosexual interaction. 


Some physicians have claimed that female 
liberation (and the increased demand for male 
performance) is the primary cause of an increase 
in male impotence (Liddick, 1972). On the basis 
of interviews with 50 men, however, Bry (1975) 
concluded that men prefer sexually aggressive 
women. Hunt (1974) suggested that the norm of 
male dominance and female submissiveness dur- 
ing intercourse has been changing, but to date, we 
have no experimental evidence to indicate the 
average person’s reaction to female assertiveness 
in bed, nor do we know attitudes toward the 
male when he assumes the more submissive posi- 
tion. 

Thus, one purpose of the present research was 
to explore male and female reactions to variations 
in coital positions relevant to sex role norms. 
Aside from giving birth, there is probably no 
other arena in which traditional gender role dif- 
ferences have more importance to us than in our 
sexual interaction. Thus, it was hypothesized that 
subjects would make more negative attributions 
about a couple engaging in women-above inter- 
course than about the same couple engaging in 
women-below intercourse. 


This research was part of a doctoral dissertation 
submitted by the first author to Purdue University 
and was supported by National Science Foundation 
Grant SOC 74-15254 to Donn Byrne. 

The authors gratefully acknowledge the helpful 
comments of Rick Allgeier, Donn Byrne, Rick Kim- 
ball, Don Lehr, Dave Przybyla, and Winnie Shepard 
on an earlier draft. 

Requests for reprints and for an extende: re 
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Department of Psychology, State University of New 
York, Fredonia, New York 14036. 


Arthur F. Fogel 
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The second purpose of the research was to si 
if observers’ degree of sex typing would influenci 
their responses to traditional versus nontradt 
tional coital positions. Bem (1976) has demon 
strated that androgynous persons (those who gi 
equally high endorsement to masculine and femi- 
nine traits as self-descriptive) are more willing to 
engage in cross-sex behavior, and are more col 
fortable while doing so, than are sex-typed per 
sons. Accordingly, it was hypothesized that si 
typed persons would respond rnore negatively t 
the couple in the women-above position thal 
would androgynous persons. 4 

Subjects were 119 unmarried, introductory 
psychology students at Eastern Michigan Uni- 
versity who volunteered to participate in a stu 
entitled Responses to Erotica. After students had 
completed a demographic questionnaire and the 
Bem (1974) Sex-Role Inventory, six slides ofa 
couple engaged in intercourse were projected onto 
a large movie screen for 1 minute each. All stuz 
dents saw three “neutral” slides of the nude 
couple lying side by side mutually engaged in 
foreplay. In addition, half of the students saw 
three slides of the couple having intercourse A 
the woman-above coital position, and the i. 
half of the students saw three slides of E. 
couple in the woman-below coital position. A 
dents were then instructed to complete 4 a 
perception task by checking a scale position 4 E 
each of nine 7-point bipolar scales that best k 
flected their impressions of the woman (man) E 
the slides with respect to the dimensions “a 
justment, cleanliness, respectability, ae 
femininity (masculinity), goodness, ne ia 
tion, desirability as a wife (husband), and desi 
ability as a mother (father). 
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Results of the 2 X 2 X 2 (Coital Position X 
students’ Gender X Sex Typing) unweighted 
means analyses of variance indicated consistent 
interactions between subjects’ gender and the 
vital position on ratings of the woman’s cleanli- 
ness, F(1, 111) = 5.68, $ < 01; respectability, 
Pa, 111) =7.96, p <01; morality, F(1, 111) 

=8.79, p < .005; goodness, F(1, 111) =7.45, $ 

<01; desirability as a wife, F(1, 111) = 5.39, 
'p<.05; and desirability as a mother, F(1, 111) 

=5.92, p< .05. Gender X Coital Position inter- 
ations were also obtained on ratings of the man’s 
leanliness, F(1, 111) = 5.61, p<.05; respect- 
ability, F(1, 111) = 5.74, p <.05; morality, F(1, 
111) = 6.80, p < .01; and masculinity, F(1, 111) 
=7.10, p < 01. 

Internal comparisons revealed that females 
wnsistently rated the couple in the woman-above 
position significantly more negatively than they 
üd the couple in the woman-below position. 
Specifically, females rated the woman as dirtier, 
lss respectable, less moral, less good, less desir- 
thle as a wife, and less desirable as a mother 
when she was on top than when she was beneath 
the man during intercourse. 

Females also rated the man as dirtier, less re- 
spectable, less moral, and less masculine when he 
was in the woman-above position than when he 
was in the woman-below position. Males, on the 
other hand, tended to respond more positively to 
the couple in the woman-above position than in 
the woman-below position; however, the differ- 
tices were not significant. Thus, the hypothesis 
têgarding responses to variations in coital posi- 
tion was supported with respect to the responses 
of females and rejected with respect to the re- 
panes of males. No differences emerged as a 

lunction of the extent of students’ sex typing, 

80 the second hypothesis was rejected. 

The finding that females (but not males) dis- 
minated against the woman-above couple is 
Particularly interesting in light of Masters and 
an (1966) finding that the female coital 
ene develops more rapidly and with greater 
vee in the woman-above position than in 
me oe The design of the present study does 
e A low us to determine why females responded 

fare ey did to the woman-above couple, and 
er research should be conducted to determine 
ai ae to which their bias is a result of the 

5 a that males find female assertiveness in bed 
ra pening or unattractive. If this were the case, 
cae further research replicates the finding that 
a s do not discriminate in their evaluations of 

Ouple based on coital position, such informa- 
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tion might be very useful to the therapist work- 
ing with nonorgasmic women who inhibit their 
own initiative due to fear that they will displease 
or threaten their mates. 

The failure of students’ sex typing to interact 
with coital position is in apparent contradiction 
to the findings of Bem (1976) that the extent of 
sex typing influences such diverse behaviors as 
nurturance, independence of judgment, and will- 
ingness to engage in cross-sex behavior. On the 
other hand, none of her designs have studied the 
context of heterosexual activity, and it may be 
that the influence of variations in sex typing 
does not yet extend to relations between the 
sexes. Support for this possibility is provided by 
Zeldow (1976), who found no relationship be- 
tween sex typing and scores on the Attitudes 
Toward Women Scale (Spence & Helmreich, 
1972). A number of the items in this scale deal 
with sexual relations, and this may be the area 
in which women feel most reluctant to abandon 
the feminine norms with which they were raised. 
Further research should be aimed at determining 
if sex typing does influence attitudes and behavior 
during heterosexual interaction in less explicitly 
sexual contexts (e.g., initiation of dates, sharing 
cost of dates, etc.) or if the effect of sex typing 
is negligible in any context involving male- 
female interaction. 
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Variations in a Construct: Quantitative and 
Qualitative Differences in Children’s Locus of Control 
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Virginia Commonwealth University 
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The present study examined differences in locus of control scores and factor patterns 
among normal and nonnormal groups and male and female normals using the 
Nowicki-Strickland Locus of Control Scale for Children, Multidimensionality of the 
locus of control construct in children was supported by separate factor analyses of 
emotionally disturbed, delinquent, and elementary public school children. Qualitative 
differences in factor patterns between normals and nonnormals raised questions about 


interpreting inventory scores 
groups. 


Locus of control refers to whether people per- 
ceive positive and negative outcomes of events 
as being contingent on their own behavior (in- 
terval) or the result of luck, fate, or powerful 
others (external). Considerable research has been 
conducted to investigate locus of control as a 
generalized expectancy (see Lefcourt, 1972), and 
factor analytic studies with adults support a 
multidimensional conceptualization of the locus of 
control construct (Levenson, 1973). 

The present studies were designed to examine 
locus of control as it operates in normal (N1) 
and emotionally disturbed (ED) children and in 
juvenile delinquents (JD). Data from a second 
and larger group of normals (N2) were collected 
to allow an examination of sex differences. 

The N1 group (M age = 10.9) contained 64 
male and 43 female children, The N2 sample in- 
cluded 145 males (M age = 10.3) and 125 fe- 
males (M age= 10), Although obtained from 
different systems, both normal groups were from 
suburban public schools in middle-class neighbor- 
hoods, The ED group contained 189 hospitalized 
children (M age = 11.1; 151 males, 38 females) 
for whom psychiatric Screening indicated average 


ee rA the present article were conducted 
while the first author was affiliated wi irgini 
Commonwealth University, meee 
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as reflecting the same construct for differing subject 


or above intellectual potential and a wide ran 
of diagnoses, Although chronologically older (M 
age = 15.3), the JD group of 185 children (144 
males, 41 females; M IQ of 69 children = 83,5) 
had a mean reading level (4.5) and mental age 
(11.3) comparable to children in the other gro ps. 

All children completed the Nowicki-Strickland 
Locus of Control Scale for Children (Nowicki & 
Strickland, 1973) with individual attention pro- 
vided when necessary to insure understandingi 
This scale was selected over other measures, De 
cause it has been found to be unrelated to social 
desirability or intelligence test scores and be- 
cause it was thought to be the most reliable 
Measure of generalized locus of control appropH= 
ate for children of a variety of ages. Scales were 
Scored in the external direction. E 

The respective means and standard deviations 
for the N1, ED, and JD groups were 14.63, ‘a 
17.54, 4.41; and 15.60, 5.17.1 Results of Adi 
comparisons between independent means indica 
that each of these three groups differed from oni 
another. More external than the N1 group were 
both the ED sample, ¢(294) = 4.65, p < .001, 45 
the JD group, ż(290) = 2.23, p < .05. The a 
group also scored in a more external direc! 
than the JD group, ¢(372) = 3.93, p < 01. 


by Nowicki and Strickland (1973) show a mary 
change from the fifth to sixth grade, the oe ee 
the present normal samples, also at these grade 

were reasonably comparable. 
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Separate factor analyses resulted in eight fac- 
{ors for the N1 group (accounting for 67% of 
the total variance), five for the ED (59.2%), 
md six for the JD group (62.1%). The first 
factor emerging from the N1 group analysis con- 
fained four items that could appropriately be 
labeled Generalized Expectancy. A similar factor 
did not emerge from the ED or the JD data. 
Rather, the first ED factor was labeled helpless- 
ness, and the first JD factor was called Super- 
sition, Representative of the other N1 factors 
were Intellectual Concern and Effort, and other 
ED factors were Persecution, Superstition, and 
Futility. The JD factors included Helplessness 
it Home, Helplessness with Friends, and Help- 
lesness with Parents. Since the three analyses 
were rotated to the same varimax criterion, 
emergence of a general factor within the normal 
thildren’s responses but not in the nonnormal 
groups should be attributable to differential char- 
acteristics in the subject groups. 

A separate analysis of the amount of variability 
accounted for by each factor was conducted to 
compare the relative potency of factors in each 
group. In the N1 group, the factors did not ac- 
count for different amounts of variance (largest 
US 1.66). In contrast, the first factor to emerge 
in each of the nonnormal analyses (i.e., Helpless- 
ness from the ED data and Superstition from the 
JD data) accounted for a significantly greater 
amount of variance (z = 2.16, p < .05; z= 2.02, 
P< .05, respectively) than did the second or re- 
maining factors. 

Since whether a child is male or female in our 
culture is important regarding expectancies, a 
second sample of normal children (N2) that 
contained a large number of both males and fe- 
males was obtained and analyzed for sex differ- 
ences. A ¢ test between independent means indi- 
cated that the females responded significantly 
more externally, #(268) = 1.98, p<.05, than 
we In addition, both the entire N2 sample and 
the N2 females were more external than the N1 
group, £(376) = 2.87, p<.01; t(230) = 4.76, $ 
oe respectively, and the JD group, t(454) = 
ad b< .05; t(308) = 3.81, p <0.l, respec 
a but not from the ED group (both ts < 1). 
Ex N2 males also were significantly more ex 
i al than the N1 group, (250) =3.12, P< 
ike and the JD group, t(328) = 2.05, P< 05, 
aa unlike the females, N2 males were signifi- 
i ntly less external than the ED group, ¿(332) = 
11, p < 05. 
peen the total N2 group responses were fac- 

T analyzed, the major factor—as in the initial 
1 analysis—was appropriately labeled Gen- 
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eralized Expectancy, The remaining factors, how- 
ever, consisted of only two items and could not 
be meaningfully labeled. The separate factor 
analyses for males and females resulted in seven 
and eight factors, respectively. For males, the 
major factor consisted of four items concerning 
Parental Fairness. For females, the major factor 
consisted of only two items (as did six of the 
eight factors) and was inconsistent regarding 
labeling. In regard to relative factor potency, the 
first male and female factors did not account 
for significantly different amounts of variance 
than the remaining factors. 

Results of the initial group comparisons sup- 
port the adjustment —locus of control hypothesis 
—that internals are better adjusted—since nor- 
mals were significantly more internal than the 
two nonnormal groups. The N2 group, however, 
was significantly more external than the N1 
group and not different from the ED group. 
Although N2 females were more external than 
the N2 males, one cannot explain the differences 
between the two normal groups on the basis of 
sex, since the percentage of females was similar 
in each group (N1 = 40%, N2 = 46%), and N2 
males as a group were more external than the N1 
group, composed of both males and females. 
Though sampling may account for the findings, 
the most parsimonious explanation may be that 
locus of control reflects relevant differences in 
children’s life situations. That is, children’s scores 
may indicate some effects of the situation on im- 
mediate responses to the scale (ie. a state) as 
well as a stable disposition (i.e., 4 trait). To the 
extent that this were true or that other unknown 
factors were operating, one could anticipate that 
locus of control research with seemingly “normal” 
groups of children might yield contradictory 


findings. 

In general, although the present results require 
replication, factor analyses indicated that locus 
of control in children is a multidimensional con- 
cept and that the patterning of factors differs 
among groups. One clear difference that emerged 
was in the themes that constituted the factors, 
For example, the main factor in both normal 
groups was 4 generalized expectancy theme, 
whereas the majority of the ED factor labels 
reflected feelings (€.8., Helplessness, Persecution, 
Futility), and the JD factors emphasized situa- 
tions or environments (e.g, at Home, with Peers, 
with Parents). 

Based on these findings, operation of the locus 
of control construct in different children’s groups 
be said to differ qualitatively. It seems 


may i 
ion whether these qualita- 


pertinent then to questi 
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tive differences could affect quantitative com- 
parisons. Or, stated differently, to what extent 
is one entitled to interpret scores on a particular 
inventory as reflecting the same construct for 
differing subject groups, especially when one is a 
nonnormal group? 
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is challenged, 


The publication of do-it-yourself weight reduc- 
n books and manuals has proliferated in recent 
s. Unfortunately, the distribution of these 
terials has preceded experimental justification 
r their use. 

Hagen (1974) found a bibliotherapy condition 
‘manual via mail) to be as effective as a be- 
wioral group therapy condition, both of which 
re superior to no treatment. However, a short 
low-up and the use of a mildly obese popula- 
lon make the results difficult to interpret. Han- 
, Borden, Hall, and Hall (1976) replicated 
en’s findings but found no treatment effects 
a 1-year follow-up. The authors noted that 
ly 4 of 38 subjects attained even 50% of their 
sired weight loss. 

Fernan (Note 1) found that do-it-yourself 
Mieters receiving little professional contact did not 
differ from subjects receiving no treatment, The 
‘Present study was designed to remedy the prob- 

$ with earlier investigations and to evaluate 
effectiveness of self-administered diets. 

Subjects were 29 females, at least 15 pounds 
(68 kg) or 15% overweight, who were randomly 
‘signed to a standard behavioral treatment 
koup (SBT), a group receiving a manual with 
Minimal professional contact (MMC), or a no- 
{eatment group. Subjects averaged 63.6% over- 
Weight, mean weight was 194.4 pounds (88.4 kg), 


ap tests for reprints and for an extended version 

this study should be sent to Kelly D. Brownell, 

ea now at the Department of Psychiatry, Uni- 
ty of Pennsylvania, 205 Piersol Building, Phila- 
hia, Pennsylvania 19104. 
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Therapist and Group Contact as Variables in the 
Behavioral Treatment of Obesity 


Kelly D. Brownell, Carol L. Heckerman, and Robert J. Westlake 
Brown University/Butler Hospital 


Obese females were randomly assigned to one of three experimental conditions; (a) a 
“standard” behavioral treatment (SBT) group emphasizing self-management tech- 
niques (Subjects attended group therapy meetings weekly for 10 weeks, then monthly 
for 6 months and were given a weight control manual.) ; (b) a group receiving the 
weight control manual via mail with little professional contact (MMC); and (c) a 
waiting list control condition. Results revealed a superiority of both treatment con- 
ditions over the control condition at posttreatment. SBT subjects did significantly 
better than MMC subjects at posttreatment but not at the 6-month follow-up. Weight 
loss for MMC subjects was minimal. The use of “do-it-yourself” treatment manuals 


and average age was 48.7 years. In the SBT 
group subjects met weekly with a trained thera- 
pist for 10 weeks, then monthly for 6 months 
and were given the Behavioral Weight Control 
Manual (Brownell, Heckerman, & Westlake, 
Note 2). MMC subjects received the same 
manual and met six times to be weighed. 

There were no pretreatment differences among 
groups for mean body weight or mean percentage 
overweight. At posttreatment, differences among 
groups were significant for change in body weight, 
F(2, 25) = 10.36, p< 01; change in percentage 
overweight, F(2, 25) = 11.62, p < 01; and the 
weight reduction quotient (pounds lost/pounds 
over X 100), F(2, 25) = 12.49, p< 01, New- 
man-Keuls comparisons revealed that SBT and 
MMC subjects were superior to control subjects 
and that SBT subjects lost more weight than 
MMC subjects for all three measures of weight 
change (p < 05). At the 6-month follow-up, 
there were no differences between groups. Mean 
weight loss was 7,42 pounds (3.7 kg) for SBT 
subjects and 2.2 pounds (1.0 kg) for MMC sub- 


jects. 

Both treatments were plagued by a lack of 
long-term effectiveness, and the weight losses 
could be considered only temporary. This was 
f the minimal contact group, 


roduce even temporary weight 


duce sustained weight loss. eY 
The emotional hazards of unsuccessful dieting 


have been clearly documented ( Stunkard & Rush, 
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1974), and there is preliminary evidence which 
suggests that periods of weight gain are character- 
ized by substantial increases in blood pressure 
and serum lipid levels (Gordon & Kannel, 1973). 
It is possible that an ineffective diet is more dan- 
gerous than no diet. 

In light of these potential hazards, and con- 
sidering that the do-it-yourself dieter has little 
medical or psychological guidance, diets designed 
to be self-administered should be subject to con- 
trolled clinical investigation prior to distribution, 
and consumers should be educated as to the 
merits and drawbacks of specific programs, 


Reference Notes 


1. Fernan, W. S. The role of experimenter contact in 
behavioral bibliotherapy of obesity. Unpublished 
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Preface 


Four of every five manuscripts submitted to this journal in recent years have been 
rejected. Although rejection can occur for many reasons, most of the unsuccessful 
manuscripts were returned because of serious methodological weakness in the research, 
Many of these errors of method were fundamental and glaring. Indeed, it is not pos- 
sible to complete an editorial term without inferring reluctantly that current doctoral 
training in research, as reflected by recent submissions to this journal, is deficient to 
the point of being disastrous. 

This unfortunate inference has led to this special issue, which is intended to be 
a resource for behavioral scientists working on problems of clinical psychology. Each 
contribution that follows describes and discusses the major methodological aspects of 
a particular topic area such as smoking or addiction. The topic areas have been se- 
lected because of the frequency with which they have stimulated contemporary 


tesearch investigation. 

Readers may note that at some points contri 
appear elementary. This has been done not out o 
sophistication of the journal’s readers, but because our ex 
scripts has made it clear that many investigators are mi! 


matters. . 
It is the hope of the Editor and contributors alike that this issue of the journal 
will be of some lasting value to those who engage in the difficult and demanding 


task of clinical research. 


butors have raised issues that may 
f an insensitivity to the level of 
perience reviewing manu- 
isinformed on some basic 
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Editor 
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Research Problems in Clinical Diagnosis 


Sol L. Garfield 
Washington University in St. Louis 


This article discusses some of the problems and deficiencies apparent in past 
research on clinical diagnosis. Among the important issues discussed are those 


pertaining to sampling problems with 


adequacy of control groups, base tates, clinical versus statistical significance, 


lack of cross-validation, and problems 
of classification. Some cautions and sug; 
also discussed. 


Although interest among clinical psycholo- 
gists in clinical diagnosis and related actiy- 
ities has appeared to diminish relatively in 
recent years, interest in diagnostic problems 
remains, and a number of studies on such 
problems are conducted yearly. Although over 
the years we have had an opportunity to be- 
come ‘better acquainted with methodological 
problems concerning research in this area, it 
would appear as if rather limited advantage 
has been taken of this opportunity. I will 
discuss a number of methodological inade- 
quacies that I have encountered in reviewing 
manuscripts for this and other journals, as 
well as in the published literature, 

The term clinical diagnosis does not appear 
to have a precise meaning, and consequently, 
this should be acknowledged at the outset, 
Although in the past, this designation fre- 
quently has referred to psychiatric diagnosis, 
or a nosological label taken from the Diag- 
nostic and Statistical Manual prepared by 
the American Psychiatric Association (1968), 
such a delimitation has not been exclusive. 
Many psychologists have not limited them- 
selves in this manner, preferring to use other 
schemes, formal or informal. Some have pre- 
ferred the term assessment to that of diag- 
nosis. Others have Preferred to rely on 
personality descriptions and on the i 


e nferred 
psychodynamics of the individuals appraised. 


Requests for reprints should be sent to Sol L. 


Garfield, Department of Psychol i 
rfield, ogy, Washingt 
University, St. Louis, Missouri 63130. i E 
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regard to the clinical subjects studied, 


due to reliance on inadequate schemes 
gestions in terms of future research are 


I will not debate the issue here of what clini 
cal diagnosis really is, but I will include 1 
my discussion references to research that in 
volves the clinical assessment, diagnosis, con 
parison, or categorization of individuals bei 
appraised or studied for clinical purposes, 
presentation is organized around several top 
that have appeared to be of particular conce 


Sampling Problems 


One of the most frequent and serious prob 
lems encountered in diagnostic research pet 
tains to the sample of subjects or patien 
used in particular investigations. This act 
is a broad categorization for a variety 0 
problems. One critical issue pertains to hoj 
far one can generalize from the findings 0 
tained with a certain sample of subjects pi 
ported to represent a given type of disordé 
or diagnosis. Ostensibly, at least, finding i 
cured from a particular sample representati 
of a given category of disorder presumab 
have relevance for other comparable samj 
The question then is, how does one defi 
both the sample studied and the other sam 
ples or populations to which the results 4 
supposedly applicable? i 

Unfortunately, there are no true standa $ 
reference measures that can be used to ag 
fine a particular clinical group of subjeci r 
and the diagnostic terms used leave much q 
be desired. Consequently, it is exceedingy 
important that the sample be selected a 
great care and that as much useful descripi 


ive information as possible be provided con- 
ming the sample. Many manuscripts, as 
wl as a significant number of published 
alicles, fail to provide even such basic data 
sthe sexual composition of the sample, let 
done other attributes of importance. Such as- 
pets as age, length of hospitalization, fre- 
quency of hospitalization, marital status, work 
tistory, previous treatments, family resources, 
Y «ucation, intelligence, type of ward the pa- 
ient is on, whether he or she is receiving 
medication, the type of medication, and the 
lke, are all potentially important variables 
that may influence test performance, treat- 
ment outcome, and similar variables. It should 
beapparent that significant variation on some 
o many of these variables between studies 
| limits the reliability of the results secured 
| and the drawing of conclusions that may have 
broad applicability. 
_ The problems referred to above are particu- 
luly visible when samples of modest size are 
drawn from institutional settings that vary 
Videly on a number of dimensions. Patients 
Muniversity or private hospitals generally are 
Mite different from those in state or Veterans 
Aiministration psychiatric hospitals, and gen- 
talizations from one setting do not fit the 
ther settings, even though the patients may 
ill carry a diagnosis of schizophrenia. For 
ae in a previous study of prognostic 
GA used in research on schizophrenic pa- 
tin, we Kere unable to secure an adequate 
kis of “reactive” or good-prognosis pa- 
Kicn, the state hospital where we were 
Mi on our study and had to secure pa- 
Baie om a city hospital, which had fewer 
a (Garfield & Sundland, 1966). 
ER : sample size is also of some im- 
Wine = a small scale studies are more 
bility, si Aik luce findings limited in relia- 
"i he alone is not sufficient. The selection 
ot prime S and sample specifications is clearly 
N vas If one is studying or 
aa a sel lected types of disordered be- 
WP kE GR used in selecting subjects 
should be explicit and the procedures used 
validity ls for which the reliability and 
ommont te known or available, and meet 
hen ey „accepted standards. Particularly 
ps fo are conducted on groups based 
Ychiatric diagnosis, it is essential that 
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how the diagnoses were derived be clearly 
described and that other supporting selec- 
tion criteria be used. Diagnoses based on old 
records and provided by different psychiatrists 
are usually not sound or reliable bases for 
subject selection. 

A related problem concerns the matter of 
randomization of subject selection. Were the 
subjects selected at random from a previously 
selected pool of available subjects or were 
they selected because they were not on drugs, 
were in a special ward, were considered co- 
operative, or in some other manner were really 
not “typical subjects”? Such selectivity can 
obviously bias the results obtained and can 
limit their generalizability. 

Selection of subjects on the basis of a single 
scale may also not be adequate, particularly 
if the subjects are considered to represent a 
particular diagnostic group. For example, not 
all subjects who score 70 or higher on a scale 
of the Minnesota Multiphasic Personality In- 
ventory (MMPI) resemble groups diagnosed 
on other criteria as schizophrenic, depressed, 
and so on. College students who secure such 
scores may or may not be clinically depressed, 
psychotic, and so forth, and comparisons with 
actual clinical populations may therefore not 
be warranted. 

It would seem clearly desirable to use more 
than one procedure or method to establish 
the diagnosis of the subjects to be used in 
any research study in which the diagnosis 1s 
considered to be a significant variable. Psy- 
chiatric or clinical diagnosis should be sup- 
plemented by other criteria such as scores 
on appropriate tests or standardized rating 
scales. In the case of depression, for example, 
scores on the MMPI, the Hamilton Rating 
Scale (Hamilton, 1960), and the Beck De- 
pression Inventory , 1972) could be 
used in addition to clinical diagnoses. Further- 
more, clinical diagnoses, preferably, should 
be based on diagnoses secured from two or 
more clinicians with a reasonably high indi- 
eliability between them. If such 
procedures are followed, there is greater as- 
surance of the reliability of diagnosis as well 
as several external or operational reference 
points in defining the samples used. To be 
sure, extra effort may be required to carry 
out such procedures, and, to a certain extent, 
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such procedures might reduce the size of po- 
tential samples and raise new issues of selec- 
tivity and limited generalizability. However, 
the samples would be more clearly defined, 
and the dangers of relying exclusively on 
somewhat haphazard means of classification 
would be lessened. In the long run, at least 
one would hope, the conflicting results ob- 
tained with the use of diffuse, unclear, and 
unreliable diagnostic categorization might 
diminish. 

Another issue that is sometimes apparent 
concerns the selectivity of a patient sample 
in order to match it with a control group. 
If, for example, the control group has an av- 
erage IQ of 98, an average educational attain- 
ment of 10th grade, or is composed primarily 
of ward attendants, a subject sample selected 
to match it on one or more of these variables 
may be a highly selected and unrepresenta- 
tive sample of the patient group that these 
people are supposed to represent. They are 
not typical patients of a particular diagnostic 
group, and broad generalizations to other 
more randomly selected groups are not justi- 
fied. For example, in one study of mine of 
patients diagnosed as schizophrenic, very 
different patterns of performance on the 
Wechsler-Bellevue Scale were secured for pa- 
tients differing in education and IQ (Garfield, 
1949). Furthermore, comparisons of samples 
of schizophrenic patients studied in different 
investigations, which differed noticeably in 
terms of mean IQ, also revealed significant 
differences on test patterns among these sam- 
ples. In other words, there were as many sig- 
nificant differences between the various sam- 
ples of schizophrenic patients as there were 
between a given sample and a normal group 
of control subjects, A recent study of brain- 
damaged adults also revealed significant dif- 
ferences due to education on a battery of 
tests (Finlayson, Johnson, & Reitan, 1977). 

The relative influence of social class vari- 
ables on samples of subjects is another matter 
that needs to be carefully appraised in each 
study. As Meehl (1971) has pointed out, the 
tendency among social scientists to view cor- 
relations uncorrected for social class as auto- 
matically spurious is unjustified, unless a clear 
case can be demonstrated that a causal re- 
lationship exists between social class and the 
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correlated variables in the specific instance, 
In some cases, this may be so, but in others, 
the so-called nuisance variable may be of 
little importance. If there is doubt, the in. 
vestigator can provide both corrected and un. 
corrected correlations as one means of at- 
tempting to evaluate the possible importance 
of social class variables. (See Meehl’s, 1971, 
article for a more extended treatment of this 
issue.) 

The selection and specification of subjects 
used in research on clinical diagnosis is thus 
a matter of primary importance, and would-be 
investigators should give careful attention to 
the issues discussed in the preceding para- 
graphs. In the final analysis, the results can 
be no better than the type and representative- 
ness of the subjects used. 

Some of the problems pertaining to how 
clinical diagnoses are secured will also be 
mentioned briefly here, although they could 
be accorded a separate section. Most clinical 
psychologists are quite aware of the problem 
of diagnostic reliability, and this issue need 
not be reviewed in any depth here (Garfield, 
1974). However, because of its relevance to 
much research on clinical diagnosis, some ref- 
erence to this matter is pertinent here. 

In an earlier section of this article, some 
mention was made of the importance of fully 
describing the sample studied and how it- 
was selected. One aspect of sample descrip- 
tion concerns the matter of diagnosis. Hit 
were the patients diagnosed, by whom, an 
when? These are all pertinent questions me 
influence sample specification and, ultimately) 
the results secured and the conclusions drawn. 
If the diagnosis is made by one clinician, I 
representative are his or her diagnoses? F 
one study comparing three psychiatrists mi i 
ing in the same hospital with ee 
groups of patients, it was found that se 
the psychiatrists classified two thirds of a 
patients as schizophrenic, whereas the ot i 
two psychiatrists classified 22% and 29% 
their patients as schizophrenics (P. a 
Dinitz, & Lefton, 1959). Thus, some a 
tion has to be paid to the reliability of te 
nosis and comparability of diagnoses am 
different studies. ‘ r 

Diagnoses made at the time of inak a 
admission to the hospital may also differ 


e made at a later time when more oppor- 
for observation and study is available. 
ly, diagnoses reached at a formal staff 
nce or with the help of teaching con- 
ints may be different from those made by 
e clinician. Diagnoses based on psy- 
logical tests may also be different from 
s based on other data sources. However, 
r no information on such relevant mat- 
provided in a large number of studies. 
ual statement is something like this: 
y-four patients diagnosed as schizo- 
nic were selected for study.” This cer- 
ly does not provide adequate information 
how the diagnoses were made or by 
they were made. Consequently, speci- 
of how diagnoses were reached is of 
importance in estimating the confidence 
n place in the diagnoses secured and 
0 in evaluating the comparability of re- 
studies, Since the behavior of patients 
anges over time, there is also a ques- 
about the suitability of diagnoses that 
eached some time prior to the current 
gation. There are, thus, a number of 
ently minor but nevertheless important 
iderations that pertain to the matter of 
sis. 


Proper Control Groups 


Nother problem frequently encountered in 
arch reports concerned with clinical diag- 

pertains to the appropriateness of the 
tol groups used. Although we are rela- 

more sophisticated about such matters 
y than we were in the past, problems of 
fopriate controls are still evident. Ob- 
7 a group of patients that has been 
lized for some time should not usually 
npared with a group of apparently nor- 


cus of a particular investigation is on 
diagnosis or on the specification of 
fern of performance of a specific category 

tients or subjects. If the results of the 
i,» are to have any clinical significance in 
tactical sense, then the experimental 
must be compared to other clinical 
S with which they would normally be 
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compared in the actual clinical situation. 
Comparisons of test performance or other 
measures of a sample of patients diagnosed as 
schizophrenic should be made with other clini- 
cal groups normally seen in that clinical situa- 
tion and in approximately the usual propor- 
tions. In a clinical or hospital setting in which 
a clinical diagnosis or evaluation is sought, 
the problem is rarely one of comparing the 
given patient to a normal population. Rather, 
the issue is one of differential diagnosis and 
appraisal. Is there any suggestion of psychotic 
disturbance or of possible brain damage? How 
serious is the thought disturbance or depres- 
sion manifested by the patient? In trying to 
reach answers to such questions, the clinical 
psychologist is considering patterns of various 
types of psychopathology and is not compar- 
ing the patient’s performance primarily to the 
performance of normally adjusted individuals. 
Thus, in studying a diagnostic pattern for 
possible utility in clinical diagnosis, the in- 
vestigator should compare the results of the 
particular clinical group of interest with those 
of other diagnostic groups that are usually 
encountered in practice and from whom the 
aforementioned group is to be compared or 
differentiated. For purposes of clinical diag- 
nosis, it would seem efficacious to have a 
control group made up of the proper mix of 
the other diagnostic groups that are normally 
encountered in the particular setting or of 
several groups representing the types of dis- 
orders that are most frequently confused with 
the group under study. 

Similar to the issue mentioned earlier with 
regard to the selection of the sample of sub- 
jects for investigation, the investigator should 
provide adequate information on all groups 
of subjects, such as how they were selected 
and from which subpopulation they were 
drawn. For example, if a control group of 30 
patients was selected for purposes of compari- 
son with the group under study, why and 
how were these subjects selected and from 
what number of comparable subjects were 
they drawn? If they were the only ones who 
had certain data or test scores available, what 
were the reasons for this particular state of 
affairs? How selective is the group, and how 
representative are they of supposedly similar 
subjects? Selective bias may greatly impair 
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the kinds of conclusions that may be drawn 
and the extent to which generalization to 
other samples and populations is possible. 

It should be reasonably obvious also that 
whatever control group or groups are used 
should be comparable on the variables of im- 
portance for a particular investigation. When 
Cognitive tests are used, the groups should 
show some comparability in terms of level of 
ability, education, and age, since these may 
all affect performance. However, if one’s par- 
ticular focus is the diagnosis of mental re- 
tardation, such comparability may not be 
necessary or meaningful. Length of institu- 
tionalization, drugs, type of ward, degree of 
cooperation, and other such attributes may 
also be variables of importance. 

In essence, therefore, considerable attention 
should be given to the selection of adequate 
and appropriate control groups in research 
on clinical diagnosis, Although this seems par- 
ticularly relevant to matters of differential 
diagnosis and related problems, the issue of 
appropriate control groups also applies to 
more experimental or theoretical studies of 
psychopathology—at least as far as I am 
concerned. Numerous studies of various psy- 
chological functions in schizophrenic subjects 
have compared the latter mainly with normal 
controls. Although this may have some value 
at a certain stage of investigation, it is ulti- 
mately of limited value if one is interested 
in demonstrating particular patterns of re- 
sponse or thought in a given clinical disorder. 
What the investigator is actually trying to 
discover or demonstrate are response patterns 
that characterize a pathological group. 
Whether these patterns are distinctive of the 
particular type of pathology in question can 
only be demonstrated if the performance of 
the pathological group is compared with other 
pathological groups of comparable severity. 
In this particular instance, other psychotic 
groups would appear to be the most adequate 
control groups. 


Clinical Versus Statistical Significance 


Surprisingly, a number of basic statistical 
and clinical considerations appear to get slight 
consideration from would-be authors and in- 
vestigators. Those that have been noted with 
some frequency will be discussed. 
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In at least a noticeable number of manu- 
scripts, significant correlations or differences 
at the .05 level of significance are reported, 
with little attention paid to the number of 
significance tests performed. Although it 
should be apparent that the interpretation of 
the significant findings secured is directly re- 
lated to the number of statistical tests per- 
formed, this stricture is not always observed, 
If 35 comparisons or correlations are per- 
formed and 2 are found to be at the magic 
.05 level of significance, it seems clear that 
these results are so close to a chance occur- 
rence that little should be made of them. 
However, some investigators generally play 
down this aspect of their results and empha- 
size their potential significance. In a similar 
fashion, post hoc analyses are sometimes 
treated as if explicit hypotheses were being 
tested. Obviously, different considerations en- 
ter into such analyses than when a specific 
hypothesis, stated in advance of the investi- 
gation, is being tested. Although these matters 
are considered to be rather basic ones in in- 
struction on experimental design and statis- 
tical analysis, somehow the lessons on these 
topics are either not learned well or are 
quickly forgotten. When investigators empha- 
size significant findings that they have not 
predicted in advance but that are noted after 
the investigation has been completed, they 
are essentially capitalizing on chance occur- 
rences. For the results to be taken seriously, 
they should be replicated on a new sample 
of subjects. 

A related issue concerns the practical of 
clinical significance of findings that are cleatly 
significant statistically at the .05 or even at 
the .001 level of significance. It would m 
pear from numerous reports of all kinds 0! 
studies that researchers have been led a 
worship at the shrine of statistical sign 
cance. Perhaps this is a result of the em- 
phasis placed on securing “positive” fe 
and the dread of not being able to reject w 
null hypothesis, particularly with reference a 
doctoral dissertations. Happy is the Pa 
who secures “positive results” at the .05 le 


for 
1 These matters along with some procedures 


3 i nd- 
estimating the probability of securing eee 
ings are discussed by Hays (1963, pp. 488 5 
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> of significance, regardless of whether the re- 
sults are of some potential practical or social 
value, Of all the limitations that I have en- 
countered in reviewing manuscripts for several 
journals over a period of years, the emphasis 
on statistical significance and the disregard 
" of practical psychological significance is prob- 
ably the most frequently encountered. 

There are several aspects of this problem 
that should be emphasized here even though 
most of them may be considered to be obvious 
or elementary. The fact that they occur with 
some frequency makes such an emphasis per- 
missible and warranted. 
Clearly, a researcher must examine his or 

her data to see if the results obtained are due 
to chance. A statistical test of the differences 
secured between clinical groups or of corre- 
lations obtained on certain measures and 
designated criteria for a specific diagnostic 
category or categories is a necessary procedure 
to estimate the influence of chance on the re- 
sults secured. No criticism of this procedure 
is implied here. However, one should recog- 
hize that this represents only one of the neces- 
sary procedures for appraising the results. 
Such a statistical test informs us of the proba- 
bility that our results may be explained by 
chance occurrences. If our results are sig- 
< ficant at .05 level of significance, our usual 
interpretation should be that there are only 
S chances in 100 that the obtained results 
may be attributed to chance. The findings, of 
Course, could be due to chance, but the odds 
Would appear to be against it. However, all 
one can reasonably conclude is that the find- 
_ igs do not appear to be due to chance and 

that if the study were repeated, we might 


expect the findings to be comparable to those ` 


obtained initially. Whether the results ob- 
a have any clinical usefulness cannot be 
termined by the statistical tests alone. Other 
‘ppraisals must be made for this purpose. 
ewe going on to discuss the matter of 
a Clinical significance of research data, it 
ants reiterating some other elementary 
ilay erations pertaining to statistical manip- 
ae and their meaning. Statistical tests 
of e be very much influenced by the size 
Ne samples used and by their variability. 
Ce large’ samples generally would be 
S influenced by selective and chance vari- 
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ables than would very small samples, and, 
thus, findings that are small in actual mag- 
nitude may be statistically significant in the 
former instance, whereas they would fail to 
attain significance with smaller samples. For 
example, with samples of about 30 subjects, 
correlations have to be in the neighborhood 
of .35 or so in order to reach significance. In 
contrast to this, with a sample of several hun- 
dred subjects, a much lower correlation, in 
the neighborhood of .10, may be statistically 
significant, although the amount of predicted 
variance in the scores is of no practical sig- 
nificance. Thus, besides the level of signifi- 
cance, we must also consider the size of the 
sample and the actual amount of variance 
accounted for by the correlation coefficient if 
we are to interpret the findings in terms of 
their broader significance or utility. Moder- 
ately high correlations that are not statis- 
tically significant would appear to be of little 
value, and highly reliable but low correla- 
tions would also appear to be of rather limited 
value. In a similar fashion, low variability 
within groups of subjects generally increases 
the probability of securing statistically sig- 
nificant results, and although small standard 
deviations offer the possibility of useful dis- 
crimination between different clinical groups, 
the actual utility of the measures or compari- 
sons used requires further analysis of the 
actual data secured. : 
Even though there are occasional discus- 
sions of the difference between clinical and 
statistical significance in the published litera- 
ture (Lick, 1973), the importance of this 
topic either does not get enough attention 
during the graduate training of future clini- 
cal researchers or the occasional references to 
it have a relatively weak impact on those to 
whom it is directed. Many investigators ap- 
pear content to rest on their statistical laurels 
and not to worry overly about the actual 
practical value of their results. i i 
Clearly, as indicated earlier, an investiga- 
tion can be conducted to test a theoretical 
proposition or hypothesis when practical con- 
siderations are not of primary importance. 
However, if the investigator in his or her 
discussion of the results obtained implies some 
potential clinical significance for the data, 
then he or she should provide some additional 
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support for the inferences that go beyond the 
fact that statistically significant results were 
obtained. The kind of analyses that should 
be done will be described briefly in the next 
few paragraphs. 

There are several aspects of the research 
data that should be examined in terms of 
their potential clinical significance. For ex- 
ample, regardless of the mathematical proce- 
dures used to test hypotheses within a sta- 
tistical model, it is also important to know 
how much of the variance is accounted for in 
terms of the particular variables studied. For 
practical purposes, as well as theoretical ones, 
this is a very important consideration, but it 
is very frequently omitted in the discussion 
of results. In the case of correlational data, 
it is relatively simple to square the coefficient 
of correlation and secure an estimate of the 
amount of variance accounted for by the par- 
ticular set of correlates. With other methods 
of analysis, the implications may not be as 
readily apparent, but it is equally important 
that an estimate of the variance accounted 
for by the experimental manipulation be 
provided. 

The importance of such analyses can be il- 
lustrated from some recently published re- 
ports, as well as from reviews of other un- 
published material. Particularly when large 
samples are used, the author should feel ob- 
ligated to stress the implications of his/her 
findings in terms of the variance accounted 
for by the variables under study. For example, 
in the published report of the current actiy- 
ities and preferences of a sample of 855 clini- 
cal psychologists (Garfield & Kurtz, 1976), 
numerous findings at the .01 level of signifi- 
cance or better were secured. Some may have 
been the result of the numerous comparisons 
made, but the sample size appeared to be of 
Some consequence in this regard. Correla- 
tions of .10 that were highly significant sta- 
tistically were Consequently considered to be 
of little practical importance, since they ac- 
counted for only 1% of the variance. In an- 
other published study of over 1,000 clients 
in a number of community mental health 
centers, several correlations of around .10 
were reported as highly significant, even 
though they were obviously of little signifi- 
cance clinically or socially (Sue, McKinney, 
& Allen, 1976). For example, the correlation 
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between diagnosis and premature termina- 
tion was .10, and this was significant at the 
-001 level of confidence. However, by itself, 
such a significant finding accounts for a negli- 
gible amount of the variance. 

In another study, a mean difference of 4 
on one subtest of the Wechsler Adult Intellj- 
gence Scale (WAIS) was obtained between 
two different administrations of the test to 
a sample of schizophrenic patients. This dif- 
ference was reported as being significant at 
the .05 level of confidence, and the author 
went on to offer some detailed conclusions 
pertaining to this “significant” finding. Re- 
sults of such magnitude, however, would not 
appear to be of much practical value. 

Apart from going beyond the obtained 
levels of statistical significance to report the 
amount of variance accounted for by the 
variables studied, other analyses may be re- 
quired if the data are presumed to have some 
value for clinical diagnosis. The means and 
standard deviations for the clinical samples 
studied should be reported, and the extent 
of overlap of the distributions should also be 
clearly stated. Of value also for clinical diag- 
nosis are the number of subjects who would 
be correctly diagnosed or classified by the 
diagnostic procedures evaluated and those 
who would be misclassified—that is, data on 
false positives and false negatives. These are 
certainly important data for any purportedly 
effective diagnostic procedures, but they are 
not always secured or provided in reports of 
research in this area. It should be apparent 
that a particular diagnostic technique may 
differentiate two or more clinical groups at 
the .05 level of significance but yet produce 
so many false positives or negatives that it 
has very limited clinical utility. It would ap- 
pear incumbent on investigators to analyze 
their data in terms of such considerations and 
to present their analyses in a clear fashion. 


Base Rates 


Another problem in some types of research 
on clinical diagnosis pertains to the old 18 
sue of base rates. Even though this impor- 
tant matter was raised some years ago an 
is not an unfamiliar topic in either the areas 
of diagnostic assessment or pace 
(Gathercole, 1968; Meehl & Rosen, 1955); 
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at least a certain number of studies appear 
fo disregard it even when it is clearly rele- 
vant. Consequently, although the matter of 
base rates is not a new issue nor even a 
complex one, I will devote some space to it 
and provide a few examples of what is in- 
volved. 

I can start this brief discussion by refer- 
ting to an earlier experience of my own. 
Working as a clinical psychologist in a Vet- 
erans Administration hospital 30 years ago, 
I collected some Rorschach test data on a 
moderate sample of patients who had been 
diagnosed as schizophrenic by the clinical 
staff, I compared my diagnostic impressions 
based on the test data with the clinical diag- 
Noses on these patients and, among other 
things, noted that my diagnoses agreed with 
the staff diagnoses in about 67% of the cases. 
1 prepared a paper on this study and sub- 
mitted it to the branch chief psychologist, 
who at that time was David Shakow. He re- 
turned it to me with his comments. Among 
the suggestions he made was that I secure 
the base rates for diagnoses of schizophrenia 
in my hospital. I was rather reluctant to 
comply with his suggestion but did a quick 
survey of admissions for a limited time and 
discovered that the number of cases diag- 
nosed as schizophrenic was just about 67%. 
In other words, my diagnostic work did not 
exceed the base rate for diagnoses of schizo- 
phrenia, and diagnosing every admitted pa- 
tient as a schizophrenic would have been as 
accurate as diagnoses derived from my Ror- 
schach examinations. 

E matter of base rates, thus, is of some 
pooo and it is certainly a factor 
inds a be appraised in evaluating certain 
in 3 research on clinical diagnosis. Apart 
fe Dn fact that diagnostic procedures to 
a ive must clearly exceed the base rates 
ie disorder, attention to such 
Ma S along with attention to false positives 
tes Saeed indicates the potential difficul- 
orders using diagnostic procedures for dis- 
isi, a which the incidence is very low, 
kce e case for suicide. In the latter in- 
ae Ny diagnostic or predictive measure 

De aioe effective in discriminating 

; cific type of cases being evaluated if 

ale to be clinically useful. 

us, for specific kinds of problems per- 
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taining to differential diagnosis or prediction, 
it appears essential for the researcher to pro- 
vide data on the base rates for the disorders 
of interest and to clearly show the advantages 
as well as the disadvantages for the proce- 
dures being evaluated. Listing the percentage 
of correct hits or diagnoses obtained by a 
particular diagnostic technique is not suffi- 
cient. Along with other criteria, attention 
must be paid to the matter of base rates. 


Inadequate Data Presentation 


The present designation includes a variety 
of frequently small but, nevertheless, impor- 
tant oversights that are apparent in at least 
some manuscripts. Among these are such mat- 
ters as not providing basic information on 
the measures or scales used, necessary infor- 
mation on important variables, how diagnoses 
were secured, how the subjects were selected, 
and similar aspects. The examples to be dis- 
cussed will not be exhaustive but rather will 
illustrate the kinds of problems encountered 
when such information is not included in the 
report. 

In the current era in which a large number 
of patients, both inpatients and outpatients, 
are receiving medication of various kinds, it 
is extremely important that this information 
be provided in the report of the research. If 
the drugs or medications used have any po- 
tency, they are bound to have some influence 
on the behavior and mental functioning of 
the subject studied. Not only must the taking 
of medication be clearly mentioned, but the 
medication used, the dosages, and in many 
cases, the duration of medication should also 
be specified. If part of a group of subjects is 


on medication and a part is not on medica- 


tion, there are clearly problems in mixing the 
results of these two subgroups and treating 
them as one relatively homogeneous group. 
In a related fashion, comparing a clinical 
group of subjects receiving medication with 
a control group that is not must influence the 
kinds of conclusions that can be drawn from 
such comparisons. If one is attempting to 
compare the effects of drugs on two groups 
of comparable patients, then, of course, the 
previous comparison would be feasible, pro- 
viding a placebo was used with the control 
group. However, if the comparison were made 
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to compare the mental functioning, behav- 
ior, or personality characteristics of a given 
diagnostic group with some other group, then 
such a comparison would provide results con- 
taminated by the influence of the medication. 
Although this appears quite obvious, it is sur- 
prising how frequently this kind of a problem 
is ignored or glossed over in manuscripts 
submitted for publication. 

Another problem concerns the lack of ade- 
quate information presented on the particu- 
lar techniques or methods of appraisal used 
in the research study. This is of particular 
consequence when such procedures are not 
standardized ones, are not very well-known, 
or have been constructed by the investigators 
for their specific research, often without ade- 
quate reliability or validity studies. It is the 
responsibility of investigators to describe 
clearly the procedures and techniques that 
they have used in their study so that others 
can fully understand what has taken place 
and be able to attempt possible replications 
of the study if they so desire. 

With the pressures for publication evident 
today and the corresponding desire to use 
journal space as efficaciously as possible, it 
is understandable that editors want manu- 
scripts to be as brief and concise as possible. 
Unnecessary verbiage should, of course, be 
deleted from all manuscripts. Nevertheless, 
this does not mean that significant information 
about a research project should be omitted, 
When relatively unknown tests or rating 
scales are used, for example, the investigator 
should describe them in sufficient detail so 
that the reader has a clear understanding of 
the techniques used and can appraise their 
suitability for the sample used and the prob- 
lem under investigation. The investigator 
should also provide pertinent data concerning 
the reliability and validity of the procedures 
used in these instances as well. Since the value 
of the results secured in any research study 
is dependent on the types of measures used, 
such information would appear to be essential, 


The Lack of Cross-validation 


One of the apparent causes of concern about 
much past and current research, not limited 
to clinical diagnosis alone, is the large num- 
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ber of conflicting findings in the literature 
and the apparent difficulty encountered in the 
replication of published findings. This has 
become such a regular feature of research in 
clinical psychology that many review articles 
tend to sum up the number of positive and 
negative studies on a given topic. If over 
half of the studies appear favorable to a given 
proposition, then that view is judged to have 
some support, even when the studies vary 
greatly in quality. This is a somewhat risky 
kind of approach to drawing conclusions about 
some technique or finding. If a particular 
treatment, for example, is found helpful in 
eight investigations and harmful in four, 
should it be concluded that the overall im- 
pression is that the treatment is helpful? The 
fact that there are such conflicting findings 
should make us suspect some possible limita- 
tions in the research reported and withhold 
judgment until we are able to explain the 
confused findings. 

Most likely, the discrepancies among re- 
search reports are due to subject variables, 
small samples, and variations in diagnostic 
procedures, as well as to chance variables. Be- 
cause of the kinds of problems already men- 
tioned concerning variations in assigning clini- 
cal diagnoses, in subject samples, in the kinds 
of settings used, and sometimes also in the 
procedures used, the findings from any single 
investigation can only be viewed as sugges- 
tive. Although attempted replications by other 
investigators in different settings appear to 
be essential in appraising the value of any 
investigation, there is one procedure that most 
researchers could use to improve the reliabil- 
ity of their findings. This is simply to secure 
enough subjects in their experimental and 
control groups so that all groups could be 
randomly divided into two groups. The study 
is then conducted with one set of groups func- 
tioning as the initial group, and the second 
set can then be used for cross-validation. This 
is a relatively straightforward procedure that 
has been used in some studies, but relatively 
infrequently. 

Examples of studies that contained such 
attempts at cross-validation and that also 
illustrate their value are Garfield and Wolpin 
(1963); Lorr, Katz, and Rubinstein (1958); 
and Sullivan, Miller, and Smelser (1958). 19 


study by Sullivan et al., two cross-valida- 
ns were carried out on MMPI profiles of 
ients who terminated prematurely from 
hotherapy, and the results of the cross- 
validation attempts were clearly important in 
the final conclusions drawn. In this study, 
significant differences between premature ter- 
minators and those who continued in therapy 
e found for several MMPI scales for each 


the groups under study showed a consistent 
pattern. For each separate appraisal different 
sales were found to be significant in their 
ferentiation. Consequently, the findings se- 
fired in the first appraisal were not sup- 
orted in the subsequent cross-validations, and 
e final conclusions reached were different 
han they would have been if the cross-valida- 
lions had not been attempted. These authors, 
it the conclusion of their article, emphasized 
necessity of cross-validation, and I 
strongly concur with their conclusion. 
Although attempted replications or cross- 
dations by other investigators would also 
be required to fully evaluate the significance 
ind utility of findings reported in a single 
vestigation, the procedure suggested here is 
a useful one for increasing the poten- 
> value of single studies. Most likely it 
Might require larger initial samples of subjects 
0 that the samples can be divided into two 
toups for purposes of cross-validation. How- 
Wer, such extra effort and cost would appear 
be justified by the possible increase in the 
nificance of the findings secured, and per- 
Aps some premature generalizations and con- 
usions might be negated. 
Because of the complexities and potential 
blems in conducting research in the area 
f clinical diagnosis, investigators in a major- 
ty of instances should attempt some sort of 
ss-validation before reaching conclusions 
4 before submitting their research reports 
4 Publication, Their findings would tend to 
ho © greater reliability and validity, and, 
pefully, the parade of contradictory findings 
appear in our journals would be de- 
ed. Initial findings of a single investiga- 
a that have not been cross-validated but 
fou Boe to be of possible importance 
e submitted for publication only as 
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brief reports. As already mentioned, such find- 
ings that have not been cross-validated can 
only be viewed as suggestive. 


Miscellaneous Issues 


There are also several other issues that can 
be mentioned somewhat briefly. All investiga- 
tors can be presumed to be aware of these 
matters, but at times there appear to be some 
lapses in attention to them. 

One basic consideration that is sometimes 
neglected concerns the careful checking of re- 
sults and computations before submitting a 
manuscript for publication. This seems so ob- 
vious as to not require any mention here, but 
errors in computation do occur. In the ap- 
parent haste to get the manuscript ready for 
submission for publication, an author may 
fail to detect some rather obvious, and at 
times important, errors in his or her manu- 
script. In some cases, the conclusion drawn 
from the study may have to be greatly modi- 
fied, which is a decided embarrassment to the 
author involved. 

Related to the above are occasional mis- 
uses of statistical procedures. In the case of 
chi-square, inflated Ns, based on the number 
of observations instead of the number of sub- 
jects, can provide spurious results. The use 
of parametric statistics for nonparametric data 
is another example. One of the more common 
errors is treating highly related measures or 
ratings as independent measures. Because a 
number of us are deficient or not up-to-date 
on our knowledge of statistical procedures, it 
is a wise procedure to seek consultation from 
other more knowledgeable individuals when 
we are planning studies, when we are ready 
to analyze our data, and when we are inter- 
preting and writing our research results. 

Another matter that is occasionally noted 
pertains to the interpretation of findings se- 
cured by means of correlations. Coefficients 
of correlation are useful indices of the rela- 
tionship between two variables. However, the 
relationship between two variables cannot be 
used as a basis for judging causality. For ex- 
ample, one may find a significantly high cor- 
relation between marital status and length of 
hospitalization. One might be able to say that 
marital status is a favorable prognostic in- 
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dicator for future discharge from the hospital, 
but one could not say that getting married 
causes a short stay in the hospital. It is con- 
ceivable that better integrated individuals or 
less asocial ones are more likely to get mar- 
ried and that these variables or others like 
them may possibly be related to outcome. 

Another practice that I have occasionally 
noted is that of drawing major conclusions or 
inferences from very scanty or minor bits of 
data. One should not let his or her enthusiasm 
exceed the quantity and quality of the ac- 
tual data secured. In one of the studies men- 
tioned earlier, for example, in which a mean 
difference of .4 on a subtest of the WAIS 
was secured between two different administra- 
tions of the test, the author offered some de- 
tailed (and speculative) reasons to account 
for this “significant” result. Results of such 
magnitude do not appear to require any 
lengthy interpretations of the data. 


Clinical Diagnosis: Some Concluding 
Comments 


Because the issue of clinical diagnosis is 
such a complex, controversial, and yet crucial 
issue, a few final words on this topic are in 
order. It is readily apparent that any research 
on diagnostic groups, for example, schizo- 
phrenia or manic-depressive psychosis, can 
be no better than the validity or meaning- 
fulness of the diagnoses secured and used, 
This problem has plagued research in this 
area for many years. On the one hand, sys- 
tematic and reliable classification of subjects 
facilitates research and the accumulation of 
potentially meaningful data about types of 
psychopathology. On the other hand, if the 
classification scheme used for such research 
is beset with problems of clarity, reliability, 
and validity, the results based on such classi- 
fication are bound to be limited in their use- 
fulness. An unreliable and loose scheme is 
bound to produce unreliable and variable re- 
sults. The most highly quantified data and 
the most exacting statistical analyses cannot 
provide worthwhile conclusions if the assump- 
tions or foundation on which they are based 
are weak, 

If research based on clinical diagnosis i 
to have any theoretical or clinical value, tak 
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attention will have to be paid to imp) 
the selection and specification of research 
ples. Otherwise, we will continue to have th 
confusing array of conflicting and contradic- 
tory findings that makes its way into the 
published literature each year, despite the 
high rejection rate of journals published by 
the American Psychological Association, 

Although it will not be easy to improve the — 
situation greatly, it is my belief that some d 
improvements can be made in the current 
situation. Editors should openly state that 
articles based on broad diagnostic categories 
will usually not be accepted for publication. 
That is, reports on “schizophrenia,” “brain 
damage,” and the like, will be considered too 
diffuse to merit publication. Instead, greater 
specificity will be required in the future. In 
line with the suggestions discussed earlier in 
this article, the authors of studies will be ex- 
pected to specify and describe their research 
subjects in ways that give more meaning to 
the diagnostic categories studied. In cases 
diagnosed as schizophrenic, for example, the 
patients will have to be described also in 
terms of measures such as process-reactive} 
good or poor premorbids; the MMPI; IQ; 
factored scales such as those developed by 
Lorr, Klett, McNair, and Lasky (1962); and 
the like. Major symptoms of the subjects 
should also be described and rated on reliable 
scales. To treat schizophrenia as a single 
meaningful category or disorder is to coutt 
chaos and disaster. 

In a similar manner, brain damage has for 
too long been treated as if it were a unitary 
disorder when actually a great variety of dis- 
orders and impairments have been subsu 
under this crude designation. The site of the 
disorder in the brain, the type of lesion, the 
type of onset, and the source of the disorder 
or injury are all pertinent. Even though some 
progress in the direction suggested here has 
been made in recent years, much more ne 
to be done along these lines. This is particu- 
larly true as new and equally vague categories 
such as the hyperactive child, minimal brain 
damage, infantile autism, and learning dis- 
orders begin to receive both popular and re 
search attention. 

Unless we pay attention to the research 
issues involved and try for as much specifica- 
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tion as possible of the disorders we are study- 
ing, our research efforts will only produce 
conflicting and disappointing results, and 
dinical practice will be the loser. We can 
karn from our past deficiencies and, hope- 
fully, strive to improve the quality of our 
investigative efforts. 
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Clinical Neuropsychological Research 
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Methodological problems and issues in clinical neuropsychological research are 
discussed for four types of neuropsychological studies: (a) differential diagnosis, 
(b) basic brain-behavior relationships, (c) effects of noxious agents or factors on 
brain—behavior relationhips, and (d) rehabilitation of neuropsychological deficits. 
Recommendations as to how to handle these methodological problems are made. 
The characteristics of good case study reports are presented. 


Neuropsychology is the study of brain- 
behavior relationships; clinical neuropsychol- 
ogy is the application of empirically estab- 
lished facts concerning these relationships, 
and theories derived from them, to clinical 
problems. During the last decade we have 
witnessed an outpouring of research in clini- 
cal neuropsychology. All indications point to 
a continuation of this healthy and vigorous 
area of research. In this article we discuss 
some of the methodological problems that, if 
carefully considered and acted on, could sub- 
stantially improve the quality of the research. 
(And we include ourselves as recipients of 
that admonition!) In some instances the 
methodological points made will have much 
in common with any psychological experi- 
ment; in other instances problems that are 
fairly unique to neuropsychological research 
will be addressed. 

Before consideration of specific methodo- 
logical problems, it is appropriate to point 
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out that competent clinical neuropsychological 
research demands mastery of several areas, 
The traditional areas of measurement of hu- 
man behavior (psychometrics) and psycho- 
pathology, coupled with a serious study of 
the central nervous system (e.g., neuroanat- 
omy, neurophysiology, and neuropathology) 
and exposure to clinical neurological concepts 
and procedures, are necessary, in our opinion, 
for the level of sophistication demanded by 
this complex field. This is not to say that one 
cannot conduct meaningful neuropsychologi- 
cal research without knowing the microscopic 
anatomy of the brain; we are saying that 
good neuropsychological research demands a 
background often lacking in psychologists’ 
training. This is a remediable condition. 
Given an adequate background, a second 
factor conducive to good research in this area 
is the continual recognition that behavior is 
the final common expression of a tremendous 
number of possible factors. We emphasize 
this somewhat trite and elementary maxim be- 
cause it is seemingly frequently ignored. For 
example, cognitive “deficits” are characteris- 
tically found in brain-damaged patients, but 
they are also present in patients with schizo- 
phrenia (G. Goldstein & Halperin, 1977), de- 
pression (Miller, 1975), anxiety (Chapman 
& Wolff, 1959), or culturally deprived or edu- 
cationally disadvantaged persons (Amante, 
VanHouten, Grieve, Bader, & Margules, 
1977). In the Halstead-Reitan Battery (Rei- 
tan & Davison, 1974), much attention !5 
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given to differential left-hand and right-hand 
motor performance as an indication of lateral- 
ity of cortical damage. However, poor motor 
performance with either hand may result from 
peripheral motor and sensory injury of a sub- 
flenature. Unless this possibility is thoroughly 
explored by careful history taking, erroneous 
inferences could be made about the brain 
state. Our point here is that if the neuro- 


- psychological researcher keeps this maxim in 


mind, possible alternative explanations or con- 
founding variables may be anticipated, thus 
resulting in a more acceptable research report. 
Let us consider, for example, five indepen- 
dent variables that most experimenters would 
agree are of importance in behavioral re- 
search: age, education, sex, socioeconomic 
level of subjects, and experimenter or ex- 
aminer characteristics. Are these variables also 
important in neuropsychological research? 


Age 


Heaton, Baade, and Johnson (1978) have 
recently reviewed 94 studies in which neuro- 
psychological test scores of psychiatric pa- 
tients and brain-damaged patients were com- 
pared. In 38% of these studies, age was either 
Not mentioned or the groups differed with 
Tespect to age! Given the well-known rela- 
tionship of age to cognitive, perceptual, sen- 
sory and perceptual-motor, and information- 
Processing variables, most of which are being 
Measured in neuropsychological studies, it is 
Vital that age receive close scrutiny. The 
Safest procedure is to equate groups being 
studied on age both as to means and standard 
pen. If age is significantly different 

groups, and age matching by dropping or 
ng subjects is not possible, there are sev- 
Sa hoone First, it should be deter- 
Ea if the dependent variable is in fact 

4 fe with age. If it is not correlated in 
mee, control, or other experi- 
be di groups, then it is likely that age could 

N Scounted as a significant factor. 
aa however, that a lack of correlation 

Ne brain-damaged group could not be used 
ity Seba the influence of age. Heterogene- 
S amage to the brain could obfuscate any 

ral relationship. For example, Prigatano 
Parsons (1976), in a cross-validated 
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study, found that the Category and Tactual 
Performance Test measures on the Halstead- 
Reitan Battery correlated significantly with 
age in both brain-damaged and non-brain- 
damaged groups but that the Rhythm, Speech 
Perception, and Tapping tests correlated sig- 
nificantly only in the non-brain-damaged 
groups. Apparently, the brain-damage effect 
was sufficient to override the effects of age 
in the latter variable. Incidentally, the cor- 
relations of age with the Halstead Impair- 
ment Index in the non-brain-damaged groups 
were .57 and .64; in the brain-damaged group 
they were, respectively, 33 and .44. All cor- 
relations were significant, but they were con- 
sistently lower in the brain-damaged group. 

Another method of handling age difference 
among groups, once the relationship between 
age and dependent variables has been estab- 
lished, is to adjust the dependent variable 
scores by analysis of covariance and then test 
for differences. Partial correlational techniques 
could also be used. To repeat, however, the 
most clear-cut solution is to equate groups on 
age means and variability. 


Education 


Heaton et al. (1978) were “reassured” by 
the 60% of the studies that they reviewed, 
which stated that the groups under investi- 
gation had comparable educational back- 
ground. What about the 40% who did not? 
Does education make a difference? Two re- 
cent articles have addressed this question. 
Prigatano and Parsons (1976) correlated edu- 
cation and Halstead Battery performances in 
brain-damaged and other patient groups. 
Neither the brain-damaged nor the psychiatric 
group had significant correlations. However, 
in a group of non-brain-damaged general 
medical-surgical patients (N= 50), they 
found significant correlations with six Hal- 
stead measures. It seems likely that psychi- 
atric disorder and brain damage introduce 
enough variance to reduce the correlation 
found in a nonpsychopathological population. 
The effects of education were demonstrated 
in a clear-cut fashion in a recent study by 
Finlayson, Johnson, and Reitan (1977). 
These investigators compared brain-damaged 
and control subjects, stratified in three levels 


610 


of education, on the Halstead Neuropsycho- 
logical Battery. Both level of education and 
brain damage had a “pronounced effect” on 
scores; the lower education groups scored 
lower on the Halstead measures, as, of course, 
did the brain-damaged group, Our recommen- 
dations to investigators as regards the possible 
effects of education are the same as we dis- 
cussed for age. 


Sex 


Given the popularity of this variable in 
most dimensions of our lives, it seems rather 
strange that the possible differences between 
males and females in ical stud- 
ies has received little attention. In fact, in 
the Heaton et al, (1978) article, sex is not 
even mentioned, let alone introduced as an 
important methodological variable! Yet there 
have been at least two 
studies that have found that the sex variable 
is of some importance. Lansdell (1968) stud- 
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Socioeconomic Level 

This variable is closely related to ed 
and so in most adult studies, education is 
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sufficient. However, as education levels con- 
tinue to rise and “social” promotions are the 
rule rather than the exception, it may be im- 
portant to again consider the broader variable 
of socioeconomic level. Another aspect is the 
occupation of the subjects. Has there been 
overlearning of specific skills, which in turn 
could lead to spurious results? For example, 
manual workers and bookkeeper clerks might 
well differ in patterns of strength in percep- 
tual-motor and verbal-calculation skills. De- 
pending on the titre of occupational represen- 
tation in a given sample of brain-damaged 
patients, many different inferences can be 
made, The socioeconomic variable is quite im- 
portant in neuropsychological investigations 
of children, Indeed, in a recent study, Amante 
et al. (1977) concluded that levels of neuro- 
logical integrity vary along a socioeconomic 
gradient. The relevant factors associated with 
social causation include malnutrition, reduced 
environmental stimulation, and inadequate ob- 
stetrical and pediatric care. 
We recommond that adult studies in neuro- 
consider the occupational! variable 
along with education. Certainly with children, 
the general family socioeconomic background 
should be specified, as educational levels will 
be roughly equivalent for groups through 
age 16. 


Examiner Characteristics 


Another independent variable to be con- 
sidered is that of the experimenter, or more 
fically the examiner or tester. Schafer 
(1954) some years ago eloquently described 
the various problems faced by the psychologi- 
cal test examiner. Among them was the “pres 
sure” to obtain a “score,” a pressure hat 
can often lead to an inattention to the pa 
tient’s interpersonal needs during the exami- 
nation or even to poor test administration. 
Inexperienced or poorly trained examiners 
making such mistakes may well fail to elicit 
maximum effort or motivation on the part of 
the patient. Without the latter, the validity 
of tests such as the Halstead-Reitan Battery 
may be questioned (Reitan, Note 1). Studies 
in clinical neuropsychology seldom 
the degree of expertise of the examiners. Cet 
tainly there are varying degrees of sophisti 


of the neuropsychology technicians used 
the various clinical and research labora- 
ries throughout the country. Yet, clinical 
uiropsychological research is often published 
th no mention of how this variable may 
influenced their reported findings. 

he social psychological research of Rosen- 
al (1968) has certainly pointed to the pos- 
e impact that experimenter or examiner 
pectations may have on subjects’ perform- 
ce or the evaluation of subjects’ perform- 
ice, Schachter (1964) has argued effectively 
the perceived emotional environment can 
a major influence on a person’s subjec- 
emotional state and reactions, which in 
im can affect performance, Although there 
e relatively few neuropsychological studies 
address these questions, Parsons and 
t (1966) found that brain-damaged 
tients showed less improvement on a per- 
ptual-motor neuropsychological test when 
mined by a “disinterested—factual” ex- 
iner as compared to one who was “suppor- 
” and interested in reducing the patient’s 
anxiety. 

demonstration that the interpersonal 
imate is important in examining brain-dam- 
ged patients is important in view of the grow- 
use of technicians and computer-run lab- 
tories. A disinterested examiner or an in- 
ction with a computer might create the 
ype of atmosphere that interferes with the 
imum level of performance in some pa- 
ents or groups of patients but not in others, 
ding to conflicting results across labora- 
We recommend that neuropsychological re- 
Arch reports state the level of sophistication 
® training of the persons who conduct the 
mination. A short description of the inter- 
climate would also be helpful (e.g., 
ts were encouraged frequently”; “if 
» Patients took a short break”; “instruc- 
as were given clearly, and testing did not 
ted until the patient demonstrated un- 
nding”; “supportive reassurance was 
Propriately given”; etc.). 


Approaches to Neuropsychological 
Investigations 


_ europsychological studies customarily 
eve fallen into three major types. In the 
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first type, the objective is the differential diag- 
nosis of brain-damaged patients from non- 
brain-damaged control and other psychopatho- 
logical groups, an applied or clinical problem. 
The second type seeks to identify and de- 
scribe the general and specific changes in 
brain-behavior relationships as a function of 
brain states. The intent of these latter studies 
is usually to contribute to our basic under- 
standing of neuropsychological relationships. 
The third type consists of studies that at- 
tempt to ascertain the effects of certain noxi- 
ous agents (e.g, metabolic disease, drugs, 
trauma, etc.) on the brain—behavior relation- 
ships. Looming on the horizon, however, is a 
fourth type of studies, the new and challeng- 
ing field of rehabilitation of neuropsychologi- 
cal deficit behavior. Of course, there are many 
studies in which several objectives are present, 
but for our present purposes we will consider 
the four types separately. Finally, we shall 
present our notions of what constitutes a good 
case study. 


Differential Diagnosis 
A problem area that has preoccupied psy- 
chologists since the rise of clinical psychology 
in the late 1940s is that of identification of 
brain-damaged patients. The objective of 
these studies is to develop behavioral 
measures that will discriminate a group of 
patients typically identified as “brain dam- 
” from patients who have another psycho- 
pathological disturbance (e.g, schizophrenia, 
depression, or anxiety) and from non-brain- 
damaged, nonemotionally disturbed control 
subjects. Although there are many investiga- 
tors who have wondered whether continued 
efforts in this area are warranted (Parsons, 
1970; Smith, 1975; Spreen & Benton, 1965), 
the continued interest and vigorous pursuit 
of such studies suggest that the working 
clinicians believe otherwise. 


Subject Characteristics 


The primary methodological problem is 
that of subject selection and subject charac- 
teristics. Let us consider the brain-damaged 

ients, It is not unusual in neuropsychologi- 
cal studies to find that the patients were de- 
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clared “clearly brain damaged” from the 
medical records by unknown or at least un- 
identified physicians or other persons (Heaton 
et al., 1978), There seems to be an assump- 
tion that perfectly reliable criteria were used. 
It should come as no surprise, however, to 
learn that even neurologists do not always 
agree on diagnosis of brain damage. In one 
of the classic studies (Fisher & Gonda, 1955), 
two skilled neurologists made a judgment as 
to whether damage existed rostral to the fo- 
ramen magnum in patients who were given 
thorough neurological and other biomedical 
tests. This judgment was based on review of 
the records and included clinic follow-up re- 
ports and occasional autopsy reports. The 
agreement was 86% under these superior di- 
agnostic conditions. Surely under less favor- 
able conditions the percentage of agreement 
would drop. 

Given the problems of cost, availability, and 
time, the preferred method of having two 
neurologists independently examine the pa- 
tient probably cannot be achieved by most 
neuropsychological researchers. One solution 
would be to have at least one neurologist re- 
view all available data. Criteria of positive 
neurological examination plus at least one 
positive biomedical test (e.g., computerized 
axial tomography [CAT] scan, skull films, 
arteriogram, pneumoencephalogram, electro- 

‘am, and so forth) could be used. 
A selected sample could be blindly rated again 
by the neurologist to give some indication of 
the reliability of his or her own diagnostic 
statements. The point is that more attention 
must be given to the defining criteria of brain 
damage, especially the qualifications and re- 
liability of the clinical judges, the latter be- 
ing preferably the relevant medical specialists 
(Jacobs, 1977). After all, the behavioral mea- 
sures can only be as effective in diagnostic 
statements as the reliability and validity of 
the criterion measure of brain damage permit. 

As Piotrowski (1940) pointed out almost 
3 decades ago, one of the best ways to deter- 
mine the diagnostic utility of neuropsycho- 
logical tests would be to apply the tests to a 
sample of persons referred for suspected brain 
damage. Some of these people would show 
brain damage; others would not, the latter 
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constituting what has more recently been 
termed pseudo-neurologic (Matthews, Shaw 
& Kigve, 1966). If the neuropsychological 
tests (independent of the diagnostic process) 
can distinguish brain-damaged from non- 
brain-damaged persons, the tests have a diag- 
nostic potential. Note that this is a more 
exacting procedure than the method men- 
tioned earlier (i.e., using clearly diagnosed 
patients) and much more difficult to achieve 
practically. As neuropsychological testing has 
become more widespread and of demonstrated 
effectiveness (Filskov & Goldstein, 1974; G., 
Goldstein, 1974; Reitan & Davison, 1974; 
Satz, Fennell, & Reilly, 1970; Smith, 1975), 
it undoubtedly influences decisions about di- 
agnoses. In retrospect these influences are 
quite difficult to partial out. Further, a non- 
invasive superior biomedical test, the CAT 
scan, is now in use at most centers. Patients 
who previously presented diagnostic problems, 
because of the physicians’ understandable re- 
luctance to use invasive techniques, are now 
diagnosed earlier in the process. Parentheti- 
cally, use of the CAT scan as an important 
criterion measure in neuropsychological stud- 
ies will be routine in the future. It is im- 
portant to note, however, that reliability 
studies of CAT scans are only now appearing 
and that normal values and ranges for given 
laboratories are frequently inadequate. At the 
present time, users of the CAT scan in neuro- 
psychological studies would be advised to 
quantify their measurements as much as pos- 
sible and to present interrater reliabilities. 


Base Rates 


To return to the problem of subject char- 
acteristics in studies of differential diagnosis, 
the ultimate criterion of a good diagnostic 
test is the “hit rate.” The more true positives 
(brain damaged) and true negatives (non 
brain damaged) and the less false positives 
(not brain damaged called brain damaged) 
and false negatives (brain damaged called 
non brain damaged), the better the test. How- 
ever, interpretation of the efficacy of hit rates 
is highly dependent on the base rates in the 
populations sampled. Contrast, for example, 
the high incidence of brain-damaged patients 
on the neurology service with the typically 
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“pw incidence seen on psychiatric services in 
general hospitals. The cost efficiency of neuro- 
f psychological tests (Rimm, 1963) will differ 
| under these conditions. Clinical neuropsycho- 
logical researchers should be thoroughly fa- 
miliar with the base-rate problem (Gordon, 
1977; Heaton et al., 1978; Krug, 1971; 
Rimm, 1963; Satz, 1966). 


Test Variables 


What should guide the researcher in test 
selection in studies of differential diagnosis? 
First, researchers should be aware of the stud- 
ies that have compared hit rates with single 
_ and multiple tests. Spreen and Benton (1965), 
in their review, reported that individual neu- 
topsychological tests discriminated brain-dam- 
aged from normal controls with an average 
hit rate of 71%; with several tests the cumu- 
lative predictive value was 80%. They con- 
cluded that the search for screening devices 
had reached its culmination point and served 
its purpose. Heaton et al. (1978) reported a 
median hit rate of 75% for 84 classification 
attempts involving a variety of psychiatric 
disorders but omitting process or chronic 
Schizophrenia. Single tests and combinations 
Of tests appear to have similar hit tates. The 
Halstead Impairment Index, an index based 
0n seven tests, frequently gives the same level 
of hit rate as individual component tests 
(Smith, 1975), 
other words, for differential diagnosis 

ere seems little warrant for constructing a 
3 ge battery of tests. This is not to say that 

ge sampling of test behaviors is not im- 
Portant, rather it means that to answer the 
*Pecific question of differential diagnosis, most 

Single tests give practically the same 
of discrimination as combinations of 

. Of course, for gaining information, be- 

that of diagnosis per se, enlarging the 

e of behaviors studied (i.e, multiple 

ay highly desirable. In fact, in most 
Reciiying toe” the emphasis today is on 
damag 8 the nature and location of the brain 
e. As the field moves toward a greater 
his S on rehabilitation, as we will discuss 
ko more extensive sampling of behavior 

Y be required, 
liability of tests is always a concern for 
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the researcher. What do we know about the 
reliability of neuropsychological tests? Mata- 
razzo, Matarazzo, Wiens, and Gallo (1976) 
presented test-retest data on the Halstead- 
Reitan Impairment Index for four groups of 
subjects: normals, schizophrenics, and two 
samples of organic patients. In the patient 
groups the test-retest reliability was reassur- 
ingly high (.83, .63, and .82), considering that 
test-retests in clinical populations are always 
attenuated by change in patients’ conditions 
over time. In the normals it was low (.08), 
but this was due to the latter’s extremely low 
Impairment Index scores; actually 29 out of 
29 were classified as normal in each testing, 
for 100% accuracy. Of course, if the investi- 
gator is devising a new test, test-retest relia- 
bility coefficients should be presented. If for 
some reason this is not possible, other esti- 
mates of reliability such as split-half correla- 
tions can be used. As we will discuss later, 
reliability of the test is significant in speci- 
fication of group difference in performance. 
We are past the day when investigators try 
out a new test in brain-damaged and control 
groups, find large differences, and optimisti- 
cally submit the results for publication with- 
out considering such basics as reliability. 


Statistical Analysis 


In recent years the advent of computer 
analyses has enabled multivariate statistics to 
be applied to problems in differential diag- 
noses. These analyses have the advantage of 
maximizing differences among groups and 
identifying the measures on which groups 
differ most (G. Goldstein & Halperin, 1977; 
Swiercinsky & Warnock, 1977). It should be 
recognized that the sensitivity of multivariate 
analyses in accentuating group differences 
capitalizes on the chance variance that may 
be present as much as on true variance dif- 
ferences. Any study that purports to have 
heightened discrimination through multivari- 
ate analysis should be cross-validated. For 
example, in discussion of the use of stepwise 
regression multivariate analysis, Cohen and 
Cohen (1975) advise that if the results are 
“to be substantively interpreted, a cross-vali- 
dation of the stepwise analysis in a new sam- 
ple be undertaken, and only those conclusions 
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which hold for both samples should be drawn” 
(p. 104). 

In our opinion the search for the Holy Grail 
(i.e., the neuropsychological test(s) that will 
give a high discrimination of brain-damaged 
patients from controls or other psychopatho- 
logical groups in all types of settings, with 
all socioeconomic levels) is not likely to be 
as productive as other research approaches. 
If one feels compelled to continue this search, 
then we suggest that the investigator follow 
these basic suggestions: (a) Give specific in- 
formation on the reliability of the diagnoses 
(done preferably by relevant specialists) for 
all of the patient groups studied; (b) give 
some estimate of the base rate of brain dam- 
age in the samples studied and compare hit 
rates with them; (c) give some indication of 
the reliability of the tests used for measure- 
ment; and (d) conduct a cross-validation 
study to determine whether the statistically 
maximized hit rates will hold for another 
sample. 


Studies Elucidating Basic Brain-Behavior 
Relationships 


In these studies the concern is with gain- 
ing basic understanding of the qualitative dif- 
ferences in brain-behavior relationships. What 
are the functions of the right and left hemi- 
spheres? Are there different behavioral pat- 
terns characteristic of localized lesions? These 
and many other questions are answered by 
studying patients who are identified on the 
basis of known damage to the brain in certain 
areas (e.g., left-hemisphere lesion vs. right- 
hemisphere lesion patients), The patients are 
given neuropsychological tests and, if differ- 
ential results are obtained, inferences are 
made about the role of those affected neuro- 
anatomical regions in behavior, 


General Versus Specific Effects 


At the outset it is important to r i 
that there are both general and ERRA 
of brain damage (Adams, 1969; Chapman & 
Wolff, 1959; Parsons, 1970; Smith, 1975) 
and that both effects are dependent on the 
size or extent of the lesion, Chapman and 
Wolff (1959), in a classic article that should 
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be read and reread by researchers in clinical 
neuropsychology, have provided convincing 
evidence of the relationship; the larger the 
amount of tissue removed from the brain, the 
greater the postoperative impairment of the 
patient. Specific effects on neuropsychological 
tests will always be embedded in the larger 
context of these general effects; comparisons 
of different lesion locations should be made 
with patients who have roughly equivalent 
amounts of dysfunctional or destroyed tissue. 
If patients are being compared who have not 
had neurosurgical intervention, this may be a 
difficult condition to meet. However, some 
estimate could be made by neurologists’ rat- 
ings of severity of the effects of the damage 
(Parsons, Vega, & Burn, 1969). 


Variables Related to the Changed Brain 


There are a number of aspects of the lesion 
or disease process to be considered. One vari- 
able of utmost importance is duration of the 
lesion or time since insult to the brain. Smith 
(1975) provided a clear discussion on the 
effects of diaschisis (i.e., the effects of local- 
ized lesions on more distant parts of the 
brain). He cited studies indicating disturbed 
cerebral hemisphere blood flow and cerebral 
metabolism in both the diseased and healthy 
hemispheres of patients, Diaschisis is related 
to the “general” effects noted above but may 
be separable in that it contributes to the 
general effects in the period immediately fol- 
lowing the insult but diminishes in effect 
over time. 

Another aspect of the duration of lesion 
variable is the age at which the insult occurs. 
It is well-known that effects of brain dam- 
age as measured in adulthood are dependent 
on whether the damage was incurred as 4 
child, while the brain was still developing, 
or after physical maturity of the brain has 
been reached, that is, by age 15 or 16. (See 
various chapters in Reitan & Davison, 1974-) 
Even in adulthood it would appear that dura- 
tion of several years is a critical value. Over 
time, static lesions lead to less discriminating 
differences than lesions that are relatively 
acute (Reitan, 1966, 1974). Rapidly growing 
or worsening lesions are also likely to lead to 
much greater deficits than static lesions eve? 


the duration is equivalent (Fitzhugh, Fitz- 
gh, & Reitan, 1961; Reitan, 1966). Com- 
ning brain-damaged subjects who may differ 
terms of age of onset, time since insult, 
tic or changing nature of lesion introduces 
verse sources of variance. Either inaccurate 
enuation or accentuation of specific brain— 
havior relationships could occur. 

The nature of the disease process or dys- 
tional brain state is important. Reitan 
6, 1974) has pointed out that cerebro- 
scular disease, neoplasms, trauma, and de- 
erative diseases give rise to different pat- 
of neuropsychological deficits. Different 
tentages of such patients in any specific 
alization studies could give rise to mis- 
ding generalizations of conclusions regard- 
specific brain—behavior relationships. 


ndedness, A phasia, Visual Field Defects, 


otional Res ponses 


Four other subject variables deserve not- 
i handedness, presence of aphasia or vis- 
field defects, and the patients’ emotional 
ponses. It is customary to specify the 
“edness of the population being studied. 
edness and footedness are also frequently 
} but their relationship to performance is 
lear.) The difference in lateralization ef- 
jas a function of handedness is currently 
S bject of many investigations (See any 
nt issues of Cortex and Neuropsycho- 
L) The relationship is complex and not 
l clearly understood, but one conclusion 
@vious: Groups to be compared should 
Pmposed of similar percentages of right- 
ed and left-handed subjects. 

» presence of aphasia is important in 
5 of lateralization or localization. Dis- 
language functioning may grossly af- 
i. Understanding of oral or written in- 
Mns or the communications of answers 
te or both (Zangwill, 1969). Infer- 
Made about disturbed, nonverbal, higher 
‘ sinctioning in such patients may be 
y made. It is of importance to iden- 
aphasic subjects and the nature of 
. Et disturbances in studies of general 
A an damage (in which estimates 
Wy Ot overall levels of intellectual func- 
P% as well as in lateralization or locali- 
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zation studies. A similar caveat can be made 
for visual perceptual studies in which patients 
with visual field deficits are used. Faglioni, 
Scotti, and Spinnler (1969) have presented 
convincing evidence of the clarification of re- 
search findings by considering this variable. 

Finally, we should consider the role of the 
patient’s reaction to his or her changed brain 
state. Parsons (1970) has discussed this issue 
rather thoroughly, as have other workers 
(Chapman & Wolff, 1959; K. Goldstein, 
1959; Reitan & Davison, 1974; Smith, 1975). 
Although the typical emotional response of 
brain-damaged patients to their condition is 
depression (Parsons, 1970), there are a vari- 
ety of other effects as described by Chapman 
and Wolff (1959), including emotional and 
physical withdrawal, circumscribed interests, 
increase in premorbid defenses, and lowered 
frustration tolerance. Kurt Goldstein’s (1959) 
discussion of the organismic changes of the 
brain-damaged patient places this problem 
in an even broader holistic framework. Com- 
plicating the picture, however, is the recent 
evidence (Galin, 1974), which suggests that 
left and right hemispheres may be differen- 
tially salient in the control of dysphoric and 
depressive versus euphoric and sense of well- 
being feeling states. ; 

In studies that attempt to uncover basic 
brain-behavior relationships, care should be 
taken to distinguish general from specific ef- 
fects. Unless controlled or otherwise accounted 
for, a number of variables can influence the 
results and hence the interpretation of specific 
brain-behavior findings. These variables in- 
clude age at which the damage occurred, time 
since onset of lesion, static versus changing 
(worsening) condition, type of disease or in- 
jury, handedness, presence of aphasia and 
visual-field deficits, and the patient’s emo- 
tional responses. We suggest that these vari- 
ables be assessed or otherwise noted in re- 
search reports. 


Studies of the Effects of N oxious Agents or 
Factors on the Brain 


In this class of studies, the experimenter 1s 
usually concerned with the question, what is 
the effect of “X” (e.g., anoxia, hypoglycemia, 
LSD, alcohol, liver disease, periods of uncon- 
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sciousness from trauma, penetrating head 
wounds, unilateral electroconvulsive therapy, 
etc.) on functioning of the brain? Whether 
the variable is an acute study, such as acute 
alcohol ingestion, or a chronic study as in 
the neuropsychological changes in chronic 
alcoholics, ithe intent of these studies is us- 
ually to specify the brain—behavior relation- 
ships affected so that appropriate prevention 
and remediation techniques could be used. 
Both general and specific effects are likely 
to be investigated. Most if not all of the meth- 
odological variables previously described are 
pertinent here. Among the specific effects, a 
popular current line of research is the dif- 
ferential effects of various agents on hemi- 
spheric laterality of function. Parsons (1975), 
for example, has suggested that both the acute 
and chronic effects of alcohol are more de- 
tectable in behaviors associated with right- 
hemisphere saliency as opposed to behaviors 
under control of the left hemisphere. If we 
are interested in whether variable X does re- 
sult in differential hemispheric test perform- 
ance, it is advantageous to have tests that 
differ in content (e.g., verbal and visual paired 
associates) but are of the same general level 
of reliability and item difficulty. 

Chapman and Chapman (1975) have 
shown how the more reliable test will lead 
to the greatest differences among the groups. 
For example, if the verbal paired associates 
are of greater reliability than the visual paired 
associates, even though both might be de- 
pressed by variable X, the more reliable test 
will give rise to the greater and more signifi- 
cant differences. Further, they point out, item 
difficulty also helps determine the mean dif- 
ference between groups on tests. Item dif- 
ficulty affects task variance; the larger the 
task variance, the larger the separation of 
groups of high and low ability on the test, 

In other words, to examine the effect of 
variable X on tests measuring right- and left- 
hemisphere functioning or other comparisons 
such as frontal versus postfrontal function- 
ing, it would be highly desirable to use tests 
that are equivalent in reliability and item 
difficulty. The Wechsler Adult Intelligence 
Scale (WAIS) Satisfies, at least in part, th 

A À , these 
requirements, which may be one of the rea- 
sons why the WAIS continues to be widely 
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used in neuropsychological research even 
though it was designed for a different pur- 
pose. Although we cannot hope to achieve 
the comparable psychometric status of the 
WAIS for most of our neuropsychological 
tests without national and federally supported 
effort, we believe that the kind of careful 
analysis of the psychometric properties of the 
tests (reliability and item difficulty) as ad- 
vocated by Chapman and Chapman (1975) 
provides a rather clear direction for more 
sophisticated research in this area. 


Rehabilitation of Patients with 
Neuropsychological Deficits and 
the Case Study Method 


Clinical neuropsychologists are becoming 
more aware of their potential contribution to 
the field in identifying behavior changes fol- 
lowing brain injury and developing rehabilita- 
tion programs to improve neuropsychological 
functioning (Diller, 1976; Lewinsohn, 1973; 
Diller et al., Note 2). Strategies for improv- 
ing body image perception, adapting to vis- 
ual field deficits (Diller et al., Note 2), and 
improving memory functioning (Lewinsohn, 
Danaher, & Kikel, 1977) have appeared, 
which provide excellent examples of what can 
be done when specific cognitive and perceptual 
retraining programs are attempted with brain- 
injured patients. Moreover, there is thera- 
peutic value in providing specific information 
concerning the nature of a patient’s deficits 
when consulting with patients, their families, 
and employers. A recent article by Fuld and 
Fisher (1977) makes this and other important 
points regarding the management of children 
with closed head injury. In general, the meth- 
odological problems in this area are similar to 
those discussed previously and will not be re- 
peated. However, as in all cases of remedia- 
tion, the major problem is identifying the 
factor(s) at work in change in behavior over 
time. Without adequate control groups it 18 
difficult to describe causal effects due to the 
training procedures used. A second problem 
will be to develop repeatable techniques 1 
which effects of practice can be distinguishe 
from improvement. Our prediction is that this 
area will gradually become one of the major 
thrusts of neuropsychology. 
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Case Study Method 


Although group studies in this area are cer- 
tainly desirable, the time is ripe for the ap- 
pearance of systematic individual case studies 
that report on the patterns of neuropsycho- 
logical recovery following various brain in- 
juries. (For an example of this, see Prigatano, 
Note 3.) Also, case studies that attempt to 
assess the efficacy of various treatment or re- 
habilitative efforts on improving rate and level 
of recovery are needed. The possible experi- 
mental designs that might be used in the 
single case study are described in detail by 
Barlow and Hersen (1973). A good example 
of how individual case studies can lead the 
way to more systematic group research is seen 
in the work of Lewinsohn and his associates. 
Lewinsohn et al. (Note 4) described in some 
‘detail a model for assessing memory deficits 
and their treatment in individual patients. 
later, after several patients had been evalu- 
ated and treated, group analysis of the data 
“Was made, and some important points regard- 
tg the value of visual imagery in working 
‘With these patients was documented (Lewin- 
“ohn et al., 1977). 

Also, the use of the case study method to 
‘tnvey the human struggle involved in the 
Teovery process can be a powerful source of 
fin to other workers in the field. 
3 s (1972) description of one man’s life- 
nA determination to overcome neuropsycho- 
ka deficits secondary to a left-temporal- 
fl injury is an account that few readers 
k Sie It serves as an important reminder 
Pet can be done when human will and 

ba are combined. 
| in ae ne value of the case study method 

Mstrated” theoretical points is no better dem- 
hich In the famous case of H.M., in 

ae Corkin, and Teuber (1968) 

Rea, that specific long-lasting deficits 

i Men piace pa ue concom- 
tinct; z other glol cognitive 
4 ae WAIS IQ scores). Other stud- 
_— ie the case study method to 
Alea ewig: as well as practi- 
aplan, ie. oN owt (e.g., Geschwind & 

ith this bg $ sler, 1977). 

Value to tevi ckground, it may be of some 
ew what goes into a methodologi- 
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cally sound case study and the advantages and 
disadvantages of this approach. A methodo- 
logically sound individual case study clearly 
defines why the in-depth analysis of a given 
patient’s behavior is generally important to 
the field. Without this, it is simply an ex- 
plication of behavior that may or may not 
have been seen by other investigators and 
adds little scientific knowledge. Next, the case 
study must present a clear statement of the 
patient’s history, symptoms, and laboratory 
data to allow the reader to come to some con- 
clusions about the patient’s situation. If this 
cannot be agreed on, whatever point is being 
made by the case study will inevitably be ob- 
scured. Third, reliable and validated mea- 
sures, repeated observations, and/or experi- 
mental designs devised for individual case 
studies (see Barlow & Hersen, 1973) should 
be used whenever possible. This allows the 
reader to objectively evaluate data, not clini- 
cal impressions, about a patient. 

The advantages of the case study method 
are that it (a) provides a detailed look at 
several interacting variables, which is often 
impossible within the confines of group stud- 
ies; (b) allows for the appraisal of clinical as 
opposed to the purely “statistical” signifi- 
cance of a phenomenon; (c) shows documenta- 
tion of clinical material, which has theoretical 
importance; and (d) serves as a reminder to 
investigators that the field must eventually in- 
clude some explanation of the individual’s be- 
havior, The disadvantages are (a) the po- 
tential unreliability of a single case and (b) 
the inability to quantify the effects of poten- 
tially relevant variables. With these considera- 
tions in mind, the use of the individual case 
study method can be a useful adjunctive 
method in the field of clinical neuropsychol- 
ogy. We hope to see more of such studies in 


the future. 


Conclusion 


Methodological problems in clinical neuro- 
psychological research are similar in some 
respects to those encountered in all of behav- 
ioral research. Among these are subject vari- 
ables such as age, education, sex and socioeco- 
nomic level, and experimenter characteristics, 
such as degree of training and attitude. Dif- 
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ferent types of neuropsychological investiga- 
tions pose different methodological problems 
while sharing certain common concerns. Stud- 
ies of differential diagnosis require attention 
to reliability of diagnosis, statements as to hit 
rates in relation to base rates, and reliability 
of tests. Cross-validation of new findings, espe- 
cially if multivariate analyses are used, are 
almost mandatory. Studies that attempt to 
elucidate basic brain—behavior relationships 
should be concerned with distinguishing gen- 
eral from specific effects, amount of dysfunc- 
tional tissue, duration of lesion, age of onset, 
type of disorder, handedness, presence of 
aphasic or visual field defects, and the pa- 
tients’ emotional reaction. Investigations of 
the effects of noxious agents on brain func- 
tion, especially differential effects that depend 
on different neuroanatomical regions, should 
contain tests of appropriate content (e.g., 
verbal vs. visuospatial) that are equivalent 
on item reliability and difficulty, The emergent 
field of rehabilitation of neuropsychological 
deficits will have a major problem in demon- 
strating that improvement in neuropsychologi- 
cal performance is due to the training and not 
to practice or recovery associated with pas- 
sage of time, 
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Common Methodological Problems in MMPI Research 


James N. Butcher and Auke Tellegen 
University of Minnesota 


Research with the Minnesota Multiphasic Personality Inventory (MMPI) con- 
tinues at a high rate. Unfortunately, too many articles submitted and even ac- 
cepted for publication are methodologically weak. In this article we discuss some 
common methodological problems involving the use of the MMPI, encountered 
in the course of reviewing articles submitted for publication. A number of rele- 
vant issues are discussed, and some suggestions for improving research designs 


are made. 


Research with the Minnesota Multiphasic 
Personality Inventory (MMPI) has continued 
at a high level in recent years. As a self-report 
inventory that includes several measures of 
psychopathology, the MMPI, or some deriva- 
tive of it, has become widely used as a descrip- 
tive instrument or a criterion measure in a 
vast array of clinical and research investiga- 
tions. In 1972 Buros reported that over 200 
books and articles on the MMPI are pub- 
lished annually. Dahlstrom, Welsh, and Dahl- 
strom (1975) cited over 6,000 references to 
the MMPI; Butcher and Pancheri (1976) re- 
ported over 600 recent references in cross-na- 
tional MMPI research alone. Butcher and 
Owen (in press) recently reviewed and clas- 
sified MMPI research for the past 5 years. As 
shown in Table 1, they found that over one 
fourth of the research was focused on two 
areas of “popular” investigation: alcohol and 
drug abuse, and crime and delinquency. Ap- 
proximately 37% of the studies focused on 
the use of the MMPI to study nonpsychi- 
atric populations—medical patients, parents, 
women, ethnic groups, college students, and 
the aged. The MMPI seems to have been 
widely accepted as a criterion measure of psy- 
chopathology by researchers who want to 
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measure psychological problems in a variety 
of nonpsychiatric groups. But studies pointing 
to the need for revision or modification of the 
MMPI are scarce. This absence of revision- 
oriented research is impressive, since the 
MMPI is around 40 years old and can be ex- 
pected to show some signs of aging. The only 
serious efforts to modify the MMPI have been 
directed at shortening it: About 12% of the 
published research in the past 5 years con- 
cerned the development or use of shorter vet- 
sions of the instrument. 

There are many reasons for the extensive 
use of the MMPI: Its administration is rela 
tively effortless; its scoring is objective; gem 
erally straightforward objective interpretive 
Procedures are available; and its validity 4 
a criterion measure is comparatively wê 
founded. But some of the factors encouraging 
researchers to use a self-report personality 
measure such as the MMPI also result in the 
kinds of methodological flaws that this article 
may help to prevent. The MMPI is so easy 
to administer as a part of an ongoing study 
that researchers (even if they have little or 
no research background with the instrument) 
feel encouraged to include it. Furthermore 
since good criteria in psychopathology ot Pe 
sonality research are scarce, the MMPI scales 
are sometimes too readily adopted as ®% 
pedient proximate criteria. (In some respects) 
one might almost consider the MMPI an “at 
tractive nuisance” in the legal sense, to be aP- 
proached with appropriate caution.) 

The remainder of this article will focus 0 


= 
OO ———_-- R 


ee as 
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METHODOLOGICAL PROBLEMS 


‘ene methodological problems that are often 
sociated with research using the MMPI. We 
ipe to provide some guidelines that could 
idp avoid these difficulties, by providing min- 
il criteria. Some of these criteria emerged 
sa result of our experiences as reviewers of 
MPI manuscripts that were submitted for 
jiblication in several psychological journals. 
It will not be possible to provide an exhaus- 
lve review of all pertinent issues nor to avoid 
ming at times somewhat dogmatic. Yet it 
snot our intention to “dictate?” which areas 
il investigation are important or which ones 
uld be avoided, or to impose idiosyncratic 
imeferences for certain research methods when 
tal defensible ones are available. 


Wat Kind of Instrument is the MMPI? 


The MMPI is often mistakenly considered 
be an all-purpose personality assessment in- 
ent that is sensitive to “normal range” 
onality attributes. Consequently, some re- 
tchers use the MMPI with groups for 
hich a different instrument might be more 
pe The standard clinical MMPI 

£s are measures of psychopathology, not 
[tral personality. The MMPI should not be 
ee what it is not designed to do. 
pet A use the MMPI inappropriately, 
f e better to consider alternative or 

mentary measures such as the Cali- 
l co sical Inventory (CPI) or Per- 

& Y Research Form, which focus on nor- 
What Personality dimensions. 

a of MMPI measures can or 
be tree le The eight clinical scales and 
t ae on scales plus Mf and Si are 
ny a. “aes and most widely used 

Rave oan view of the many problems 
Mcinicas cA countered in the computation 

È e scores, two suggestions are in 


L In 
tandah research computations more under- 


e $ 
a pits will often be obtained if 

A n use the K correction, but uses K 
Ee nicator. This simply means 
ed, By as a “suppressor” is not as- 
ysis a pare, appropriate multivariate 
mhether xe data will permit determina- 
"Wesson ‘i an fact does function as a 
one’s particular data set. The K 
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Table 1 
Classification of Recent Minnesota Multiphasic 
Personality Inventory Research 


No. 
Category articles 

Alcohol and drug abuse: detection and 

treatment 96 
Short forms 66 
Use with other tests or rating scales 62 
Crime and delinquency 56 
Diagnostic considerations: rules, profiles, 

and code types 50 
Psychometric characteristics 49 
Medical patients: physical disorders and 

symptoms 47 
Correlation with diverse criteria 46 
Women 42 
Treatment variables and therapy outcome 38 
Depression and suicide 38 
College students and adolescents 34 
Parents, couples, and families 33 
Drug therapy: choice and effect 31 
Race: cross-cultural, ethnic 31 
Automated interpretation 26 


Employment screening and job performance 24 


Anxiety and stress 18 
Brain damage 18 
Sleep 10 
Sexual deviation 9 
Aging 5 


si ol E SS 
Note. Adapted from Butcher and Owen (in press). 


score will often be found to be an additional 
measure of good adjustment rather than a 
measure of invalid variance. If non-K-cor- 
rected T scores are used, use the correct con- 
version tables. Occasionally manuscripts are 
received containing figures with non-K-cor- 
rected profiles drawn according to incorrect 
norms. Do not use the raw score indications 
printed on the standard profile sheet unless 
your scores are actually K corrected! If you 
are plotting non-K-corrected scores for Scales 
Hs, Pd, Pt, Sc, and Ma and plan to display 
these on the standard profile sheet, it is neces- 
sary to obtain the correct T-score elevation 
from a conversion table (e.g, in Dahlstrom, 
Welsh, & Dahlstrom, 1972) and draw the ap- 
propriate elevation from the T-score indica- 


1 Researchers interested in obtaining access to the 
item response data of the original Minnesota stan- 
dardization sample should contact W. Grant Dahl- 
strom, Department of Psychology, University of 
North Carolina, Chapel Hill, North Carolina 27514. 
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tions on the side of the profile sheet. In spite 
of the above general recommendation about 
using non-K-corrected scores, it is sometimes 
useful for comparisons with other studies to 
do a parallel set of analyses using K-corrected 
scores, 

2. Use raw scores in research computations 
rather than T scores unless the use of T scores 
is specifically indicated, for example, in the 
analysis of code types or when combining data 
from both sexes, 

The eight clinical scales were developed pri- 
marily by means of empirical keying methods, 
using clinical diagnoses as criteria. Because of 
the primary reliance on external criteria 
rather than on internal structure, these scales 
are for the most part quite heterogeneous. 
Consequently, although these scales, particu- 
larly when used in combination, provide use- 
ful information as to the diagnostic status of 
a person (Dahlstrom et al, 1972; Overall, 

Note 1), they do not in any simple way 
reflect “what the patient is saying,” that is, 
how the patient is describing herself/himselj 
through the medium of the MMPI items. 
This, too, is important information, Nowa- 
days we recognize more clearly than in the 
heyday of empirical keying and “dust bowl 
empiricism” that important and clinically rele- 
vant direct relations exist between the mani- 


fest content of a 's self-descri 
clinical en ption and 


2. use of so-called “critical items” (Gray- 
son, 1951; Koss & Butcher, 1973; Koss, 
Butcher, & Hoffmann, 1976). 4 
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Both approaches can serve a very useful role 
complementary to that of the clinical scales, 


Scale Proliferation 


The development of additional MMPI 
scales has been a rather popular pastime for 
psychologists. There are presently more 
MMPI scales than there are items on the in- 
ventory! Unfortunately, much of the MMPI 
scale development does not derive from a 
sound conceptual framework. Many scales 
have been constructed by contrasting different 
“samples of convenience” (often of heteroge- 
neous makeup or with important character- 
istics unknown). Often these scales are not 
cross-validated, and more often than not their 
psychometric properties and interrelations 
with other scales are not reported. Many new 
scales, since they are, after all, derived from 
the same item pool, prove to be largely re 
dundant alternative versions of existing scales, 
although sometimes of poorer quality. 

A researcher who is tempted to add to this 
plethora of scales would find his or her con- 
tribution to the MMPI literature better re 
ceived and used if: 

1, The investigator has very carefully con 
sidered the question of whether the MMPI 
item pool adequately covers the domain of 
content of the construct to be measured. If 
the answer is not affirmative, then the MMPI 
will obviously not serve the investigator's pu 
pose very well, and the new scale will reflect 
more the characteristics of the MMPI item 
pool than the domain of the intended co™ 
struct. We believe that this is in fact the a 
with some new MMPI scales. 

2. The investigator can show the scale 1 
be conceptually interesting. s 

3. The scale is developed and cross-vali- 
dated on reasonably large, well-defined sai 
ples (whether it is an empirically keyed 
or one derived by internal consistency Me 
ods, e.g., factor analysis). 7 

4. The internal structure of the scale is T° 
ported (internal consistency, possibly a 
Structure), and the relation to other MMI b 
scales is reported. These relations along be 
validity data would ideally reveal that E 
scale either is a superior alternative t° 


ng MMPI scale or is a measure of a dis- 
ion or state not measured by existing 


‘In addition to publishing statistics on 
relationships, the author reports sub- 
external validational data. It is not 
ifficient to know that a scale correlates ap- 
ciably with a criterion, since, ideally, both 
gent and discriminant validity informa- 
should be reported (Campbell & Fiske, 
)) for the new scale, It is also important 
| know how well a scale predicts behavior for 
ividual in various contexts. Thus some 
ta reporting prediction success and failure 
ific settings (e.g., in the form of hit- 
tables) are desirable. 
lew MMPI scales, like many of the old 
Should not be assumed to measure the 
teristics suggested by its name or by its 
or. The researcher should be aware of the 
record of a particular scale before he or 
gambles on it. Block (1965) provided an 
iteresting illustrative example: 


„ ease of analysis should not mean casual- 
regard to the scales chosen for study. On oc- 
Scales have been employed in circumstances 
can only mean ignorance or naivete. These are 
Words but consider the following: several cor- 
Studies in the response set domain have 
d both the Prejudice (Pr) and Tolerance 
“Scales of the MMPI. Both of these scales are 
Gough (1951; 1952). The Pr scale was de- 
and validated as an MMPI measure to cor- 
With the California Ethnocentrism-Fascism 
le (Adorno et al., 1950), Later, Gough decided 
é the Pr scale slightly and for entirely ap- 
r conceptual reasons took the occasion to re- 
he scale as a measure of tolerance, reversing the 
n of scoring of the items. In the To scale, 29 
O items overlap with the Pr scale but scored 
tely, 
the Sample of scales selected for study can de- 
the shape of the results obtained, simple con- 
of availability must be bolstered by actual 
of candidate scales. To use both the Pr 
To scales in the same study and draw 
ty Conclusions from their opposite but identical 
La s is informative only about the in- 
. (p. 118) 


MMPI wears other clothes as well. 
Scales in the personality research do- 
l have been generated from the MMPI 
Pool or from constructs so close to those 
d by the MMPI that the items are 
My the same. Often the new researcher 
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is unaware of the original relationships or 
similarities and then conducts an “empirical” 
study rediscovering the affinity of the tests in 
question. A few instruments that are wholly 
or in part derived from the MMPI or are 
closely related are the CPI (over 200 items 
are common), the Taylor Manifest Anxiety 
Scale, and Lanyon’s Psychological Screening 
Inventory. 


MMPI Short Forms 


A large number of recent studies have used 
one of the several MMPI short forms, It 
should be noted that the only MMPI short 
form recognized by the test authors and the 
publishers consists of a reduced number of 
items (around 400) that include all of the 
items required to score the 3 validity and 10 
standard scales. The items excluded are those 
at the end of the booklet that are not scored 
on the basic scales. Some short forms have 
been the object of a large amount of recent 
MMPI work and have been developed by 
varying methods and for different purposes 
(the Mini-Mult, Midi-Mult, the Fasching- 
bauer Abbreviated MMPI, MMPI-168, etc.) 

Several recent studies have pointed out the 
limitations of MMPI short forms (Fillenbaum 
& Pfeiffer, 1976; Hedlund, Won Cho, & 
Powell, 1975; Hoffmann & Butcher, 1975). 
Although MMPI short forms may correlate 
significantly with the full MMPI, the resulting 
code-type congruence (hit rate) between the 
two forms is quite low (from 33% to 49%), 
too low to result in very similar individual clin- 
ical decisions. Although these studies indicate 
the need for caution in not accepting short 
forms at face value as adequate substitutions 
for full-length MMPI measures, many investi- 
gators continue to consider an MMPI short 
form as a near-equivalent set of scales. It is 
conceivable that some short forms could serve 
in a much more limited way as measures of 


global psychopathology. 


How to Find Relationships: Problems in 
Discovering and Appraising Relations 
Between the MMPI and 
Other Variables 

All MMPI research can be viewed as es- 
sentially correlational research, whether the 


( 
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MMPI scales are the “dependent” or “inde- 
pendent” variables and whether we compute 
correlations explicitly or, for example, do ¢ 
tests, comparing one group with another on 
certain MMPI measures (e.g., patients vs. 
normals), or determine hit rates obtained 
with certain MMPI-based diagnostic rules. 
Sometimes the strength of the relation investi- 
gated is immediately given by our statistics 
(e.g., correlation coefficient, hit rate), some- 
times it is not (¢ test, F ratio). It is known 
but easily forgotten that some measure of 
strength of relation is always necessary if we 
are to determine the importance of our find- 
ings. Demonstrating statistical significance is 
necessary, but it is not enough. Not only 
should the magnitude of, for example, a corre- 
lation coefficient be considered but also its 
confidence interval. Yet one suspects that an 
inverse relation exists between the sample 
sizes used in published studies and the 
strength of the relations that are claimed to 
exist between certain variables. The reason, 
of course, is that a good way of “discovering” 
dramatically strong relationships in a set of 
data is to collect several measures, preferably 
those that can be expected to show some 
moderate interrelations, using a small sample. 
Chance alone will move a few of these ex- 
pectable relations up into the “strikingly 
high” range (the “trade-off” being that one 
can expect to “lose” some of the other rela- 
tionships by fluctuations in the opposite di- 
rection), 

In short, sample sizes should be large 
enough to permit conclusions about the 
strength of the relationships under considera- 
tion that go beyond the inference that the null 
hypothesis is false, The rejection of this null 
hypothesis is a trivial contribution, particu- 
larly if the observed relationship is in the 
(more or less) expected direction, because, as 
has been pointed out, we Practically know the 
null hypothesis to be false—almost every- 
thing is somewhat related to everything 

(Lykken, 1968; Meehl, 1967). Example: Sup- 
Pose one collects data on a (small) sample of 
19 subjects and “discovers” a somewhat unex- 
pected correlation of .50 between Scale 7 and 
a psychophysiological measure of “arousal,” 
skin conductance level. The corresponding s 
value of .50 is .55 with a standard error of 25. 
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The .05 (two-tailed) confidence interval for 
this correlation, then, in z values, is .55 + .49, 
In correlational values the corresponding 
range is between .06 and .78. The finding of 
a possibly very weak correlation, one that is 
only significantly greater than .06, we submit, 
is hardly a contribution (particularly not in 
this case, since the direction of the relation 
was more or less anticipated), in spite of its 
Statistical significance and observed magni- 
tude. Large sample sizes are essential to 
achieve less trivial results. Admittedly, if an 
observed relation is definitely not in the ex- 
pected direction, and is definitely “significant” 
(even though the estimate of its strength still 
has a large margin of error), and involves a 
truly interesting relationship, then we might 
speak of a finding that deserves publication at 
this point to stimulate further research in 
what now appears to be a promising area. 
Nevertheless, it would still be better to follow 
up one’s own findings with a replication per- 
mitting a more precise estimate and demon- 
strating a relationship of nontrivial strength. 
(After all, the more interesting our findings 
are to us, the less expected they must have 
been, and therefore the less reason we ap- 
parently have to assume their replicability 
without further test! ) 

One fairly common practice is to calculate 
multiple correlations, often on a small sample, 
and to announce their statistical significance 
without reminding the reader that the magni- 
tude of the obtained multiple correlations may 
be substantially inflated. Without cross-vali- 
dation or an appropriate “shrinkage” estimate, 
this procedure can be grossly misleading. (See 
Schmitt, Coyle, & Rauschenberger, 1977, for 
a recent discussion of this topic.) 

Almost all MMPI research is multivariate 
(whether or not multivariate statistics are a€- 
tually used). This, in combination with the 
frequent availability of “samples of conve- 
nience” (consisting, e.g., of individuals who 
have completed the MMPI as part of a screen- 
ing procedure in some clinical or counseling 
setting), inevitably invites “look-see” Of 
“shotgun” studies, guided only by the in- 
vestigator’s vague hope of finding something 
somewhere among the many possible relations 
that might be subjected to scrutiny. Particu- 
larly if the sample is small, chance findings 


wil often occur, and if one’s sample of con- 
venience is, in addition, poorly defined, then 
even nonchance findings may well prove not 
replicable in the absence of clear understand- 
ing of to what population the results are to 
‘be generalized. Unfortunately, a rather large 
number of manuscripts describing “one shot” 
studies using small and haphazardly collected 
samples continue to be submitted for publica- 
tion—and to appear in print. 

Another undesirable practice is to boost 
sample sizes by lumping together subjects dif- 
‘fering on some potentially important charac- 
teristics such as sex, age, race, and socioeco- 
omic status (Carlson, 1971). Sometimes 
these variables may in fact prove not to be 
Important for the relations of interest to the 
investigator. But this cannot be assumed. At 
the very least sex ought to be included among 
‘he’s variables. However, this would still not 
allow for the possibility that relationships be- 
tween the MMPI and other variables, for ex- 
imple, response to treatment or diagnostic 
atus, might differ for the two sexes, for dif- 
ferent age groups, and so on. If sample sizes 
permit, the simplest first step would be to sub- 
livide one’s sample into relatively homoge- 
neous subgroups in respect to major differen- 
llators (sex, age, and so forth) and to evalu- 
ate the relationships in the different subgroups 
lor possible differences. 
ot of the MMPI “code type” tradi- 
ips +‘ Is not uncommon to analyze relation- 
o y presenting MMPI correlates in the 
E of relevant base rates or means associ- 
; ee each of several distinct code types. 

ie atlases of Marks, Seeman, & 
A i 4, and Gilberstadt & Duker, 1965, 
Ba rf nown examples.) One distinct po- 
lin ne of this approach is that cer- 
tins am igural” relations, reflecting interac- 

a Rae, ae scales in relation to cri- 
Correlate fo es, will be used. For example, if 
fa ie S of Scale 7 differ depending on the 
| a Scores on additional scales, say 2, 4, 
a i. en the use of different code types such 
“hitaling 2 2-7-8, and so on, permits one to 

n these interactions. 


engi, 
tes, ‘ered on code types has disadvan- 
Vita) S. is that some information is in- 
tode p> St when a profile is assigned to a 


e 
type. Another disadvantage is the small 
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number of classifiable subjects for many code 
types, even in rather extensive studies. These 
classification problems are often unnecessary, 
since many important relations between 
MMPI and external criteria are often not con- 
figural but essentially linear. That is, impor- 
tant relations may be adequately described by 
the linear relations existing in the total sample 
between one or more MMPI variables and 
other measures. One example is Goldberg’s 
(1965) analysis of methods of distinguishing 
psychotic from neurotic patients on the basis 
of the MMPI. Goldberg showed that a simple 
linear combination of several clinical scales 
did just as well as the highly configural 
Meehl-Dahlstrom rules. Although the Gold- 
berg analysis is one of the few well-docu- 
mented examples, we have little doubt that 
(e.g., in the material from which Marks et 
al. 1974, and Gilberstadt & Duker, 1965, 
atlases have been derived) a number of linear 
relations between single or several MMPI 
scales and important nontest data are hidden, 
other than the psychotic versus neurotic dis- 
tinction. One example is an apparent straight- 
forward (linear) relationship between scores 
on Scale 4 and alcoholism in the Gilberstadt 
and Duker (1965) sample (pointed out by 
them). This example illustrates another dis- 
advantage of exclusively relying on the “code- 
type approach”: The method may exploit but 
does not reveal the nature of the relations, 
linear or nonlinear, that may exist in one’s 
data. 

Goldberg’s monograph is one example of 
how such relations can be pursued systemati- 
cally, albeit in this case with negative results 
with respect to nonlinear relations. However, 
we believe that even nonlinear relations may 
emerge from systematic analyses provided the 
search is guided by a certain degree of the- 
orizing permitting a focus on a smaller set of 
possible configurations, thus reducing the 
probability of drowning a few actual con- 
figural relations in a sea of unreplicable chance 
relationships. The reason is that without some 
theoretical constraints, the number of con- 
figural patterns that would have to be con- 
sidered in an exhaustive search easily becomes 
extremely and unmanageably large, thus re- 
quiring unattainably large sample sizes to 
minimize the occurrences of chance patterns. 
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We find, then, that code and profile types 
provide convenient ways of assimilating the 
information contained in a set of MMPI 
scores. But in spite of the continuing advocacy 
(and plausibility) of typological classifica- 
tions, acceptance of such classifications should 
be provisional rather than uncritical. We still 
do not know a great deal about how much 
and when type of membership would add to 
the information already contained in a (more 
parsimonious and psychometrically conserva- 
- tive) dimensional predictor, for example, an 
appropriate linear regression formula. Re- 
search on this question seems quite feasible 
and could give interesting results, 


Assessing Profile Change 


The measurement of change is methodologi- 
cally complex, and the researcher engaged 
in such measurement should become aware 
of the difficulties involved (Cronbach & 
Furby, 1970; Fiske & Rice, 1955; Mauger, 
1972; see especially Dahlstrom et al., 1972, 
chap. 7). We cannot review all of the relevant 
issues in the limited space of this article. 
However, it is important to keep in mind that 
a simple difference in MMPI mean scale 
scores on retest may not indicate that the in- 
dividuals involved have necessarily changed 
as a result of some interposed treatment. Fa- 
miliarity with and critical understanding of 
such concepts as “residual gain” are essential 
(eg, Cronbach & Furby, 1970). To balance 
somewhat the note of pessimism in Cronbach 
and Furby’s article, it should be pointed out 
that experimental designs containing com- 
parison groups can be effective tools for un- 
derstanding change. One substantive issue 
that the investigator should keep in mind is 
that MMPI measures appear to tap a mixture 
of changeable and stable characteristics. Some 
of the scales (e.g, D) appear to be reflective 
of “state” as well as “trait” characteristics 
and are sensitive to mood changes. Other 
scales focus on “biographical” factors (Si or 
Pd) and seem relatively more stable. What is 
the meaning of a particular MMPI scale 
change? If not a statistical artifact, does it 
reflect a change in affective state, a perma- 
nent change, a change in self-presentation? 
In addition, a change on a particular scale 
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is not necessarily best described in terms of 
the name of the scale in question. For exam- 
ple, a significant change on the Pd scale may 
be due primarily to changes on a subset of Pd 
items reflecting a negative mood state; con- 
tent-homogeneous MMPI scales are certainly 
less potentially misleading in the interpreta- 
tion of change than the more heterogeneous 
clinical scales. These problems should be ad- 
dressed in any thoughtful discussion of MMPI 
data pertaining to change. As a general point, 
it is good to keep in mind that some scales 
may be useful for reflecting change, say, re- 
sulting from intervention, but other scales 
may be more useful for predicting response 
to treatment (without necessarily reflecting 
the effects of treatment). 

On the whole it may be true that the origi- 
nal method of developing the MMPI favored 
the selection and construction of relatively 
stable items and scales, respectively. None of 
the original work was directly focused on mea- 
suring change. Subsequent efforts at using 
MMPI items to detect changeability or to 
develop scales that would give clues to poten- 
tial for change have not met with much suc- 
cess (Mauger, 1972; Pepper, 1964). The 
“stability” of the MMPI is shown by the 
fairly consistent finding that about 87% of 
the MMPI items are answered in the same 
direction on retest (Butcher & Gur, 1974; 
Goldberg & Jones, 1969; Schofield, 1948; 
Ullmann & Wiggins, 1962). Consequently, 
the number of items that can be expected to 
vary on retest is relatively small. 


Reporting Group Data 


What do we know from a group mean pro- 
file that summarizes a set of MMPI scores? 
Can we assume that if the group average 1$ 
a 278” profile type that the established per- 
sonality correlates for the 278” profile neces- 
sarily fit the group as a whole or some subset 
of individuals in the group? No, sometimes 
we fall into the trap of drawing such a con 
clusion. It may actually be the case that no 
individual in the group has the code type CO 
responding to the group mean score. Re 
searchers who are not content to present 
merely the means and standard deviations 0 
individual scales could report the percentages 


the different code or profile types occur- 
ng in their sample. 

Tn summary, the MMPI continues to be 
Widely used research instrument as well as 
Clinical assessment inventory. Reasons for 
wide application include its easy adminis- 
ation, its objective scoring, the volume of 
demonstrating its validity, and the auto- 
ation of its interpretation. The MMPI is 
ich an easily used research instrument that 
fis sometimes misapplied or the data ob- 
ined with it are incorrectly analyzed by re- 
archers who are not familiar with some of 
I limitations or peculiarities. The MMPI 
it play and has played a useful role in sig- 
ficant research. The main prerequisites are 
investigator’s long-term and genuine con- 
in with an important problem and her/his 
ty to draw on the store of substantial 
tthodological knowledge and wisdom, ac- 
In ulated in the field of personality assess- 
ent in general (e.g., Wiggins, 1973) and in 
le area of MMPI research in particular 
ëg., Dahlstrom et al., 1972, 1975). 


Reference Note 


Becta, J. E. Implementation of an actuarial 
f agnostic program in a clinical, setting. Paper pre- 
Stnted at the 11th annual symposium on recent 
levelopments in the use of the MMPI. Minne- 
Apolis, April 1976. 
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Methodological and Interpretive Problems of 
Single-Case Experimental Designs 


Alan E. Kazdin 


Pennsylvania State University 


Single-case experimental designs have been used with increasing frequency in 
clinical research, These designs are uniquely suited to evaluating treatment ef- 
fects with individual clients. Although treatment is evaluated by comparing 
baseline and treatment phases, the manner in which this is accomplished varies 
as a function of the specific design. Typically, the comparison is replicated over 
time (ABAB design) or across different behaviors (multiple-baseline design). 
Several methodological problems frequently arise in single-case designs, such 
as deciding when to alter phases or conditions in the experiment, ensuring that 
the intervention is implemented, comparing alternative treatments unconfounded 
by sequence effects, and ensuring that data are collected reliably. Many interpre- 
tive problems of single-case designs stem from the criteria used to evaluate 
treatment. Whether treatment produced a reliable effect usually is determined 
by visual inspection of the data. In addition, single-case research has been con- 
cerned with the clinical importance of treatment effects. The ambiguity of these 
criteria, relative to statistical tests used in group designs, presents unique prob- 
lems for evaluating treatment. The present article considers methodological and 


PSingle-case experimental designs are used 
iitteasingly in clinical research. Although 
Mese designs can evaluate interventions with 
Oups of subjects, their unique contribution 
that they can experimentally evaluate in- 
entions for the individual client. Hence, 
designs offer a distinct methodological ad- 
iiice over the traditional uncontrolled case 
dy, which provides only suggestive infor- 
on about the effects of treatment on a 
Ment’s behavior. 

Several characteristics and problems of 
hgle-case experimental designs are still un- 
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interpretive problems that frequently arise in single-case experiments. 


familiar to many investigators who might 
profit from their use. The designs and their 
execution are not inherently complex. How- 
ever, they do differ in important ways from 
traditional group research with which most 
investigators are familiar. The present article 
describes several methodological problems 
that frequently arise in single-case experi- 
ments. These encompass planning the re- 
search, making decisions during the investi- 
gation, and evaluating the results. To under- 
stand these methodological problems, the 
basic rationale of single-case experiments and 
the manner in which intervention effects are 
demonstrated need to be discussed briefly. 


Overview of Single-Case Experimental 
Designs 
Basic Rationale 


The rationale of single-case designs is simi- 
lar to traditional group research. In tradi- 
tional experimentation, the effect of an inter- 
vention is assessed by comparing performance 
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under the influence of different levels of a 
given variable. Typically, different groups of 
subjects or the same group are exposed to dif- 
ferent experimental conditions. The essential 
feature of traditional research, as well as 
single-case research, is a comparison of per- 
formance under different conditions. The man- 
ner in which the comparison is made in single- 
case experiments departs from traditional 
designs. 

Single-case experimental designs usually be- 
gin by observing a client’s behavior before 
treatment for a period of at least several 
days. This period, referred to as a baseline 
phase, serves a twofold purpose. First, the 
data collected during baseline describe the 
existing level of performance and hence pro- 
vide information about the extent of the 
client’s problem. Second, the data serve as a 
basis for predicting the level of performance 
for the immediate future if treatment is not 
provided, Even though the descriptive func- 
tion of baseline data is important for charac- 
terizing the extent of the problem, from the 
standpoint of design, the predictive function 
is central. 

To evaluate treatment, it is important to 
have a clear idea of what behavior would be 
like in the immediate future with no inter- 
vention. A projection of baseline performance 
into the future is the implicit criterion against 
which treatment is evaluated. Thus, a stable 
level of performance during baseline is impor- 
tant before beginning treatment. A stable rate 
of performance is characterized by the absence 
of trend (slope) in the data and only slight 
or moderate variability in performance. Once 
a stable rate is obtained, treatment can be 
implemented, Assessment of behavior is con- 
tinued while treatment is in effect to deter- 
mine whether performance departs from base- 
line levels, If treatment is effective, the actual 
level of behavior should deviate from the 
projected level of behavior from baseline per- 
formance. Data during the treatment phase 
serve another function. These data also pro- 
vide a new level of performance and can be 
used to predict how the client will behave in 
the immediate future if treatment is con- 
tinued. After performance stabilizes, the treat- 
ment can be withdrawn to reassess whether 

performance under these conditions deviates 
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from the predicted level. Also, by withdrawing 
treatment, the investigator tests directly 
whether the original baseline level of perform- 
ance would have continued at a given level. 

Essentially, data in separate phases of 
single-case designs provide information about 
present performance, predict the probable 
level of future performance, and test the ex- 
tent to which predictions of performance from 
previous phases were accurate. By repeatedly 
altering experimental conditions in the de- 
sign, there are several different opportunities 
to compare phases and to test whether per- 
formance is altered by the intervention. If 
performance changes in response to alterations 
of the experimental conditions, the change 
can be more parsimoniously accounted for by 
the experimental conditions than by extra- 
neous events. 


Illustrative Designs 


The manner in which treatment effects are 
demonstrated varies as a function of different 
designs. There are several designs, each of 
which includes many variations (see Baer, 
Wolf, & Risley, 1968; Hartmann & Hall, 
1976; Hersen & Barlow, 1976; Kazdin, 
1978b; Leitenberg, 1973; Ulman & Sulzer- 
Azaroff, 1975). Two designs warrant brief 
mention because they are the most commonly 
used in single-case experimentation and be- 
cause they serve as a useful point of depar- 
ture for describing methodological problems 
for single-case experiments. These designs ate 
the ABAB, or reversal, design and the multi- 
ple-baseline design. 

The ABAB design evaluates an interven- 
tion by alternating the baseline condition (A 
phase) when no treatment is in effect with the 
intervention (B phase). Baseline data are 
collected until the client’s performance sta- 
bilizes. At this point, the intervention is im- 
plemented. When performance again stabilizes 
or shows a trend that clearly departs from 
the projected performance of baseline, the in- 
tervention usually is withdrawn (second A 
Phase). Finally, as performance shifts to 4 
new level, the intervention is reinstated (sec 
ond B phase). Typically, behavior agai? 
changes in response to the intervention. Sys- 
tematic changes in behavior associated with 
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‘variation of the experimental conditions, par- 
‘larly when replicated at different points 
‘inthe design, strongly suggest that the inter- 
vention accounts for the results. 
€ The multiple-baseline design, used less fre- 
quently than the ABAB design, demonstrates 
the effect of the intervention without with- 
“drawing treatment. In this design, the effect 
of the intervention is demonstrated by show- 
in that behavior change accompanies intro- 
“duction of the intervention at different points 
in time. Although there are different versions 
of the design, a commonly used version is 
the multiple-baseline design across behaviors. 
In this version, baseline data are gathered 
across two or more different responses of a 
Single subject (or group of subjects). After 
tach behavior shows a stable rate, the inter- 
Vention is applied to only one of the behav- 
lots. Baseline conditions remain in effect for 
the other behavior(s). Typically, the behavior 
to which the intervention was applied changes, 
Whereas the other behaviors remain at base- 
line levels. When all responses show a stable 
“ate, the intervention also is applied to the 
Sond behavior; remaining behaviors con- 
tinue under baseline conditions. This proce- 
dure is continued so that the intervention is 
introduced to one response at a time. A causal 
‘tation between the intervention and behav- 
lot is demonstrated if each response changes 
fn only when the intervention is intro- 
that as a change in behavior at the point 
A pee enon 1S introduced is repli- 
R oss several behaviors, this provides 
tion Bee demonstration that the interven- 
responsible for behavior change. 
ere are other designs that demonstrate 
i age effects in ways that vary in de- 
fiin, Pe SPA and multiple-baseline 
these aie m ao above. In each of 
Waluate 1 » the undamental comparison to 
= Teatment is the performance of 
eine and intervention phases. The com- 
Bnei “Seas is based on replications within 
ae aAA so that repeated opportunities 
fin a e to assess whether the interven- 
Bd a for change. It is important to 
teatment a e basic rationale for evaluating 
Petause A transcends the diverse designs, 
ambiguities in results of single-case 
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experiments frequently derive from not meet- 
ing the conditions required by the designs. 


Common Methodological Problems 


Several methodological problems are com- 
mon in investigations using single-case ex- 
perimental designs. Many problems are inde- 
pendent of the specific designs that are used. 
Hence, it is not important to fully detail the 
specific designs available in single-case ex- 
perimentation as long as the rationale is il- 
lustrated with samples of the most frequently 
used designs, Areas in which problems fre- 
quently arise are altering phases in the de- 
signs, checking whether the intervention was 
manipulated correctly, comparing alternative 
interventions, and gathering reliable data on 
client behavior. 


Altering Phases or Conditions 
During the Experiment 


Traditionally, research designs are pre- 
planned so that most of the details are ar- 
ranged before subjects are studied. In single- 
case experimental designs, a number of im- 
portant decisions can be made only as the 
data are collected. Decisions such as how 
long baseline data should be collected and 
when to present or withdraw experimental 
conditions are made during the investigation 
itself. The experimenter needs to decide when 
to alter phases in the design in such a way as 
to maximize the clarity with which inferences 
can be drawn about the intervention. Each of 
the single-case designs usually begins with a 
baseline phase. Treatment is evaluated ulti- 
mately by comparing performance across 
baseline and intervention phases, as described 
earlier. For these comparisons to be made 
easily, one has to be sure that the changes 
from one phase to another are likely to be 
due to the intervention rather than to a con- 
tinuation of an existing trend. A fundamental 
design issue is when to change phases to maxi- 
mize the clarity of data interpretation. 

There are no clear rules for altering phases. 
Understandably, investigators differ when con- 
ditions are altered. Yet, the point at which 
conditions are changed is a very important 
issue, because subsequent evaluation of inter- 
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vention effects depends completely on how 
clear the behavior changes are across phases. 
The usual rule of thumb is to alter phases in 
which the data are stable. Stability refers to 
the absence of trend and relatively small vari- 
ability in a given level of performance. Trends 
and extensive variability during any of the 
phases, particularly baseline, can interfere 
with evaluating treatment. 

Trends in the data. Ideally, baseline data 

should show no trend or slope prior to im- 
plementing the intervention. One of two differ- 
ent trends may be apparent. First, behavior 
may be changing in the direction opposite 
from that which is to be achieved during the 
intervention. For example, a psychotic patient 
may show a reduction in rational statements 
during baseline. Because the intervention will 
attempt to alter behavior in the opposite di- 
rection (increase rational statements), this 
initial trend in baseline is not likely to inter- 
fere with subsequent conclusions about the 
intervention. As a general rule, when the in- 
tervention is designed to change behavior in 
a direction opposite from the trend in base- 
line, the trend is not problematic. The rule 
can be extended to all changes in conditions 
so that trends, opposite from what the antici- 
pated data will show in the next phase, 
should not interfere with drawing inferences 
about treatment. 

In contrast, the baseline trend may be in 
the same direction that the intervention is 
likely to produce, Essentially, baseline may 
show improvements in behavior, and one 
might question the need to intervene at all. 
However, even if improvements are made, as 
might be the case with an autistic child’s 
Severe self-injurious behavior during a base- 
line period when attention for such behavior 
is inadvertently withheld, the changes may 
be so slow that some intervention is needed, 
Ft va a child engaging in frequent 

e: anging might gradually decrease the 
response, but the reduction could be so grad- 
ual that serious self-injury might be inflicted 
unless the behavior is quickly eliminated. 
Despite the desirability of intervening in 
many situations in which baseline trends 
move in the direction of therapeutic change, 
evaluating the effect of the intervention T 
these situations is extremely difficult. The in- 
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tervention has to produce very marked change 
to draw unambiguous conclusions. Because of 
this diffculty in evaluating interventions with 
systematic trends in the direction of thera- 
peutic change in baseline, the usual recom- 
mendation is to wait for baseline to stabilize 
so that there is no trend before intervening 
(Baer et al., 1968). This cannot be done in 
many clinical situations in which treatment 
is needed quickly. 

When baseline shows a trend toward im- 
provement, two major alternative strategies 
are available. First, an ABAB design can be 
used in which a procedure for changing be- 
havior in the opposite direction can be alter- 
nated with the intervention. For example, 
baseline can consist of reinforcing decreases 
in the desirable behavior, whereas interven- 
tions can reinforce increases in the same be- 
havior. This alternative can dramatically dem- 
onstrate that treatment accounted for change 
(e.g, Ayllon & Haughton, 1964). The design 
is experimentally sound but clinically untena- 
ble because it includes specific provisions for 
making the client’s behavior worse. s 

A second solution is to select designs in 
which trends in the data are unlikely to affect 
each of the baselines observed (multiple-base- 
line design) or in which trends are not rele- 
vant to evaluate different treatments (simul- 
taneous treatment design). A third solution 
is to use statistical techniques that can eval- 
uate the effect of the intervention relative to 
the baseline trend. Specific statistical tech- 
niques such as time-series analysis can take 
into account baseline trends and assess 
whether the intervention has made a reliable 
change over and above what would be €x- 
pected from continuation of the trend (see 
Glass, Willson, & Gottman, 1974; Jones, 
Vaught, & Weinrott, 1977; Kazdin, 1976). 

Intrasubject variability, In addition tOo 
trends, excessive variability in the data can 
interfere with drawing conclusions about 
treatment, As a general rule, the greater the 
variability in the data, the greater the dif- 
ficulty in demonstrating behavior change: 
Excessive variability is a relative matter that 
depends on the initial level of behavior dur- 
ing baseline, the magnitude of change 
achieved during the intervention, the degree 
of change desired, and other factors. In the 
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[sieme case, baseline performance of the 
target behavior may vary daily from 0 to 
00%, that is, to each extreme of an assess- 
nent continuum. Such baseline data proba- 
bly cannot be used to predict a specific level 
gf future performance and hence would not 
provide an adequate base against which the 
data during an intervention phase could be 
compared. 

| To facilitate treatment evaluation, investi- 
gators sometimes reduce the appearance of 
vatiability in graphical presentation of the 
data by averaging data points across consecu- 
live days or sessions. The procedure consists 
merely of combining days of data and averag- 
ig across the number of combined data 
points. By representing 2 or more days with 
isingle averaged data point, fluctuations are 
feduced substantially and the data appear 
ire stable. Although such a procedure dis- 
Wrts the day-to-day pattern of performance, 
the data in different phases can be readily 
(mpared. 

Whenever possible, it is better to identify 
‘nd control sources producing variability than 
Merely to average the data. The variability 
(nay result from sources that are important to 

ilentify before implementing treatment. For 

‘ample, unreliability in scoring behavior may 

Mntribute to excessive variability. The client 

k em the target response relatively 

E ently. Yet observers may change cri- 

i. ns the behavior, which may add 

ity i y to the variability. Such varia- 

Bean d be controlled. by ensuring that 

i. ms are scored consistently across 

k E over time. Other sources may 

ts in or variability, such as changes in 

esence r environment. For example, the 

a Re. oe rather than another 

li rate a inane than father) may control 

attempt ehavior. If variability is excessive, 
is conditio oa made to assess the stimu- 

Monse rate ; at mediate differences in re- 
tors i dentification of some of the 
a z contribute to variability also 
havior i mes aban the factors controlling 
for treatment. provide practical suggestions 

*t whatever reason, behavior may simpl; 
Quite variable. Ind ad eae 

re eed, the goal of the pro- 
Y be to alter the variability of the 
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client’s performance rather than the mean 
rate. Usually, large data fluctuations, espe- 
cially during baseline, make evaluation of the 
intervention difficult. Such fluctuations are a 
signal to the investigator to either look for 
the controlling variables or to redefihe the 
behavior or conditions of assessment to reduce 
variability. Continuation of the program in 
the face of excessive variability eliminates 
the possibility of drawing inferences about 
treatment. 

Duration of the phases. How long phases 
will be in a single-case experimental design 
usually is not specified in advance of the in- 
vestigation. The reason for this is that the 
investigator needs to examine the data and 
to determine whether the information is suf- 
ficiently clear to make predictions about per- 
formance. The presence of trends or excessive 
variability indicates that the information may 
not be sufficiently clear. Hence, it is useful 
to wait for a more stable rate of performance 
to emerge. A common methodological prob- 
lem is altering phases before a clear pattern 
emerges. For example, most of the data may 
indicate a clear pattern for baseline. Yet, after 
a few days of relatively stable baseline per- 
formance, one or two data points may be 
higher or lower than the previous data. The 
question that immediately arises is whether 
a trend is emerging in baseline or whether 
the data points are merely a part of the nor- 
mal variability. It is wise to continue the 
condition without shifting phases just to be 
sure. If 1 or 2 more days of data reveal that 
there is no trend, the intervention can be im- 
plemented as planned. The increase of data 
provides an increase in confidence that there 
was no emerging trend and facilitates subse- 
quent evaluation of intervention effects. 

Occasionally, an investigator may obtain 
an extreme data point during baseline in the 
opposite direction of the change anticipated 
with treatment. This extreme point may be 
interpreted as suggesting that if there is any 
trend, it is in the opposite direction of treat- 
ment effects. Thus, investigators often shift 
phases when an extreme data point is noted 
in the previous phase in the direction opposite 
from predicted effects of the next phase. Yet, 
extreme scores in one direction are likely to 
be followed by scores that revert in the di- 


634 


rection of the mean. Thus, if the observation 
ihdicates a particularly high score on one 
occasion, the next occasion is likely to show 
a less extreme score. This characteristic is 
known as statistical regression and is a func- 
tion ðf the correlation between consecutive 
data points.* 
It is important to be aware of the possibil- 
ity of regression. Because the extreme score 
in one direction is likely to be followed by a 
much less extreme score that reverts toward 
the mean, it is unwise to shift phases fol- 
lowing highly extreme scores. Shifting phases 
at this point would capitalize on regression. 
This improvement in performance would ap- 
pear to be a function of shifting from base- 
line to intervention phases. However, the im- 
provement might be a function of regression. 
As data continue to be collected in the new 
phase, the investigator could, of course, see 
whether the intervention is having an effect 
on behavior. However, if mean levels of per- 
formance are compared across phases, shifting 
phases at points of extreme scores could sys- 
tematically bias the average performance in 
each of the phases. This could influence the 
conclusions that are drawn, especially if the 
phases are relatively brief. 

Currently, there are no agreed upon ob- 
jective decision rules for alternating phases in 
single-case experimental designs. Occasionally, 
investigators have attempted to specify ob- 
jective criteria in advance that indicate the 
conditions under which baseline and experi- 
mental conditions will be alternated (eg., 
Scott et al, 1973; Wincze, Leitenberg, & 
Agras, 1972). For example, phases can be 
shifted when variability about the mean level 
of performance falls within a specific range 
for a period of 5 days or when a given num- 
ber of consecutive data points are not in the 
same direction above (or below) the mean, 
Attempts to develop an objective preset rule 
for an important design decision that has re- 
mained primarily subjective are laudible. 
However, the rules have not been sufficiently 
developed to this point to routinely exclude 
emerging trends or the influence of regression 
due to extreme scores in the data. 


Checking the Intervention 


In single-case experiments, the investigator 
manipulates specific conditions and assesses 
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their effects on behavior. If the dependent 
measure reflects change and the replication 
requirements within the design are met, the 
investigator is likely to assume that the in- 
tervention was responsible for behavior 
change. In most behavior modification inter- 
ventions, in which single-case experimental 
designs are used with the greatest frequency, 
the intervention consists of the behavior of 
individuals (e.g., parents, teachers, peers) 
who interact with the client (e.g., child). For 
example, parental attention can be used to 
alter the behaviors of conduct problem chil- 
dren at home. When the intervention consists 
of behaviors in natural situations, the inves- 
tigator has no assurance that the intervention 
is carried out correctly, or indeed is even im- 
plemented at all. This situation is to be con- 
trasted with laboratory experiments in which 
the intervention can be implemented by read- 
ing instructions or playing an audio or video 
tape recorder and in which standardization of 
the intervention can be more readily assured. 

When the behavior of individuals in contact 
with the client is the independent variable, 
it is especially important to gather data to 
ensure that the intervention is carried out. 
This is essential for at least three reasons. 
First, the behaviors of individuals who inter- 
act with the clients in naturalistic settings of- 
ten are difficult to change (e.g, Breyer & 
Allen, 1975). Thus, an investigator cannot 
automatically assume that behavior-change 
agents will engage in the desired behaviors 
after receiving instructions or reminders. Sec- 
ond, when changes are made in behavior- 
change agents, they often are short-lived 
(e.g., Kazdin & Moyer, 1976). There is no 
assurance that once the intervention is im- 
plemented it will continue to be carried out 
in the appropriate fashion. Finally, individuals 
responsible for altering client behavior may 
show changes in their behaviors that extend 


5 ive 
1The lower the correlation between consecuti 


data points, the greater the regression toward 
mean. The reason for this is that the lower co” a 
tion between consecutive data points, the Gian 
the amount of error in the scores. The random ere 
that is likely to be present in an extreme score ey 
one occasion is not likely to be present on the duly 
Occasion. Thus, a score that departs consideral y 
from the overall mean is likely to be followed by 
much less extreme score. 
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beyond the confines of the intervention 
(Chadwick & Day, 1971; Trudel, Boisvert, 
Maruca, & Leroux, 1974). 

Even if experimental results meet the de- 
sign requirements and are replicated across 
phases, precision can be increased by supple- 
mentary data showing that the intervention 
was carried out in the desired fashion. With- 
out these data, the systematic changes in 
dient behavior may be the result of some 
other influence in the setting. Indeed, it is 
quite possible given existing evidence on the 
performance of behavior-change agents that 
some aspect of their behavior other than the 
intended intervention led to change (Kazdin, 
1977d). 

Often it is desirable to have information 
about variables that may covary with the in- 
tervention and could plausibly account for 
change. For example, verbal approval and 
disapproval have proven to be very potent 
Variables in altering child behavior. In pro- 
grams evaluating other interventions (eg., 
token reinforcement, nonverbal approval), it 
15 essential to collect data on verbal behav- 
iors of the behavior-change agents as well to 
assess whether they have covaried with the 
Intervention (e.g., Kaufman & O'Leary, 1972; 
Kazdin & Klock, 1973). This assessment is 
Important in situations in which behavior 
ne might result from events that already 
a een shown to influence behavior in pre- 
a research and may be more plausible as 

uences than the variable of direct interest. 


Comparing Alternative Interventions 


Fe ee using single-case designs fre- 
ead ae interested in assessing the relative 
ee different treatments with a given 
JA a ‘omparing different treatments within 
ioc = of subjects or a single subject is 
E ecause of the likely confound of 
oA s with sequence or order effects. It 
OS on ee to see investigators compare 
Be ye aan different interventions in a ver- 
om a BAB design. As a simple case, 
sien ERY be represented as an ABCA 
a ere A refers to baseline and B and 
Panes | alternative treatments). If per- 
different levels of the client differ under the 
treatments, the investigator can make 


Stat 
‘ments about the relative efficacy of the 
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treatments. However, two treatments invaria- 
bly are confounded with the order of appear- 
ance. One cannot interpret whether the first 
(or second) intervention led to more (or 
less) behavior change because of the treat- 
ment itself or because of the sequence in — 
which it appeared. 

A common technique is to add a return-to- 
baseline condition (an A phase) between the 
two interventions to show that original base- 
line level of performance is recovered (an 
ABACA design). Even if baseline levels are 
the same prior to each (B,C) intervention, 
this still does not correct for the confound 
of sequence effects. Merely because baseline 
phases prior to each intervention were equal 
in the level of performance does not mean that 
they are comparable in all other respects. Be- 
havior change is not equally difficult to 
achieve on repeated occasions in which a 
treatment is presented and withdrawn. Addi- 
tional interventions may need to be more or 
less effective than prior ones to effect change. 

Investigators often conduct a more com- 
plex version of the design in which the ef- 
fects of treatment are replicated at different 
points in the design for the subject. For ex- 
ample, in an ABCABC design, each treatment 
(B,C) is given twice. Even here, any conclu- 
sions reached about treatments are restricted 
to the particular order of treatment. The 
efficacy of C may result from the fact that it 
has followed B. The effects of C when pre- 
sented without this prior history may be 
completely different. In short, the fixed se- 
quence of treatments may lead to multiple- 
treatment interference (see Campbell & Stan- 
ley, 1963). Conclusions about the treatments 
must be tempered by acknowledging the pos- 
sible contribution of one treatment on later 
treatments.” i 

The main problem with sequence effects in 
single-case experimental designs stems from 
the fact that investigators frequently rely on 
versions of the ABAB design. If this design 


2 Although the discussion of sequence effects may 
seem of academic interest, this is not the case. Both 
laboratory animal investigations (Grice & Hunter 
1964) and applications with clinical problems (White 
Nielson, & Johnson, 1972) have shown that the ef- 
fects of a given intervention may be affected by the 
treatments presented immediately before. 
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were used to compare treatments, more than 
one subject would be needed to ensure that 
the treatments are counterbalanced. Only if 
the treatments are presented in a different se- 
quence can the investigator be assured that 
the interventions are differentially effective 
independent of the order in which they ap- 
pear. Actually, using more than one subject 
is not a very useful solution. Unlike group de- 
signs in which the different sequences are 
replicated across several subjects, a single-case 
design with only two subjects would not al- 
low separation of the sequence effects from 
unique subject characteristics. That is, if two 
treatments produce different effects across two 
subjects, the investigator cannot determine 
whether the order was important or whether 
the subjects merely responded differently to 
the treatments. 

If the investigator is interested in single- 
case designs for comparing treatments, the 
ABAB design and its variations probably 
should be avoided. There are alternative 
single-case experimental designs suited to 
comparisons of different treatments. These 
include the multiple-element baseline or mul- 
tiple-schedule design (Leitenberg, 1973; Ul 
man and Sulzer-Azaroff, 1975) or the simul- 
taneous-treatment design (Kazdin, 1977c). 
In these designs, the separate interventions 
can be presented to an individual subject in 
such a way that the interventions are not 
confounded by sequence effects, Specifically, 
Separate interventions are presented concur- 


rently but can be balanced across conditions 
of administration. 


Interobserver Agreement 


In most single-case erimental i 
overt client behavior ial assessed a ae 
the course of baseline and intervention con- 
ditions. In the majority of cases, observers 
record whether behavior has occurred based 
on their judgment and the definition of the 
response as originally specified. Occasionally, 
automated recording devices (e.g., time clock, 
plethysmograph), equipment requiring little 
or no judgment (eg., scale), or devices that 
make scoring merely a clerical task (e.g., key 
to a paper-and-pencil test) are used, and there 
is little question whether observations are con- 
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sistent and accurate. In other cases, in which 
human judgment may figure more promi- 
nently in scoring the response, doubt can be 
raised about the adequacy of the assessment 
procedure and the consistency with which ob- 
servations are made. In these latter cases, it 
is essential to assess the extent to which ob- 
servers are collecting data consistently and 
accurately. Actually, assessment of client be- 
havior and interobserver agreement might be 
viewed as purely measurement problems and 
be distinguished from design issues. However, 
single-case designs rely on behavioral ob- 
servations, and assessment issues directly re- 
late to evaluation of intervention effects.’ 
Interobserver agreement assesses whether 
the client’s behavior is observed consistently. 
With low agreement, the data may differ 
greatly depending on who is scoring the be- 
havior. Variation in the data due to the ob- 
server adds to any variability in client be- 
havior and obscures actual performance. As 
noted earlier, evaluation of the intervention 
depends on obtaining data that have relatively 
little variability. Measurement error contrib- 
utes to variability and makes subsequent 
evaluation more difficult. Indeed, if variability 
due to assessment error is extremely large, 
establishing a stable rate of behavior and 
evaluating the intervention may be impossible. 
Evidence suggests that observers may stray 
from original behavioral codes in scoring 
client behavior (Kent & Foster, 1977). To 
ensure sustained accuracy and consistency in 
observer assessment, it is useful to check each 
observer against an independent standard and 
calculate the amount of agreement. Aside from 
a general trend to deviate from the original 
behavioral codes, observers may develop idio- 
syncratic coding tendencies so that they may 


3The assessment of interobserver agreement in 
single-case experiments has been a topic of major 
discussion in recent years, Basic issues are still ac- 
tively discussed, including the manner in which inter- 
Observer agreement should be assessed, the conditions 
under which such methods are appropriate, the 
sources of bias that obscure interpretation of agree- 
ment data, and similar factors. These issues pa 
been described in several publications (e.g., Hawkins 
& Dotson, 1975; Kent & Foster, 1977) including 2 
invited series of articles in the Journal of Applied 
Behavior Analysis (1977, 10[41). 


| see behaviors different from other observers. 
Periodic checks on interobserver agreement 
| may help evaluate whether changes in scoring 
are occurring and whether particular observers 
| are responsible for any of these changes. 
| Actually, research on interobserver agree- 
ment has revealed several factors that need 
fo be taken into account when obtaining 
| checks on observers (see Kazdin, 1977a; Kent 
Tę Foster, 1977). Observer awareness that 
agreement is being checked, who the other ob- 
“server is, and reactions of the experimenter 
fo the data that are reported have influenced 
interobserver agreement and the nature of 
the data obtained. As a general rule, a mini- 
mal condition for observation of overt behav- 
| t is the assessment of interobserver agree- 
ment at different points throughout the ex- 
£ Agreement checks need to be 
dispersed over each of the phases to ensure 
hat observer biases and changes in criteria 
Hot scoring responses are not likely to be 
tnfounded with treatment. Ideally, inter- 
Observer agreement should be assessed unob- 
Mtusively, because observer awareness that 
#gteement is being checked influences the 
ata, However, experimental arrangements 
| quired to accomplish unobtrusive agreement 
thecks are difficult to meet (Kazdin, 1977a). 


Issues in Data Interpretation 


} a Major issue in evaluating treatment ef- 
a... research is the internal validity of 
1963 a findings (Campbell & Stanley, 
Bric nternal validity refers to assessing 
f ie the intervention accounted for the 
nA oa essentially pertains to experi- 
Pentia] esign considerations that rule out 
internal sources of confound. The issue of 
ipod is central to all research. 
Bona : Single-case designs differ from tra- 
aie in the criteria for determin- 
results oa an intervention accounted for the 
» Failure to understand the criteria for 


| 
Waluatin 

ng results constitutes a major meth- 
Mlogical issue, a 


| “iter; 
| eria for Evaluating Change 


| Sin, 
ee experimental designs attempt to 
the situation so that the effects of 
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the intervention and extraneous variables can 
be distinguished. The replication of interven- 
tion effects over time or across baselines is 
designed to make implausible the influence 
of various threats to internal validity. Just 
as the designs depart from traditional group 
experimentation, so do the criteria for evaluat- 
ing the effectiveness of an intervention on 
behavior. Two criteria have been used to 
evaluate treatment effects in single-case ex- 
periments, namely, the experimental criterion 
and the therapeutic criterion (Kazdin, 1978b; 
Risley, 1970). 

Experimental criterion. The experimental 
criterion refers to a comparison of behavior 
during the intervention with what it would 
be like if the intervention had not been im- 
plemented. This criterion, of course, charac- 
terizes group research designs as well. How- 
ever, single-case experiments usually do not 
fulfill this criterion by appealing to statistical 
analyses. Rather, the experimental criterion 
is met by replicating treatment effects over 
time. Repeatedly showing that alteration of 
contingency changes the level of performance 
in relation to the previous phase fulfills the 
experimental criterion. The strength of the 
demonstration derives from showing that per- 
formance during a given phase violates the 
predicted level of performance of the prior 
phase before the intervention was introduced. 
Each design accomplishes the replication of 
treatment effects in a different way. For ex- 
ample, in the ABAB design, the intervention 
is replicated over time for a single subject or 
group of subjects. Similarly, in a multiple- 
baseline design, the effect is replicated across 
separate baselines. 

In practice, whether the results meet the 
experimental criterion is determined in vari- 
ous ways. First, if performance during an in- 
tervention phase does not overlap with per- 
formance during the baseline phase when these 
data points are plotted over time, the effects 
ded as reliable. The replica- 


usually are regar liable. 1 ( 
tion of nonoverlapping distributions during 


different treatment phases strongly argues for 
the effects of treatment. Second, a more typi- 
cal criterion for experimental evaluation is 
related to the trends in each phase. If base- 
line and intervention conditions show changes 
in trends as the phases are alternated, the re- 
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liability of the data and effects of the inter- 
vention usually are inferred. 

One of the problems with experimental eval- 
uation of single-case method designs is that 
they have relied almost exclusively on visual 
inspection to determine whether the magni- 
tude of change across phases is significant. 
The problem with visual inspection is that 
individuals who peruse the data may not see 
eye to eye. For example, Jones, Weinrott, and 
Vaught (Note 1) demonstrated considerable 
disagreement among judges asked to visually 
inspect the data from the same single-case 
experimental design. 

Recently, statistical evaluation of single- 
case data has received increased attention 
(Kazdin, 1976; Michael, 1974). Such evalua- 
tions are likely to be useful in situations in 
which the ideal conditions such a stable base- 
lines and relatively small variability cannot 
be met. Thus, the reliability of treatment ef- 
fects can be examined in situations in which 
visual inspection might be weak or unreliable, 
However, the use of statistics with single-case 
designs is controversial (Michael, 1974), Also, 
statistical tests appropriate for single-case 

analyses are less familiar to most investiga- 
tors because they are not the ones commonly 
taught in graduate statistics in psychology. 

Further, the tests include relatively restric- 

tive assumptions that may dictate selection 

of the deisgn and the manner in which treat- 
ment is implemented (see Kazdin, 1976). Be- 
cause of the infrequent use of statistics in 
single-case experiments and lack of familiarity 
with the statistical properties of the tests, it 
is no surprise that inappropriate applications 
of statistical tests already have entered the 
literature, In any case, statistical tests are 
available for evaluating the reliability of in- 
tervention effects in single-case experimental 
designs. These tests have been illustrated at 
ory elsewhere (Jones et al., 1977; Kazdin, 
Therapeutic criterion. The therapeutic cri- 
terion for evaluating single-case experimental 
designs refers to the value or importance of 
behavior change, that is, the clinical signifi- 
cance of treatment effects. This criterion re- 
fers to whether the extent of behavior change 
achieved during treatment enhances the 
client’s functioning in everyday situations 
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(Risley, 1970). Implicit in this criterion, of 
course, is that the behavior selected for al- 
teration is itself of clinical or social im- 
portance. 

The therapeutic criterion is more difficult 
to satisfy than is the experimental criterion. 
Showing a reliable treatment effect, through 
visual inspection or statistical analysis, has 
no necessary bearing on the importance of 
the change. For example, a single-case dem- 
onstration might well show a reduction of 
self-destructive behavior from 100 to 50 in- 
stances per hour. Even though the reduction 
is relatively large and might meet the most 
stringent statistical test, the remaining level 
of behavior is far removed from the “normal 
social interaction” to which the individual 
might someday return. To be of unequivocal 
therapeutic significance, the intervention 
would need to eliminate self-destructive be- 
havior. 

The ease of evaluating the importance of 
clinical change in the above example derives 
from the fact that intense destructive behav- 
ior is maladaptive whenever it occurs. Yet, 
for many behaviors, the level and intensity 
of behavior rather than merely its presence 
or absence dictate whether it is acceptable. 
This makes satisfying the therapeutic cri- 
terion much more difficult. Indeed, the pre- 
cise criterion for evaluating whether the ef- 
fects of treatment are clinically important is 
difficult to specify. Part of the reason is that 
individuals in everyday life (e.g., parents, 
teachers, friends) and the individual himself 
or herself determine the level of behavior that 
is acceptable or deviant. 

Recently, single-case experimental method- 
ology has made efforts to assess objectively 
whether the effects of treatment are clinically 
significant. The procedures for assessing the 
clinical importance of treatment effects are 
referred to as social validation (Wolf, Note 
2) and have been incorporated into an in- 
creasing number of investigations (Kazdin, 
1977b). Generally, social validation of treat- 
ment effects consists of determining whether 
behavior changes are clinically important in 
the social context in which the client func- 
tions. This is accomplished in one of two 
ways, referred to as social comparison and 
subjective evaluation methods (Kazdin, 


mb). With the social comparison method, 
behavior of the client before and after 
tment is compared with the behavior of 
iodeviant” peers. The question asked by 
comparison is whether the client’s behav- 
after treatment is distinguishable from the 
pmative range of behavior of his or her 
wrs. With the subjective evaluation method, 
avior is evaluated by individuals who are 
idly to have contact with the client to de- 
imine whether the change made during 
maiment is important. The question ad- 
Messed by this method is whether behavior 
nges demonstrated in treatment lead to 
ilitative differences in how, the client is 
pened by others. 
‘In using the social comparison method to 
luate treatment, it is essential to identify 
d viduals who are similar to the client in 
ject and demographic variables but who 
ive been identified as showing acceptable 
Hhavior. Presumably, peers of the client iden- 
ltd in such a fashion should differ markedly 
im the client in the target behavior prior to 
tment, After treatment, the behavior of 
client and peers can be compared again. 
f treatment has effected clinically signifi- 
mit change, this should be demonstrated by 
ving that the client’s behavior has moved 
a acceptable, that is, “normative,” 
‘a of functioning. Several studies have 
n E aant effects bring initially 
k. ye of behavior within normative 
A Social interacion or conduct prob- 
ea (e.g., O'Connor, 1972), basic 
Bre of the mentally retarded (Azrin 
icon ong, 1973), locus of control and 
a cepts of delinquents (Eitzen, 1975), 
Wi i of unassertive college students 
es as & Twentyman, 1973), and 
i, a Xazdin, 1977b, for a review). 
tho leat the subjective evaluation 
ith t i a ividuals who normally interact 
M (e.g ae or who are in a special posi- 
tilar ae expertise) to judge a par- 
BS how w i provide global evaluations to 
ally Ae the client is functioning. Es- 
Piei = ae appraisal is used to eval- 
licaties eae behaviors: changed 
eee or how the client is viewed 
viously, the global ratings are 


)~ onl; : 
Y to supplement objective behavioral 
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data, since studies have demonstrated that 
global judgments are much more likely to re- 
flect bias than are specific objective measures 
(Kazdin, 1973; Kent, O’Leary, Diament, & 
Dietz, 1974; Schnelle, 1974). 

The subjective evaluation method can be 
used both to help identify important behav- 
jors to assess and focus on in treatment as 
well as to evaluate qualitative improvements 
associated with behavior change. Studies have 
used global assessments prior to treatment to 
determine whether the specific behaviors that 
are observed and included in treatment are 
relevant to individuals in contact with the 
client in everyday situations (e.g., Werner et 
al., 1975). More commonly, subjective eval- 
uations are used to evaluate pretreatment and 
posttreatment levels of behavior. Several 
studies have shown that behavior changes 
have improved qualitative global judgments 
of individuals in contact with the clients for 
programs training delinquent girls how to 
communicate appropriately with others (e.g. 
Maloney et al., 1976), developing public- 
speaking skills in adults (Fawcett & Miller, 
1975), training problem-solving skills of lower 
socioeconomic adults in positions that may 
influence decision making (Briscoe, Hoffman, 
& Bailey, 1975), and altering performance of 
conduct problem children (Kent & O'Leary, 
1976). 

Social comparison and subjective evalua- 
tion methods are not without problems. For 
example, identifying appropriate peer groups 
to which client behavior can be compared, 
deciding whether normative levels are the ap- 
propriate standard or themselves should be 
subject to change, and determining precisely 
what global judgments actually mean are only 
some of the problems for evaluating the thera- 
peutic criterion (Kazdin, 1977b). However 
imperfect, the methods show considerable 
promise in assessing the clinical importance 
of behavior change and need to be incorpor- 
ated routinely into clinical research. Although 


the bulk of studies using social validation pro- 


cedures relied on single-case experimental de- 
signs, many of the studies were traditional 
between-group designs (e.g, Kent & O'Leary, 
1976; McFall & Twentyman, 1973). The vali- 
dation procedures are not restricted to any 
particular design and are emphasized here 
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only because single-case experimental research 
in behavior modification has embraced a 
therapeutic criterion as a central feature of 
treatment evaluation. 


Generality of the Results 


An important issue, often noted in the form 
of an objection to single-case experimental 
designs, is that the results from an individual 
case may not necessarily apply or be general- 
izable to other subjects. The generality of 
experimental findings in between-group and 
single-case research is a weighty topic that 
cannot be developed fully here (see Hersen 
& Barlow, 1976; Kazdin, 1973). Suffice it to 
say that mere numbers of subjects do not, 
in their own right, guarantee generality of the 
results, Indeed, the vast majority of between- 
group investigations evaluates results on the 
basis of average (mean) group performance, 
which does not provide insights about the 
generality of a particular effect within a given 
experimental condition. Yet, this latter prob- 
lem is a function of not looking at all of the 
individual subject data within a between- 
group investigation. With single-case demon- 
strations, there is no immediate possibility to 
assess generality, by definition. 

Although it is quite possible that the ef- 
fects of treatment demonstrated with the in- 
dividual case may not generalize to others, 
this does not appear to be a problem in con- 
temporary research. Findings obtained in 
single-case demonstrations appear to be as 
generalizable, if not much more so, than dem- 
onstrations using other designs. The reason 
for this does not seem to be related to the 
designs per se but rather to the type of in- 
terventions that are commonly evaluated, In- 
vestigators who use single-case designs have 
made a special point to look for interven- 
tions that produce dramatic changes in be- 
havior. Interventions with such effects for 

the single case are likely to generalize more 
broadly than are interventions that meet the 
relatively weaker criterion of statistical sig- 
nificance based on group averages that char- 
acterize between-group research. 

Although generality of the results in single- 
case designs is not an inherent problem, in- 
vestigation of the dimensions along which 
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generality might occur is difficult. The designs 
characteristically are weak in revealing sub- 
ject characteristics that may interact with 
a specific treatment. Focusing on one subject 
does not allow for the systematic comparison 
of different treatments across multiple sub- 
jects who differ in various characteristics, at 
least within a single experiment. Examina- 
tion of subject variables is more readily ac- 
complished by between-group research, spe- 
cifically, factorial designs in which interven- 
tion effects can be analyzed separately 
according to characteristics of the subjects, 
Yet, single-case experiments have focused on 
developing potent interventions that tend not 
to depend heavily on subject characteristics. 
This has been a matter of philosophy rather 
than a simple design consideration (Kazdin, 
1978a). The ultimate test of generality of 
findings among subjects or any other condi- 
tion is replication. Single-case experiments 
commonly replicate intervention effects across 
a wide range of clients (e.g., Kazdin, 1977d). 


Concluding Comments 


Single-case experimental designs offer 
unique advantages for research in clinical 
psychology because they provide an empirical 
and scientific basis for investigating interven- 
tions with individual clients. The diverse de- 
signs permit examination of different outcome 
questions, such as the effects of an overall 
treatment package, analysis of components of 
a package, and comparison of different treat- 
ments. Single-case designs make available 
areas not easily studied in group research. For 
example, many clinical problems cannot be 
studied on a group basis simply because few 
clients come for treatment or the behaviors 
are extremely rare. j 

Single-case designs make demands on an in- 
vestigator that may not be easily met in many 
clinical situations. Initially, client ra 
needs to be observed continuously, & tas 


‘It is only fair to point out that many panel 
case” designs utilize more than one individual; fa 
deed, large groups often are included in suc i 
signs. Also, some design variations including bate 
the multiple-baseline designs routinely rely on m 
than one individual. 


re difficult to implement than traditional 
tment and posttreatment assessment. 
ond, behavioral measures usually are con- 
ucted to assess the individual client’s prob- 
i. Third, some designs require reinstating 
ditions that compete with effecting durable 
ietapeutic changes. Finally, many important 
stions are not easily addressed, such as 
interaction of client, therapist, or setting 
fables with the intervention. Interactions 
this sort are more readily addressed in 
dup factorial designs. 

pingle-case designs appear relatively 
ightforward, a characteristic that may fos- 
‘father than ameliorate the methodological 
dinterpretive problems highlighted in this 
ile, Complexities of the designs derive 
mm the lack of clear rules for deciding how 
Bto gather data, when to implement treat- 
nt, and the criteria for evaluating change. 
lantitative guides for making decisions 
lit changing from baseline to intervention 
ses and for evaluating the statistical re- 
bility of interventions are available but are 
infrequently used in the literature. The 
“ guide for appropriate use and interpreta- 
n of Single-case experiments is recognition 
fe basic logic of the designs and common 
thodological problems. The present article 
essed some of the major problems that 
pede interpretation of findings with single- 
€ experiments. 
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pling of stimulus persons. 


More than 30 years ago, Egon Brunswik 
1947) pointed out that if we wish to gener- 
lize the results of a psychological experiment 
populations of subjects and to populations 
i stimuli, we must sample from both popula- 
lons. This argument was elaborated by him 
other articles and was summarized cogently 
Ma short article by Hammond (1948). The 
litpose of this article is to review the issues 
hat Brunswik raised and to examine some of 
eit implications for contemporary research 
Aclinical psychology. 
| Brunswik’s thesis is very simple. When we 
induct an experiment intended to investigate 
ie effect of different values of an independent 
ia on a population, we always take care 
i taw a sample of subjects that is repre- 
itative of the population in question. We do 
Naturally, because we recognize the range 
ation that exists in populations of indi- 
a We wish to make sure that deviant in- 
i values do not distort our estimate of 
pee of the population. If the stim- 
a e use are defined in physical units, 
Be a should be) careful to confine our 
} hea tions to the range of values actually 
eh the study. When physical units 
A A , we have relative confidence that 
sige us can be replicated by another in- 
poe that the detailed descrip- 
a 


4 ao is due to Winifred Barbara Maher 
Roses a reading of early drafts of this article. 
naa Teprints should be sent to Brendan A. 

Ha ment of Psychology and Social Rela- 
University, Cambridge, Massachusetts 


Stimulus Sampling in Clinical Research: 
Representative Design Reviewed 


Brendan A. Maher 


Harvard University 


Brunswik’s concept of representative design is reviewed with special reference to 
studies of clinical bias. The limitations of single-stimulus, actor-script, and serial 
replications are discussed. No satisfactory alternatives exist to adequate sam- 


tion of the stimulus is followed carefully. 
Should a subsequent investigator change one 
or more of these attributes, we are not sur- 
prised if there is a concomitant change in the 
responses that are made to the stimulus. 
When the stimuli to which the subjects re- 
spond cannot be defined in physical units and 
are likely to vary within a population, a differ- 
ent situation arises. Outstanding examples are 
to be seen in research directed to the inves- 
tigation of the effects of human beings as 
stimuli that elicit behavior from other human 
beings. Consider some instances drawn from 
recent volumes of this journal. Acosta and 
Sheehan (1976) reported that Mexican 
American subjects viewed an Anglo American 
professional therapist as more competent than 
a Mexican American professional when all 
other variables were matched. Babad, Mann, 
and Mar-Hayim (1975) reported that trainee 
clinicians who were told that a testee was a 
high-achieving upper-middle class child as- 
signed higher scores to Wechsler Intelligence 
Scale for Children (WISC) responses than did 
another sample of clinicians who were led to 
believe that the same responses had been 
made by an underachieving deprived child. 
Research of this kind is generally cast in 
terms of a hypothesis that members of a 
specified population respond in discriminatory 
fashion to members of certain other popula- 
tions. Thus, for example, we encounter such 
questions as, Do physicians give less adequate 
medical care to ex-mental patients than they 
do to normal medical patients? (Farina, 
Hagelauer, & Holzberg, 1976) and Are thera- 
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pists with a behavioral orientation less affected 
by the label patient when evaluating observed 
behavior than are therapists of psychody- 
namic persuasion? (Langer & Abelson, 1974). 


Single-Stimulus Design 


Human attributes are generally distributed 
in such a fashion that any one of them is 
likely to be found in conjunction with a wide 
variety of others. Let us consider an investiga- 
tion of bias toward ex-mental patients. To be- 
labor the obvious a little, we can note that 
the attribute ex-mental patient can be as- 
sociated with any measure of intelligence, age, 
sex, education, socioeconomic status, physical 
attractiveness, and so forth. It is true that 
some of these attributes may have significant 
correlations with each other; a patient of 
upper socioeconomic status is quite likely to 
have had substantial education, for example. 
Nonetheless, even the largest of these correla- 
tions is quite modest, and the population of 
ex-mental patients to which we wish to gen- 
eralize will have a wide range of values on 
these attributes. 
When we employ only one person as a stim- 
ulus, we are faced with the fact that the 
specific values of some of the other attributes 
possessed by this person will also have stimu- 
lus value that will be unknown and uncon- 
trolled. Responses made by a sample of the 
normal population to an ex-mental patient 
who is female, young, attractive, articulate, 
and intelligent may well be different from 
those made to a normal control who is male, 
old, ugly, incoherent, and dull. These differ- 
ences cannot be assigned to the patient/non- 
patient status of the two stimulus persons, as 
many other unidentified differences were un- 
controlled, At first sight it may appear that 
this problem is solved by the simple expedient 
of matching the patient and the control on all 
variables other than that of patient status. 
Unfortunately, this can only be achieved at 
the cost of further difficulties, We do not 
know the full range of variables that should 
be matched, and hence this solution neces- 
sarily involves resort to an actor and a script, 
barring the unlikely availability of discordant 
monozygotic twins for research purposes! 
Scripts bring with them some special prob- 
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lems, of which more will be said later. The 
main point to note here is that the use of a 
single human stimulus acting as his or her 
own control fails to deal with the problem of 
the interaction of the attribute under in- 
vestigation with those that have been con- 
trolled by matching. Pursuing for a moment 
the example of responses to the label ex-men- 
tal patient, let us consider a hypothetical 
study using a male actor with athletic phy- 
sique and vigorous movements. The willing- 
ness of a normal subject to accept this indi- 
vidual as a fellow worker, neighbor, or friend 
may well be influenced by the perception that 
the ex-patient, if violent, could be dangerous. 
Had the actor been older and visibly frail, the 
reaction might well be different. Under the 
first set of circumstances, the bias hypothesis 
would probably be confirmed, and under the 
second set, the null hypothesis might fail to be 
rejected. 

An additional difficulty is incurred by the 
single-stimulus own-control strategy. We can- 
not determine whether a finding of no differ- 
ence between group means is due to the weak- 
ness of the hypothesis, errors of method, or 
the inadvertent selection of an atypical stimu- 
lus person to represent one or both conditions. 
An example of the complexities of interpreta- 
tion with this design can be found in Farina 
et al. (1976). These investigators hypothe- 
sized that physicians would provide less ade- 
quate medical care to former mental patients 
than to normal medical students. To test this 
hypothesis one stimulus person, a 23-year-old 
male graduate student, approached 32 medical 
practitioners. In each case he entered the doc- 
tor’s office 


carrying a motorcycle helmet and a small knapsack, 
. . . The same symptoms were reported to all doc- 
tors. Stomach pains suggestive of ulcers were Sè- 
lected to be neither clearly psychiatric nor unrelated 
to the mind. . . . Every other practitioner was tol 
the pains had first occurred 9 months earlier while 
the patient was traveling around the country. The re- 
maining 16 doctors were also informed that the pains 
had appeared 9 months earlier, but at that time the 
patient reported being in a mental hospital. (Farina 
et al., 1976, p. 499) 


No significant difference of any relevance was 
found in the kind of medical care given x 
the practitioners under either condition. 4m 


lusion, the authors stated that “a former 
fal patient seems to receive the same 
tical treatment as anyone else” (p. 499). 
Logically, several conclusions are compatible 
ih this finding. One obviously valid conclu- 
is that a young male motorcyclist with the 
mptoms of ulcers receives a certain class of 
ment whether or not he describes him- 
as a former mental patient. We cannot 
whether this treatment is the same, bet- 
or worse than that typically given to a 
dom sample of the normal population of 
ilients who seek treatment for stomach 
iüs, as no such sample was obtained. A sub- 
tial number of physicians may have had 
ions about motorcyclists as unfavorable as 
e that they were hypothesized to have 
t former mental patients, and hence both 
fititions produced equally inadequate medi- 
p cate. Alternatively, the physicians may 
l felt the necessity to be unusually careful 
Moviding care to individuals who might be 
pied to be irresponsible (such as motor- 
sts and mental patients), and hence they 
Mded better than average care, Finally, 
tal practice may be sufficiently precise 
tthe adequate procedures to follow with 
ents who complain of stomach pains that 
a room for bias exists, the treatment 
4 che the same as would be given to 
of patients, 

+ can summarize the limitations of the 
eis design as follows: 
iiy ae ieee may be due to the 
BSF incont aie hypothesis or to the ef- 
A n action aaa variables in 
Watiable. N with the intended indepen- 
these = 0 method of distinguishing be- 
O a fo is possible. ; 

tY of the h erence may be due to the in- 
logica] e, undiscovered meth- 
a rs such as subject sampling 


ian Presence of an uncontrolled stim- 
feo the intended independent variable 
1s effect to a ceiling value in both 
is aa control situations. 
Mtrolled ‘i apparent that the problem of 
lus i. tributes occurring in a single 
Son of a can only be solved by the 
mS, since equate sample of stimulus 
hey will tend to cancel each 
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other out. No satisfactory solution is possible 
within the single-stimulus design. 


Scripts and Manticores 


Some investigators have attempted to solve 
the problems of the single-subject stimulus by 
fabricating scripts without the use of a hu- 
man actor to present them. Case histories, dos- 
siers, vignettes, audiotapes, or other devices 
have been used to reduce the effects of the 
uncontrolled aspects of a human stimulus. 
Thus, in a study by Babad et al. (1975), the 
trainee clinicians were given only the WISC 
protocol and did not see the child who was 
alleged to have been tested. These manufac- 
tured materials may be termed scripts. Scripts 
may be taken from existing sources of genu- 
ine material, such as clinical files; they may 
be created de novo in accordance with prior 
theoretical guidelines or in an attempt to 
present an ideal “typical” case. 

When the script is drawn from original clin- 
ical files, the investigator is assured that at 
least one such case exists in nature. The lim- 
itations on the results obtained from such 
scripts are, in principle, the same as those that 
plague any single-stimulus design. Some minor 
advantage accrues to the method, however, in 
that the number of uncontrolled accidental at- 
tributes has been reduced by the elimination 
of those attributes associated with physical 
appearance, dress, and so forth. When the 
script is fabricated for research purposes, a 
new problem develops—namely that in devis- 
ing material according to theoretical guide- 
lines, a case is created that like the manticore, 
may never have existed in nature, We can 
imagine a hypothetical investigation of the 
attitudes of males toward females of varying 
degrees of power. Varying naval ranks with 
male and female gender of the occupant of 
each rank, we create the dossier of an imagi- 
nary female Fleet Admiral. Whatever our male 
subject’s response to this dossier may be, we 
have no way of knowing whether it is due to 
the theoretically important combination of 
high rank with female gender or to the singu- 
larity of a combination that is, as yet, un- 
known to human experience. 

For a recent illustration of this problem, we 
can turn to Acosta and Sheehan (1976). They 
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presented groups of Mexican American and 
Anglo American undergraduates with a video- 
taped excerpt of enacted psychotherapy. Each 
group saw an identical tape, except that in 
one version the therapist spoke English with 
a slight Spanish accent and in the other ver- 
sion the accent was standard American En- 
glish. Some subjects were told that the thera- 
pist was a highly trained professional; the 
others were told that the therapist was a para- 
professional of limited experience. There were 
thus four experimental conditions and two 
kinds of subjects. The Spanish-accent tape of 
a trained professional was introduced with a 
background vingette describing the therapist 
as American born of Mexican parentage and 
as having a Harvard doctorate in his field 
and a distinguished professional record. For 
the American-English-accent tape, the thera- 
pist was introduced with the same vingette 
but with an Anglo-Saxon name and parentage 
identified as Northern European. Anglo Amer- 
ican ratings of the therapist’s competence 
were uninfluenced by the ethnic identification, 
whereas Mexican Americans rated the Mexi- 
can American therapist less favorably than 
the Anglo American therapist. 

In their discussion of this somewhat surpris- 
ing result, the authors noted that the number 
of Mexican American therapists actually in 
practice in the United States shortly before 
the study was done was 48 (28 psychologists 
and 20 psychiatrists). We do not know what 
characteristics would be typical of this popula- 
tion, and no attempt seems to have been made 
to ascertain them before preparing the script. 
There is, therefore, no way to be sure that 
the therapeutic style, choice of words, gesture, 
and so forth, were authentically typical of ac- 
tual Mexican American therapists. Given that 
essentially the same script was used for both 
ethnic conditions, we must conclude that 
either one or the other version of the script 
was ethnically inaccurate or, less likely, that 
the only actual difference that would bé seen 
in the comparative behaviors of Mexican 
American and Anglo American therapists 
would be their accent. In brief, we cannot ig- 
nore the possibility that the Mexican Ameri- 
can subjects disapproved of the Mexican 

American therapist not because he was Mexi- 
can American but because his behavior was 
not representative of that of actual Mexican 
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Americans. Like the woman admiral, he may 
have presented a combination of character- 
istics that is theoretically possible but un- 
known in the experience of the subjects re- 
sponding to it. The only guarantee that a 
script is free from impossible or improbable 
combinations of variables is when it is directly 
drawn from an actual clinical case or other 
human transaction. We cannot produce a 
fictional script of a psychotherapeutic session 
with any confidence that it is as representative 
as a transcript of an actual session. The ideal 
or typical therapeutic interview may be as 
rare as the perfect textbook case of conversion 
hysteria or as a stereotypical Mexican Ameri- 
can. This rarity or implausibility may well de- 
termine a subject’s response far more than the 
attributes that were planned to make it appear 
typical. Our hesitation in generalizing from a 
single stimulus case to a population of cases 
is increased substantially by the prospect of 
generalizing from a case that is not known to 
have existed at all. 


Representative Design 


The moral to the foregoing review is simple. 
If we wish to generalize to populations o 
stimuli, we must sample from them. Only in 
this way can we be confident that the various 
attributes that are found in the population 
will be properly represented in the sample 
Those attributes that are significantly cor 
related with membership in the population will 
appear in appropriate and better-than-chanct 
proportions; those attributes that are uncor- 
related with population membership will ap 
pear in chance proportions but will not affec 
the outcomes. If we intend to draw conclu 
sions about the way in which physicians trea 
former mental patients, we must sample phys! 
cians and former mental patients. If we wis 
to know what Mexican American student 
think of Mexican American therapists, W 
must sample students and therapists. This i 
the essence of Brunswik’s (1947) concept q 
representative design. There is no satisfactor) 
alternative to it. Nonetheless, the use of repi 
sentative design is rarely, if ever, seen in " 
ported research. There are, in my opinio" 
three reasons for this. First, many clinical Psy 
chologists are unaware of Brunswik’s ck 
The remedy for this is obvious and €339 © 


lappy- Second, there is a common failure to 
nderstand that the replication of single-stim- 
lilus studies with additional single-stimulus 
iudies cannot create accumulated representa- 
five design unless the selection of single-stim- 

[ilus persons was achieved by sampling. 
Let us consider a hypothetical series of 
[tudies of the effect of examiner gender on 
thildren’s test responses. In the population of 
jgaminers, there are likely to be attributes 
that distinguish males from females in addi- 
lion to those that are inseparable from gender. 
Thus the proportions of married and single 
Persons, prior experience with children, knowl- 
"ige of various hobbies, mean age, prior locale 
iii undergraduate education, and so forth, may 
iiffer between the two groups. In the first 
idy we use one male examiner and one fe- 
ale examiner, each with 1 year of experience. 
sing samples of male and female children, 
find differences in test responses attrib- 
lable to examiner gender. Conscious of the 
fict that we included inexperienced examiners, 
Neteplicate the study with one male examiner 
id one female examiner each with 3 years of 
perience. Now we find no difference. Our 
Mies ends when we have made gender com- 
|; ‘a for examiners with 1, 3, 5, 7, 9, 11, 
be a 17, and 19 years of experience. We 
And significant examiner effects at every 
; eo experience except 3 and 5 years. As 8 
ee Studies have found significant dif- 
i TA i to gender, we conclude that there 
a alizable finding, We might even treat 
ft k series as a single experiment com- 
y e group of 10 male examiners with 
e of 10 female examiners and find a 
fan ‘Sad Significant difference between the 
Hsu the eee elicited by one group 
oe this conclusion it is first neces- 
ae what the true proportion of the 
erience ation of examiners at each level of 
s incl n If the experience range of 3-5 
clusion “a ee of all examiners, our best 
® establish at gender differences have not 
al the ‘ca ‘i s The reason is, of course, 
lative of e of examiners was not repre- 
fded to fen Population to which it is in- 
tes. to 5. eralize, being underrepresented 
“cannot ‘sear experience range. Note that 
ithing of i this by some proportional 
ata obtained from the ex- 
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aminers with 3-5 years of experience, as the 
results obtained from those comparisons suffer 
from the limitations of single-stimulus design 
and might well be due to the effects of uncon- 
trolled differences between examiners other 
than gender. 

A third reason for the failure to use repre- 
sentative design is that it is laborious and ex- 
pensive. Providing an adequate sample of 
stimulus persons, each of whom is to be ob- 
served by an adequate sample of subjects, 
necessarily involves large numbers and long 
hours. For some investigators it is, as one of 
my correspondents put it, “too hard to do 
it right.” 

There is, however, no satisfactory alterna- 
tive to doing it right. Clinical psychology is 
concerned with real people and not with hypo- 
thetical collections of attributes. Our research 
into the behavior of patients, therapists, diag- 
nosticians, normal persons, and the like, must 
produce generalizations that are valid for 
actual populations of these people. Conclu- 
sions based on inadequate sampling may be 
worse than no conclusions at all if we decide 
to base our clinical decisions on them. If the 
patience and time that it takes to do it right 
create better science, our gratitude should not 
be diminished by the probability that fewer 
publications will be produced. 
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Common Methodological Problems in Factor Analytic Studies 


Andrew L. Comrey 
University of California, Los Angeles 


Investigators are urged to plan factor analytic studies prior to collecting the data, 
to formulate a hypothesized factor structure, to develop several relatively pure 
measures of each factor expected, and to select an appropriate sample of at 
least 200 cases. Continuous variables should be used rather than dichotomous 
variables wherever possible. Programmatic series of studies are preferred over 
one-shot investigations. Putting unities in the diagonals and rotating all factors 
with eigenvalues of one or more is discouraged, because this procedure tends to 
give communalities that are too high, produces too many factors, and distorts 
the rotational solution, especially when analytic rotational programs are used. 
In some situations, a computer-assisted hand rotational solution is most likely 
to give satisfactory results. Mathematical algorithms designed to approximate 
simple structure work well only in situations properly designed for their ap- 


plication, 


Despite the fact that factor analysis has 
been roundly condemned by many behavioral 
scientists as being without redeeming social or 
scientific value, the number of factor analytic 
investigations reaching manuscript form ap- 
pears to be on a geometric upward course. 
Why? For one thing, critics of factor analysis 
often are not sufficiently knowledgeable about 
factor analysis to discriminate between a 
proper and an improper use of the technique. 
They are ready, therefore, to generalize from 
a collection of poor studies to the conclusion 
that all factor analytic studies are worthless. 
There are situations in which benefits can be 
derived from the proper use of factor analysis, 
and this fact can account at least to some ex- 
tent for its expanded use despite persistent 
and heavy criticism, Probably a greater reason 
for the increase in use, however, is the ready 
availability of programs at various computer 
centers that will produce factor analytic re- 
sults cheaply. The temptation to use such a 
convenient resource in this publish-or-perish 
age is apparently overwhelming. If all else 
fails in dealing with a body of data, there is 


oe es 
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always the computer and factor analysis stand- 
ing ready to rescue an otherwise fruitless quest 
for publishable results. In the face of an 
avalanche of poor factor analytic studies, it is 
not surprising that many thoughtful scientists 
and journal editors have come to distrust the 
method. For those who are willing to take the 
trouble to learn how to use such methods | 
properly, however, factor analysis can serve 
as a powerful analytic tool to aid the scientist 
in his search for reliable knowledge. The pur- 
pose of this article is to warn inexperienced 
users about some of the pitfalls that they may 
encounter in trying to use factor analytic 
methods in their research. 


Design of Factor Analytic Investigations 


In experimental studies it is rather unlikely 
that a statistical treatment procedure can be 
found after the data have been collected that 
will be entirely satisfactory unless some 
thought was given to the analysis when the 
study was designed. It is obviously best to de- 
sign the data collection for use with a par- 
ticular method of statistical analysis. So it 1 
with factor analysis, too. The variables should 4 
be selected with some particular theory or Com” 
ceptual framework in mind, the data shoul 
be collected in such a way that appropriate 


ational methods can be used, and a suit- 
mple of individuals should be selected 
allow for appropriate application of 
sen method of analysis. Unfortunately, 
ften, factor analysis is thought of only 
e data are already in. At this point, it 
next to impossible to carry out a good 
nalytic investigation. 

oral of the story, therefore, is to plan 
hen using factor analysis. The investi- 
hould start with a first-stage concep- 
what the factor structure should be like 
€ domain he is studying (Guilford & 
mer, 1971; Thurstone, 1947). This first- 
conception may be generated by theory, 
on, past results, or a combination of 
elements. The fact that an investigator 
@ start with a tentative conception of 
the ultimate factor structure is apt to be 
loes not mean, of course, that he should 
force the results to confirm that original 
ption. He should at all times be alert to 
e that the data poorly confirm that 
ption and take steps to revise it accord- 


bles should be selected for the analysis 
will provide good representative mea- 
Or each of the expected factors. Insofar 
ble, it is preferable to have variables 
l measure one and only one of the ex- 
factors in any substantial way. Com- 
fa lables that measure several different 
S will be of little value in locating the 
t factor structure in the analysis. After 
atiables have been selected, it is im- 
Mt to develop good measures of these 
Hes that will produce reliable scores with 
“E Continuous distributions in the sam- 
Individuals to be studied. Without good 
fus measures, the correlation matrix 
factor analyzed will be a poor one. 
y, the sample must be carefully selected 
F presentative of the kind of population 
3 the investigator wishes to generalize 
a It also must be large enough to 
4 Correlation coefficients. If possible, 
poua be at least five times as many 
e number of expected factors, 
a a ould be a least 200 subjects. I 
> Continued reduction in the per- 
4 Bo oat analytic results up to 
_~ 4,000 cases before the factor struc- 
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ture stabilized. Factor analytic results based 
on small samples can be considered at best as 
crude hypotheses to be tested in further in- 
vestigations. 

Beyond planning ahead in designing a fac- 
tor analysis, it is important to understand 
that the best use of factor analysis takes place 
in a programmatic series of investigations in 
which the researcher is constantly refining his 
conception of the factor structure and the 
variables that represent the factors, improving 
the extent to which the variables measure the 
factors, and testing hypotheses in subsequent 
studies generated by results in prior investiga- 
tions. A hypothetico-deductive approach is 
recommended. The initial investigation gives 
a first approximation to what the factor struc- 
ture should be. On the basis of this concep- 
tion, modifications in existing variables are 
made, new variables are added, and so on, and 
predictions are made about what will happen 
in the next study as a result of these changes. 
The next study verifies or fails to verify these 
predictions, thereby generating further sug- 
gested changes. In this manner, the researcher 
gradually clarifies what the factors are ina 
given domain and what variables measure 
them in the most effective way. His results 
are stabilized over a series of investigations 
using different samples. 

This is not to say that a one-shot, after-the- 
fact treatment of existing data using factor 
analysis in a hunting expedition is always 
useless. Such results are useful, however, 
mainly as hypotheses to be followed up by 
more adequate programmatic research. In and 
of themselves, they have little value from a 
scientific standpoint unless they can be veri- 
fied in other investigations. It should also be 
emphasized that good programmatic factor 
analytic research usually will not be sufficient 
in and of itself to establish the ultimate utility 
of the identifying factors. Internal analysis of 
correlational data usually must be supple- 
mented by experimental and other kinds of 
non-factor-analytic investigations to test hy- 
potheses related to the factors before their 
scientific value can be clearly demonstrated. 


Selection of the Sample 


The particular sample of individuals studied 
in a factor analytic investigation can have a 
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profound effect on the outcome of the anal- 
ysis. If we give ability tests to a sample of 
children ranging in age from 8 to 14, for ex- 
ample, a huge general factor may be gen- 
erated, since most of the 14-year-olds will 
probably do better than most of the 8-year- 
olds on all the tests. The correlations between 
all tests will be very high, and a single factor 
will account for most of the variance. An un- 
wary investigator might call this a general 
intelligence factor, but in reality it is only a 
maturation factor. If all the tests were to 
correlate substantially within a sample of 
children of the approximate same age, on the 
other hand, this would be indicative of a gen- 
eral intelligence factor running through these 
cognitive measures. If, however, students from 
a very selective educational institution of 
higher learning were given tests of a similar 
kind, the general factor might disappear en- 
tirely because variance would be so restricted 
on general intelligence, Since all of the stu- 
dents would be very bright in such a sample, 
the general factor would be lost, leaving only 
group factors. 

It is very important to avoid idiosyncratic 
collections of subjects that can have gross ef- 
fects on the correlation matrix. For example, 
if two subjects who have much higher (or 
lower) scores than all of the other subjects 
on several of the variables are in the same 
sample, this will greatly elongate the scatter 
plot ellipses, generating very high correlations 
for those pairs of variables and injecting a 
great deal of spurious common factor variance 
into the matrix. The investigator cannot nec- 
essarily anticipate all of the special circum- 
stances that might give him spurious results, 
He can be suspicious, however, of any unex- 
pectedly high correlations, or low ones for 
that matter, and investigate thoroughly to de- 
termine if there are special problems with his 


data leading to distortions in the correlation 
coefficients. 


Selection of the Variables 


À It has already been indicated that a pre- 
liminary conception of the probable factor 
structure in the domain under investigation 
should be formulated, and then variables 
should be selected for the analysis that will 
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provide good definitions of these putative fac- 
tors. Ideally, a variable for a factor analytic ‘ 
study should measure one and only one factor 
in the domain to any substantial degree; that 
is, it should be a pure-factor measure. This is — 
not usually possible for all of the variables in 
the matrix, since it is difficult to develop pure- 
factor measures. In particular, it is not likely 
to be the case when the investigator is bound 
to use existing data that were not collected 
with a factor analysis in mind. When the 
analysis is planned in advance, the investi- 
gator can and should carefully plan which 
variables he wishes to include, selecting ap- 
proximately an equal number of variables for 
each hypothesized factor and trying to make 
each variable as pure a measure of one factor 
as he can. He should seek particularly to 
avoid the situation in which a substantial per- 
centage of the included variables are complex — 
measures, each with substantial expected con- 
tributions to two or more factors. It is very 
difficult, if not impossible, to attain an ap- 
propriate rotational solution with any known 
analytic method when a majority of the vari- 
ables in the analysis are complex. 

Beyond getting the most appropriate col- 
lection of variables in the matrix, it is very 
important to develop good, reliable measures í 
of those variables. All too often, there is an 
attempt made to factor analyze poor data 
variables, such as two-choice questionnaire 
items with poor splits in the proportion of yes 
and no responses. Proper variables for factor 
analytic purposes should be relatively con- 
tinuous; that is, they should have many pos- 
sible categories of response and reasonably 
normal distributions. Also, the regressions for | 
every pair of variables should be linear and 
reasonably free of gross departures from the 
normal expected elliptical pattern of data 
points in the scatter diagram. When two- 
choice response variables are used, it is not 
even possible to investigate linearity, and 
furthermore severe distortion can be intro- 
duced into the correlation matrix with a CON- 4 
sequent dramatic effect on the factor analytic 
solution. If one variable is represented by @ 
measure that splits 50-50 while another 15 
represented by a measure that splits 95-5; the 
maximum possible correlation between the two 
variables is limited to an absolute value of 


“Ore 
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approximately .23. With more appropriate 
measures on continuous scales, these two vari- 
ables might correlate much higher. The form 
o measurement of these two variables imposes 
qartificial limit on the size of the correlation 
that could introduce a serious distortion in the 

obtained factor structure. 
Even more dramatic distortions may occur, 
owever, because grossly inflated correlation 
values also can be introduced by poor mea- 
sures of the variables used. Again using two- 
thoice variables as an illustration, suppose 
hat in a sample of 200 cases, 199 subjects 
answered no to two items and one person an- 
swered yes to both. In this case, the correla- 
lion between the two variables would be 1.00 
ecause of 1 deviant individual out of 200. If 
that person had said no to both items, the cor- 
ttlation would have been .00 instead of 1.00. 
hus, with two-choice data, the correlations 
an be artificially limited in size or they can 
le grossly inflated, depending on the situation 
Mcountered, as compared with what probably 
ould have occurred with continuously mea- 
sured, Normally distributed variables. The in- 
VGtigator, therefore, should seek to use vari- 
E for analysis that are continuous rather 
D ofroi: I have personally given up 
a aoee items wherever possible, pre- 
a a E oe with seven-choice items if they 
Ries 4 actor analyzed. This substantially 
sorrel Poy of gross distortions in 
Peal: k a like those indicated above. 
Etonic ia factor analytic studies with 
Rive epee I prefer to use variables 
mmogeneous ieee Ren over several 
en, at have been shown to 
"es have ig the same variable. These total 
the eae more possible data points 
i... associated with a single item. 
ore normal! Shan continuous, tend to be 
Ms more Spleen dc are almost al- 
Banat is e than individual item scores. 
Measures E ic results for such improved 
an those sae less subject to distortion 
_based on dichotomous measures 

aS items. 

tors ro have been made to correct for 
tions i e oc variable intercorrela- 
Max an a such devices as phi over phi 
Methods oa tachoric correlations, but these 
n theoretically only deal with the 


Such 
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problem of artificial limits on the size of the 
correlations. These methods do not even ad- 
dress the problem of spuriously high correla- 
tions. Comparisons of the results of factor 
analyses of the same data using phi coef- 
ficients (Pearson correlations with 1 and 0 
values assigned to the two dichotomous cate- 
gories), phi over phi max, and tetrachoric 
correlations show that the latter two coef- 
ficients can grossly inflate the amount of 
common factor variance in the matrix in cer- 
tain situations (Comrey & Levonian, 1958). 
The sad fact is that nothing can be done to 
make good data out of poor data. Wherever 
possible, the investigator should make plans 
in advance to obtain continuously measured 
scores for the variables that he wishes to 
factor analyze. If dichotomous measures must 
be used, try to get them as close to 50-50 
splits as possible for all measures and then 
interpret the results very cautiously. In par- 
ticular, look for any large correlations that 
might have been artificially elevated by idio- 
syncratic response distributions in the mea- 
sures correlated. 


Extraction of Factors 


What to put in the diagonal cells as initial 
communalities and how many factors to ex- 
tract are the two decisions that create the 
most difficulty for typical research users of 
factor analysis. Many popular computer pro- 
grams offer the option of inserting 1.0 for each 
communality (diagonal) cell of the correla- 
tion matrix followed by the extraction of all 
factors by the principal factor method that 
have eigenvalues greater than or equal to 1.0 
(called the “eigenvalue-one criterion” by 
Rummel, 1970, p. 363). That is, the sum of 
the squares of all factor loadings for the ex- 
tracted factor is 1.0 or greater. One justifica- 
tion for this procedure is that each variable 
included in the matrix adds 1.0 to the total 
communality in the entire matrix, hence any 
retained extracted factor should contribute at 
least as much as the effect of adding one vari- 
able to the matrix. This option has become 
very popular, probably because it is readily 
available in existing computer programs that 
are widely used, and because it requires no 
judgment on the part of the investigator. 
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There are problems with this approach, how- 
ever, that can lead the unwary investigator 
astray (Comrey, 1973; Guertin & Bailey, 
1970; Lee & Comrey, Note 1; Comrey & Lee, 
Note 2). First of all, each variable has 1.0 as 
its potential communality after all possible 
factors have been extracted. This is true 
whether the variable has much or little actual 
common factor variance. By the time all fac- 
tors have been extracted with eigenvalues 
greater than 1.0, it not infrequently happens 
that an amount of common factor variance has 
been extracted for certain variables that is 
inappropriately high given the size of the cor- 
relations with the other variables in the ma- 
trix. This is particularly apt to occur when 
items of a two-choice nature are being corre- 
lated. These variables are notoriously unre- 
liable and often have low correlations with 
each other. The factor results based on analy- 
sis of such matrices with the eigenvalue-one 
procedure described above, however, often 
show very high communalities and large factor 
loadings for these dichotomous variables. 
When this spurious extra common factor vari- 
ance is introduced into the matrix, the number 
of factors that seem worthy of retention even 
after rotation, is often too high, further dis- 
torting the solution. It is my conclusion that 
the eigenvalue-one procedure should not be 
used unless the variables in the matrix are all 
good, reliable, continuous measures that cor- 
relate substantially with each other, leading to 
rather high true communalities for all vari- 
ables. In this case, the distortion introduced 
by the eigenvalue-one procedure is less likely 
to represent a problem, 

The distortion introduced by this method 
can be mitigated by iterating the communal- 
ities, that is, after extracting factors with 
eigenvalues greater than or equal to 1.0, re- 
insert the accumulated communalities for that 
number of factors, reextract the same number 
of factors, and repeat (Lee & Comrey, Note 1; 
Comrey & Lee, Note 2). Unfortunately, the 
values to which the communalities converge 
if they do, will not necessarily and often will 
not be, the same values to which they would 
converge starting from different initial com- 
munality estimates. If unities are used in th 
diagonal: in wi s 

gonals to begin with for a common factor 
analysis, however, I believe that it is better to 
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iterate the communalities to stability, since 
this will typically introduce less distortion into 
the solution than failing to iterate. 

It cannot be assumed, however, that the 
correct number of factors is obtained by stop- 
ping extraction when the eigenvalues are less 
than 1.0. This procedure would seldom give 
too few factors, although it often gives too 
many. The investigator should look at a 
number of criteria for terminating extraction, 
including examining rotated solutions with 
different numbers of factors, before leaping 
to the conclusion that he has the right num- 
ber. There is no definitive solution to the 
problem of determining the correct number of 
factors. It is up to the investigator to use all 
of the information that he can get to reach a 
conclusion about this and to try to justify his 
conclusion within several converging lines of 
evidence instead of relying on a universal rule 
of thumb. 

A second common factor extraction strategy 
is to use squared multiple correlations in the 
diagonal cells and extract all factors by the 
principal factor method having eigenvalues 
greater than zero. This procedure can be fol- 
lowed by iteration of the communalities until 
they stabilize. The stabilized communalities 
generally will not diverge from the squared 
multiple correlations by this procedure as 
much as the stabilized communalities will dif- 
fer from 1.0 in the previously discussed pro- 
cedure. For this reason, iteration of the com- 
munalities is less necessary in this latter pro- 
cedure, and in fact, it may not give a better 
solution than the noniterated communalities. 
Iteration of the communalities tends to mM- 
crease the size of these values, on the average, 
capitalizing on chance error to raise the por- 
portion of variance extracted. I have found 
cases in which it appears that the communal- 
ities have been raised to unrealistically high 
values by this iterative process (Comrey, 
1973). It is not unreasonable, however, tO 
iterate the communalities and examine the re- 
sults in comparison with the uniterated results 
to see which solution can be justified as being 
the more reasonable. The number of factors 
must be kept constant during the iteration 
process, however, or the number of factors will 
gravitate toward the number of variables and 
the communalities will gravitate toward 1.0 


iii 


(ead of stabilizing at smaller values. Using 
jared multiple correlations and eigenvalues 
fater than zero as a criterion will typically 
tract substantially more factors than the 
nvalue-one procedure. Care must be taken 
lj eliminate superfluous factors during the 
lation process to avoid distorting the final 


@ is to obtain a solution that does not re- 
{ite communality estimates. Comrey (1962) 


ised a stepwise minimum residual solution 
it operates only on the off-diagonal ele- 
ments of the correlation matrix. Communal- 
ls are derived as a result of the factor ex- 
tion process rather than being estimated 
tad of time and then influencing the cal- 
' tion of factor loadings. The Comrey and 
ihimada procedure has a built-in criterion 
fi terminating extraction when iterated fac- 
h vectors converge on vectors of opposite 
Min the extraction process. This corre- 
Ponds roughly to the point in the principal 
4 method of terminating factor extrac- 
M where eigenvalues become negative when 
lared multiple correlations have been used 
initial communality estimates. The com- 
Malities that result from this minimum 
fual procedure can be used as the point of 
patture for an iteration of communalities by 
© principal factor method. This may or 
bean Prove to be defensible, depending on 
Fh ae a elevation of the communalities 
litalities TaN process. Iterating the com- 
} A rting with the minimum resid- 
iat communalities as initial estimates, 
EH cases ah about the same results in 
iber 3 i at would be obtained for that 
Bisse actors using the Harman and 
os a minres method. Minres is also 
ess the = ea residual solution that re- 
ht it does 3 fea residuals to a minimum, 
is, not ad after a Specified number of fac- 
(19 Bethea we time as in the Comrey 
er of Petes ith the minres method, the 

s desired must be specified in 


Reto Shed factors by the principal 
MS in the di with squared multiple correla- 
Nalities lagonal and iterating until com- 
Stabilize, extracting factors by the 
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Harman and Jones (1966) minres method, or 
extracting factors by the Comrey method 
(1962) and iterating the communalities all 
give similar results if the number of factors 
is held constant. Extracting the same number 
of factors with the communalities set equal to 
1.0 can give very different results that are 
difficult to defend. If the eigenvalue-one re- 
sults are iterated until the communalities 
stabilize, the amount of distortion will typi- 
cally be lessened although not necessarily 
eliminated entirely. Determining the number 
of factors to extract is still a major problem 
with all of these methods. Ordinarily, it is 
necessary to consider a variety of criteria and 
even rotate various numbers of factors before 
coming to a final conclusion about what is the 
appropriate number of factors. Just accepting 
the number of factors suggested by a particu- 
lar termination criterion can lead to gross dis- 
tortions in the final results. 

Many other procedures for extracting fac- 
tors are available, each with its own charac- 
teristics and potential abuses.” 


Rotation of Factors 


The unrotated factor matrix is computed in 
most methods such that the product of this 
matrix by its transpose will approximate the 
original correlation matrix within a certain 
margin of error. If a given factor matrix will 
do this, so will an orthogonal transform of 
that matrix, that is, a matrix that can be 
reached by orthogonal rotations of the un- 
rotated factors. The axes do not even have to 
be rotated orthogonally, but then the angles 
formed by the axes must be considered in re- 
producing the correlation matrix. The various 
methods of factor extraction obtain the un- 
rotated matrix by satisfying certain mathe- - 
matical criteria, such as extracting the maxi- 
mum amount of variance in one factor or 
minimizing the sum of squares of residuals 
of off-diagonal elements, and so on. In almost 
all cases, these mathematical criteria locate 
the factor axes in positions that have nothing 
to do with psychological meaningful positions 
for the axes. In these cases it is necessary, 


ee 
1 For further information, consult the factor analy- 


sis textbooks mentioned later in the article. 
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therefore, to carry out rotations of the un- 
rotated factor axes, either orthogonal or 
oblique, before the factor axes are appropri- 
ately located to define psychologically mean- 
ingful constructs. Since the number of differ- 
ent positions for these axes is virtually un- 
limited, there is no unique solution to the 
rotation problem in factor analysis. Many dif- 
ferent criteria have been developed to help the 
investigator locate the “correct” factor axis 
positions, but none of these criteria has 
achieved general acceptance, The basic reason 
for this is that no one criterion of rotation can 
fit all situations, since the proper location of 
axes in a given matrix is often quite specific 
to that particular problem. A rotational solu- 
tion is nothing more than the investigator’s 
interpretation of the data. It may or may not 
be a good one. 

Thurstone (1947) popularized the criterion 
of simple structure for rotation of factor axes, 
and many of the modern computer procedures 
for rotation of axes are merely attempts to 
approximate simple structure through the ap- 
plication of specially designed mathematical 
algorithms to the data. Simple structure usu- 
ally will work reasonably well, as will com- 
puterized approximations to it, if the study is 
properly designed for the use of this criterion. 
Thus, if there are several well-defined factors, 
each measured by several pure-factor mea- 
sures (one-factor variables) that are normally 
distributed and reliably assessed, and if there 
are few if any complex factor measures, simple 
structure will ordinarily work very well in- 
deed. Unfortunately, in practice relatively few 
factor analytic studies carried out actually 
fit this model. In most actual analyses, the 
factors are not so well charted in advance, 
each factor does not have several pure-factor 
measures to define it, and there are usually 
many complex variables. The number of vari- 
— Lederer’ 4 vary considerably, and 

3 w well-defined factors in 

the matrix. 
lt tha Inter cata, ase al 
not apt to be ver miei ice rink cae fe 
y successful in pointing the 
way to the best solution of the rotation prob- 
deena ceri S 
either. A etn ae ee eee el 
` such as this requires all of 
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the skill and experience that the most knowl- 
edgeable factor analyst can muster to have ` 
any hope of locating the best solution. Since 
such idiosyncratic collections of variables do 
not fit the simple structure paradigm, to 
locate the best factor structure for the data it 
is often necessary for the investigator to fix 
the rotated factor positions in accordance 
with knowledge that he may have about the 
variables. Thus, he might know that one vari- | 
able has been found in several past analyses to 
be a relatively pure measure of a certain fac- 
tor. In his rotations, he would deliberately 
place one of the factors coincident with this 
variable vector and attempt to rotate it in | 
such a way that variance for this variable 
would not appear in any substantial way on 
other factors. He might know that two other 
highly correlated variables were complex com- 
posities of factors X and Y. Not having pure 
measures of X and Y, positioning of the axes 
would be difficult, but at least the factors 
could be positioned in such a way that a fac- 
tor is not run through these two complex 
variables as an analytic rotation program 
might. In short, a resort to the old “hand- 
rotation” procedures using all of the knowl- 
edge that one can muster about the vari- 
ables may be necessary to have any chance — 
of getting a reasonable rotated factor solution 
(Comrey, 1973). This skill has almost become 
a lost art with the advent of the high-speed 
computer. These hand rotations can be per- 
formed without undue labor, however, using 
the computer to do the hard work, so there is 
no need to give up this valuable option just 
because it used to be so laborious. . 
I have developed a computer program 
(Comrey, 1973), for example, that first plots 
the factors. These plots are inspected visually 
to find the desired rotation, orthogonal of 
oblique, and then the instructions are given to 
the computer to carry out the rotations and 
make new plots. The investigator only has to 
inspect the plots and determine which factors 
he wishes to rotate by which angles of rota- ¢ 
tion. This is more work than applying a com- 
puter-programmed analytic method, of course, 
and it requires the application of some judg- 
ment, but it is not laborious, and it is 4 skill 
that can be mastered with some practice. 
Critics of factor analysis will be quick to 
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pint out that such a procedure is subjective 
ind hence suspect. This is no doubt true. The 
investigator is charged with finding the best 
pssible interpretation of the data that he 
un and then justifying his interpretation 
yith evidence that goes beyond the factor 
malytic techniques used to obtain that solu- 
lion. Application of a mathematical algorithm 
işs more objective to be sure, but it does not 
iter the onus placed on the investigator to 
jistify by other lines of evidence that he has, 
infact, found the best structure for the data. 
If the mathematical algorithm applied hap- 
pens to be inappropriate for the data and hap- 
pens to give nonsense for results, it is small 
Wmfort to the investigator to know that the 
Mocess was untainted by human judgment. It 
is the fond wish of naive investigators that 
oo ee be an objective machine 
ia pit mio pao by human 
he et y tale devices do not exist in 
I sciences, nor indeed in the 
thysical_ sciences. The reality is that factor 
eating like a microscope. It can 
sc thia he well-trained investi- 
E se, a Ss e could not see with the 
sions ae ‘ perhaps derive scientific con- 
sc ine i ue as a result. The untrained in- 
E. ng through a microscope can see 
y things, but he may not be able to in- 
litpret what he se ae 
aa lost a es correctly. The untrained 
a 736 ch can get a final rotated 
a from the computer center, but 
propriate oe z very good position to draw 
Stas we nai usions from what he sees. 
Moco ben not throw out or blame the 
Wes it era an untrained individual 
CA ses sees not blame the 
Moperly, nalysis is applied im- 
| Ort, 

ie <a oblique axes. The de- 
ss should b ogonal or oblique rotated 
lator for e a conscious one by the inves- 
lappen t AW Cause, not based on just what 
(H ie 2e available at the computer cen- 
determing, Mvestigators give no thought to 
bhque we whether they should have an 
W obli othogonal solution and if oblique, 
tore que. Conformity to simple structure 

; variables with 1 Toads 
oved more and low loadings) can be 
come more TAS as axes are allowed 
more oblique. Some ana- 
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lytic procedures require the specification of 
a constant that will put more or less of a 
brake on this tendency toward greater oblique- 
ness. Others require the specification of a 
maximum permissible correlation between 
oblique factors. To avoid inappropriate de- 
grees of obliqueness in the solution, whatever 
rotational method is used, the investigator 
should have some idea of what correlations 
between his factors are reasonable. The ac- 
tual angles between factors in his oblique 
solution should be examined with this in mind. 
If the oblique solution rendered by an analytic 
computer method yields factor correlations 
that cannot be justified as reasonable, the in- 
vestigator should be prepared to modify the 
solution. 

Rotating different numbers of factors. The 
number of factors rotated in a solution can 
have a profound effect on the results, espe- 
cially if a computerized mathematical algo- 
rithm is applied. A common error is to ex- 
tract too many factors and then rotate them 
all by varimax or some other procedure that 
seeks something approximating a simple 
structure type of solution. The varimax algo- 
rithm will build up minor factors where pos- 
sible at the expense of major factors. Espe- 
cially where the eigenvalue-one procedure is 
used and too many factors are rotated, the 
varimax method may produce one or more 
factors with high loadings for one and only 
one variable. With a different kind of rota- 
tion procedure, this variable might have an 
important loading on one of the major fac- 
hen the varimax procedure was 


tors but not w! x 
used with too many factors being rotated. The 
actor with one large 


presence of a rotated fi r 
loading, say .6 or more, and no other loading 
above .35, is almost a certain sign of a dis- 


torted factor solution when common factor 
variance is being analyzed. 

Skillful hand rotations can avoid the prob- 
lems associated with rotating too many fac- 
tors. The extra small factors are simply not 
allowed to distort the proper appearance of 
the major factors. It is analytic rotation of 
these extra factors by mathematical algo- 
rithms that is apt to cause a problem. One 
way of dealing with this problem is to rotate 
several numbers of factors in the region of 
uncertainty by the mathematical algorithm 
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in question and then compare the solutions to 
see which can be justified best as a proper 
interpretation of the data. I have developed a 
procedure called the “tandem criteria” method 
of rotation (Comrey, 1967) that starts out 
by rotating more than enough factors. Vari- 
ance is shunted from the major factors toward 
the smaller factors, but this process is re- 
strained by the correlations that exist among 
the variables, Correlated variables are re- 
tained on the same factors. After this process, 
factors that are still too small are dropped. 
Retained factors are rotated by a different al- 
gorithm to provide a more simple structure 
type of solution. It is often necessary to ro- 
tate two or three different numbers of factors 
in the region of uncertainty before a final de- 
cision is made about the “correct” number of 
factors. 

Rotating too few factors, on the other hand, 
forces the amalgamation of factors into com- 
plex composites, obscuring the most useful 
factor structure. Variance is ordinarily ex- 
tracted in successively smaller amounts from 
the first to the last factor, but parts of the 
variance from several factors appear as each 
factor is extracted. If only 10 factors are ro- 
tated when there are really 12 factors there, 
most of the variance for the two extra fac- 
tors will be included in the first 10 unrotated 
factors. This variance must go somewhere, 
so it is superimposed on the first 10 factors, 
distorting their true appearance, Nothing can 
be done to correct the effects of rotating too 
few factors, except rerotating with more fac- 
tors. It is better, therefore, to err on the side 
of taking out too many factors rather than 
too few. 

One of the main points of this discussion 
is to suggest that it is often impossible to 
determine the correct number of factors just 
by considering the unrotated factors alone. 
Rotational solutions can often provide impor- 
tant information that will be useful in mak- 
ing this decision. In most empirical analyses, 
particularly with small samples, there is a 
good deal of error in the correlations that 
tends to add surplus variance to the solution, 
bide Bailey (1970) have pointed out 
tions Oa i ra (og obra 
with large sampl eae 

ples produces greater commu- 
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nalities, and in many cases this can mean 
extra factors. Trying to decide just where the 
wheat ends and the chaff begins is not easy. 
The decision should take advantage of every 
scrap of evidence available to the investigator, 
including the benefit of comparing solutions 
with different numbers of rotated factors. This 
increases the dangers from subjectivity, of 
course, but for a cautious, scientifically ori- 
ented investigator, the benefits outweigh the 
disadvantages in my opinion. 

Confusing factor levels. A persistent prob- 
lem that plagues many factor analytic in- 
vestigations is the inadvertent production of 
low-level factors in the hierarchy of factor 
generality. A factor of this kind is indicated 
by the presence on the factor of two or more 
variables with high loadings that are very 
similar in character. Thus, systolic and dias- 
tolic blood pressure both included in the same 
analysis will produce such a low-level factor, 
or at least distort the nature of some factor, 
depending on the other variables included in 
the analysis. This is due to the fact that these 
two variables are highly correlated, measur- 
ing very similar phenomena. As another ex- 
ample, consider these two test items (a) I 
feel blue, and (b) I am depressed. Correlat- § 
ing these items with others and factor analyz- 
ing them will produce a low-level factor, with 
these two items showing high loadings. These 
two variables are merely alternate forms of 
the same thing. Any factor analytic solution 
should be inspected for the presence of such 
low-level factors. They cannot represent con- 
structs of general interest and scientific util- 
ity per se, because these factors can be pro- 
duced virtually at will. It is only necessary 
to add another variable to the matrix that 15 
very similar to an existing one, and then 4 
factor will be produced defined by these tw0 
variables. 

In my opinion, factors are more apt to be 
useful if they stand at the next level up 
the hierarchy of factor generality. The varia- 
bles with high loadings on the factor should 
not contain two or more variables that coul 
represent virtual alternate forms of the same 
thing. Each variable with a high loading i 
the factor should have a logically distinct FA 
separate identity, measuring something thé 
is not the same as what is being measured by 


my other variable on the factor. The high- 
ding variables defining the factor should be 
related but. not alternate forms of each 
er. In my experience, it is not possible to 
ep adding factors of this kind in a given 
fomain. One very quickly comes to the end 
the line. There is, on the other hand, no 
md to the number of low-level factors that 
n be generated. This fact alone favors 
ond-level factors versus first-level factors. 
At the other extreme, also to be avoided, 
are factors that are too high in the hierarchy. 
tors produced by correlations among 
hly complex variables, such as several tests 
intelligence, are too broad to have much 
\ ility. When several second-level factors are 
lund to be persistently intercorrelated, how- 
wer, a useful third-level construct may be in- 
lated. This is the case with general intelli- 
ce, a third-level construct that is generated 
the intercorrelations that exist among 
wer level factors of cognitive performance. 
Mentification of such useful higher order con- 
cts takes place best through the identifica- 
m of second-level factors that are corre- 
ed, not by a direct assault using complex 
Va lables in the factor analysis. 
peine a general factor. Occasions arise 
4 a an investigator hopes to demonstrate 
h ene of a general factor running 
One A A3 e variables that he is investigating. 
i e common errors in this instance is 
a the first extracted factor and treat it 
aan it were a general factor. This pro- 
i Is unsatisfactory, because the first 
peor can easily contain major load- 
A entirely unrelated variables. If the 
n a be a general factor, the variables 
a ae it must be correlated with each 
etain ae in rare special cases involving 
Bisson a. of complex variables. The in- 
hh ding aed make sure that the variables 
teed | Se general factor are in- 
Bes a th toa satisfactory degree before 
lave a general Pacluşion that he does in fact 
tions ot : actor. On the other hand, if 
timax, for eral ae are carried out by 
ing Rate ae 22 or any other method 
to disperse th cture, the tendency will 
ble, thereby r du Pe tee sie 2 Ee 
eral co ene the prominence of any 
j at might be present. It is 
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desirable to use a method that will permit a 
general factor to appear if in fact the data 
warrant it without forcing it on the data. 
The Criterion I rotational method of the tan- 
dem criteria (Comrey, 1967, 1973) is one 
such method. 


Interpretation of Results 


The usual procedure in interpreting factor 
results is to inspect the variables that have 
high loadings on the factor, look for what 
they have in common, and then name the fac- 
tor in accordance with the common elements. 
Up to this point, the activity is little more 
than factor naming, and if nothing beyond 
this is done, the value of the analysis may be 
rather limited. When a name is given to a 
factor, a hypothesis has been formulated. Un- 
tested hypotheses usually have limited value 
until something has been done to test them. 
Ideally, as has been mentioned earlier, the 
investigator will make plans to carry out ad- 
ditional analyses in which he adds new vari- 
ables that should have major loadings on 
certain specified factors and low loadings on 
other factors if his hypotheses are correct. He 
will perhaps revise other variables in ways 
that predict certain outcomes if his factor in- 
terpretations are correct. These predictions 
will be tested by further investigations. Ex- 
periments may be carried out, with predictions 
being made as to the outcome, in which at- 
tempts are made to alter scores on one factor 
but not another. Results of the experiment 
will confirm or disconfirm the hypothesized 
factor meanings. In other cases, predictions 
may be made about how scores for a certain 
factor will correlate with other variables out- 
side the matrix. When the investigator carries 

d does not follow up after 


out one analysis an 
the factor naming to test the accuracy of 
his factor interpretations, he must recognize 


that much remains to be done before his in- 
terpretation can be regarded as proved. 


Writing Up Results 


One of the most commonly violated rules 
of scientific reporting in factor analytic stud- 
ies is that the writer fails to give enough in- 
formation to permit another investigator to 
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repeat his work. There should be sufficient 
information given about the variables studied 
so that someone else can also construct or 
obtain comparable measures if he so desires. 
The correlational matrix itself should be made 
available to the reader so that he can evaluate 
the adequacy of the solution reported or 
carry out his own preferred type of analysis 
on the data. Future developments may also 
provide more powerful methods of analysis 
that can be applied to the same matrix, pro- 
vided that the matrix is made available. Many 
factor analyses appear in manuscripts sub- 
mitted for publication that fail to contain 
this information, making it exceedingly diffi- 
cult to evaluate the work. True, these matrices 
often cannot be published in the article it- 
self, but at least they can be made available 
through auxiliary publication outlets and cer- 
tainly should be available to journal editors. 

Beyond this minimum, the investigator 
should preferably do much more and often 
fails to do so. He should report enough about 
his sample so that its impact on the results 
can be assessed. He should tell what method 
of carrying out the factor extractions he used, 
what he did about the communalities and why, 
and how he determined the number of factors 
extracted and rotated. He should also make 
available the unrotated factor matrix, using 
auxiliary publication outlets. 

The method of carrying out the factor rota- 
tions should be explained and the reasons 
given why this is an appropriate method for 
the data. Some information should be given 
about why an orthogonal or an oblique solu- 
tion was chosen. Discussion of the rotational 
solution with respect to the problems of de- 
termining the correct number of factors should 
show that the investigator at least considered 
alternate possibilities. The rotated factor 
matrix should also be made available. If there 
is an oblique solution, the structure matrix, 
the pattern matrix, and also the matrix of 
correlations between factors should be made 
available. Again, auxiliary publication outlets 
should be used for these matrices, but the 
journal editors should have access to these 
materials at the time of manuscript review. 
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Sources of Help 


In an article such as this, it is possible to 
point out briefly some common trouble spots 
in factor analytic investigation and reporting, 
but it is obviously not possible to say all 
that needs to be said nor to give adequate 
advice on how to avoid all possible pitfalls, 
Several good textbooks are available to help 
the investigator, each with its own strengths, 
weaknesses, and special emphasis. A few of 
the more recent books are mentioned below 
with brief annotations: 

Comrey (1973) assumes a knowledge of 
beginning statistics and high school algebra 
and gives elementary treatment of the mathe- 
matical background of factor analysis, but 
the major emphasis is on practical applica- 
tions. Computer programs are made available. 

Gorsuch (1974) contains a balanced ap- 
proach ‘between theory and applications with 
a level of difficulty greater than Guertin and 
Bailey (1970) but less than Harman (1976), 
Horst (1965), and Mulaik (1972). It is a 
good introduction to the more difficult books. 

Guertin and Bailey (1970) is essentially 
nonmathematical in its presentation and hence 
tends to be a “how to” cookbook; despite its 
weakness on theory, the book contains much | 
useful information and results of many €x- 
periments with different factor methods. | 

Harman (1976), published posthumously, | 
is the latest edition of Harman’s widely used 
text and is an excellent source for the mathe- 
matical methods and theory of factor anal- | 
ysis. It is difficult reading for the person who 
is poorly trained in mathematics, however, and 
it does not emphasize the practical problems 
in applying factor analysis. 

Horst (1965) is an encyclopedia of infor- 


mation about theory, methods, and techniques 
in factor analysis. Unfortunately, Horst’s 
notation makes the book very difficult to read 
for most behavioral scientists. : 

Mulaik (1972) is a thorough mathemati- 
cal treatment of factor analytic techniques. 
The major emphasis is on mathematical the- 4 
ory rather than applications. Unless the reader 
is fairly sophisticated mathematically, he 
should probably start with some other book. | 
Rummel (1970) is a good introduction. It 


emphasizes applications more than theory 27° 
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ssimilar to the Comrey (1973) and Gorsuch 
(1974) texts in level of difficulty and mathe- 
matical preparation required. 
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Experimental Methods and Outcome Evaluation 


Michael J. Mahoney 
Pennsylvania State University 


Within the constraints imposed by philosophy of science and the limitations of 
scientific methodology, experiments are basically attempts to identify causes 
and to evaluate their effects. All experiments involve a comparison of dependent 
variable values in the presence of different values of another variable. This 
comparison is usually accomplished by one of two basic experimental strategies: 
(a) within-subject designs and (b) between-subject designs. The 12 most com- 
mon designs are outlined and discussed in the present article. Much of the dis- 
cussion focuses on the importance of internal, external, and theoretical validity 
in experimentation and data interpretation. To illustrate the most common 
threats to these three forms of validity, various components in a hypothetical 
experimental manuscript are discussed. This discussion is followed by an 
acknowledgment of the continuum of fallibility along which all experiments fall. 
It is argued that in the final analysis, our goals should be to strive toward con- 
ducting the least fallible inquiries, to cautiously interpret our experiments in 
accord with their logical warrant, and to guard against the paralysis of com- 


placency regarding the adequacy of current research methods, 


The methodologist is often stereotyped as 
a person who is quite willing to offer criticism 
but who is generally less adventurous when 
it comes to the actual execution of research. 
Although armchair quarterbacking is a com- 
mon diversion among scientists, the bulk of 
our strategic insights seems to follow rather 
than precede the busy work of experimenta- 
tion. Besides illustrating the old maxim that 
hindsight is more plentiful than foresight, our 
post hoc criticisms also seem to exhibit an- 
other common theme—namely, that we tend 
to be more critical of other people’s research 
than of our own. It is, therefore, with some 
reservations that I comment on experimental 
problems in clinical psychology. I do not pre- 
sume that my own research has been methodo- 
logically flawless, nor do I harbor many illu- 
sions about the limitations of science. The 
perfect experiment has yet to be designed and 
is, in some sense, inconceivable (Weimer. 
1977). Even if it were conceivable, however, 
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it is a safe bet that it would be impossible to 
execute. Among other things, the human ele- 
ment in science makes it an inevitably falli- 
ble endeavor (Mahoney, 1976; Mitroff, 
1974). Let us therefore dismiss the notion of 
an ideal experiment and instead devote our 
attention to the continuum of fallible effort 
along which all experiments must fall. In this 
regard, our past mistakes and costly hind- 
sights may well serve to refine our continuing 
attempts to improve the field of clinical psy- 
chology. 

As clinical researchers, it is our goal to 
harness the powers of science in a manner that 
will help us refine human services—particu- 
larly psychotherapy and counseling. This 
might sound like a rather straightforward un- 
dertaking in that it would seem to require 
only a knowledge of scientific methods and 
the opportunity to apply them to problems 
of human distress. As it turns out, however; 
few undertakings are quite as ambitious as 
the scientific study of therapeutic interaction. 
This may be due in part to our poor under- 
standing of the nature of science (Lakatos 
& Musgrave, 1970; Weimer, 1977). At this 
point in time, for example, no acceptable cri- 
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fn has been established for the demarca- 
im of science from nonscience. We must 
lyefore proceed, if at all, with a humility 
bout our understanding of what it is that 
Institutes “good scientific research” and 
low considerable tolerance for statements 
hut convey relativity and tentativeness. To- 
i's science may well be tomorrow’s alchemy, 
ind it is imperative that this contextual fea- 


jie be appreciated. With these apologetics 


i 
imind, we can now proceed to a discussion 


i 


iicontemporary issues that bear on the eval- 
ition of research. 


Fundamentals of Experimental Method 


Un at least one sense, science is a search 
it order—an attempt to discern, describe, 
M apply systematic covariations between 
ents, Order is a prerequisite for prediction, 
id accurate prediction is a first step toward 
Mtrol, This is why the assumption of deter- 
iinism cuts so deeply into the core of sci- 
pie methodology (Hook, 1958). Although 
pe covariation is a prerequisite for 
pe prediction, however, it is hardly 
= for the inference of a cause-effect 
ete. This, of course, relates to the 
quoted difference between correlation and 
ae or what David Hume regarded as 
4 po between sequence and conse- 
eee covariation, no matter how 
a , Can Never demonstrate causation, 
: ional inquiries (or “associative” stud- 
Fie. ie considered nonexperimental in 
Ei si true, of course, if one defines 
Pict cs as requiring an active manipu- 
fe in ak aa variables. It should be 
ities are E He baa that associative in- 
Ber ly uninformative. Although 
D this 3 ean demonstrate cau- 
Eien Si so true of the best conceived 
not as ia, e difference between the two 
fi ‘otomous as some might expect, 
lative een = inquiry varying in the 
Miro 3 e conclusions they war- 
tontin some Fi orrel ational studies can dis- 
this sense eae relationships, and 
iga hypo oa are capable of corroborat- 
Hetence Hetven, There is a tremendous dif- 
hove n corroboration and confirma- 
er, which we shall explore in a 
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moment. For the time being it is worth noting 
that such sciences as astronomy and meteor- 
ology are almost exclusively correlational, yet 
their progress is unmistakable. Having made 
this brief defense of associative inquiry, I 
shall hereafter restrict my comments to the 
topic at hand—namely, problems in experi- 
mental research as it has been traditionally 
conceived. 

There are, of course, substantial parallels 
and yet important differences among the con- 
cepts of prediction, causation, and explana- 
tion. Since the technical aspects of these terms 
have consumed entire volumes, only the most 
superficial rendering could be offered here. 
I shall therefore defer to Weimer’s (1977) 
more extensive discussions, which bear direct 
and significant implications for the conduct 
of research. In lieu of a more technical di- 
gression, suffice it to say that contemporary 
scientists often require the following in their 
evaluation of a causal relationship: 

1. relative temporal contiguity (together- 
ness in time); 

2. priority (the cause must precede the 
effect); 

3. noncontradiction (no observed instances 
of the cause without the effect) ; 

4, factor isolation (the elimination or con- 
trol of all possible influences other than the 
one being examined) ; and 

5. replicability (the capacity 

the alleged relationship). 
The logical validity of these criteria will not 
be examined here, but this should not be in- 
terpreted as complacency regarding our un- 
derstanding of the concept of causation. 

Scientific experiments are basically attempts 
to identify causes and to evaluate their ef- 
fects. This is usually accomplished by making 
systematic changes in one Oor more factors 
(the “independent variables”) and looking 
for any covarying changes in other factors 
(the “dependent variables”). The division of 
factors into independent and dependent cate- 
gories is often arbitrary, and in some inquiries 
auxiliary variables are measured (e.g. to 
evaluate whether they can help predict the 
prerequisites or range for a cause-effect in- 
fluence). Common to all experiments, how- 
ever, is the act of comparison. All experiments 
involve @ comparison oj dependent variable 


to replicate 
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values in the presence of different values of 
another variable. In extreme form, the manip- 
ulated (independent) variable may take on 
dichotomous values (e.g., present vs. absent). 
The basic question, of course, is whether dif- 
ferent values of the dependent variable(s) 
are associated with different values of the in- 
dependent variable(s). This comparison is 
usually accomplished by one of two basic ex- 
perimental strategies: (a) within-subject de- 
signs and (b) between-subject designs. 


Within- and Between-Sub jects Designs 


As the term implies, within-subject designs 
are ones in which the comparison is made 
within the same subjects across time. That 
is, the dependent variable is measured at least 
after the introduction of the independent vari- 
able, and perhaps before or during experi- 
mental intervention. This may be done with 
a small number of individuals, in which case 
it is called “single subject” or “N = 1” re- 
search. The merits and shortcomings of single- 
subject research have received considerable at- 
tention (cf, Greenwald, 1976; Hersen & Bar- 
low, 1976; Kazdin, 1973; Sidman, 1960; 
Thoresen, in press). Within-subject compari- 
sons can also be performed en masse on a 
single group of individuals, however, so that 
the within-subject design is not necessarily re- 
stricted to studies involving very few subjects. 

Between-subjects designs also attempt to 
compare values of the dependent variable(s) 
in the presence of different values of the in- 
dependent variable(s). In these experiments, 
however, the primary strategy is to look at 
differences between persons who, for example, 
have and have not been exposed to the inde- 
pendent variable(s). Historically, most be- 
tween-subjects designs have involved com- 
parisons between groups rather than individ- 
ual subjects, and thus the between-subjects 
design is often equated with “group research.” 
It should be kept in mind, however, that 
Ronn att the aforementioned comparison. 

l any varieties of both within-sub- 
ject and between-subjects designs. Many of 

the former are outlined in Hersen and Barlow 

(1976), and between-subjects designs are ex- 

tensively described in Campbell and Stanley 

(1963) and Paul (1969). The 12 most com- 
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mon designs will be outlined later, but we 
should first discuss three types of experi- 
mental validity. 


Validity: Internal, External, and Theoretical } 


Few terms in scientific research have been 
applied as widely as the word validity, which, 
to make matters worse, takes on different 
meanings in its various uses. Without listing 
its numerous other uses and varieties, the term 
validity will here refer to “logical warrant” 
or the extent to which a statement or pro- 
cedure is logically consistent with the experi- 
mental intent. Borrowing from Campbell and 
Stanley (1963), internal validity refers to the 
extent to which a given set of procedures al- 
lows one to draw valid conclusions about what 
actually happened in an experiment. When 
an experiment exhibits maximal internal valid- 
ity, one has relatively strong logical warrant 
for deciding (a) whether an “effect” occurred 
and (b) whether that effect can be attrib- 
uted to the independent variable(s) in ques- 
tion. External validity, on the other hand, 
refers to the extent to which a set of experi- 
mental procedures allows one to draw valid 
generalizations to other subjects or situa- 
tions. A cause-effect relationship that has only 
been observed in a very small or unrepre- 
sentative sample of subjects would be said to 
have little external validity in the sense that 
its generalizability is severely restricted. To 
exhibit external validity, an experiment must 
first demonstrate internal validity. One would 
hardly want to generalize an observed rela- 
tionship if the validity of that relationship 
were itself in question. Internal validity 's 
therefore a prerequisite for external validity. 

A third—and often overlooked—form of 
validity is theoretical validity, which refers to 
the logical bearing of an experiment on some 
hypothesis or theory. Although there are occ 
sional exceptions in the early exploratory 
stages of a research program, the vast major- 
ity of experiments are conducted to test 3, 
hypothesis or a theoretical prediction. No sin- 
gle experiment is ever crucial in such evalua- 
tions (Lakatos, 1970), but it is still important 
that the experimental procedures be clearly 
relevant to the hypothesis in question. In ae 
many scientific controversies focus On wha 


id constitute an adequate experimental 
inf some hypothesis (Kuhn, 1962). More- 
researchers may argue about which hy- 
is is most relevant to the current state 
fe art. In their analysis of this issue, 
mf and Featheringham (1974) expand on 
popular notions of Type I and Type II 
sand talk about a Type III error— 
ily, the probability of having conducted 
‘wong experiment. Ideally, then, an in- 
ligator attempts to 

[design an experiment whose outcome will 
‘clear logical bearing on some focal hy- 
esis (theoretical validity) ; 

i execute that experiment in a manner that 
I maximize his or her logical warrant for 
uding whether an effect occurred and to 
lt factors the effect could be attributed; 


p use procedures that will maximize the 
ftalizability and replicability of any ob- 


13) discussed some of the factors that pose 
Mmtial threats to these experimental ideals. 
| risk of exclusion and oversimplifica- 
10 most common culprits in experi- 
tal inadequacy might be considered the 
lowing: 
f selection of a theoretically irrelevant hy- 
thesis or issue; 
4 ofa subject sample that is very small 
B Oe of the population to which 
aizations are to be drawn; 
A ii 
Bien case of between-subjects designs, 
4 ce of random assignment to the vari- 
i perimental conditions; 
i Poor specificati i 
iable(s) ion of the independent 
». in; ; 
ae standardization, assessment, 
i; ption of how the independent variable 
A implemented ; 
in 
hose ae control for factors other than 
= mediate experimental interest; 
as replication of the cause-effect 
snip (either within or between sub- 


i 


; 


8. poor choj Tan 
la oe specification, or assessment 
N nt dependent variables; 
10, eae data presentation; and 
Sions or interpretations that are 
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not logically warranted by the experimental 
procedures. 

Since these experimental flaws are most com- 
monly cited post hoc—as in a journal referee’s 
comments—it might be worthwhile to briefly 
expand on each in the context of a hypo- 
thetical manuscript. 


Anatomy of an Experimental Manuscript 


For purposes of illustration, let us assume 
that we are evaluating a hypothetical study 
on the topic of whether extrinsic rewards can 
actually undermine a person’s interest in some 
activity. This issue has recently become a 
very controversial one (cf, Deci, 1971, 1972; 
Lepper, Greene, & Nisbett, 1973; Levine & 
Fasnacht, 1974). Some attribution theorists 
have argued that extrinsic rewards (such as 
tokens) may cause a person to devalue the 
rewarded activity and to “infer that his ac- 
tions were basically motivated by the exter- 
nal contingencies of the situation, rather than 
by any intrinsic interest in the activity itself” 
(Lepper et al., 1973, P- 130). Others have 
defended the use of extrinsic rewards in cer- 
tain situations to motivate performance. 


Introduction 

In the introductory section of our hypo- 
thetical manuscript, it would be easy to de- 
fend the timely relevance of this issue. On 
the other hand, an introduction usually in- 
cludes a specification of experimental hypothe- 
ses and a preface to the methods used. It is 
here that theoretical validity is often threat- 
ened in that a hypothesis may be oversimpli- 
fied or illegitimately inferred from a parent 
theory. For example, in our illustration, it 1s 
hardly the case that attribution theory pre- 
dicts a decline in intrinsic interest every time 
extrinsic reinforcement is used. Among other 
things, the individual’s perception of the rein- 
forcer may moderate this alleged effect (cf. 
Steiner, 1970). It would therefore be mislead- 
ing to state the experimental hypothesis (im- 
plicitly or explicitly) as “extrinsic rewards Gr 
ways) lead to a decrease in intrinsic interest. 


Method 
nto the Method section of 


As one moves i 
the article, the number of potential threats 
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to validity increases substantially. For ex- 
ample, who are the subjects? There are prac- 
tical constraints on the research populations 
to which most of us have access, and this is 
understandable. We must bear in mind, how- 
ever, that these constraints will take their 
toll in limiting the generalizability of our 
findings. If we study the behavior of a small 
group of preschoolers in a progressive aca- 
demic community, our warrant for generaliza- 
tions is accordingly handicapped. In a tech- 
nical sense, of course, it is impossible to 
experiment with a “universally representative 
sample.” We cannot hope to include a wide 
enough range of subjects to allow us to say 
that they were representative of all persons 
in all ways. For practical purposes one must 
often choose beforehand which factors are 
most relevant to the generalizability of our 
findings. Subsequent replications and exten- 
sions of a study may help clarify its external 
validity. The use of laboratory analogues and 
“nonclinical” populations also bears on ex- 
ternal validity, and current writers appear to 
be divided on the conceptualization and merits 
of such research (cf. Bandura, 1978; Bern- 
stein & Paul, 1971; Kazdin & Rogers, 1978). 
In addition to the importance of a repre- 
sentative sample, most methodologists encour- 
age the use of “random assignment” in be- 
tween-subjects research. Essentially, this is 
intended to remove (or at least minimize) 
biases that might be present if one were to 
use a nonrandom method of deciding which 
subjects will experience which experimental 
manipulation. In our hypothetical study on 
intrinsic interest, for example, one might be 
biased in assigning children to experimental 
versus control groups. If the brighter children 
were overrepresented in one of these groups, 
this might constitute a threat to both inter- 
nal and external validity. By using a random 
assignment procedure, it is hoped that auxil- 
iary variables will be distributed evenly across 
the various experimental groups, With an 
infinitely large sample, this would, in ‘fact 
2 

occur, but true random distributions on all 


variables are impossible when one i i 
ne is wor] 
with a finite sai oo 


1 mple. Once again, we face the 
reality that our methodological ideals can be 
only crudely approximated in actual practice. 
This is particularly apparent when one is 


ea 
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working with preformed groups (e.g., class- 
rooms), which must be dealt with en masse. 
These groups often present practical impos- 
sibilities for individual random assignment, 
and the conclusions drawn from such en- 
deavors must be appropriately cautious. Al- 
ternatives to random assignment include 
matching (yoking) and blocking (tiering), but 
these introduce complexities in data interpre- 
tation (Feldt, 1958). | 

One of the most common (and frustrating) 
shortcomings of an experimental article is the 
failure to clearly specify the independent vari- 
able(s). It would not be informative, for ex- 
ample, to report that “subjects were rein- 
forced for playing with jigsaw puzzles.” A 
colleague who might want to replicate one’s 
efforts would be hard pressed to determine 
what actually took place on the basis of such 
a cursory and superficial description of pro- 
cedure. What constituted the reinforcer? Who 
administered it? How was it presented? Was 
the experimenter always the same person? 
Good operational specifications of procedure 
are essential not only to facilitate replication 
but also to clearly communicate what it was 
that was manipulated. It is difficult to inter- 
pret an experimental outcome if one is unclear i 
about the independent variable. 

Related to this need for clear operational 
descriptions of procedure is the importance 
of assessing the presence and variability of the 
independent variable(s). This is particularly 
apparent when the presence of the indepen- 
dent variable is not totally in the hands of 
the experimenter. In our hypothetical manu- 
script, for example, there are at least two 
possible implications of the simple procedural 
statement that “subjects were reinforced” — 
namely: (a) Subjects were presented with 
positive stimuli, and (b) the reinforcement 
value of these stimuli was reflected by 10 
creases in the target behavior. This secom 
meaning derives from the common (if prob- 
lematic) definition of reinforcement as 4 pt0 
cedure that increases the future likelihood 
of specified performances. In our hypothetical 
study, suppose the experimenters “reinforced 
children with candy. One might expect little 
controversy over whether the children i 
“really” been treated in a manner relevant 
to the experimental hypothesis. In point 0” 


i 
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„however, candy might have been a rela- 
ily unrewarding stimulus under some cir- 
tances (e.g., right after a prior class- 
wm snack or lunch). To insure greater rele- 
ance to the hypothesis in question, one would 
mt to (a) investigate the children’s rein- 
jeement history and/or involve them in the 
lection of reinforcing stimuli; (b) opera- 
mally specify how, when, and by whom 
ise were presented; and (c) assess their 
cts by evaluating associated performance 

ges. 

This digression on the assessment of inde- 
ndent variables is an important one and 
mld be illustrated in a wide range of situa- 
ns, Its moral would seem to be don’t take 
it independent variable for granted. As a 
‘searcher, you may have gone to great pains 
train therapists and/or otherwise prepare 
ithe administration of an independent vari- 
le, These efforts may facilitate—but they 
lot insure—the standard and consistent 
plementation of that variable in the experi- 
nt proper. Logically, it is always safest to 
ible check and monitor the actual carrying 
i experimental procedures. As the annals 
TAR will attest, too many conclusions 

4 ee on the presumed infallibility 
m6), ay 3 ee Cirt 1976; Mahoney, 
Be. iti not | ack integrity or com- 
A exhibit variability in the adminis- 

me even the most simple experiment, 
the ee may figure prominently 

ica snes Hake s data. 

Peden: hae ion and assessment of 
incen of journal are certainly a common 
ade reviewers, but they almost 
hich a Rint ison to the frequency with 
propriate See is criticized for lacking 
Mpossible to ae conditions. Just as it is 
ent, it is theorcfieal PERE 
r all possible inf cally impossible to control 
bles of interest. uences other than the vari- 
ample BRE In our hypothetical study, 
Was rew AT include a control group 
tmpt to eae noncontingently as an at- 
the subjects? To for such factors as mood, 
$0 on, Su oo of the experimenter, 
tt it would z a group might be informative, 
tential Ahad be sufficient in isolating 
f Potential R ae Given the infinite array 

riables in most experiments, one 


ot ex. 


665 


must usually decide to control for the “most 
likely” influences extraneous to the experi- 
mental variable(s). This obviously shades the 
decision of which variables are most likely to 
need controlling. The researcher is aided (or 
perhaps abetted) here by some historical con- 
sensus on the most likely “artifacts” in ex- 
perimental research. At the risk of oversim- 
plification, they are: (a) maturation (cf. 
Campbell & Stanley, 1963), (b) subject ex- 
pectancy (cf. Wilkins, 1973), (c) experimen- 
ter bias (cf. Barber, 1976; Orne, 1962; 
Rosenthal, 1966; Rosenthal & Rosnow, 1969), 
(d) assessment (i.e., the effects of measure- 
ment alone), (e) participation (i.e., the effects 
of involvement in an experiment), and (£) 
differential attrition (i.e., selective loss of sub- 
jects from some experimental conditions). 
The history of methodological refinements 
in the behavioral sciences is basically a his- 
tory of attempts to control for such variables 
as these. The 12 most common experimental 
designs are illustrated in Table 1, which draws 
heavily on the early conceptualization of 
Campbell and Stanley (1963). In the table, 
a O refers to an observation or assessment 
and an X refers to some form of intervention 
or treatment. Reflecting some of my own 
biases, the designs outlined in Table 1 are 
listed in a crude progression of validity (i.e. 
with the later designs exhibiting greater 1n- 
ternal and external validity than the earlier 


rant—are among the least popular in the field. 
Tt is also noteworthy that two of the most 
crucial threats to internal validity are among 
the most neglected in behavioral science re- 
search, that is, subject expectancies and ex- 
perimenter bias. Literature ranging from the 
“placebo effect” to psychotherapy outcome 
documents the potential influence of subject 
expectancies in an experiment (cf. Badia, 
Haber, & Runyon, 1970; Kazdin & Wilcoxon, 
1976; Shapiro, 1971). If a person participates 
in an experiment, his or her behavior is likely 
to be different than that of a nonparticipat- 
ing control subject. One might say “of course 
ws that an effect has occurred.” 


—that just sho e 
Unfortunately, many studies suggest that 


simple participation in any experience that is 
presented as “treatment” may produce thera- 
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The 12 Most Common Experimental Designs 


Design 


Symbol 


Description 


Comments 


1, Posttest only 


2. Pretest—posttest 


3, Reversal 


4. Equivalent time 
samples 


5. Multiple baseline 


6. Time series 


7. Changing criterion 


X0 


0X0 


0X0X0 


OX0X0X0X0X 


0X000 
00X00 
000X0 


000X000 


0X,0X,0X; 


Within subject 


A person or group experiences 


a manipulation (X) (e.g., 
therapy), and the 
dependent variable is then 
measured (0). 

The dependent variable is 
measured (0) before and 
after the experimental 
manipulation (X). 


Two separate manipulations 
of the independent 
variable are each preceded 
and followed by measure- 
ment of the dependent 
variable. 


An extension of the reversal 
design in which an in- 
dependent variable is 
sequentially presented and 
removed (or otherwise 
manipulated) with alter- 


nating measurements of the 


dependent variable. 


The timing of an experi- 
mental manipulation is 
systematically varied 
across different behaviors 
or situations. 


The stability of a dependent 
variable is measured; 
deviations from that 
stability after an experi- 
mental manipulation are 
used to infer causal 
relationship. 

Similar in some respects to 
equivalent time samples 
design, except that the 
value of the independent 
variable is systematically 
altered; if changes in the 


dependent variable reliably 


covary with these manipu- 
lations, a causal relation- 
ship is inferred. 


Extremely weak and unin- 
formative design; no strong 
conclusions can be drawn, 


Weak design; it may be con- 
cluded that there was (or 
was not) a change in the 
dependent variable, but one 
cannot determine whether 
this change would have oc- 
curred anyway (without the 
experimental manipulation). 

More adequate design in that 
it can replicate the observed 
effect of an experimental 
manipulation ; conclusions are 
limited to the subject or 
group in question, however, 
and this design does not rule 
out the possible influence of 
factors other than the in- 
dependent variable; reversal 
may pose practical and 
ethical problems in some 
situations, à 

Moderately adequate design in 
the sense of multiple replica- 
tion and possible control of 
some time-related factors; 
limitations include the pos- 
sibility that the effects of a 
manipulation may change 
due to its repeated pre- 
sentation and withdrawal. ; 

Moderately adequate design in 
that it includes replication 
and partial control of time- 
related factors; limitations 
vary with the specific 
procedures. 

Somewhat controversial in 
terms of adequacy; limita- 
tions include failure to rule 
out factors that change! 
simultaneously with the 
experimental manipulation 
and failure to replicate. 

Moderately adequate design 
that shares many of the 
strengths and weake inn 
the equivalent time samp es 
design; in addition, this may 
be problematic with some 5 
patterns of criterion change. 
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ble 1 (continued) 


Design Symbol Description Comments 
F- Between subject 
8. Multiple baseline 0X000 The timing of an experi- See multiple baseline design 
00X00 mental manipulation is within subjects (5). 
000X0 systematically varied 
across persons or groups. 
9, Control group 0X0 Subjects are randomly as- Generally adequate design 
00 signed to an experimental whose limitations include 
or control condition failure to evaluate the effects 
(usually groups) ; the in- of being observed or par- 
dependent variable is ticipation in any experiment. 
manipulated only in the 
experimental condition, but 
the dependent variable is 
measured (pre and post) 
in both. 
!), Solomon four 0X0 Subjects are randomly as- Very adequate design in that 
group 0 0 signed to four conditions it controls for the effects of 
X0 (usually groups), two of testing; limitations include 
0 which will receive the failure to evaluate the con- 
experimental manipulation tribution of participating in 
—one of these is tested any experiment. 
(ie., the dependent vari- 
able is measured) before 
and after the manipula- 
tion; the other is tested 
only after; in the two 
control conditions, one is 
tested pre and post, the 
other only post. 
ii, ‘ ; 
ition and 0X0 Subjects are randomly as- Very adequate design; major 
Soup 0 0 signed to six conditions limitation is failure to con- 
X0 (usually groups), four of trol for subject expectancies. 
0 which are identical to those 
0Y0 in the Solomon design; the 
YO two remaining receive some 
form of contact or atten- 
tion designed to control for 
simple participation in an 
h experiment. 
» Pla J 
eand 0X0 Identical to Design tfjex: | Very adequate design; ihe 
group 00 cept that the two added placebo condition simul- 
X0 conditions receive an ex- taneously controls for par- 
0 perimental manipulation ticipation and 
0Z0 that has an equal expectancy. 
ZO degree of credibility or 


probable effects as the true 
experimental manipulation; 
the placebo’s probable 
effects are evaluated by 
subjects. 


a treatment (or experimental) variable and 
(b) the subject harbors positive expectancies 
about the probable effects of the variable in 


Deutic § 
197 ee in some individuals (Shapiro, 
€ subject pope ticularly the case when (a) 
elieves that he or she is receiving 
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question, Humans tend to be active gener- 
ators of hypotheses, and they may often try 
to guess an experimenter’s hypotheses as well 
as their group assignment. They are then 
more likely to be influenced by their own ex- 
pectancies and what they perceive as the de- 
mands of the situation (cf. Orne, 1962). 
Particularly important in research on the 
outcome of a psychological treatment are con- 
trols for participation and subject expectancy. 
This is most efficiently performed by the pla- 
cebo and control group design (Design 12 in 
Table 1). It should be noted, however, that 
a “placebo” (i.e., inert) treatment only con- 
trols for subject expectancies to the extent 
that it is perceived (by the subject) as being 
equally credible and powerful as the experi- 
mental treatment(s). One cannot presume to 
determine these factors without involving the 
subjects themselves. That is, one cannot dic- 
tate the credibility of a placebo or presume 
that subjects will expect to benefit as much 
from it as from one’s experimental treatment. 
These are empirical issues that may require 
pilot studies and that certainly merit assess- 
ment in each application of the placebo. 
None of the designs outlined in Table 1 
specify controls for experimenter bias—the 
artifacts that may be introduced in an experi- 
ment (consciously or otherwise) by the in- 
vestigator. Although experimenter effects are 
seldom granted much concern in clinical re- 
search, the available evidence suggests that 
this complacency may be quite costly (cf. 
Barber, 1976; Mahoney, 1976; Rosenthal, 
1966; Rosenthal & Rosnow, 1969). It is not 
an insult to the integrity or competency of a 
researcher to acknowledge that he or she may 
exhibit subtly human patterns of fallibility 
that may threaten the internal validity of an 
experiment. In some instances, the experi- 
menter may bias things against his or her hy- 
pothesis, but more often than not the prejudice 
seems to favor rather than challenge the hy- 
pothesis in question. It is partly for this rea- 
son that independent replication is so im- 
portant. On the other hand, we now know 
that scientists can independently replicate 
A a 
spurious phenomenon dozens of times if they 


the same expectancies (cf Mah 
mh. e ies (cf. oney, 
= Ait this reason, it is important i 


ward maximizing the objectivity of 
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the investigator (s) by such means as using ex- 
perimenters who are “blind” (uninformed) 
about the hypotheses and/or subject condi- 
tions. Multiple raters or persons with contrary 
biases may also help to assess (and hopefully 
minimize) experimenter bias. No single 
method is ever completely effective in eliminat- 
ing this factor, and an inability to control for 
experimenter bias does not mean that a study 
should not be conducted. It does, however, re- 
strict one’s logical warrant for confident data 
interpretations. 


Results 


The results section of a manuscript may 
play an important role in its fate—particularly 
if the results run contrary to popular assump- 
tions or the biases of the reviewers (Mahoney, 
1977). The psychology of science is not, how- 
ever, our current concern. More relevant are 
the factors that bear on manuscript evaluation 
and become most salient in the Results sec- 
tion. These include (a) choice of dependent 
variables; (b) specification and assessment of 
dependent variables; (c) data presentation; 
(d) descriptive statistics; (e) inferential sta- 
tistics; and (f) in treatment studies, mainte- 
nance and/or generalization of therapeutic ef- 
fects. j 

In most cases, the choice of dependent vari- 
ables may seem straightforward and uncon- 
troversial. When concern is expressed about 
the outcome measures in an experiment, it 
most often bears on (a) the relevance of the 
variable to the hypothesis, (b) the failure to 
assess other relevant variable(s), and (c) the 


: 


assessment methods used. In our hypothetical 


manuscript on intrinsic interest, for example, 
subjects’ actual performance rates might ap- 
pear to be a straightforward outcome measure. 
If the task involved jigsaw puzzles, however, 
one must decide whether to assess time spent 
in the activity, number of units completed, the 
“vigor” of responding, and so on. Likewise, 
a more comprehensive outcome assessment 
might be facilitated by asking subjects to rate 
their interest in the experimental activity. 
Assessment methods constitute an ambitious 
topic in and of themselves, and they cannot S 
adequately examined here. At the risk af 
seeming perfunctory, I shall offer a handful 0 
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in this regard. First, one is well 
to used multimodal assessment, that 
ety of dependent variables and assess- 
thods that may lend consensual con- 
o one’s observations. Direct observa- 
behavior, physiological measures, sub- 
self-reports, unobtrusive measures, and 
ized psychometric instruments may 
tribute to one’s confidence regarding 
e and magnitude of an experimental 
herever possible, the assessment in- 
nts should be chosen from among those 
e professionally recognized as having 
strated acceptable reliability (consist- 
d relevance to the outcome in ques- 
en direct behavioral observations are 
Tesponse categories should be clearly 
and independent interrater reliabilities 
be reported. More extensive discussion 
€ points and some of the more popular 
nents are available elsewhere (cf. Cone 
ns, 1977; Cronbach, 1970; Hersen & 
1976; Meehl, 1973; Mischel, 1968; 
‘Campbell, Schwartz & Sechrest, 1966). 
Presentation is another topic that 
consume a volume in and of itself. Al- 
is usually impossible to publish one’s 
in their entirety, it is important that 
€ summarized in an objective and com- 
live fashion. It is also a scientist’s ob- 
to make his or her raw data avail- 
colleagues (cf. Wolin, 1962). The goal 
Presentation is to be concise without 
8 content. This can usually be ac- 
shed by a judicious construction of 
oa tables. The tables or text should 
relevant descriptive statistics that re- 
lore than one measure of central tend- 
| indicate the range and variance of 
a question. 
ee ie” inferential statistics is 
versial point, and many manu- 
neounter problems in this area. Some 
: t common concerns here include 
priate choice of a statistical test 
ametric or nonparametric, variance 
ce analysis, etc.) ; 
lon of the test to the appropriate 
Variables (e.g., posttest scores ver- 
Scores) ; 
967) against probability pyramiding 
through use of conservative 
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significance levels or appropriate post hoc 
comparisons; 

4. failure to attend to issues of power in se- 
lecting sample size and significance level; 

5. failure to consider the possible role of sta- 
tistical regression in the obtained results; and 

6. failure to distinguish between statistical 

significance and clinical significance. 
The amount of emphasis placed on statistical 
inference is substantial, and there is likely to 
be vigorous debate regarding Weimer’s (1977) 
recent demonstration that all contemporary 
inferential statistics are illogically founded. 
The logic of his proof is straightforward and 
may raise the ire of many a science apprentice 
who has been forced to develop traditional sta- 
tistical proficiencies (Mahoney, 1976). I shall 
avoid the temptation to expand on that proof 
and instead suggest that we might do well to 
imitate the physical sciences in their rejection 
of null hypothesis testing as an automatic and 
rational arbiter of experimental results (cf. 
Greenwald, 1975; Meehl, 1967). Reports of 
percentage of outcome variance attributable to 
the independent variable(s) might offer a 
more rational and constructive alternative to 
our present inferential statistics. One should 
bear in mind, however, that this is still a 
minority view and that the vast majority of 
psychologists—and journal referees—hold few 
things more sacred than significance levels. 

In clinical outcome evaluations, the gen- 
eralization and maintenance of a therapeutic 
effect often constitute important factors in the 
professional response to a manuscript. Mainte- 
nance is a particularly salient concern, perhaps 
because it has been such an elusive feature in 
our efforts to facilitate human adjustment. 
Short-term therapeutic effects are not difficult 
to produce in some behavior disorders. Short- 
term improvements that can be confidently 
attributed to the treatment in question are 
much less common; enduring and attributable 
therapeutic improvements are a rarity indeed. 
It is, I think, a promising sign that contempo- 
rary journals are now demanding more refined 
methodologies and demonstrations of long- 


term maintenance. 


Discussion 
How one’s data are interpreted depends, in 


part, on how they have been analyzed and 
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whether their implications are obvious. In the 
case of the latter, one can occasionally find 
consolation in their having passed the test of 
“jnterocular trauma,” that is, when the effect 
or outcome is so salient that it “hits one be- 
tween the eyes” (Weimer, 1977). In the ab- 
sence of such a dramatic event, one must draw 
conclusions about (a) whether an effect was 
observed (e.g., therapeutic improvement), (b) 
whether an effect (if present) can be at- 
tributed to the independent variables, (c) 
how the present findings bear on the hypothe- 
sis or theory in question, and (d) how widely 
the present findings can be generalized. These 
conclusions are obviously related to the issues 
of internal, external, and theoretical validity. 
Most researchers are well versed in the de- 
signs that will allow them to confidently as- 
sert that an ex tal effect was observed 
(Table 1), There is much more room for in- 
terpretive license (and error) when it comes 
to asserting that the independent variables 
were responsible for such an effect. It is here 
that adequate controls and protections of in- 
ternal validity are most relevant, In our hypo- 
thetical study on intrinsic interest, for ex- 
ample, one could conclude very little from a 
control group design (see Design 9, Table 1) 
due to the plethora of competing variables 
(eg, access to the target task, boredom, 
etc.). Errors or excessive license in causal at- 
tribution are a common concern of journal re- 
viewers, as are overgeneralizations (cf. Maher, 
1978). Interestingly, another form of logical 
error that is seldom noted may be one of the 
Se tony na atong the 

When the 


premise warrants a true conclusion. Consider 
for example, the hypothesis that “if theory A 
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is true, then observation X will occur.” Logi- 
cally, the confirmation of this hypothesis re- 
quires that one be told that the premise is 
true, after which one could rationally infer 
that the conclusion is also true (i.e., that ob- 
servation X will, indeed, occur). This is ob- 
viously of little use to the scientist, since the 
truth value of the premise is most often the 
point at issue. Philosopher Karl Popper has 
noted this for years, and yet scientists persist 
in their claims of confirmation (cf. Popper, 
1959, 1963, 1972). 

When a researcher reports findings that are 
consistent with his or her hypothesis and then 
argues that these data “support” the hypothe- 
sis, a serious logical error has been committed. 
Technically, the error is called affirming the 
consequent. It consists of making the illogical 
inference that a true conclusion implies a true 
premise, In my earlier illustration (in which 
theory A predicted observation X), the oc- 
currence of observation X has absolutely no 
logical bearing on the truth value of the prem- 
ise (theory A). The observation of not X 
would, however, bear clear logical implica- 
tions for the theory. As we all learned in 
elementary algebra, it is only false conclu- 
sions that bear conclusively on a premise 
(i.e. they can falsify it), This was, in fact, 
the cornerstone of Popper’s philosophy of 
falsificationism. Counterintuitive as it may 
seem, negative results and predictive failures 
have far-reaching logical implications, 4 
positive results (successful predictions) have 
comparatively little information content. This 
is, of course, contrary to the popular practice 
of selectively publishing positive results manu- 
scripts and emphasizing these successes more 
heavily in literature reviews (Meehl, 1967). 
The epistemological costs of this practice arè 
themselves quite distressing (cf. Mahoney, 
1976; Smart, 1964). > 

So what does one do with “positive results, 
that is, an experimental outcome that is con- 
sistent with the hypothesis in question? How 
are these to be interpreted? In a word, Cau“ 
tiously. One can legitimately note their coo 
sistency with the hypothesis or state that they 
corroborate the hypothesis. Loosely speaking, 
corroboration refers to the act of having sur 
vived falsification (Popper, 1972). To s 
that a hypothesis has been “corroborated 
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to state that it has been tested and has 
ively) survived. This is very different, 
e, from implying that it has been 
ened. No matter how many times a 
is survives falsification, it can never 
imately said to have been “confirmed” 
orted.” As corroborative instances ac- 
le, most of us experience a psychologi- 
ease in our confidence regarding the 
is in question. This subjective phe- 
on should not, however, be confused 
logical warrant of the situation. As 
we scientists may like to view our- 
rational creatures, there is a clear 
nificant difference between our rare use 
gic and more common use of psycho-logic. 


Concluding Remarks 


S not been my intent to overemphasize 
lortcomings of science or to criticize our 
Apts to use experimental methods in the 
ment of therapeutic interaction, We must 

ize, however, that all of our scientific ef- 
all along a continuum of fallibility. 
IS no investigation that can be totally 
in its potential informativeness, nor 
re ever be one that is perfect in its 
Ament of internal, external, and theoreti- 
alidity. Our goals, then, should be to 
toward conducting the least fallible in- 
to cautiously interpret our experiments 
co dance with their logical warrant, and 
PaA against the paralysis of complacency 
Mng the adequacy of current research 


Tefinements in both our methods as 
Sur technical knowledge. As illustrated 
Cases of inferential statistics and con- 
m claims, these explorations may re- 
m : Psychologically unsettling changes 
Old “accepted” procedures to newer 
5. As such, they may require a re- 
Of long-revered assumptions about 
Way to do scientific inquiry. In the 
i n however, the prospect of prog- 

Opefully overshadow the inconve- 
Ey demand. We can, after all, 

t to “grow” if we are unwilling 
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Investigation of treatment outcome has been 
d continues to be a major research topic in 
Psychotherapy and behavior therapy (e.g., 
agin, 1971; Eysenck, 1952; Kazdin & Wil- 
pee Luborsky, Singer, & Luborsky, 
et “Meltzoff & Kornreich, 1970; Paul, 
ol Mi achman, 1971; Sloane, Staples, Cris- 
Bf th orkston, & Whipple, 1975). The goals 
Resin outcome research usually are to de- 
JA e the efficacy of a given treatment, to 

uate the relative effectiveness of different 
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Evaluating the Generality of Findings 
in Analogue Therapy Research 


Alan E. Kazdin 


Pennsylvania State University 


Analogue research in psychotherapy and behavior therapy has proliferated in 
recent years. The value of analogue research is that it allows analytic and well- 
controlled research to address questions that often are prohibitive or impractical 
to evaluate in clinical situations. The major source of controversy about the 
value of analogue studies is their external validity, that is, the extent to which 
the results can be generalized to the clinical situation. Much of the controversy 
stems from conceptualizing therapy investigations discretely as either analogue 
or clinical research. However, all treatment research is an analogue of the situa- 
tion to which an investigator wishes to generalize. Thus, the main question is 
the extent to which an investigation is an analogue of the clinical situation. An 
investigation can vary from the clinical situation along several dimensions, such 
as the target problem, the clients studied, the manner of client recruitment, and 
others. It is often assumed that the greater the similarity of an intervention to 
the clinical situation, the more likely the findings will be generalizable to the 
clinical situation. The present article questions this assumption and suggests an 
empirical method for evaluating the generality of therapy research to the clin- 


treatments, and to assess the components of 
treatment that are responsible for change for 
a particular treatment problem and client 
population. The most direct means of achiev- 
ing these goals is to study the therapy tech- 
niques and populations of interest directly in 
a clinical setting. The requirements for con- 
ducting clinical research appear relatively 
straightforward. Clients who seek treatment 
for a given problem at a treatment facility 
can be assigned to different treatment or con- 
trol groups, depending on the research ques- 
tions and desiderata of the experimental de- 
sign. Treatment can be administered by 
trained therapists and evaluated by assess- 
ment of therapeutic change on multiple mea- 
sures of the clients’ problems. 

Although the basic requirements for out- 
come research might be highlighted in a 
straightforward fashion, meeting these re- 
quirements in practice presents nearly insur- 
mountable obstacles for the researcher. Be- 
cause of the diverse practical and ethical ob- 
stacles of clinical research and the complexity 
of treatment, a great deal of research has been 
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conducted in situations analogous to those 
available in the clinic. Research that evalu- 
ates treatment under conditions that only re- 
semble or approximate the clinical situation 
has been referred to as “analogue research.” 
An analogue study usually focuses on a care- 
fully defined research question under well-con- 
trolled conditions. The purpose of this in- 
vestigation is to illuminate a particular process 
or to study an intervention that may be of 
importance in actual treatment, 

An important issue in the psychotherapy 
and behavior therapy literature is the extent 
to which analogue studies contribute to under- 
standing therapeutic processes and outcome 
in clinical settings. It is generally acknowl- 


gering concern about the 
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vided presents several obstacles. The obstacles 
encompass practical difficulties of undertak- 
ing and completing clinical research, as well as 
ethical considerations that militate against ad- 
dressing select research questions, The practi- 
cal problems include diverse aspects of the in- 
vestigation, Initially, in most treatment stud- 
jes, it is difficult to obtain a sufficient number 
of clients with the same or similar problems 
to meet the requirements for an experiment, 
Similarity among the clients’ problems is re- 
quired so that measures of their personality 
or behavior can be compared along the same 
response dimension (e.g, anxiety, depression, 
or another dimension) to evaluate the effects 
of treatment. 

Even if clients with similar problems can be 
included in a given study, often there are com- 
straints in assigning clients to specific forms 
of treatment, For example, in inpatient treat- 
ment, administrative demands sometimes dic- 
tate that the more severe patients be given 
the treatment that is more likely to alter be- 
havior than is some other treatment. In out- 
patient treatment, clients are not easily as- 
signed to different variations of therapy, bè- 
cause they may seck and express preference 
for specific treatments. Thus, randomization 
of assignment, an essential ingredient in most 
experimentation, may not be feasible. 

Another problem in working with clinical 
populations is that it may be difficult to com- 
trol or remove the influence of competing fac- 
tors that can affect therapy results, For ex 
ample, in outpatient psychotherapy, clients 
may receive additional experiences (eg, © 
counter group or marathon sessions) or coum- 
sel (e.g, from a physician) while participating 
in a given treatment, In inpatient treatment 
with psychiatric patients, it may be difficult to 
control ongoing treatments (e.g, drugs) and 
other factors (e.g, release from the bospital) 
that can influence treatment or its evaluation. 

Aside from obstacles in selecting clients, it 
is often difficult to obtain therapists who wi 
engage in treatment and meet the demands 
research. Initially, there is little incentive 
most practicing clinicians to participate in 48 


experience. Finally, most experienced 
pists are interested in conducting treatment 
their own way rather than following the 


fegimented or standardized procedures 
w be needed to evaluate a specific 


ical considerations also make clinical in- 
on of certain questions very difficult 

{ 1973). Many important questions 
therapy require control groups that 
specific aspects of treatment that 
ibe crucial for behavior change. To sub- 
fents to “control” conditions that have 

pbability of effecting behavior change 
J violate the professional commitment to 
pent. Also, the possible ineffectiveness of 
procedure may lead to relatively 
Js of attrition, The problem of assign- 
ts to control conditions also is evi- 
cases in which the effects of therapy 
uated over a protracted period against 
that never receives treatment. A no- 
iment or waiting-list control group that 
r receive treatment or that receives 
after a long delay (after they have 
their purpose as no-treatment subjects) 
sis difficult to implement for clients whose 
are in a crisis state when they seek 


of the practical and ethical prob- 
highlighted here, therapy research on 
t populations with trained therapists in 
fal settings is relatively rare, Much of 
ois known about therapeutic processes 
havior change is learned from analogue 
The analogue studies are investiga- 
Hof circumscribed therapeutic processes 

blems in a well-controlled laboratory 
1. There are many advantages of lab- 
investigations of clinical phenomena 
in & Paul, 1971; Levis, 1970; Paul, 
i As a general statement, laboratory in- 
ons of therapy allow the investigator 
ol the conditions of experimentation to 
greater extent than do clinical in- 
tions. This careful control allows the 
flor to minimize sources of variance 
ight obscure an effect of treatment in 
5 in which several parameters are free 
y. Thus, the subjects who receive treat- 
fan be selected because of their similar- 
the type of target problem, the severity 
Sproblem, and subject and demographic 
Hes that might contribute to the vari- 
Of treatment effects. Aspects of treat- 
fan be carefully controlled in analogue 
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research that otherwise would vary consid- 
erably in treatment. For example, therapists 
can be selected because of their homogeneity 
with respect to experience, age, training, and 
other factors. Also, the number of sessions, 
duration of treatment, and specific tasks in 
each treatment session can be held constant 
across subjects and groups, All of these char- 
acteristics of laboratory investigations will 
minimize variability among subjects and ther- 
apists and increase the power of the test of a 
given intervention. 

Laboratory investigations also allow the use 
of different control groups that might not 
otherwise be available. In laboratory research, 
specific ingredients of treatment, or i 
treatment itself, can be more readily withheld 
than in the clinical situation, Freedom in pro- 
viding or withholding specific ingredients of 
treatment greatly expands the range of ques- 
tions that can be asked about therapy, This 
perhaps is the greatest advantage of labora- 
tory investigations of therapy. 

The procedures used in laboratory investi- 
gations probably are more easily replicated 
than are those used in clinical investigations. 
In laboratory research, investigators can more 
carefully control and specify the parameters 
of treatment administration than in clinical 
situations, Subsequent investigators who wish 
to replicate the original laboratory study may 
have relatively explicit guidelines to follow 
because many parameters 0! 
well specified originally. 
tion, all sorts of conditions must 


to vary and may not be easily repeated in 


control that laboratory 


ch affords relative to clinical research is 
pie í the priorities of these different 
hods. 


made to complete the requirements 0! 
sign. For example, the treatment can be stan- 
dardized across clients rather than take into 
account individual differences with respect to 
the problem. In contrast, in clinical research 
with clients who have sought treatment, the 
higher priority is their improvement, Rigors 
of the laboratory may have to be sacri 
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to meet the priority of treatment. For ex- 
ample, the therapeutic material covered during 
the therapy session and duration of treatment 
may be allowed to vary widely so that clients 
receive sufficient treatment to affect the de- 
sired change. 

To meet the rigors of experimentation, 
analogue research can be conducted in cir- 
cumstances that resemble the clinical situa- 
tions in varying degrees. The use of analogue 
studies has been widely discussed (e.g., Bern- 
stein & Paul, 1971; Cooper, Furst, & Bridger, 
1969; Cowen, 1961; Goldstein & Dean, 1966; 
Heller, 1963, 1971; Levis, 1970). These dis- 
cussions have addressed the extent to which 
analogue studies are useful in understanding 
treatment in the actual clinical situation, the 
limitations of analogue research, and areas in 
which particular care may need to be taken 
in substituting experimental rigor for clinical 
relevance. 


Conceptualization of Analogue Research 


As usually discussed in the clinical litera- 
ture, therapy investigations are looked upon 
as analogue or nonanalogue research. Yet, this 
may not be the most fruitful way to view 
clinically relevant research. Dichotomizing 
clinical research in this fashion may obscure 
interpretation of research findings in different 
ways. First, the distinction tends to overlook 
the inherent limitations and the “analogue” 
nature of all clinical research including those 
studies conducted in clinical settings with pa- 
tient populations. Second, and more important 
for present purposes, the categorization does 
not provide clear guidelines to distinguish 
among analogue studies, Presumably, gener- 
ality of the results from all analogue studies 
to be se situation is not equal and par- 
tially depends on specific isti 
the individual Pene eae. 

Virtually all psychological experimentation 
with human subjects is analogue research in- 
sojar as it constructs a situation in which a 
particular phenomenon can be studied. The 
phenomenon is selected as an approximation 
of the phenomenon in the nonexperimental 
situation. The experimental version of the 
Phenomenon may resemble the naturally oc- 


currin; i i 
phenomenon in varying degrees, but 
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in an important sense it is only an analogue.? 
It is assumed, and often supported, that the 
versions of the phenomenon in the experimen- 
tal situation and “real world” are in important 
ways the same and come under identical prin- 
ciples and laws. 

In human experimentation, the differences 
between experimental and nonexperimental 
situations may lead to important differences in 
subject behavior. These differences can readily 
alter the phenomenon that is being studied 
and the generality from the experimental to 
nonexperimental setting. For example, experi- 
ments with humans place subjects in a special 
situation and provide some form of interven- 
tion. Subjects may complete specific response 
measures before and after the intervention to 
assess change in particular responses. How- 
ever, participating in an experiment or a con- 
trived arrangement that is not normally en- 
countered in everyday life makes the experi- 
ment an analogue of the situation to which 
one may wish to generalize the results. In- 
deed, such phenomena as demand character- 
istics of the experimental situation (Orne, 
1969), subject roles (Weber & Cook, 1972), 
and pretest sensitization (Lana, 1969) are ex- 
amples of the influences of experimental ar- 
rangements on behavior. 

Even when the experimental situation is 
not contrived and does not differ from the 
situation in which the client normally behaves, 
other features of human experimentation con- 
tribute to its analogue nature. In an experi- 
ment, psychological measures (e.g., self-report 
inventories, behavioral checklists, physiologi- 
cal responsiveness) are used to assess the rela- 
tionship between an intervention (e.g., treat- 
ment) and behavior change. Yet, it is not the 
response on the psychological measurement 
device per se that is of interest, but rather it is 
the construct that is assumed to be repre- 
sented by the measure. For example, in clini- 
cal research in an actual treatment setting, 4 
client’s anxiety might be assessed with various 
devices, It is not the client’s change on these 


1 Not all human psychological research is an ana- 
logue of the situation to which the investigator wishes 
to generalize. For example, research on the aioe 
of the experimental situation itself on behavior is ê 
direct test of the subject matter of interest. 


per se that is of direct interest, at 
© the client. These measures are of in- 
because they may reflect or relate to 
occurring in the natural environment 
ordinary circumstances. Insofar as ex- 
al research uses measures that are 
ed to reflect or resemble the responses 
al interest in the natural setting, it is 
logue of the situation of direct interest. 
ng almost all psychological research 
mans as analogues of situations to 
me would like to generalize has im- 
implications for conceptualizing treat- 
rch, Initially, it is essential to keep 
that investigators are interested in 
lating the findings to some area, prob- 
setting that is not studied directly. 
sis always the possibility that the ex- 
ation is not accurate in accounting for 
latural phenomenon. The difference be- 
the natural and experimental situation 
ly along these (usually unknown) di- 
ms might lead to different findings for 
M variable. A second implication is that 
y not be useful to speak of analogue 
/Monanalogue research. Rather, research 
e viewed on the basis of the extent to 
it resembles the situation to which one 
S to generalize the experimental findings. 
are several dimensions along which 
y research may vary from the clinical 
Hon that will be discussed below. 
erimental research usually is removed 
€ way from the situation to which one 
to generalize. This by itself does not 
ly limit the extent to which results 
generalized. The results of studies 
be widely generalized across dimen- 
hat may seem more discrepant than the 
tom laboratory and clinical investiga- 
the area of therapy. For example, 
Ndings in the psychology of learning 
en obtained in experiments with in- 
‘Species. Specific learning paradigms 
nciples have had generality to human 
despite differences between humans 
a ahumans and the importance of con- 
uniquely human characteristics. Simi- 
‘ tside of psychology, infrahuman spe- 
€ been used to study genetics. The 
have been uncovered have applied to 
Y transmission in humans, even 
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though the population often studied (e.g., 
fruit fly) bears little resemblance to humans. 
As a general point, research conducted in the 
context of the laboratory under conditions 
that seem very distant from the one to which 
a researcher wishes to generalize may produce 
laws or generate theories of great relevance to 
areas that are not directly studied. Thus, the 
fact that a study does not focus on a problem 
and population of direct interest is not always 
critical. 


Dimensions for Evaluating 
Therapy Analogues 


In psychotherapy and behavior therapy re- 
search, the question of interest is not whether 
a particular study is an analogue. As noted 
earlier, experimentation even in the clinical 
rather than laboratory situation is removed 
from the actual phenomenon and situation to 
which one might wish to generalize, Thus, in 
any given therapy study, the question is to 
what extent the conditions resemble the situa- 
tion to which the investigator wishes to gen- 
eralize. This is a complex question, because 
there are a large number of dimensions along 
which studies may vary. Each study may be 
evaluated in terms of its standing on these 
dimensions to determine the degree to which 
the study approaches the clinical situation. 
Major dimensions that can be used to evalu- 
ate a study and the extent to which it is an 
analogue of the clinical situation include the 
target problem; the population; the manner 
of client recruitment; the therapists; the se- 
lection, set, and setting of treatment; the 
variation of treatment used; and the assess- 
ment procedures used. 

Target problem. A major concern about 
the value of analogue studies, particularly in 
recent years, pertains to the target problem 
(Cooper et al., 1969; Levis, 1970). The issue 
is whether subjects in a laboratory study 
evince the same kinds of problems and the 
same magnitude of severity for a given prob- 
lem as do the clients for whom treatment is 
usually provided. For example, in behavior 
therapy, a vast literature has appeared on the 
treatment of mild or subphobic levels of fear. 
Subjects typically are college students who ex- 
press some level of fear on a questionnaire and 
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who refuse to approach a feared stimulus 
(e.g., harmless snake). The level of fear may 
be much less potent than one would encounter 
with a clinical population. Indeed, the per- 
centage of subjects who meet the definition of 
“fearful” in laboratory studies vastly exceeds 
the percentage of debilitating fear for the 
same problem in the general population (cf. 
Bernstein & Paul, 1971). In any case, select- 
ing mildly fearful college students raises the 
question of the generality of treatment effects 
to clinical patients whose fears (or other prob- 
lems) are more intense. Certainly, the concern 
is well taken, because it should be easier to 
alleviate mild problems than it is to alleviate 
severe problems. 
Aside from severity of a given problem, the 
specific behavior studied in therapy research 
may vary in topography and qualitative char- 
acteristics from clinical problems to which the 
results might be generalized. Some problems 
that have been frequently studied in labora- 
tory research may bear little resemblance to 
clinical problems. For example, most behavior 
therapy techniques designed to alleviate anx- 
iety have been evaluated with clients who fear 
small animals. Recent research has revealed 
that fears of small animals in college students 
typically habituate more quickly in response 
to the anxiety-provoking stimulus and are 
more influenced by suggestion than are fears 
in social situations (e.g., heterosexual anxiety, 
speech anxiety) (Borkoyec, Stone, O’Brien, & 
Kaloupek, 1974; Borkovec, Wall, & Stone, 
1974; Singerman, Borkoyec, & Baron, 1976), 
Results from studies focusing on behaviors 
that do not rapidly habituate and are less 
amenable to demand influences are more likely 
to be generalizable to clinical Problems that 
presumably share these characteristics (Bor- 
kovec & O’Brien, 1976). 
The extent to which the target problem in 
a therapy investigation resembles clinical 
problems is a matter of degree. To obtain 
target problems that increasingly resemble the 
clinical problem, a researcher can screen for 
severity of the problem in a nonclinical popu- 
lation so that the subjects included in treat- 
some deleterious effects of 
everyday life. Stringent cri- 
‘Dect selection make Broan 
findings to a clinical Population much 
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more plausible. For example, Lang (1968), 
who conducted the earliest analogue studies of 
desensitization, indicated that only 1%-2% 
of college students will evince intense fear if 
rigidly screened on behavioral, interview, and 
self-report criteria. When treatment alters in- 
tense fears of college students who are strin- 
gently selected, it is not unreasonable to ex- 
tend the findings to a clinical population, 

Population. The characteristics of the pop- 
ulation (subjects or clients) that receives 
treatment contribute to the degree to which 
an investigation is an analogue of the clinical 
situation. The population dimension encom- 
passes all features that might distinguish sub- 
jects in an experiment from those for whom 
the results might be generalized. For example, 
in analogue fear studies common in behavior 
therapy, subjects often are college students. 
The study is an analogue of clinical treat- 
ment in the sense that college students as a 
group may differ on several dimensions from 
clinical populations typically treated for anx- 
iety disorders in such characteristics as level 
of education, age, socioeconomic status, and 
occupation (or lack of one). The differences 
may bear directly on the effects of particular 
therapy procedures, For example, the age of 
the clients and situational characteristics of 
their lives may bear on the kinds of problems 
that they bring for treatment. The influences 
also may covary with the receptivity of indi- 
viduals to professional treatment and to their 
expectancies for cure. The generality of find- 
ings from an investigation of treatment to 4 
clinical situation may partially depend on the 
similarity of (nonproblem) characteristics of 
subjects to clients who usually come for 
treatment. 

Manner of recruitment. In most treatment 
settings, clients solicit a therapist on their own 
or are referred by a particular source (eB 
physician, relatives). The manner of obtain- 
ing clients in therapy or institutional treat- 
ment differs markedly from the manner in 
which subjects are obtained in an experimental 
investigation of treatment. In many treat- 
ment studies, college students voluntarily par- 
ticipate in treatment because of small rewards 
toward college course credit, completion of 4 
course option, or interest in learning about 4 
Specific form of therapy. Presumably, the in- 


“nives for patients who directly seek out 
iment through the usual channels differ 
frstically from the incentives of college stu- 
mts who may be less concerned with being 
ed” of a particular problem. The volun- 
fet patient and the mildly coerced college 
Mudent who is not very much interested in 
ihe treatment or problem for which it is ap- 
Pied probably represent end points on the 
‘ontinuum of the extent to which a given 
therapy study is an analogue of a clinical 
freatment, at least on the dimension of man- 
er of recruiting subjects. Perhaps, more to- 
vard the middle would be subjects who are 
Wlicited from newspaper, television, or radio 
advertisements that make available treatment 
to individuals in a community situation. This 
| form of recruitment still solicits subjects, un- 
tke most therapy situations, However, it is 
| likely to uncover clients who are interested in 
treatment even though they would not have 
| Sought treatment without the advertisement. 
| The manner of recruitment also relates to 
i the factors that keep a client in therapy. In 
Clnical treatment, clients usually are free to 
tminate therapy and, indeed, readily do so 
Without penalty. In treatment research in 
Which attrition can be devasting to the re- 
Aa specific contingencies can be invoked 
retain subjects. For college students who 
pee in experimental treatment re- 
fes A either explicit or implicit contingen- 
Fr perate, such as withholding credit to- 
ae BOnIDICHOn until a project is 
Soa With clients who are solicited for 
Bed 28 refundable deposits may be col- 
i. returned later only if treatment 
NN completed. The threat of losing a 
trition in deposit significantly thwarts at- 
ie in T Foreyt, & Durham, 1976). 
reatment usually pay for therapy 


ane t 
“d hence exert ultimate control over selec- 


tio ; 
n and termination of therapy. Although 


_ Paying for therapy per se has not been found 


Sag therapeutic outcome, it does co- 
“a other variables such as diagnosis 
Ee ava status that do (Pope, Gel- 
aa ilkinson, 1975). In any case, the 
iene that maintain clients in 
ndings a may influence the generality of 
i, Cross laboratory and clinical settings. 
pists. In the clinical situation, an 
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experienced therapist usually provides treat- 
ment. On the other hand, experienced full- 
time therapists are infrequently used in most 
therapy research, although there are out- 
standing exceptions (e.g., Orlinsky & Howard, 
1975; Paul, 1966; Sloane et al., 1975). Typi- 
cally, graduate students with some interest 
or experience in clinical applications serve 
as therapists. The similarity of students to 
therapists who apply therapy techniques in 
clinical situations is a dimension that may 
determine the extent to which the results can 
be generalized from research to practice. 

The therapist dimension is complex, be- 
cause it may include multiple differences in 
the characteristics of individuals who provide 
treatment (see Meltzoff & Kornreich, 1970; 
Truax & Mitchell, 1971). Initially, student 
therapists and professional clinicians may dif- 
fer on such variables as experience, training, 
age, and other characteristics. In addition, 
they may behave differently toward their 
clients. Therapists may engage in certain 
types of conversation, hold greater expecta- 
tions for improvement, and have a stronger 
commitment to patient cure than do students. 
serving as therapists. The generality of treat- 
ment effects in a therapy study to the clinical 
situation may depend in part on who serves 
as therapists. 

Selection, set, and setting of treatment. In 
clinical work, clients seek a particular treat- 
ment that they believe will be effective. In- 
dependently of how well the technique in fact 
helps people, the set with which people ar- 
rive at treatment is that they will receive a 
bona fide treatment designed to alter their 
behavior. In addition, the client may have 
heard about the particular treatment and 
therapist with whom he or she will have con- 
tact. The assignment of clients to treatment 
may be selective, since individuals exert some 
choice over treatment. If treatment is not 
viewed favorably by a client after having at- 
tended some of the sessions, choice can be 
exerted by leaving and attending another 
treatment. In experimental therapy research, 


clients often are assigned to treatment and 
the treatment or 


are not given a choice over 
therapist. The element of choosing or seek- 
ing out a particular treatment may influence 
the generality of results obtained in therapy 
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research. Indeed, therapy investigations have 
indicated that being able to choose treatment 
is highly related to therapeutic outcome, in- 
dependently of the specific technique that is 
actually received (Devine & Fernald, 1973; 
Gordon, 1976). 

The client’s set about treatment, perhaps 
related to choice and selection mentioned 
above, may contribute to the generality of 
results from treatment research to clinical 
applications. The set about receiving an ef- 
fective treatment is likely to be enhanced 
by the aura of the treatment facility or office 
and by the convincing description of therapy 
provided by the therapist (Frank, 1973). 
Interestingly, the set of the client when ar- 
riving at treatment and the manner of pre- 
senting treatment may differ markedly across 
research and clinical settings. In the research 
setting, the client may not have the initial 
set that an effective treatment will be pro- 
vided. Also, the actual setting (e.g., univer- 
sity building rather than a clinic) may not 
foster the set for receiving help with one’s 
problem. In addition, instructions on the part 
of the investigator may indicate that the pro- 
cedure to be used is an “experimental” treat- 
ment. 

In recent years, ethical standards for re- 
search increasingly have stressed that investi- 
gators should explicitly convey the nature of 
the treatment, whether it is experimental in 
nature, and the possible benefits and risks. 
The procurement of informed consent in the 
context of experimental clinical research may 
alter to some extent the manner or context 
in which treatment is provided, In ordinary 
therapy, patients usually are not told that 
treatment is experimental but rather are given 
a convincing rationale about its efficacy and 
illustrations of other cases. In contrast, “ex- 
perimentation” usually conveys the eplora 
tory nature and tentative effects of treatment. 
Generalizing the results from research to the 
clinical situation depends in part on the simi- 
larity of the manner in which treatment is 
presented and the sets of the individuals who 
participate in treatment, 

The dependence of treatment effects on the 
ue in which treatment is presented was 
gens sey by Bonn Nae 

ry study designed to alter fear 
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of rats in college students. Bootzin compared | 
live modeling (observing someone perform a 
fearless response, touching and holding the 
rat) versus live modeling plus participation 
(observing plus actually engaging in touch- 
ing and holding of the rat). For one half of 
the subjects in each of these groups, the ex- 
perience was called a “treatment” demon- 
stration; for others, the experience was la- 
beled as a maze learning demonstration (since 
the rats eventually were run through a maze 
at the end of the session). Interestingly, live 
modeling plus participation decreased fear of 
rats (on an assessment battery administered 
outside of the context of this experiment) 
whether or not subjects were told that the 
demonstration was “treatment for fear,” In 
contrast, subjects who received live modeling 
showed a reduction of fear only if they served 
in the group that specifically identified the 
procedure as treatment. Thus, the effective- 
ness of modeling depended on being identified 
as a form of treatment, whereas the effective- 
ness of modeling plus participation did not. 

The general point here is that the context 
in which the procedures are presented may 
differ across studies and distinguish the spe 
cific effects of treatments. This creates some 
obvious problems for research that often 
cannot present the treatment program with 
the same amount of zeal that might be the 
case in actual clinical practice. 

Variations of treatment. The extent 10 
which the results of a therapy study can be 
generalized to the clinical situation depen! 
on the degree to which the actual treatment 
is varied across research and clinical applica- 
tions. Occasionally, changes are made in spe 
cific components of treatment to increase 
precision in research, to control variability 
across subjects, or to examine a component 
of treatment that might be difficult to study 
without the variation. Even though some 
changes may pertain to ancillary features oe 
treatment, others might represent significant 
departures. ie i 

Several components might vary across t i 
research and clinical versions of a given treat 
ment. For example, research applications w 
hold constant the number of treatment E 
sions across groups; the duration of ren j 
sessions; the material, tasks, or topics 
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cussed within sessions; and so on. These di- 
mensions usually are not specified in clinical 
applications, and they vary freely. Yet, in 
research the dimensions may be rigidly speci- 
fed to ensure that they do not systematically 
vary across treatment conditions or subjects 


within a condition. 


Some of the components may be more im- 
portant than others in standardizing treat- 
ment to make it well suited to the demands 
of research. For example, portions of treat- 
ment in an experiment may be prerecorded 
and presented by tape to the client. Perhaps, 
more seriously, standardization of an aspect 
of treatment may change a crucial ingredi- 
ent of therapy that would vary treatment 
efficacy. For example, in systematic desensiti- 
tation, clients imagine specific scenes to over- 
come anxiety. These scenes are hierarchically 
arranged and individually tailored to a client’s 
problem. In clinical research in which spe- 
tific components of treatment are standard- 
izd across groups, individuals may receive 
the same hierarchy of items based on the as- 
Sumption that it will be germane and ap- 
ana suited to their individual needs. 
et, technically, this is a violation of the 
Original requirement of treatment. Travers- 
nae that are not designed for a given 
ee violate the goal of minimizing anx- 
A 3 coe treatment and of ensuring 
Bin elaxation is the dominating response 

g imagery. 

es 1 even more extreme departure 
NN = ee is the use of slides 
mation fen the subject as part of desensiti- 
Ua er than having the client imagine 
While i E Brown, 1973; Wilson, 1973). 
Picea client is relaxed, the slides are to 
tiation as an analogue version of desen- 
leat in which imagery is paired with 

xation, The use of slides represents & 
Major procedural varjati 

ani variation of treatment. Yet, 
recs on is consistent with the original 
ey epee which suggested that 
ompatible | owever presented, and an in- 

ble ae need to be paired 
Bettie see In general, many sources of 
semblance oe that could decrease the re- 
cee research and clinical versions 

ent. 


SS 
‘essment procedures. The extent to 
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which a therapy investigation resembles the 
clinical situation is influenced by diverse 
aspects of the assessment procedures includ- 
ing the reactivity of assessment, the setting 
in which assessment is conducted, and the 
precise measures used. To begin with, the 
client’s problem behavior usually is measured 
in situations in which the client is fully aware 
of the purposes of assessment. Reactivity or 
awareness of the assessment process itself 
may influence the type of information ob- 
tained about the client’s behavior. Once aware 
of assessment, the client’s responses might 
be influenced by response sets, biases, and 
demands of the situation that cue certain 
types of responses (Bernstein, 1973). The 
goal of therapy is to change behavior in the 
client’s everyday performance that is not un- 
der the scrutiny of a psychologist. Demon- 
strations of behavior change on reactive 
measures that are under the scrutiny of a psy- 
chologist may only be a distant approxima- 
tion of the relevant changes that the client 
wishes to achieve. The extent to which a 
study measures everyday performance in s0- 
cial situations or approximations of that per- 
formance may determine the generality of 
results to clinical situations. 

Part of the artificiality of assessment de- 
rives from the setting in which the effects 
of treatment are evaluated and the specific 
assessment devices used. Consider first the 
setting and the contribution it makes to eval- 
uating behavior change. Several studies have 
demonstrated the effect of the setting in 
which behavior is assessed by having subjects 
perform in the context of a “clinic” versus 
laboratory setting 1973; Bern- 
stein & Nietzel, 1973, 1974). The actual 
physical setting is i 
Rather, the clients 
is taking place in 


of a treatment progra 
i part of a program unrelated to 


therapy and treatment of behavior. The stud- 
that clients evince more se- 
tic behavior in the laboratory 
setting. Thus, the set and setting contribute 
to the assessment results, 

A related assessment problem pertains to 
the precise measures used to assess client þe- 
havior. The measures commonly used to eval- 
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uate therapy effects are paper-and-pencil in- 
ventories of specific psychological traits or 
states (e.g., anxiety), global self-evaluation, 
samples of direct behaviors, or physiological 
responses. Although these measures might 
weli reflect change, they vary in the extent 
to which they reflect the problem for which 
the client may have sought treatment. As 
mentioned earlier, clients do not seek treat- 
ment because of the severity of a problem 
as reflected on psychological measures. Yet, 
it is these measures that are used to evaluate 
treatment. The extent to which a study con- 
tains measures that relate to the situations 
in which the client expresses his or her prob- 
lem is important in generalizing the results 
of the study to the clinical situation. 

The use of direct behavioral samples in 
contrived situations, if clients have com- 
plained of specific problematic behaviors, may 
approximate the clinical situation. For ex- 
ample, drinking of alcoholics has been as- 
sessed in inpatient treatment settings in which 
clients can sit and consume alcohol at a 
simulated bar (e.g., Sobell, Schaefer, & Mills, 
1972), This might appear to be a close ap- 
proximation of the clinical problem of drink- 
ing outside of the laboratory. However, it is 
still relatively distant from nonlaboratory 
situations, Indeed, reports suggest that alco- 
holics drink infrequently in treatment facil- 
ities when alcohol is made available (Skoloda, 
Alterman, Cornelison, & Gottheil, 1975). Ap- 
parently, many of the psychological cues of 
the natural environment (e.g., work demands, 
interactions with one’s spouse) rather than 
physical cues of the drinking situation may 
precipitate drinking (Lawson, Wilson, Brid- 
dell, & Ives, 1976). Thus, a method of assess- 
ment that reflects the behavior even more 
directly than a laboratory approximation 
would be more representative of clinical 
change, Alcohol consumption sampled at ran- 
pien pros in the client’s natural environ- 
ment would provide a more dire 
of the clinical behavior (eg., Miller eal 
Eisler, & Watts, 1974), If alcohol consump. 
tion is assessed in the original situation in 
bee the xia is problematic, this is not 

roximation of ini 
a direct reflection os Se 


the problem itself. In 
any therapy investigation, it is important to 
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evaluate the extent to which the response 
measures approximate the client’s problem 
and the manner in which assessment is con- 
ducted. 


Degree of Resemblance to Clinical Situations 


The above discussion outlines dimensions 
along which therapy investigations can vary. 
The list of dimensions is not necessarily ex- 
haustive. The importance of the discussion 
does not derive from the specific dimensions 
enumerated. Rather, the value derives from 
viewing treatment investigations as multi- 
dimensional. The generality of the results of 
an experimental investigation of treatment 
depends on where the investigation lies with 
respect to these and similar dimensions vis- 
a-vis the clinical situation. 

Each of the dimensions, and occasionally 
separate aspects of a given dimension, can 
be viewed as a continuum along which stud- 
ies can vary. The continuum reflects the de- 
gree of resemblance of the study on a given 
dimension to the clinical therapy situation 
to which the results are to be generalized. 
The continuum for a given dimension ranges 
from identity with or close resemblance to 
the clinical situation to little or no resem- 
blance to the clinical situation. In more com- 
monly used terms, the continuum might be 
analogous to classifying the study as clinical 
(or applied) versus laboratory (or basic) re- 
search. However, this classification is an over- 
simplification, because it does not treat in- 
dividually the different dimensions along 
which a study can vary and be evaluated. — 

If each of the dimensions mentioned earlier 
is viewed as a separate continuum along 
which the study might be evaluated, the 
of evaluating an analogue study depends on 
where it falls on the continuum with respect 
to each dimension. Also, the extent to which } 
the results of the study can be generaliz 
depends on how the dimension relates to 
treatment efficacy. The issue here is how to 
decide the generality of the results of a study 
that in some way only resembles the clini 
situation. i 

An explicit assumption in the therapy i 
erature is that the degree to which a stu y 
resembles a clinical situation (for a sive? 


R 
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dimension) indicates the likelihood of its gen- 
ality to the clinical situation. Consider the 
dimension of severity of the target problem 
aone means of evaluating a therapy study. 
One would expect that a treatment shown to 
be effective in changing mildly problematic 
behavior (e.g., mild fear in college students) 
might have little generality to a more severe 
degree of similar behavior (clinical phobias). 
This is a very reasonable expectation and in- 
deed might be bolstered with support from 
literature from both psychotherapy and medi- 
tine, For example, individuals who experience 
relatively severe insomnia do not respond as 
radily to placebo treatment as those who 
experience relatively mild or moderate insom- 
tia (Nicolis & Silvestri, 1967). More gen- 
erally, a concern in evaluating treatment is 
that less severe behaviors can be readily 
changed, whereas severe clinical problems 
cannot, This concern is the primary objec- 
tion to analogue studies in behavior therapy 
in which the efficacy of select techniques oc- 
casionally has been based on the treatment 
of relatively mild problems (cf. Bernstein 
& Paul, 1971), 

As a general rule, beyond considering only 
he target behavior dimension, an implicit 
oe often made is that the greater 
E resemblance of a treatment investigation 
0 the clinical situation, the more difficult it 
n be to change behavior. Changing behav- 
3 z considered to be increasingly difficult 
tolled oe departs from well-con- 
s aboratory analogue conditions and 
Pproaches characteristics of the clinical 
Situation, 
any dimensions, it is possible and 
is ely that the assumed relationship is 
semble e, ue studies that only faintly re- 
in the clinical situation could readily 
tee : changes that are not likely to be 
. eralized to the clinical situation. However, 
i a dimensions, it is quite reasonable 
tralizabi that the results may be quite gen- 
ings, a from laboratory to clinical set- 
iy TPR the less resemblance of the 
rin the clinical situation for a given 
age beh the more difficult it would be to 
dlinical or avior. That is, departure from the 
Ore difi uation may make behavior change 

cult to achieve. In these cases, dem- 
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onstration of behavior change in the nonclini- 
cal situation might be a more convincing 
demonstration of an effect of treatment than 
would application in the clinical situation. 
For example, consider the therapist dimen- 
sion. If therapists who bear very little re- 
semblance to those who normally practice 
therapy (e.g., show less experience, education, 
and training) effect marked behavior change 
in their clients, the generality of this finding 
to highly trained therapists would be quite 
plausible. One might expect that if therapists 
who were actually untrained high school stu- 
dents, for example, could effectively change 
behavior, the effects would be likely to apply 
at least as well to trained professionals. 
Similarly, consider the treatment dimen- 
sion; if behavior change is shown with a vari- 
ation of treatment that deviates from the 
clinical version, this does not necessarily 
mean that the results might not be general- 
ized to the clinical version of the treatment. 
In many cases, the laboratory version of 
treatment is an important test of clinical 
treatment, because it minimizes the param- 
eters of treatment that are likely to contrib- 
ute to change (e.g., individualization of treat- 
ment, deviations from the treatment rationale 
to handle individual subject problems, etc.). 
Finally, consider the client dimension. 
Clients who normally seek treatment in a 
clinical setting are likely to be experiencing 
some sort of crisis (ie., defined as severity 
of the symptoms or behavioral problem) that 
precipitated seeking help. In this state, clients 
who seek help may be particularly motivated 
for and receptive to treatment and therefore 
hold high expectancies for improvement. This 
may make clients show improvements even 
before or very early in treatment (e.g., Frank, 
Nash, Stone, & Imber, 1963; Goldstein & 
Shipman, 1961). In addition, in the clinical 
situation, clients may be likely to comply 
with therapeutic instructions and perhaps 
even accept the treatment rationale relatively 
uncritically. Although all clients who seek 
treatment are not desperate for relief, sever- 
ity of the problem should increase the incen- 
tive for attending and adhering to therapy. 
At the other end of the continuum might 
be college students recruited for a treatment 
study with inducements of course credit. The 
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incentives for obtaining help and for com- 
plying with the requirement of treatment 
may be less than for clinical patients. If a 
given treatment effects change in subjects 
who are less motivated and who in fact ad- 
here less rigorously to treatment requirements 
(e.g, do not perform “homework” assign- 
ments as conscientiously) than do clients in 
a clinical situation, then the effects of treat- 
ment could be more pronounced in a clinical 
situation. 

The purpose of the present article is merely 
to raise the possibility rather than to assert 
that the analogue situation may occasionally 
provide a more rigid and more conservative 
test of a relationship between treatment and 
therapeutic change than that provided in the 
clinical situation, This is reasonable to en- 
tertain, because laboratory studies may dilute 
aspects of treatment that are central to be- 
havior change. In these cases, the commonly 
assumed relationship between behavior change 
and the lack of resemblance of the study to 
the clinical situation is altered. 

A problem in clinical research is that ana- 
logue research has been rejected by many 
on the grounds that by its very nature, it 
provides a weak test of the relationship be- 
tween a variable and change in the clinical 
situation. Yet, this is not necessarily the case. 

The relation between an analogue study 
and generality to clinical situations for a 
given dimension itself is an area of research. 
The importance of a given dimension (eg., 
population, therapists, manner of recruitment, 
etc.) to the generality of the results needs to 
be evaluated directly. Increasing resemblance 
of a given dimension of a study to the clini- 
cal situation may not necessarily predict the 
extent to which the results apply to the clini- 
cal situation. In advance of studies showing 
this relationship, this assumption across all 
of the dimensions reviewed earlier should not 
be made. 

The generality of results of a given study 
can only be evaluated directly by studyin, 
the variables of interest in the clinical ie 

tion itself. Thus, research always needs to 
test out the variables, insofar as possible, 
in the actual situation. However the value 
of analogue research in understanding clini- 
cal phenomena can be Partially evaluated 
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prior to direct clinical tests of individual find- 
ings. Research is needed that investigates the 
influence of departures from clinical situa- 
tions along various dimensions and the im- 
plications of such departures for generalizing 
results to clinical situations. The effects of 
varying degrees of resemblance to the clinical 
situation for a given dimension need to be 
evaluated on therapy outcome. Comparisons 
of different points on a continuum varying in 
degrees of resemblance can reveal whether 
departure from the clinical situation for a 
given dimension attenuates, enhances, or has 
no effect on treatment. Once these relation- 
ships are elaborated, the generality of find- 
ings from research to clinical situations can 
be more easily evaluated for laboratory-based 
treatment studies. 


Conclusion 


Analogue therapy research has proliferated 
in recent years, especially in the area of be- 
havior therapy. Extensive research has led 
to an empirically based methodology of care- 
fully studying treatment procedures and 
parameters that influence behavior change. 
Yet, many authors have raised questions 
about the relevance of analogue work to the 
clinical situation and have suggested specific 
procedures to increase the similarity of treat- 
ment research to clinical applications (¢8 
Bernstein & Paul, 1971; Borkovec & O’Brien, 
1976). Research in psychotherapy and be- 
havior therapy can differ from the clinical 
application of treatment along several di- 
mensions such as the target problem, the 
clients and the manner in which they are re- 
cruited, the therapists, the selection of treat- 
ment, the client’s set, and the setting in which 
treatment is conducted, the variations 1 
treatment, and the assessment procedures used. 

For evaluating the generality of research 
findings to the clinical situation, the main 
question of interest is not whether the inves- 
tigation is an analogue of the clinical situa- 
tion. An investigation invariably will be an 
analogue by the very nature of experimenta 
research. The alternative questions of interest 
seem to be the extent to which treatment 
research deviates from the clinical situation, 
along what dimensions, and whether the man- 


in which there are differences makes re- 
‘earch a relatively strong or weak test of 
‘yeatment in relation to the likely results in 
Gdinical situation. 

The primary purpose of the present article 
ws to suggest a methodology for studying 
generality directly. Laboratory analogue re- 
arch may differ from clinical treatment 
‘lng several dimensions. Each dimension is 
continuum spanning from little or no re- 
‘semblance of the study to the clinical situa- 
on to close resemblance or identity with the 
dinical situation. Each investigation of treat- 
ment can be classified separately on several 
dimensions denoting its resemblance to the 
tinical situation, Increasing similarity of an 
investigation to the clinical situation for a 
gwen dimension does not necessarily argue 
ior greater generality of the results from the 
rsearch to the clinical setting. Whether 
anding of a study on a given dimension is 
tevant to generality of the results needs to 
be determined empirically. Laboratory treat- 
j Ment research that differs from clinical treat- 
“Ment may not necessarily provide a less 
Sitingent test of treatment. Indeed, for some 
‘timensions, laboratory research may provide 
Amore conservative test, and changes found 
M the laboratory may be even more likely 
4 obtain in the clinical setting. However, the 
Pitpose here is not to speculate or make as- 
oo about the extent of generality of 
a findings of a study. It is hoped that di- 
ae that may affect generality will be 

yect to empirical scrutiny. 


Reference Note 


1 a R. R. Magnitude and durability of ex- 
ee i effects in behavior modification. Paper 
can nee at the annual meeting of the Ameri- 
Ru Psychological Association, Washington, D.C, 
eptember 1971. 
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Methodological Considerations in 
Treatment Outcome Research on Obesity 
G. Terence Wilson 
| teported his findings on the behavioral treat- 


an Psychological Association, Inc. 0022-006X/78, 


recommendations for reducing subject 


are advanced. 


non, namely the extraordinary growth of be- 
havior research and therapy as a whole (cf. 
73, 1977; Hoon & Linds- 
e fact that obesity has been 
jntensified research interest 
within the wider context of the development 
of behavior therapy is not surprising. Unlike 
most other clinical problems that have been 
the focus of treatment outcome research, 
obesity offers a definitive yet convenient and 
objective measure of outcome efficacy— 
weight reduction. Instead of the all-too-fa- 
miliar difficulties inherent in the measurement 
of other clinical disorders, such as rating 
anxiety, inferring depression, judging inter- 
personal adjustment, Or estimating alcohol 
consumption, 4 quantitative measure is ob- 
tained by simply asking the obese client to 
step on a scale. Furthermore, unlike most 
other behavior disorders, large samples of 
obese subjects for research investigations are 
readily available. Finally, largely as a result 
of historical accident—Stuart’s (1967) influ- 
ential application of the principles and pro- 
cedures initially proposed by Ferster, Nurn- 
berger, and Levitt (1962)—weight reduction 
has become an important arena for the eval- 


Franks & Wilson, 19 
ley, 1974). Th 
a subject of 
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uation of behavioral self-control methods. The 
happy combination of a handy, “hard” index 
of treatment outcome for large numbers of 
motivated subjects who can usually be re- 
cruited at minimal effort and cost and who 
provide a made-to-measure testing ground 
for one of the hottest areas of behavioral 
modification has proven irresistible to be- 
havioral researchers. 

These circumstances have encouraged be- 
havioral research on obesity. However, the 
ostensible ease with which controlled research 
can be conducted with obese clients has also 
helped spawn countless studies that have 
contributed neither to our knowledge about 
the treatment of obesity in particular nor to 
the development of the principles of behavior 
change in general. Many of these studies 
have been master’s degree theses or doctoral 
dissertations in psychology. In contrast to 


"i 
lf 


Í 
| 
H 
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and nonobese populations; however, this ar- 
ticle focuses on treatment outcome considera- 
tions, Although the issues raised in the fol- 
lowing discussion have broad relevance to the 
evaluation of all treatment outcome research 
on obesity, the particular focus is on behav- 
ioral approaches. 


Measurement of Treatment Outcome 
Weight Loss 


As stated above, the availability of an ob- 
jective measure of weight loss was a raison : 
d'être for the application and evaluation of 
behavioral principles in the modification of — 
obesity. However, some studies, mostly those 
of a nonbehavioral nature, still fail to report : 
Specific information on weight loss, the custom 
being to report general impressions of changes 
in subjects’ personal adjustment and emo- | 
tional well-being (cf. Leon, 1976). The im- 
portance of gathering multiple measures of : 
subjects’ response to treatment is emphasized — 
below. Yet, regardless of whatever other out- 
come measures are obtained, direct assessment 
of weight loss is a sine qua non of adequate 
evaluation of treatment of obesity. Neglecting 
to report precise data on weight in favor of a 
discussion of inferred psychological changes 
obfuscates evaluation of treatment efficacy 
and, as Bandura (1969) cautioned, can easily 
result in the “perpetuation of weak methods 
on the basis of extraneous criteria” (p. 540). 

Although it is necessary that specific weight 

loss be reported, it is not sufficient. Clearly 4 
40-pound (18 kg) weight loss in a 400-pound 
(181.6 kg) individual is not the same as 4 
40-pound weight loss in a 160-pound (72.7 
kg) individual. Following Stunkard and Mc 
Laren-Hume (1959) , Several studies have re 
ported results in terms of percentage of sub- 
jects who lose specified amounts of weight, 
for example, more than 10 (4.5 kg), 20 (9.0 
kg), 30 (13.5 kg), or 40 pounds, respectively. 
It is difficult to interpret these figures in the 
absence of precise information on subjects 
initial weights. Musante (1976), for example, 
reported treatment outcome according t0 
clients who lost more than 20 pounds or 40 
pounds, respectively. Initial weights, eae 
ranged as widely as 155-484 pounds (70. 


Tkg) in males and 125-321 pounds (56.8— 
J kg) in females. As others have noted (cf. 
® Hall, 1974; Jeffrey, 1975), simply re- 
ng absolute weight loss severely impedes 
ingful comparisons among different stud- 
mparative analyses of the outcome of 
fent studies requires the use of a stan- 
lized measure of improvement. Change in 
fentage overweight is more informative 
jabsolute weight loss, since it takes initial 
tinto account. Better still is the weight 
tion quotient (cf. Feinstein, 1959), 
measures the amount of weight loss 
to the amount needed to obtain an 
target weight. Specifically, the weight re- 
on quotient is a function of Relative Ini- 
Overweight x Percent of Surplus Weight 


pounds lost 


ht reduction index = —————___ 
pounds overweight 


initial weight 


ideal weight XAN: 


index takes account of height, amount 
tight, weight reduction goals, and ab- 
pounds lost. Several studies have used a 
tated version of this index, namely, 


pounds lost 


pounds over ideal weight % 100, 


th reflects percentage of weight goal at- 
d (e.g., Ashby & Wilson, 1977; Brownell, 
fess; Kingsley & Wilson, 1977; Mahoney, 
- The complete formula compensates for 
fact that the obese person has to lose more 
Mute weight to attain the same percentage 
Or her target weight goal. 
h measures of weight—absolute weight 
nd the reduction quotient—should be re- 
# in treatment outcome studies. These 
‘Measures have yielded essentially similar 
+ of results in several studies to date 
Ashby & Wilson, 1977; Green, in press; 
& Wilson, 1977; Mahoney, 1974). 
be : Brownell (in press) found a discrep- 
haa the two measures. Differences 
$ treatment groups that were statistically 
% in terms of absolute weight loss and 
cally ies baba were not ni? 
a s cant in terms of percen! 0 
Weight goal attained. wipe 
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Weight measurements should be based on 
specific readings obtained by the investigator 
using a reliable scale. Until the validity of 
self-reports of weight loss is established, in- 
dividuals’ estimates of their own weight gath- 
ered over the telephone (e.g., Hanson, Borden, 
Hall, & Hall, 1976) or through the mail (e.g. 
Mahoney & Mahoney, 1976) are less desirable 
and must be viewed with appropriate caution. 

Treatment outcome measures based on 
weight have been criticized as true measures 
of obesity, that is, excessive body fat (LeBow, 
1977). Overall weight reflects the person’s 
total body mass, which includes the muscula- 
ture and water composition in addition to 
body fat. Body weight might be a good pre- 
dictor of body fat in very overweight indi- 
viduals, but the relationship is less clear for 
minimal or moderate degree of overweight. 
Body weight is an especially poor index of 
body fat in young children, the elderly, and 
when comparisons are made among different 
age groups (Mahoney, Rogers, Straw, & Ma- 
honey, in press). Since it is body fat that is 
presumably the target of obesity treatment 
programs, indices other than weight that spe- 
cifically measures been ad- 


(usually temporary, however) of diuretics and 
laxatives. The use of skin-fold measures may 
provide a better predictor of body fat pro- 
vided that precautions are taken to ensure re- 
liability (Franzini & Grimes, 1976; Johnson 
& Stalonas, 1977) as a reliable and valid index 
of body fat. For these reasons, future outcome 
research might profitably include specific mea- 
sures of body fat in addition to changes in 


body weight. 


Objective Measures of Eating Behavior 


tances in which the direct mea- 


e are ins 
EA behaviors of food 


surement of specific eating 


SS 

1 Ideal weight is usually based on the 1959 Metro- 
politan Life Insurance Company norms (U.S, De- 
partment of Health, Education and Welfare, 1967). 
Less frequently used are actuarial data of height and 
weight by age in a US. Department of Agriculture 
report (Hathaway & Foard, 1960). Seltzer (1965) 
discusses the limitations of such height-weight tables. 
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consumption patterns rather than changes in 
weight is of primary concern. Some examples 
and measurement strategies can be briefly 
noted. For example, Diament and Wilson 
(1975) evaluated the efficacy of a particular 
behavioral treatment technique, covert sensi- 
tization, in decreasing consumption of a spe- 
cific target food. Weight loss was too molar a 
dependent variable for this purpose, since it is 
a function of several variables that are not 
specifically affected by covert sensitization. 
Thus covert sensitization might decrease an 
individual’s consumption of a particular food 
but not ensue in weight loss, because the indi- 
vidual increases consumption of other foods 
that were not targeted in the treatment. The 
efficacy of that technique would be obscured. 
This would be unfortunate, because if a spe- 
cific method could be shown to be effective in 
reducing specific consumption of a particular 
food, then it might be extended to decrease 
other excessive eating habits or included in 
a broader treatment program that has as its 
aim the modification of obesity as such. On 
the other hand, if the technique is shown to 
be ineffective, then it can be dropped from 
multifaceted treatment programs. Accordingly, 
Diament and Wilson (1975) adapted the 
taste-rating task from Schachter, Goldman, 
and Gordon (1968) as a direct laboratory 
measure of the consumption of specific target 
and nontarget foodstuffs rather than rely 
solely on subjective ratings of specific food 
preferences as an outcome measure of the ef- 
fects of covert sensitization. (Some of the 
methodological considerations governing the 
use of the taste-rating task as an index of 
treatment outcome were raised by Leon & 
Roth, 1977; see also Nathan & Briddell’s 
1977, discussion of the assessment of alcohol 
consumption.) 

The taste-rating task is a laboratory mea- 
sure of eating behavior under controlled con- 
ditions. Measurement of food consumption in 
the natural environment is achieved by un- 
obtrusive behavioral observations. Epstein. 

Parker, McCoy, and McGee (1976) and Gaul, 
Pier ie ass (1975) provide il- 
Scussion of the use of sys- 


tematic behavioral observation as a means of 


assessing eating under natural conditions. 
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Direct Indicators of Cardiovascular Health 


Mahoney et al. (in press) have criticized 
the almost exclusive reliance of treatment out- 
come studies on body weight. They suggest 
that more attention be given to factors that 
are directly related to cardiovascular health, 
for example, blood pressure, serum lipids (par- 
ticularly triglycerides), and activity. The im- 
portance of exercise components has often 
been deemphasized in weight reduction pro- 
grams. In lieu of less practical procedures, ac- 
tivity level can be assessed and caloric ex- 
penditure can be estimated on the basis of 
subjects’ self-monitoring. 


Criteria For Evaluating Treatment Outcome 
In Obesity 


In addition to the specific measurement con- 
siderations discussed above, there are broader 
issues that bear on the evaluation of treatment 
programs for obesity. These issues include 
both client-related and efficiency and cost-re- 
lated criteria for outcome evaluation.” 


Clinical Significance of Treatmeni Effects 


As noted above, the fact that the evaluation 
of behavioral methods rather than the treat- 
ment of obesity per se has frequently been the 
primary purpose of experimental investiga- 
tions has resulted in findings that are sta- 
tistically but not necessarily clinically sig- 
nificant (cf, Stunkard & Mahoney, 1976). Yet, 
in clinical research a major criterion for 
evaluating therapy is the overall importance 
of the treatment-produced improvement. The 
clinical significance of amount of weight re- 
duction can be considered to be of comparable 
importance to the statistical comparison of 
group differences. The magnitude of weight 
loss needs to be given much more attention 
in the future treatment literature. 


Proportion of Clients Who Improve 


Designs that emphasize statistical one 
son of group differences average the i 
of weight reduction across all subjects with! 


de- 
?See Kazdin and Wilson (1978) for a more R 
tailed discussion of these broader criteria for 
evaluating treatment outcome in general. 
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ach treatment group. This tends to obscure 
‘individual differences, a consequence of par- 
tular importance in the treatment of obesity 
i which massive interindividual variability 


ignificant improvement in a greater propor- 
lion of subjects while some subjects show an 
increase in weight. For example, Penick et al. 
971) found considerably greater variability 
Jin weight losses in their subjects treated with 
behavior modification than in subjects treated 
with traditional psychotherapy. The five most 
Successful subjects, as well as the single least 
improved subject, were in the behavioral 
tatment. Since a particular treatment might 
be recommended on the basis of the propor- 
lion of treated clients who are likely to show 
me specified improvement rather than on 
the basis of a group average, reports of treat- 
ment studies should indicate the number of 
itdividuals who lose clinically significant 


mounts of weight. Rather than simple cate- 
of absolute pounds lost (e.g., Stunkard 


f iclaren-Hume, 1959), however, an index 
Ot weight corrected for initial relative obesity 
Would be preferable. More specifically, given 
i. recurring interindividual variability in 
tatment outcome that has yet to be ade- 
Race explained, it is suggested that indi- 
= data be reported even in the context of 
en-group designs ( A € 
| gns (e.g., Aragona, Cas- 
, “dy, & Drabman, 1975). ; : 
Multiple Measures of Change 
ction to the primary measure of 
hag treatment studies should include 
a À X $ 
notional evaluation of subjects’ physical, 
1977) 
" Might 
sid 


, and social functioning (Coates, 
- For example, a therapeutic method 
ot weight but result in adverse 
ae ts. To take an extreme example,.even 
i gical intervention such as the jejuno- 
ming we operation were effective in de- 
fe ale its numerous serious side ef- 
k A t contraindicate its use, especially 
1975). erate obesity (cf. Chlouverakis, 
“ually aa approaches that might be 

effective in decreasing weight might 


i 
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produce sufficiently different side effects so as 
to recommend one over the other. 

In one of the rare comparative studies that 
has addressed this issue empirically, Öst and 
Götestam (1976) reported that a pharmaco- 
logical treatment (fenfluramine) produced 
more frequent and more protracted negative 
side effects than a behavioral treatment pro- 
gram. Stunkard and Rush (1974) have de- 
scribed the occurrence of depression con- 
comitant with weight loss in clients receiving 
traditional psychotherapy. The theory that 
obese adults who lose weight fall below a 
biologically dictated “set point,” and that as 
a result they enter a state of chronic energy 
deficit, specifically predicts adverse emotional 
reactions such as depression and irritability as 
a function of continued weight loss (cf. Nis- 
bett, 1972). Yet the results of behavioral 
treatment programs have indicated no such 
effects. On the contrary, weight loss in be- 
havioral programs has been associated with 
positive consequences for emotional and social 
adjustment (e.g. Stuart, 1967, 1971; Woller- 
sheim, 1970). Adverse side effects of behav- 
that produced 


hazardous procedures such as the use of di- 
uretics, vomiting, 


(cf. Mann, 1972). 
has also been reported to increase absences 


from therapy sessions (Jeffrey, 
other undesirable behavior (Mahoney, Moura, 


& Wade, 1973). 

On the other hand, clients who do not lose 
significant amounts of weight might nonethe- 
less show improvement in other areas of life 
functioning. Specific subjective and objective 
measures of the breadth and nature of the 
changes associated with the treatment for 
obesity will shed light on these potentially 
important relationships among different re- 


sponse systems. 


Efficiency and Cost Effectiveness 


Magnitude of weight reduction is not the 
only criterion for evaluating treatments for 
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obesity. The efficiency of treatment, includ- 
ing its duration, ease of administration, and 
disseminability, needs to be taken into ac- 
count (cf. Kazdin & Wilson, 1978). Determin- 
ing the cost effectiveness of treatment is par- 
ticularly important given the availability of 
alternative treatment approaches of roughly 
comparable efficacy in producing weight re- 
duction. Jeffrey (1975) has proposed a use- 
ful cost effectiveness index that is a function 
of the mean weight reduction quotient divided 
by mean treatment time per client. The fi- 
nancial cost of treatment is influenced by the 
costs of the professional training of the in- 
dividuals who conduct treatment and the dis- 
seminability of the methods, In addition, the 
emotional cost experienced by the client in 
participating in a treatment program might be 
assessed. For example, procedures such as 
starvation diets and surgical interventions 
can be contrasted with behavioral methods 
that emphasize gradual weight loss over an 
extended time period by reducing overall 
caloric intake without prescribing any fixed 
diet. Consumer (client) evaluation of treat- 
ment would provide useful information about 
the acceptability of different methods. Some 
treatments might be inherently objectionable 
irrespective of financial cost or even outcome 
efficacy. 


Experimental Design and Research Strategies 


Treatment outcome research on obesity has 
consisted predominantly of between-group de- 
signs of one sort or another, although single- 
subject methodology has also been used, Each 
of these two contrasting research strategies in- 
cludes several alternative methods replete 


with different methodological advanta; 
disadvantages. ana 


Single-Case Experimental Methodology 


Single-case experimental desi; fi 
therapeutic change in the ianiai chica 
rather than average change across groups of 
clients, The defining characteristics of the de- 
signs include the observation of overt be- 
havior, continuous assessment of change, and 
specific criteria for evaluating the reliability 
and significance of treatment-produced change 


G. TERENCE WILSON 


(see Hartmann & Hall, 1976; Hersen & Bar- 
low, 1976; Kazdin, 1973). 

There are several reasons why single-sub- 
ject designs seem suited to research on the 
modification of obesity. First, the emphasis on 
the individual subject seems particularly ap- 
propriate given the repeated finding, noted | 
above, of considerable interindividual variabil- 
ity in treatment outcome studies on obesity. 
Second, weight loss provides an observable 
and easily measured target behavior. And fi- 
nally, single-subject methodology contains a 
therapeutic criterion for evaluating the clinical 
importance of behavior change. Specifically, 
this criterion emphasizes the magnitude of 
weight loss and the extent to which treatment 
brings the individual within acceptable or 
normative levels of body weight. This charac- 
teristic assumes special importance given that 
a major criticism of treatment studies has 
been the failure to demonstrate clinically 
meaningful reductions in weight as opposed to 
statistically significant comparisons among 
different treatment groups. j 

Some examples of the surprisingly few sin- 
gle-case experimental designs in the treat- 
ment of obesity can be briefly noted. Mann 
(1972) used an ABAB reversal design in 
demonstrating that contingency contracting 
can result in substantial weight losses. Weight 
loss occurred only when the contingency Con- 
tracting procedure was in effect. Subjects lost 
no weight or even gained weight when it was 
withdrawn. Multiple-baseline designs have 
also been used. Morganstern (1974) used a 
multiple-baseline-across-behaviors design 19 
showing that a self-administered aversion pro- 
cedure decreased consumption of different tar- 
get foods only when specifically applied to 
each food in sequence. Similarly, Epstein et al. 
(1976) described the use of a multiple-base- 
line-across-subjects design in demonstrating 
the role of instructions in the regulation 0 
eating in obese and nonobese children. 

A third form of single-case methodology, 4 
yet to be used systematically in the modifica- 
tion of obesity, is the changing criterion de- 
sign (Hartmann & Hall, 1976). In this design, 
following a stable baseline period, the treat 
ment method is applied in order to eee 
weight to a designated level. When this target 
weight is reached, the criterion (targ 
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wight) is lowered further. The criterion is 
ogressively lowered until the ultimate goal 
it weight reduction is attained. A functional 
tionship between treatment and weight 
nge is said to be demonstrated if behavior 
weight loss) repeatedly matches the criterion 
«the criterion is changed. Kazdin (1975) 
lis pointed out that this design is particularly 
idl-suited to behaviors that have to be 
Hfaped to reach a final goal. Emphasizing as 
does the slow but steady reduction of 
Wright, the behavioral treatment of obesity 
iitarly fits this description. 
| Single-case experimental designs are not 
j" hout their limitations (see Bandura, 1976; 
fanks & Wilson, 1977; Hersen & Barlow, 
16). In general, single-subject demonstra- 
ps offer no information of the generality 
i the findings to other cases. Nor do they 
permit evaluations of the comparative efficacy 
tifferent treatment methods. There are also 
A tific problems. Reversal designs are unin- 
pretable if the pretreatment baseline does 
Mt recover when the therapeutic technique is 
jlithdrawn, A successful reversal design dem- 
strates a causal relationship between treat- 
pat technique and target behavior but, by 
tion, excludes one investigation of main- 
ance of treatment-produced change. The 
E of major importance in obesity re- 
Ned below). Interpretive problems are 
i aa with multiple-baseline designs 
TAN avior change generalizes across be- 
h eri’ ee or subjects. The changing 
Brees ae cannot rule out confounding 
X he behavior change even if weight 
leon eo. match the changing cri- 
| The introduction of each treatment 
Bis ays be associated with events 
ive o the specific treatment method. 
fiora T, weight reduction as a target be- 
ti Hie a to be relatively free of some 
Methodoly that often beset single-subject 
fii e in applied settings. For example, 
Mf su an pee to obtain stable baselines 
beatment, a Hite before intervening with 
iterpretats i undamental requirement of an 
Clear-cut ne single-case experimental design. 
peant effects that are discrimin- 
el are E from pretreatment (baseline) 
fics mmonplace. Furthermore, unlike 
target behaviors, normative data 
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exist that allow judgments about what con- 
stitutes a clinically significant weight loss. 


Between-Groups Design 


The methodological considerations that 
govern the use of between-groups designs are 
described in detail elsewhere (Campbell & 
Stanley, 1966; Paul, 1969). Suffice it here to 
address some specific issues that pertain di- 
rectly to the evaluation of the treatment of 
obesity. 

The treatment package strategy. Much of 
the outcome research on obesity has followed 
the treatment package strategy (cf. Kazdin & 
Wilson, 1978). In this strategy multiple 
techniques are administered as part of a single 
therapy program that is compared to a no- 
treatment or attention-placebo control condi- 
tion. Assuming the internal validity of the 
study, a significant difference between the 
treatment and control groups can be taken to 
indicate a causal relationship. The purpose of 
this strategy is to achieve maximal weight re- 
duction. Accordingly, as Azrin (1977) pointed 
out, these treatment package programs should 
“include as many component procedures as 
seem necessary to obtain, ideally, a total treat- 
ment success” (p. 144). Exemplifying many 
behavioral evaluations of behavioral programs, 
the Penick et al. (1971) study used a multi- 
nent behavioral approach that included 


compo. 
“everything but the kitchen sink” (Stunkard, 
1976, p. 219). If this treatment package 


proves effective in producing weight loss, then 
experimental analyses of its various compo- 
nents can be undertaken to elucidate the rea- 
sons for treatment success and thereby sub- 
sequently refine and enhance the efficiency 
and efficacy of that treatment approach. , 
Among the problems to avoid in using this 
research strategy is the necessity of ensuring 
that the treatment package is not so complex 
and wide ranging that it becomes difficult to 
identify and distinguish among the specific 
techniques within such a multifaceted pro- 
gram. Moreover, the total program must be 
distinguishable operationally and conceptually 
from alternative treatment or control proce- 
dures against which it is compared. It follows 
to describe treatment pack- 


that it is essential 
ages in a clear, operational fashion in order to 
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permit the reader of the manuscript to repli- 
cate the treatment procedure. All too often 
cursory reference is made to “the standard 
behavioral self-control program” or the 
“Stuart treatment program” in the description 
of procedures used to treat obesity. Vague, 
nonspecific descriptions such as these com- 
plicate evaluation and seriously hinder replica- 
tion. Ideally, the investigator might make 
available the detailed treatment program 
manual to readers interested in further in- 
formation about the procedure. 

Constructive and dismantling research strat- 
egies. Broad-spectrum behavioral treatment 
programs have been consistently shown to 
produce greater weight loss than alternative 
approaches in the short term. Since a treat- 
ment effect has been demonstrated, component 
analyses of the active ingredients of these 
programs are appropriate targets of research. 
Experimental analyses of the efficacy of spe- 
cific techniques, both singly and in combina- 
tion, are especially called for in view of cer- 
tain findings. Several studies have demon- 
strated that individual component parts of the 
broader behavioral treatment package can 
produce weight losses equal or superior to 
those effected with the full program (e.g., Bel- 
lack, 1976; Green, in press; McReynolds & 
Paulsen, 1976; Romanczyk, Tracey, Wilson, 
& Thorpe, 1973; see further discussion of this 
issue in Franks & Wilson, 1977). 

There are two fundamental research strat- 
egies for analyzing the component parts of 
multifaceted treatment packages, In the con- 
structive treatment strategy (McFall & Mar- 
ston, 1970), the effect of a single, narrowly 
circumscribed treatment component is estab- 
lished, and extra components are added se- 
quentially to determine if they enhance treat- 
ment effects, The effective components are 
retained as the larger treatment program is 
constructed, Illustrating the constructive strat- 
egy, Romanczyk et al. (1973) compared the 
following treatment groups: (a) no treatment 
control (waiting list); (b) self-monitoring 
control (weight only); (c) self-monitoring 
control (weight and calorie intake); (d) self- 
monitoring and symbolic aversion; (e) self- 
(een symbolic aversion, and relaxation; 
ahr slated ede aversion, and re- 

management (stimu- 
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lus control) instructions; and (g) self-moni- 
toring, symbolic aversion, relaxation, behay- 
ioral management, and contingency contract- 
ing. This design enabled the investigators to 
assess the incremental value of including dif- 
ferent treatment techniques in the overall 
treatment package. 

In the dismantling strategy (Lang, 1969), 
specific components are systematically elimi- 
nated and the associated decrement in treat- 
ment effect is measured. The relative contribu- 
tion of each component to the total treatment 
package can then be assessed. 

Comparative treatment strategy. A de- 
tailed discussion of the conceptual and meth- 
odological issues involved in a comparative 
treatment outcome research is beyond the 
scope of the present article (but see Kazdin & 
Wilson, 1978). Only some of the more press- 
ing concerns are mentioned here. Well-con- 
trolled comparative outcome research entails 
the comparison of clearly specified, opera- 
tionally distinguishable methods. Vague de- 
scriptions of therapeutic interventions in 
global terms such as “the Stuart behavioral 
treatment program,” “group psychotherapy, 
or “insight therapy” are unacceptable. The 
specific procedures embraced by these general 
labels need to be spelled out. Similar problems 
attach to the comparison of treatment methods 
with something termed routine treatment. As a 
standard for comparison, treatments labeled 
as routine vary enormously in nature, rang 
ing from those that include several active 
treatment ingredients to those that are mini- 
mal control conditions. 

Of major importance is the need to ensure 
that an evaluation of any given treatment ap- 
proach represents an adequate test of that 
approach. Behavioral methods are often com 
pared with treatments described as “psycho 
therapy.” In most instances the psychotherapy 
condition is more accurately construed as an 
attention-placebo control group. Drawing com 
clusions about the superiority of behave 
methods over traditional psychotherapy 0”, ee 
basis of these studies is inappropriate. >i 
Penick et al. (1971) comparative outcome 
study provides a relatively rare example 0 
the comparison of a behavioral program Ve 
an alternative method that was adequat Y 
representative of traditional therapy for 0 


Öst and Gotestam’s (1976) study in 
ch a behavioral method was contrasted 
fh pharmacotherapy is another instance of 
kgitimate comparative outcome evaluation. 
Different treatments must be distinguish- 
@ on an a priori basis and in their imple- 
tation. If these procedural differences are 
lirred, the results become uninterpretable. A 
idy might be designed to evaluate the effect 
a specific behavioral method such as self- 
itoring of daily caloric intake. If, how- 
it, subjects in the comparison group not in- 
ided to engage in such self-monitoring learn 
t this procedure and implement self-moni- 
fing themselves, the independent variable 
ll have been compromised. The probable 
icome of no differences between groups will 
“uninformative. Bellack, Rozensky, and 
hwartz (1974) compared two forms of self- 
itoring in which subjects monitored their 
havior either before or after they ate. The 
0 groups were not significantly different 
m each other. However, it is unclear 
iether subjects in the premonitoring condi- 
i complied with this procedure. Green (in 
ss) had external observers systematically 
le obese subjects’ adherence to treatment in- 
tions and found that fewer subjects in a 
Monitoring condition adhered to instruc- 
Dns than subjects in a postmonitoring con- 
ion. Green’s finding that this temporal fac- 
q in self-monitoring did not significantly 
Mence weight loss is therefore more inter- 
f able than Bellack et al.’s data. Treatment 
cedures cannot be properly evaluated un- 
f they are implemented, a factor that has 
Jsibly contributed to the great variability in 
Ament outcome with behavioral techniques. 
E identification of ‘the effective compo- 
3 p obesity treatment programs is im- 
ited by the failure to assess subjects’ ad- 
: to treatment methods independently 
or eme: The intent of most behavioral 
4 TA for example, has been to alter eat- 
be its as a means of producing weight 
eae the independent variables of 
D infer eee of eating habits) have 
teatment A , quite inappropriately, í from 
Being 9 trome, Several strategies exist for 
Pogram S jects’ adherence to the treatment 
heim Rite Hagen (1974) and Woller- 
) used an eating patterns ques- 
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tionnaire to assess habit change. They re- 
ported significant positive correlations be- 
tween weight loss and questionnaire responses. 
However, a questionnaire measure admin- 
istered only once at the end of treatment is 
probably the least satisfactory means of de- 
termining subjects’ eating patterns. Using sub- 
jects’ self-monitoring records of daily eating 
behavior during the Ist and Sth weeks of 
treatment, Jeffery et al. (1978) found that 
changes in eating patterns were unrelated to 
weight loss. Similarly, Bellack et al. (1974) 
found no relationship between subjects’ self- 
ratings of adherence to treatment guidelines 
and weight loss. Brownell (in press) used a 
similar but more comprehensive approach by 
assessing subjects’ daily self-reports of 38 
different weight-related behaviors over the en- 
tire treatment phase. Correlations among 
these measures and weight loss were nonsig- 
nificant. Continuous assessment offers obvious 
advantages. 

Brownell (in press) had subjects’ spouses 
rate their eating patterns as a means of ob- 
taining estimates of their accuracy. Similarly, 
both Green (in press) and Ost and Götestam 
(1976) trained observers to rate subjects’ tar- 
get behavior. The latter reported a significant 
correlation between habit change and weight 
loss. The use of external observers provides a 
reliability check of subjects’ self-reports of 
behavior change. However, the investigator 
should be alert to the possibility that this pro- 
cedure is reactive. Green, for example, found 
that the use of external observers was differ- 
entially reactive across different treatment 
methods. (See Nelson, 1977, for a discussion 
of the reliability and reactivity of self-moni- 


toring.) 
Control Groups for Treatment Evaluation 


The function of control groups is to ensure 
the internal validity of the study by preclud- 
ing the effects of weight change over time 
that are independent of treatment and the 
nonspecific influences of the treatment itself. 
As a rule the nature of the control group that 
is included will depend on the specific research 
strategy and the precise question that is the 
focus of investigation. However, some general 


comments are in order. 
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The most basic control group is the no-treat- 
ment condition. There are now sufficient data 
to show that weight does not change signifi- 
cantly as a function of the mere passage of 
time, the effects of repeated assessments, and 
other factors that the no-treatment condition 
is designed to control for. It is no longer a 
necessary control in a between-group study, 
and it is definitely not sufficient. Simply com- 
paring a treatment method with a no-treat- 
ment control group is no longer acceptable. 
Nonspecific è treatment control groups are 
necessary if causal relationships between spe- 
cific therapeutic techniques and weight loss 
are to be demonstrated. Nonspecific influences 
can be controlled in one of two ways. In one, 
the study might include at least two treatment 
groups that differ from each other, as in the 
dismantling or comparative strategies, for ex- 
ample, but that incorporate the nonspecific in- 
fluences of the therapeutic process. A differ- 
ence between the groups would be attributed 
to a specific treatment effect rather than a 
nonspecific influence, Bellack (1976), for ex- 
ample, showed that a self-reinforcement 
method that included self-monitoring was 
more effective than self-monitoring alone in 
producing weight loss, a difference that was 
ascribed to self-reinforcement, since nonspe- 
cific factors were presumably equated across 
conditions, 

The other means of controlling for nonspe- 
cific influences involves the inclusion of a 
pseudotherapy or attention-placebo control 
group, exemplified by Wollersheim’s (1970) 
pioneering study. Contrary to the view that 
it has been shown to be relatively unimportant 
(Jeffrey, 1974), this form of control consti- 
tutes a necessary feature of a well-designed 
outcome study. One reason is that more recent 
research has shown that it is more difficult to 
control for nonspecific influences than was 
once supposed. Many attention-placebo con- 
trol treatments have been found to be less 
(iba e ee active treatment 

. p > Kazdin & Wil- 
coxon, 1976). In other words, independent as- 
Sessment has indicated that not all attention- 
Placebo control conditions are successful in 
equating nonspecific treatment 


: influences, 
such as expectations of therapeuti 


c improve- 
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ment. These findings are relevant to b 
on obesity. For example, relaxation training 
has been used as a control condition for non- | 
specific effects in some studies (cf. Hall, Hall, 
Hanson, & Borden, 1974; Hanson et al., 
1976). Yet Ashby and Wilson (1977) found 
that obese subjects consistently rated the re- 
laxation training component of their multi- 
faceted behavioral program as being of little 
use. It is not unreasonable to infer that its 
use as a pseudotherapy control condition 
would be less credible than treatment involv- 
ing other behavioral techniques. Studies should 
include independent evaluations of the ef- 
ficacy of attention-placebo control conditions 
in controlling for nonspecific influences (e.g. 
Kingsley & Wilson, 1977). 

Another reason for retaining nonspecific 
control groups in treatment outcome studies 
is the fact that they have resulted in signifi- 
cant weight loss in some cases. Kingsley and 
Wilson (1977), for example, demonstrated 
that a social pressure control group modeled 
after one of Wollersheim’s (1970) control 
groups (which she had found to have little ef- § 
fect on weight) was effective in producing 
weight loss. The effects of this control condi- 
tion were especially evident at long-term 
follow-up, an evaluation few outcome studies 
ever make. The state of the art is far too un- 
developed to forego the necessity of incor- 
porating stringent and verifiable nonspecific 
control groups in treatment outcome studies 
on obesity. 

Simple inclusion of an attention-placebo 
control group is insufficient to control for non- 
specific influences in certain situations. Spe- 
cifically, it does not allow for evaluation of 
a treatment method unconfounded by expecta- 
tions of behavior change and the general de- 
mand characteristics of the experimental of 
therapeutic setting. One solution to this prob- 
lem is the use of countertherapeutic (Diament 

| 
4 
i 


®The term nonspecific is misleading. The ae 
ent treatment influences indiscriminately lumped ie 
gether under this accommodating rubric are aa 
specific. It is more realistic to propose that ae 
nonspecific factors remain to be specified, they a 
neither intrinsically unspecifiable nor qualite 
different from other variables involved in plann 
behavior change (cf. Wilson & Evans, 1976). 
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“{ Wilson, 1975; Steinmark & Borkovec, 1974) 


ornondemand (Abrams & Wilson, Note 1) in- 
structions. For example, Diament and Wilson 
(1975) evaluated the efficacy of covert sensi- 
tization on a taste-rating laboratory eating 
task. Both the covert sensitization and atten- 
tion-placebo control groups participated in 
two taste-rating tasks, which they were led to 
believe were simply baseline measures prepara- 
tory to the “real” treatment. Although both 
groups in effect received treatment during the 


| period between these two assessments, coun- 


tertherapeutic demands were communicated 
that emphasized that they would not lose 
weight until the real treatment began. The 
two behavioral measures served as pretreat- 
ment and posttreatment measures of food con- 
sumption. 


Miscellaneous Methodological Issues 
Description of Subjects 


Appropriate background and demographic 
data on subjects should be reported. A major 
problem in the obesity treatment literature is 
the absence of any reliable predictors of suc- 
cessful treatment outcome. Post hoc correla- 
lions among weight loss and diverse life his- 
ory and personality variables have been 
inrevealing (Bellack, 1975; Stunkard & Ma- 
honey, 1976). In general, the more overweight 

subject, the more weight loss that can be 
pected. Accordingly, Jeffery et al. (1978) 
tution that initial weight should be controlled 


‘ot in the analysis of other predictor variables. 


CT than focusing on personal character- 
oe obese clients, the search for predictor 
i a es might be more profitable if attention 
ae to how they respond to the initial 
al ment program (Jeffery et al., 1978) 

to performance on specific treatment pro- 


“fam-related tasks (Bellack, 1975). As with 


ho assessment in general, the em- 
laS should be on what the subject does in 
lation t 


‘ o specific controlling variables rather 

oe the subject is like (Mischel, 1968). 

cited ai manner in which subjects are re- 
or t: i 

Spelled out, reatment studies should also be 

ql ga attrition. High dropout rates in 

Y treatment studies are not uncommon 
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(Stunkard & McLaren-Hume, 1959). Careful 
reporting of the precise number of subjects in 
each group that drop out of treatment is es- 
sential. An attempt might be made to ascer- 
tain the reasons for attrition and to assess the 
progress of dropout subjects throughout the 
treatment period whenever possible. Data col- 
lected from these subjects might be included 
in statistical analyses. Interpretation of treat- 
ment findings must be guided by the fact that 
subjects who drop out of treatment are almost 
certainly treatment failures in obesity (cf. 
Levitz & Stunkard, 1974) and other (cf. Kent, 
1976; Sobell & Sobell, 1976) treatment out- 
come studies. 


Description of Therapists 


In its emphasis on specific treatment tech- 
niques, therapy outcome research on obesity 
has tended to gloss over the potential con- 
tribution to weight loss of the therapists who 
administer the treatment program. Although 
the therapists in behavioral weight reduction 
studies have ranged from undergraduates with 
no therapeutic experience (e.g., Jeffrey, 1974) 
to graduate students in clinical psychology 
(e.g., Romanczyk et al., 1973) to experienced 
professionals (e.g., Levitz & Stunkard, 1974), 
this factor is often overlooked. It is not un- 
common for manuscripts to provide virtually 
no information about therapists. Yet a clear 
statement of the therapists’ sex, their degree 
status, and their ‘therapeutic experience in 
general and familiarity with the treatment of 
obesity in particular is imperative. The im- 
portance of this information is underscored by 
the finding that professionally trained thera- 
pists were significantly more effective than 
nonprofessionals (leaders of lay self-help 
groups) in reducing client attrition during 
therapy and in effecting weight loss (Levitz 
& Stunkard, 1974). Similarly, Jeffery et al. 
(1978) reported that therapists who were 
more experienced in the conduct of obesity 
treatment groups achieved results significantly 
superior to those of less experienced therapists. 
The influence of therapist variables on treat- 
ment outcome is not surprising, and has been 
observed in the application of even highly 
structured behavioral techniques to clinical 
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disorders other than obesity (cf. Alexander, 
Barton, Schiavo, & Parsons, 1976; Johnston, 
Lancashire, Mathews, Munby, Shaw, & Gel- 
der, 1976; Wilson & Evans, 1976). 
The potential influence of therapist varia- 
bles can be controlled and evaluated statisti- 
cally by using more than one therapist in a 
design in which each therapist administers 
comparable amounts of therapy to all groups. 
If a single therapist administers all treatment 
conditions, it is impossible to disentangle the 
relative contributions of therapist versus treat- 
ment method. Another source of Therapist X 
Treatment confounding occurs if different 
therapists separately administer different treat- 
ment methods to different groups (e.g., Penick 
et al., 1971). In the latter study, experienced 
professionals conducted the traditional psy- 
chotherapy treatment, whereas therapists who 
had had no experience in behavior therapy, 
and little therapeutic experience in general, 
administered the behavioral method, This 
strategy explicitly “stacks the deck” against 
the treatment administered by the naive ther- 
apists. To the extent that this particular treat- 
ment proves to be superior, irrespective of 
therapists’ expertise, it provides a powerful 
demonstration of the efficacy of that method. 
However, if the superiority is not clear-cut, or 
if the treatment administered by the inexperi- 
enced therapists proves to be less effective, 
then the results cannot be interpreted un- 
equivocally. Thus Penick et al. (1971) found 
that behavior therapy produced greater weight 
losses, although the differences were not con- 
sistently statistically significant. Especially in 
view of Levitz and Stunkard’s (1974) find- 
ings, it is entirely plausible that had the be- 
havioral method been administered by experi- 
enced therapists, its superiority would have 
been more pronounced and its greater vari- 
ability in outcome reduced, 


Other Procedural Details 


5 Descriptions of treatment programs should 
include the therapeutic rationale and treat- 
ment instructions, the scheduling of sessions 
policies regarding absences, and arrangements 
for any make-up sessions. The time of year 
during which a study is conducted should also 
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be stated in view of reports that subjects tend 
to lose more weight in the spring and summer 
than in the fall and winter (cf. Ashby & Wil- 
son, 1977; Jeffery et al., 1978; Mahoney 
& Mahoney, 1976). 


Follow-up Evaluation 


A major shortcoming in research on the 
treatment of obesity is the relative lack of 
long-term follow-up evaluations of therapeutic 
efficacy (cf. Wilson, in press). The signifi- 
cance of this striking deficiency is highlighted 
by the fact that obesity is a clinical disorder 
that has been characterized by consistently 
high relapse rates; that is, clients who lose 
weight during treatment usually regain it (cf. 
Stunkard, 1976; Stunkard & Mahoney, 1976). 
Yet although the importance of long-term fol- 
low-ups is consistently stressed, the over- 
whelming majority of manuscripts that are 
submitted for publication nonetheless fail to 
include appropriate follow-ups. It is as im- 
portant to enquire into the reasons for this 
discrepancy as it is imperative that future 
treatment outcome research include long-term 
follow-ups. 

Long-term follow-ups require the invest- 
ment of considerable time and effort. Whether 
they can be completed successfully is often 
uncertain and unpredictable. Consider then 
the fact that a great deal of the research on 
obesity is conducted by graduate students 
and faculty in primarily academic settings. 
Clearly, long-term research—given these char- 
acteristics—is not the ideal stuff of which doc- 
toral dissertations are made of. Under the cur- 
rent system of doctoral dissertation research 
training, most students would not have the 
time to conduct a long-term follow-up them- 
selves; the realities of the situation appear to 
reward instead more manageable, more prej 
dictable, and time-limited studies. The im 
portance of treatment outcome (especially 
long-term follow-up), as Azrin (1977) has 
suggested, becomes secondary. The oni 
gencies governing the research behavior 0 
faculty members seeking promotion and ten- 
ure in the increasingly competitive job market 
are not dissimilar, Outcome research empha- 
sizing long-term follow-up represents some- 
thing of a gamble; it is not optimal gor 
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boosting publication frequencies and enhanc- 
ing curricula vitae. In sum, if we are to re- 
verse these contingencies—and, after all, be- 
havior is at least partly a function of its 
onsequences—we must encourage disserta- 
tion research that is relevant to treatment 
outcome. A student’s participation in one 
facet (eg., the long-term follow-up) of a 
broader research program might be viewed as 
a legitimate dissertation topic. This can— 
indeed must—occur without relaxing rigorous 
‘standards of scholarship and methodological 
‘sophistication. Our interpretation of research 
productivity of faculty must extend beyond 
the number of completed studies and incor- 
porate an emphasis on the nature of the 
research. 

Outcome evaluation of psychological treat- 
ments for obesity should distinguish among 
the initial treatment-produced weight loss, its 
generalization to the natural environment, 
and its maintenance over time (Bandura, 
1969). Different factors might govern each 
Of these processes, and it is now clear that 
generalization and maintenance will be en- 
sited only to the degree that specific strate- 
gles toward that end are used (e.g, Kingsley 
& Wilson, 1977). The methodological stric- 
“tures for the investigation and evaluation of 
teatment apply with equal force to mainte- 
tance or the follow-up phase of the study. 
All procedures followed should be explicitly 
i Ukscribed. Weight measures should be taken 

titectly rather than relying on subjects’ self- 

ts over the telephone or via mail. If dif- 
eent maintenance strategies are being com- 

Pared, it is essential to ensure that they are 

‘ctually implemented and that they are con- 

“eptually and operationally distinct. 

see instances subjects seek additional 
J fll ae sources of treatment during the 

aa oH period. This creates problems of 
M pretation (cf. Kazdin & Wilson, 1978) 
tle ee be avoided. It would be both fu- 

Stbjects S. to attempt to dissuade 

o e seeking extra assistance for 

ae a However, the nature and ex- 

aes ese other therapeutic experiences 
€ assessed and reported. 
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Attrition Rates 


Dropouts are more common during follow- 
up than the initial treatment phase and just 
as damaging to the interpretation of results. 
External validity is jeopardized, since attri- 
tion reduces generalizability; internal valid- 
ity is compromised when there is differential 
attrition across treatment groups. Accord- 
ingly, every effort must be made to minimize — 
attrition rates. Several procedures for accom- 
plishing this crucial goal may be mentioned 
briefly (but see more detailed discussions of 
this issue in Hagen, Foreyt, & Durham, 
1976; Sobell & Sobel, 1976). 

1. Maintenance procedures and long-term 
follow-up should be an integral part of the 
design of a treatment outcome study and 
should be presented to subjects as such. In 
committing themselves to the treatment pro- 
gram, subjects also explicitly commit them- 
selves to the follow-up. It is mot something 
to be tacked on following treatment almost 
as an afterthought. 

2. Investigators might remain in frequent 
contact with subjects. Regular phone calls or 
contact via the mail are recommended, Re- 
lated to frequency of contact is the nature of 
that contact. Personal contact that represents 
an extension of the initial treatment phase is 
indicated. Since this continuing contact could 
well be reactive, that is, function as a type 
of maintenance strategy in its own right (So- 
bell & Sobell, 1976), precise procedural de- 
scription is necessary. Y 

3. A refundable deposit that subjects 
forego if they miss too many sessions or drop- 


out significantly reduces subject attrition 


(e.g., Hagen et al., 1976). 

Christensen (1976) has argued that current 
follow-up methods might provide weight mea- 
sures that are unrepresentative of subjects 
eating habits. Specifically, subjects might re- 
sort to drastic short-term means (e.g. fast- 
ing or diuretics) of reducing weight immedi- 
ately prior to a follow-up weigh-in evaluation 
(cf. Mann, 1972). Christensen suggested that 
fixed follow-up dates be replaced by randomly 
scheduled weigh-ins. However, some subjects 
might object to this method. In any event, 
frequent contact during follow-up that 
stresses the importance of the personal rela- 
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tionship between subject and investigator 
would provide a detailed, representative as- 
sessment of subjects’ eating habits and weight 
patterns, 


Reference Note 


1. Abrams, D. B., & Wilson, G. T. Self-monitoring, 
reactivity and smoking behavior. Unpublished 
manuscript, Rutgers—The State University, 1977. 
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Smoking-Cessation Research 


experimental results. 


The Surgeon General’s Report (U.S. Public 


Health Service, 1964) on the health hazards 


tf tobacco stimulated a wave of smoking re- 
search that has persisted into the present with 
little sign of abating. In fact, there has been 
s% much published theory and research on 


“smoking that it has spawned a secondary pub- 


lihing enterprise consisting of periodic litera- 
ure reviews (e.g., Bernstein, 1969; Lichten- 
stein & Danaher, 1976), scholarly books and 
conference reports (e.g., Borgatta & Evans, 


1968; Hunt, 1970), and a monthly biblio- 


į 


gaphic bulletin (Smoking and Health Bulle- 
lin, published by the National Clearinghouse 
for Smoking and Health, Atlanta, Georgia). 
Psychological research on smoking gener- 
ally falls into one of four categories: studies 
Of the causes of smoking, studies of its ef- 
fects, Studies of its prevention, and studies 
of various treatment approaches. Thus far the 
se share of attention seems to have been 
ee to the last category, that is, to the 
a for effective methods of helping cigat- 
tte smokers reduce or eliminate their smok- 
ing behavior, 
ag the continuing interest in smoking- 
aoi research, investigators have shown 
init. little awareness of the special 
Fitch, ological problems inherent in such re- 
ha a in the ways to avoid or overcome 
i a lems. Methodological articles (e-8-, 
stein, 1969) seem to have been ignored 


ea for reprints should be sent to Richard 
en peepee cPartment of Psychology, W. J. Brog- 
1 es ology Building, University of Wisconsin, 

t Johnson Street, Madison, Wisconsin 53706. 


_ Copyright 1978 by the American Psychological Associal 


Richard M. McFall 


Univérsity of Wisconsin—Madison 


Methodological problems associated with treatment research on cigarette smok- 
ing are explored, and possible solutions are discussed. The main problems con- 
sidered are the selection and retention o. 
the design of treatment studies, and th 


f subjects, the measurement of smoking, 
e interpretation and generalizability of 


by some investigators. The present article is 
an attempt to draw attention, once again, to 
the critical methodological issues. In prepar- 
ing the article, I have drawn illustrative ma- 
terial both from published manuscripts and 
from manuscripts that were submitted for 
publication but were found unacceptable, 
Since the purpose of the article is to promote 
better future research, rather than to criti- 
cize previous investigations, the sources of 
most examples will not be referenced. 


Subject Issues: Problems of Generalizability 


Who are the subjects? Smoking research 
typically is not conducted with a randomly 
selected sample of subjects from the entire 
population of smokers.* Similarly, smoking- 
cessation studies typically do not draw their 
subjects randomly from among all persons 
who may wish to quit smoking. Rather, the 
subjects in smoking studies are nearly always 
volunteers Or recruits, whose relation to the 
parent populations of smokers or aspiring 
quitters is virtually impossible to determine. 
This fact immediately limits the generaliza- 
tions that can be drawn from the results of 


specific smoking studies. For example, when 
volunteer subjects are used to study the per- 
one cannot 


sonality correlates of smoking, 
assume that the findings are relevant to all 


1 Of course, the use of actual smokers as subjects 
in smoking-cessation studies allows greater freedom 
to generalize about the results than if analogue sub- 
jects had been used, as is the case in some other 
areas of behavior modification research. 
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smokers; it is possible that the findings may 
be more a function of the subjects’ willing- 
ness to volunteer than their smoking habits. 
Similarly, if a smoking-cessation program 
either succeeds or fails with a particular sam- 
ple of volunteers, there is no guarantee that 
the same effect would be achieved with smok- 
ers who chose—for whatever reason—not to 
volunteer. In fact, one might argue that the 
volunteer subjects in smoking-cessation pro- 
grams are not representative of smokers in 
general. 

Perhaps the typical volunteer is a member 
of a subgroup of smokers who are seeking 
an externally imposed, somewhat magical, 
solution to their problem. Furthermore, as the 
more promising candidates from among this 
subgroup manage to quit smoking, the re- 
maining subjects may represent a distilled, 
hard-core group of treatment-resistant smok- 
ers. Only a small percentage of the volunteer 
subjects in most smoking-cessation studies 
manage to quit smoking and to remain absti- 
nent for at least 6 months. Seldom do 20% 
achieve this measure of success! Based on 
such results, some investigators have con- 
cluded, rather pessimistically, that smoking- 
cessation techniques generally are not effec- 
tive. It is interesting to note, however, that 
in the meantime many individuals seem to 
have given up smoking on their own—without 
volunteering as subjects in formal treatment 
programs. Thus, over the course of numerous 
smoking-cessation studies, the characteristics 
of the volunteer subgroup may be changing. 

As if the inherent limitations of working 
with nonrandom samples were not enough, 
some investigators. further limit the general- 
ity of their results by failing to report in 
detail (a) how they recruited their subjects 
and (b) the essential characteristics of their 
resulting sample. It would be valuable if in- 
vestigators reported whether volunteer sub- 
Jects were recruited from introducto A 
chology classes, by newspaper ads, or A 
referrals from physicians; it also would help 
i a te ey eh PaE 
very least, investi 1 ate ied 

£ », Investigators should always pro- 
vide descriptive Statistics on their sample’s 
age, sexual composition, Occupational and 
educational Status, living arrangements, cur- 


RICHARD M. McFALL 


rent smoking behavior, and smoking history, 
The smoking history should include informa- 
tion concerning the pattern and form of to- 
bacco consumption over the years, the chron- 
icity of the habit, the history of any prior 
attempts to quit, and any signiñcant health 
problems. 

Another important question about the sub- 
jects in treatment studies is, what are their 
personal goals? So-called volunteers in smok- 
ing programs actually may be responding to 
cultural, social, medical, or family pressures; 
these subjects may be very different from 
those who volunteer without such coercive 
pressures. Even among genuine volunteers, 
however, there may be important differences 
in personal goals: Some may be committed 
to achieving total abstinence, whereas others 
may be satisfied with a significant reduction 
in their smoking. Few smoking studies have 
asked subjects beforehand to indicate their 
reasons for volunteering or to state their per- 
sonal treatment objective. 

Subject mortality. The problem of “sub- 
ject mortality” (see Campbell & Stanley, 
1963) is one of the biggest methodological 
problems, or sources of invalidity, in smok- 
ing-cessation research. If subjects are ran- 
domly assigned to different treatment condi- 
tions but some subjects drop out of the 
experiment prematurely, this loss seriously un- 
dercuts the investigator’s ability to interpret 
and generalize from the experimental results, 
There is no way to rule out the possibility 
that the subject loss has been nonrandom, 
thereby rendering the treatment groups no 
longer comparable. Asking subjects why they 
dropped out of the study is no remedy; even 
when subjects give reasons that seem unre 
lated to the experiment, these explanations 
may be little more than polite excuses or ra- 
tionalizations. The fact that equal numbers 
of subjects may have dropped out of the dif- 
ferent treatment groups does not mean that 
the mortality problem has been avoided; it 
may be that different kinds of subjects 
dropped out of the different groups, thus mak- 
ing them no longer comparable in compos 
tion, although they remain comparable in size. 
Replacing dropouts with new subjects ae 
Not solve the problem either; the replace 
may be very different from the original sub- 


~~ eeT— a 


ect, thus altering the group composition. And 
re simply are no acceptable post hoc meth- 
is for statistically correcting for subject 
mortality by artificially matching or equating 
Tups. The only valid solution is to retain 
il original subjects. 

"Unfortunately, volunteers for smoking-ces- 
“tion programs are notoriously unfaithful 
Subjects. Without using some kind of special 
inducement to stay in a study or constraint 
against dropping out, it has been virtually 
impossible to retain an adequate proportion 
i the original sample in most smoking studies. 
One fairly effective method of inducing 
subject fidelity has been to require an “earn- 
Gt deposit” before admitting a subject to a 
| Soking-cessation program. For example, 
‘some experimenters have collected $20 or $25 
fom each subject at the time of admission 
and have promised to return the full amount 
il the end of the study, contingent on the 
Subject’s faithful participation, The deposit 
Money was to be refunded regardless of the 
Subject’s treatment outcome, so long as the 
Fsubject attended treatment sessions and pro- 
Yided the requested smoking data. Other in- 
Vestigators have used variations of this 
method. In one variation, the smoker forfeits 
prorated portion of the deposit for each fail- 
Ute to attend or provide data. 

Th addition to using monetary incentives, 
the concerned investigator will want to do 
tverything within reason to assure that sub- 
3 do not drop out. There is a danger, how- 
Sy ore overboard in an effort to con- 
Been mortality. For example, if an ex- 
iis er were to call each subject before 
aM treatment session, provide transporta- 
ike © and from sessions, send thank you 
es i rege Serve hot cocoa and cookies, sub- 
tality fed might be reduced, but the gen- 
iie the experimental results would also 
o a to treatments that included the ele- 

Soins phone calls, transportation, thank 
Kura 2 and refreshments. Since such pro- 
stent eee are confounded with | the 
wal part z i i must be considered an inte- 
ing ae a treatment. Thus, when design- 
investigators a eS the drop-out problem, 
Used eae = methods that can be 
Other pata: y therapists operating 18 
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Measurement Issues: Problems of Reliability 
and Validity 


What is “smoking”? If the objective in 
smoking-cessation research is to reduce or 
eliminate “smoking behavior,” then it is es- 
sential to define this target behavior precisely. 
If we cannot define it precisely, we cannot 
measure it reliably, and this means that we 
cannot possibly determine whether our inter- 
ventions have had any effect on it. 

Specifying the target behavior in measura- 
ble terms is not as easy as it may seem. What 
is the best unit of measure for smoking? For 
example, should we count the packs consumed, 
the cigarettes consumed, the puffs taken, the 
volume of smoke inhaled, or the amount of 
nicotine and tar ingested? Should we assess 
these monthly, weekly, daily, hourly, or by 
the minute? Should we take into consideration 
the various stimulus situations in which the 
behavior occurs? 

The most common measurement unit is the 
number of cigarettes consumed per day—with 
no systematic classification of smoking situa- 
tions. But there is nothing sacred about this 
particular unit; in fact, it ignores a number 
of potentially important variables, such as the 
number of puffs taken, the amount of smoke 
inhaled, or, for that matter, the fact that cer- 
tain brands of cigarette are significantly 
longer than others. What if a subject lights 
a cigarette, smokes it halfway, extinguishes 
it, and relights it later—does that count as 
one or two? Despite its limitations, investiga- 
tors should consider using this common unit 
of measurement in future studies—perhaps 
along with other units—simply because it pro- 
vides a standard basis for cross-study com- 
parisons. Whatever units are chosen, they 
must be specified in sufficient detail to per- 
mit other investigators to use precisely the 
same unit. 


What method of measurement is best? 


Once the investigator has decided on a mea- 
surement unit, for example, counting the 
number of cigarettes consumed per day, then 
it is necessary to devise a suitable method of 
actually gathering the desired data. This prac- 
tical requirement is another major source of 
difficulty in smoking research. The following 
is a partial list of measurement options avail- 
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able to the investigator, along with a brief 
discussion of possible advantages and disad- 
vantages of each. 

1, Laboratory methods. The most accurate 
method of measuring actual smoking behavior 
is to observe it under controlled conditions, 
such as in the laboratory. To achieve accuracy 
and control, however, the investigator inevita- 
bly sacrifices representativeness. That is, 
smoking behavior observed in the laboratory 
may bear little resemblance to unobserved, 
nonlaboratory smoking behavior. The appro- 
priateness of using lab measures depends en- 
tirely on the specific experimental question. 
For example, if one were assessing the rela- 
tionship between anxiety and smoking, it 
would be appropriate to begin doing so in 
the lab by systematically manipulating levels 
of stress while measuring the smoking behav- 
ior. However, if one were interested in the 
therapeutic value of a particular intervention, 
then changes in laboratory behavior would 
not be regarded as meaningful or persuasive 
evidence of therapeutic change. Smoking- 
cessation studies may include laboratory mea- 
sures as part of a larger group of dependent 
measures, but there remains the problem of 
devising suitable extralaboratory criterion 
measures. 

2. Self-report methods. This is the most 
commonly used assessment method in smok- 
ing studies. Subjects are enlisted as collabora- 
tors in the data-collection process; they are 
asked to monitor, record, and report on their 
own smoking behavior. Of course, the ad- 
vantage of this method is that no one is in a 
better position to observe a person’s smoking 
behavior—across all situations and at all 

times—than that person herself /himself, One 
disadvantage is that the person’s self-reported 
data may be biased, inaccurate, or falsified 
and thus there remains the need for a suita- 
ble independent measure of the subject’s 
smoking behavior, Another possible disadvan- 
tage of self-report measures is that they 
sometimes can be reactive; that is, when sub- 
jects self-monitor their behavior, this may 
significantly affect the behavior being moni- 
tored in some manner (McFall, 1970). For 
example, it is common for subjects in smok- 
ing-cessation ay, 
| Programs who are asked to self- 
monitor their smoking frequency during a 
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baseline period to report that the monitoring 
makes it difficult for them to continue smok- 
ing “normally.” Nevertheless, because sub- 
jects do have unique access to their own be- 
havior, it seems that the advantages of the 
self-report method usually outweigh its dis- 
advantages—at least in smoking-cessation re- 
search—and that it will continue to be the 
principal data-collection method in such re- 
search. The problems with the method simply 
will have to be controlled or minimized as 
much as possible (e.g., see Nelson, 1977, for 
discussion of self-monitoring effects and their 
control). 

3. Unobtrusive naturalistic measurement. 
Webb, Campbell, Schwartz, and Sechrest 
(1966) have suggested a variety of ap- 
proaches that investigators might use in their 
effort to obtain useful naturalistic data with- 
out being so obtrusive as to contaminate the 
data. Translating their general suggestions 
into assessment methods for smoking behavior 
will require creativity and inventiveness, but 
one illustrative possibility is presented here 
to help stimulate the reader’s own imagination: 

Smoking ordinarily results in the accumu- 
lation of residual evidence in the form of 
cigarette butts. By monitoring a sample of 
the likely deposit sites of butts—for example, 
ashtrays in the office, home, or auto—an in- 
vestigator might get a reasonably good pic- 
ture of within-subject changes in smoking 
patterns over time. The resulting data should 
provide an indirect check on the accuracy of 
a subject’s self-report. 

There are at least three problems with un- 
obtrusive naturalistic methods. First, there 
are ethical and legal problems with poking 
around in another person’s personal space, 
such as their auto, home, or office, without 
their informed consent; but to obtain full 
consent would surely do away with the un- 
obtrusiveness of the measurement. Second, the 
availability of naturalistic smoking data wil 
vary from subject to subject, depending 02 
the particular environmental settings that 
each subject regularly frequents; thus, the 
samples obtained for different subjects may 
not be sufficiently comparable to permit an 
analysis of group data. Third, the collection 
of unobtrusive naturalistic data may be Pro 
hibitively difficult or expensive. 
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"4, Collaborator reports. If the subject’s 
| sdf-teports are suspect and if the investigator 
annot arrange to observe firsthand the sub- 
ject’s naturalistic smoking behavior, then a 
compromise solution may be possible. Perhaps 
a third party—someone living or working 
dosely with the subject—could be enlisted 
asan observer of the subject’s smoking behav- 
jor. This assessment method has been used 
vith increasing frequency in recent years. In- 
yestigators typically have asked subjects to 
provide names of persons who would be in 
fa position to observe their smoking and who 
“could be contacted periodically for reports. 
One problem with relying on such collab- 
orator reports is that the persons providing 
“the data ordinarily are close friends of the 
subjects and thus are not necessarily any 
‘more objective reporters than the subjects 
‘themselves. It has not been uncommon, for 
| ample, for subject and collaborator reports 
| to be extraordinarily highly correlated (e.g., 
puch, 95). Such agreement cannot be auto- 
matically accepted as evidence that the sub- 
ects and collaborators have provided valid 
data; the high correlation may reflect little 
more than collusion between the subjects and 
collaborators. Unfortunately, there is no sim- 
le method of assessing the validity of col- 
kborator-reported data, which means that 
this measurement method cannot stand alone 
4 a validity check on subjects’ self-reports.” 
5. Correlates of smoking behavior. For 
ae investigators have searched for a relia- 
è correlate of smoking behavior that they 


| 

i 

00 ae ee : 
i eT use as a sensitive indirect measure 1n 
ir smoking studies. Nicotinic acid stains 
q 

| 

& 

i; 


en fingers, respiratory flow volume, spu- 
the samples, and blood assays are among 
ed measures that have been con- 
i, at various times with only limited 
aa Recently, however, a promising 
ae correlate of smoking behavior— 
ae pride levels in samples of expired 
i E, een identified and used successfully 
Sullivan ng studies (Danaher, Lichtenstein, & 
R r in press; Lando, 1975). Brockway 
Toye es reported that thiocyanate may 
Cigarett e yet another objective measure of 
a = smoking. More work along these 
‘ ‘ie as though it might be helpful. 

mmary, no single measure of smoking 
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behavior is adequate. Until the absolute or 
ultimate measure has been discovered, inves- 
tigators must rely on a network of measures, 
each of which can serve to cover the weak- 
nesses or blind spots of the others. In any 
event, a more convincing argument for valid- 
ity can be made when there is concurrence 
among several independently derived mea- 
sures. 

Assessing change. Smoking-cessation stud- 
ies typically are composed of four assessment 
periods: (a) a baseline period, during which 
subjects’ pretreatment smoking behavior is 
recorded; (b) a treatment period, which can 
be broken down into several subperiods cor- 
responding to different phases of treatment 
or to units of time; (c) the end of treatment; 
and (d) a follow-up period, ideally covering 
a minimum of 6 months to 1 year. By assess- 
ing changes in smoking behavior over these 
periods, it is possible to evaluate the effects 
of different interventions. However, to the 
extent that any of the assessment periods are 
not adequately designed and controlled, the 
meaningfulness of the results will be seriously 
limited. Some of the most common design 
problems are discussed below. 

The essential requirement of the baseline 
period is that it provide a solid anchor against 
which to weigh the magnitude and signifi- 
cance of changes in any subsequent periods. 
Thus, the most serious mistake that investi- 
gators can make during the baseline period 
is to fail to assure that their measure of pre- 
treatment smoking behavior has stabilized 
before they initiate the treatment period. This 
flaw is not always apparent from post hoc 
inspection of the published data. It has be- 
come common practice to report only one 
mean smoking fre- 
the entire baseline period; this 
practice is unfortunate. An absolute mini- 
f three data points is necessary if one 
to estimate the stability of baseline be- 


wants i 
gle data point may obscure un- 


havior. A sin S 
derlying trends in the data that might sub- 


2It may be useful to distinguish between differ- 
ences in accuracy aS a function of the type of data 
It probably would be easier for a 


being collected. ier 3 
collaborator to report accurately on a subject’s ab- 


stinence than on the subject’s smoking rate. 
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stantially affect how one would interpret the 
experimental results. 

Another common methodological problem 
encountered in the baseline period is that ex- 
perimental groups sometimes are found to 
differ prior to treatment! This difference need 
not be statistically significant to be con- 
sidered serious, And when initial group dif- 
ferences do exist, there is little that can be 
done to correct the problem once the experi- 
ment has been carried out. Thus, before pro- 
ceeding to introduce any differential treat- 
ments, investigators routinely should 
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lated to meaningful outcome variables, Such 
analyses are potentially valuable, however, to ` 
the individual Investigator who needs to um- 
derstand what went wrong or what might ac 
count for the failure to obtain significant out- 
come differences, Without such process in 
formation, it is difficult to rise above one's 
failures and to design better treatments. 
The bottom-line question in smoking-ceam- 
tion research is the outcome question: Did 
the treatment work? This must be answered 
within two outcome time frames: immediate — 
and long-term. The end-of-treatment assess- 
ment provides an immediate measure of treat- 
ment effects; it also marks the transition 
from treatment to follow-up periods, When 
compared to the baseline measure, it provides 
a summary assessment of change over the 
treatment period, It also represents a bench 
mark against which to compare subsequent 
assessments and to evaluate the longterm 
maintenance of changes. The end-of4reat- 
ment assessment must contain the same mea 
sures used in the baseline period and in the 
follow-up period; otherwise, a valid ames 
ment of change is not possible. That is, subtle 
variations in measurement procedures from 


it would be questionable practice to amem 
pretreatment to posttreatment change by & 
ing self-monitored data at the baseline period 


meticulous about carrying out the baselisr, 
postireatment periods, osiy 
follow-up. 
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important factor in eliminating the prob- 
the persistence and determination of 

gator in pursuing the study of each 
every subject. When an investigator re- 
that some subjects “could not be lo- 
for the follow-up assessment, one can- 
help but wonder to what lengths the in- 
r actually went to locate the missing 
» 


follow-up measure of smoking should 
Je to the one used in the baseline 
‘end-of-treatment assessments, It has not 
uncommon for investigators to rely on 
à global self-reports of smoking 
, obtained from subjects via either 
conversations or preaddressed post- 
as their primary follow-up measure. 
a casual approach to assessment would 
eptable in the other experimental 
; it is equally unacceptable in the ex- 
period from which ultimate con- 
concerning the treatment outcomes 
drawn, 
has been pointed out elsewhere (McFall 
, 1971), virtually any plausible 
cessation treatment that one can 
is capable of producing a significant 
reduction in smoking behavior. 
, few treatments have managed to 
sustained reductions exceeding those 
by placebo treatments or minimal- 
P control conditions, An assessment 
thanges in smoking behavior across the 
experimental periods, therefore, is bound 
Yield a statistically significant main effect 
Periods, but is unlikely to yield significant 
treatment differences. Only the dis- 
of significant treatment differences is 
le enough, at this stage in the history 
Wroking-cessation research, to warrant 
or general dissemination of the 
er are two exceptions to this rule 
on when an established inter- 
t had come to be expected to 
Ugnificant results fails to do so; another 
4 truly novel treatment 
from a reasonably prominent theory, 
Š have an effect. In general, the pub- 
value td newsworthiness of a finding 
ietly re to its unexpectedness. 
— of results. If the results of 
ing-cessation experiments are to 
Gmpared and integrated, they must be 
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ments included in that particular comparison. 
The results have little theoretical significance; 
they reveal little about the reasons why things 
turned out as they did. 

2. Dismantling strategy. Once an effective 
change technique has been found, it can be 
examined more closely in subsequent studies 
in which it is systematically dismantled to 
see how its various components contributed 
to the overall treatment effect. This strategy 
has been used relatively infrequently in smok- 
ing-cessation research, to date, because few 
treatments have proved themselves sufficiently 
promising to warrant such an internal anal- 
ysis. One exception has been the “rapid smok- 
ing with warm, smoky air” technique reported 
by Lichtenstein and his colleagues (Schmahl, 
Lichtenstein, & Harris, 1972). The experi- 
ence of these investigators, however, illus- 
trates one of the potential frustrations of the 
dismantling strategy. When Lichtenstein, 
Harris, Birchler, Wahl, and Schmah] (1973) 
compared the relative effectiveness of the 
rapid-smoking component, the warm, smoky 
air component, and the combined components, 
they found that all three treatment conditions 
were comparably effective. That is, each com- 
ponent alone yielded the same level of effect 
as the two components combined, In this in- 
stance, the dismantling strategy failed to re- 
veal very much about the mechanics of the 
treatment effects, 

3. Constructive strategy. 
like the preceding one, assumes that a fairly 
effective intervention has been found. Using 
the established intervention as a solid treat- 
ment base and as a 


This approach, 
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theory. Ideally, such research would lead to | 
the most substantial advances, both scientific 
and technological. Unfortunately, there are 
few promising theories available at present 
to guide or stimulate such high-level research 
in the area of smoking behavior. History has 
shown that good theory is unlikely to come 
from ithe grand speculations of armchair 
thinkers; it usually arises out of attempts to 
integrate and organize empirical observations, 
Thus, there is reason to hope that smoking 
research eventually will manage to bootstrap 
its way from horse race studies, through dis- 
mantling and constructive studies, to theo- 
retically grounded research. 

An alternative research strategy deserves 
mentioning. It is a relatively unexplored, more 
indirect approach to the task of discovering 
an effective treatment. Investigators might 
learn a great deal if they took time out from 
their treatment studies to look closely at the 
various methods that have been successfully 
used by the multitude of former smokers who 
have quit on their own, There may be effec- 
tive “folk methods” that could teach us a 
great deal about smoking behavior and its 
treatment. 

Interpreting the results. An investigator’s 
choice of a research strategy has important 
implications for the subsequent interpreta- 
tion of the research results. There seems to 
be an inherent antagonism between the two 
research objectives of relevance and replica- 
bility. On the one hand, designs that foster 
relevance, such as those used in clinical trials 
or in horse race studies, tend to be so non- 
specific and uncontrolled that they cannot 
be easily replicated. On the other hand, de- 
signs that foster replicability tend to be s0 
tightly controlled and specific that they have 
limited immediate relevance to the clini 
treatment setting. eral 

At the former extreme would be a clinical 
study in which clients are treated over an €x- 
tended period by a combination of proce- 
dures that can be described in only the most 
general terms. The clients who quit pigs 
represent genuine successes, but it would be 
virtually impossible to specify and reproduce 
all of the factors that contributed to their 
quitting. : 

At the opposite extreme would be wad 
tory analogue experiments in which unmo 


ects are led to believe that their 
havior is being studied for reasons 
d to smoking reduction, in which they 
o a brief experimental manipula- 
d in which changes in within-lab 
behavior are the dependent varia- 
e results of such an analogue study 
f reasonably be interpreted as bearing 
ly on the clinical treatment of smoking. 
bell and Stanley (1963) have pre- 
| with great eloquence and clarity a de- 
list of factors that must be experi- 
ly controlled before an investigator 
to interpret the results of any ex- 
ital or quasi-experimental study. A 
of their presentation is beyond the 
this article. However, if there are 
s who have never read the Camp- 
Stanley monograph, it really should 
jidered required reading before at- 
g to design any smoking studies. In 
ders who have read it but have not 
ed it recently are strongly urged to 


lerally, it is helpful to remind ourselves 
aim of experimental research is the 
n of plausible rival alternative hy- 
. Viewed from this perspective, ex- 
ts never prove that particular hypo- 
conceptions are true; rather, at best, 
gains in stature to the extent that 
es to survive rigorous experiments 
designed to disconfirm it, whereas 
g hypotheses are disconfirmed by 
riments. In smoking-cessation re- 
for example, four of the most com- 
mpeting hypotheses are: 

thaps no systematic change even oc- 


change does occur, perhaps the ex- 
ital treatment does not produce & 
et change than do competing treatments. 
the experimental treatment produces 
est change, perhaps the change is not 
l ‘the experimental treatment is asso- 
M with the greatest and most durable 
perhaps such changes can be ex- 
More simply by uncontrolled factors 
om those that are part of the experi- 
‘eatment per se. 
M after these basic “null” hypotheses 
Monspecific factors” hypotheses have 
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been eliminated, and a treatment has been 
established as unquestionably effective, there 
remains the task of sorting through all of the 
various possible explanations for the mechan- 
ics of how the treatment works. Again, this 
is a process of eliminating alternatives until 
only a few are left standing. (The ideal of 
having only one alternative remaining is sel- 
dom if ever achieved.) 

The interpretation of experimental results 
must always be presented with reference to 
the particular rival hypotheses that either 
were discredited or failed to be disconfirmed. 
Since no single experiment is likely to pare 
the list of alternative rival hypotheses down 
to a single survivor, all prominent surviving 
competitors should be recognized in the dis- 
cussion and interpretation of a particular 
study. Suggestions for future studies designed 
to test the survivors are always welcome. 

Negative side effects. Recent events in the 
area of smoking-cessation research have 
helped to emphasize the need to be sensitive 
to possible unanticipated negative side effects 
of our experimental treatments. Lichtenstein’s 
rapid smoking technique has been one of the 
few treatments to achieve reasonably con- 
vincing and consistent effects—for example, 
abstinence rates after 6 months of 57% or 
more (Lichtenstein et al., Hk However, 
‘thi tment recently was found to pose a 
this treatme! Ag nazani, iiel 

jally i lied to patients with impair 
Tea en R 1974; Miller, 
1977). Fortu- 
of the rapid- 


smoking technique have been sensitive to the 
dangers and have 


Glasgow, 1977). Proponents of other treat- 
ments should show a similar sensitivity to the 


search, there remains the problem of translat- 
ing such an experimental treatment into a 
valid procedure for widespread use with the 
general public. Campbell and Stanley s 
(1963) discussion of external validity issues 
explores this problem thoroughly; however, 
some investigators apparently need to be re- 
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minded of some of the most critical issues. 
For example, a recently published self-help 
book purports to offer the consumer an ex- 
perimentally validated method for kicking 
the smoking habit. Assuming that the particu- 
lar method may have been shown to be 
effective when administered in a controlled 


setting to selected subjects by trained thera-- 


pists is not a sufficient basis for “going pub- 
lic” with the method in the form of a popular 
self-help book. Before going public in this 
way, the authors should demonstrate empiri- 
cally that the method is effective with sub- 
jects who buy the book and self-administer 
the treatment (see Glasgow & Rosen, 1978)! 


Conclusion 


The methodological vagaries and pitfalls 
of conducting smoking-cessation experiments 
have been outlined and discussed. Where pos- 
sible, suggestions for improved research were 
also offered. Generally, the standards for good 
design in smoking research are not different 
from the standards for research in other clini- 
cal areas. The very nature of smoking behav- 
ior, however, poses certain difficulties in at- 
tempts to achieve those design standards, 
Specifically, measurement problems have been 
a chronic weakness in smoking studies, 

Despite the extensive efforts of numerous 
investigators over many years, cigarette smok- 
ing continues to be a major health problem. 
An efficient, safe, effective treatment for 
smoking behavior remains an elusive goal, 
although progress toward this end seems to 
have been made in recent years. Hopefully, 
the present article will help hasten the day 
when the goal is realized and when it will be 


possible to help large numbers of 
kick the habit. a 
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Common Methodological Problems in Research 
on the Addictions 


Peter E. Nathan and David Lansky 
Rutgers—The State University 


Among the common methodological problems in research on the addictions re- 
viewed in this article are (a) selective, incomplete, or biased reviews of the body 
of prior research from which a study has arisen; (b) reliance on inadequate or 


incomplete diagnostic criteria in choosing subjects 
appropriate comparison groups for trea! 


for study; (c) choice of in- 


tment outcome research; (d) use of in- 


adequate alcoholic analogues when alcoholic subjects are unavailable; (e) failure 


to adequately account for treatment 


data; (f) unwarranted choice of single-subject over group 
ensure that comparably trained, equiv- 


alently committed therapists provide both experimental and control treatments 


in treatment outcome studies; (h) failure to 


dropouts in analysis of treatment outcome 


designs in addictions 


ensure that patients in both experi- 


mental and control treatments receive treatments as therapist- and time-inten- 


sive; (i) failure to follow patients 
(j) failure to provide for adequate, 


sures tapping a full range of patient beh 
scientific modesty, and criticality in repor' 
(1) failure to recognize important differences 


significance. 


This article identifies common problems in 
search on the addictions. It also offers sug- 
tions for remediating these methodological 
Roblems, The addictions considered in the 
i include alcoholism and the drug depen- 
ities. Because the literature on alcoholism 
Smuch more extensive than that on drug ad- 
oe most of the common errors reviewed 
Des aie from the alcoholism literature. 
iy a A this focus, there is enough commonal- 
"i in methods of research on alcohol- 
TIA drug dependence that criticism in 
‘the search area almost always has relevance 

other, 

Aie first problems to be considered are 

arising from inadequate, incomplete, or 
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for adequate lengths of time posttreatment; 
multidimensional treatment \ 
avior; (k) failure to exercise restraint, 


outcome mea- 


ting results of one’s own research; and 
between statistical and clinical 


methodological 
tion, research procedure, 
are detailed. Consideration of 
presentation of results and their interpreta- 
tion and discussion concludes the article. 


Literature Review and Statement of 
the Problem 


be considered gratuitous 
out the im- 
biased re- 
the impact of re- 
it is important, nonetheless, 


search findings, 1 Á 
to recognize that the alcoholism researcher 


ticularly liab 
pias called enlightened historical se- 
leotivity,” this tendency comes easily to the 
addictions researcher who is forced to choose 
one position, from several, on such contro- 
versial matters as etiology (Sociocultural, 
tabolic/physiological, and behav- 


genetic, me 
joral views are foremost.) and treatment. 
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(Dynamic, behavioral, medical, and peer- 
group Alcoholics Anonymous treatments are 
available.) This breadth of available perspec- 
tive has, in turn, been largely responsible for 
the proliferation and perpetuation of invalid 
“common truths” and unsubstantiated “old 
wives’ tales” in this field. 

Selective or biased reviews of the alcohol- 
ism literature have supported a variety of 
such “truths” through the years, including 
that alcoholism is a medical disorder (and 
that as a result, abstinence is the only appro- 
priate treatment goal for alcoholics), that all 
alcoholics experience loss of control over their 
drinking, that “dry” alcoholics are the best 
therapists for other alcoholics, that alcoholics 
drink to reduce prevailing high levels of ten- 
sion and anxiety, that alcoholics are oral- 
dependent individuals who have not come to 
terms with their needs for nurturance, and 
so on, 

One area of alcoholism and drug addiction 
research—having to do with the etiology of 
alcohol and drug dependence—has been es- 
pecially subject to the stultifying influence 
of selective biases on review and interpreta- 
tion of research results. Since significant fed- 
eral funding for alcoholism and drug research 
became available about a decade ago, unitary 
models of dependence have been largely dis- 
proven. Specifically, there do not appear to 
be characteristic Personality patterns that 
differentiate drug abusers from nonabusers, 
there is clearly not a single route to alcohol 
or drug dependence, and alcoholics and non- 
alcoholics do not appear to metabolize ethanol 
in discernibly different ways at comparable 
levels of ingestion, Instead, the mechanism of 
dependence, the 
sonality structure 


these gady 


t in more constructive 
P, a has is the behavioral Viewpoint on 

e ctions, well-stated b i 
Miller and Eisler (1975), ” PYhologists 
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Within a social-learning framework alcohol and drug 
abuse are viewed as socially acquired, learned be- 
havior patterns maintained by numerous antecedent 
cues and consequent reinforcers that may be of a 
psychological, sociological, or ph siological nature, 
Such factors as reduction in anxie » Increased social 
recognition and peer approval, enhanced ability to 
exhibit more varied, spontaneous social behavior, or 
the avoidance of Physiological withdrawal symp- 
toms may maintain substance abuse, (p. 5) 


From another perspective, that of tongue- 
in-cheek, Mark Keller ( 1972), alcoholism’s 
long-time Dean of Letters, says the following 
about unidimensional theories of alcoholism: 


A splurge of reports, in the 1940's, of biochemical 
characteristics purporting to differentiate alcoholics 
from nonalcoholics stimulated me to review a volu- 
minous related literature, implicating physical, social 
and Psychological demarcators as well. The only 
conclusion I could derive, from the entirety of the 
reportage, took a form that became known, among 
colleagues, as Keller’s Law: The investigation of any 
trait in alcoholics will show that they have either 
more or less of it. Accordingly, I then predicted that 
if sexadactyly should be investigated, alcoholics will 
yield either more or fewer six-toed and six-fingered 
people than a control population. (p. 1147) 


Subjects 

Diagnostic criteria. The familiar problems 
that plague diagnostic decision makers called 
on to make diagnostic distinctions among 
psychiatric patients (Chapman & Chapman, 
1977; Goldberg, 1968; Nathan, 1967) also 
confront the researcher who must select al- 
Coholics from a general population and, as | 
important, assure that the subjects chosen : 
are representative of an identifiable portion | 
of the universe of alcoholics, It might appeat | 
that choosing alcoholics must be easier than 
selecting schizophrenics or neurotics, because 
all alcoholics share a single stigma—they 
drink too much. But the task is made more 
difficult than it might seem by lack of con 
sistency, in research reports, in detailing this 
hallmark of the disorder. For example, de 9 
scriptions of alcoholics recently studied iy 
Psychologists range from “twenty Bs 
[who were] psychiatric patients with ap 
mary diagnosis of alcoholism” (Levine he 
Zigler, 1976, p. 141), and “alcoholic Ss [p i 
jects] [who] were volunteers from the pe y 
population of the Alcoholism Teamet rd 
gram at the VA hospital” (O'Leary, Radford, 


mey, & Schau, 1977, p. 580) to 


jects [who] were three male veteran inpatient 
mteers, ages 40, and 55 years. They reported 
ries of chronic drinking of 15, 20, and 26 years, 
pectively. Each had experienced blackouts and 
ium tremens as a result of drinking and had 
fen hospitalized 9-20 times for alcohol-related prob- 
ms, The subjects were each classified as binge 
finkers consuming up to 140 ounces of beer and 
i#-10 ounces of liquor per day when drinking. Drink- 
& episodes ranged from 5-30 days followed by a 
lay (on the average) period of sobriety. (Doleys, 
liminero, Wallach, & Davidson, 1977, p. 207) 


wing aside the question of which of these 
ludies most carefully ensured that its sub- 
iets were in fact alcoholics, it is clear that 
ily the latter article provides enough in- 
imation to permit assessment of the extent 
which the subjects studied were comparable 
alcoholics studied elsewhere. 

The greatest possible detail about drink- 
i pattern must be conveyed in a research 
i port. Above all, researchers—and their audi- 
Me—must become unwilling to consider men 
iid women as alcoholics simply because they 
ceived that diagnosis from someone some- 
Mhere at some time. Instead, hard signs of 
ysical dependence, psychological depen- 
ce, and tolerance ought to be in evidence, 
ng with indications that alcohol has been 
Problem in living and that problems in 
a relationships, vocational adjustment, 
Interpersonal relationships have followed 

Boe alcohol consumption. 
Ithough use of a broad range of assess- 
a Procedures is an essential aspect of 
3 and drug treatment research, the 
p mate utility of these measures can only 
Ripe by reference to their respective 
an y and validity estimates. It is, there- 
| eet on the responsible researcher 
i. and report the adequacy of these 
fon y as applied to the subject popula- 
the eae Unfortunately, it is rare that 
Biuras ity and validity of assessment pro- 
a ate reported in alcohol and drug treat- 

Studies, 

aa important that age, sex, and 
an oR status, at least, be summarized 
teady obse ject populations. As we have al- 
e elie rved, alcoholism does not wipe out 
t on behavior of subject variables 
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such as demography, education, and social 
class. 

Even more formidable assessment problems 
confront the drug researcher, since only very 
recently have attempts been made to measure 
quantity and frequency of drug ingestion 
(e.g., Mirin, Meyer, McNamee, & McDougle, 
1976; Rawlins, Randall, Meyer, McNamee, 
& Mirin, 1976). As a consequence, diagnosis 
of drug dependence is from whatever signs 
of physical dependence are available, whereas 
differentiation among varieties of drug abusers 
must be confined to descriptive typologies 
(e.g, heroin addict, amphetamine addict, 
polydrug abuser, and so on). However, Car- 
lin and Stauss (1977) recently explored the 
possibility of categorizing drug addicts along 
functional dimensions, including a streetwide/ 
straight typology (e.g-, legality of primary 
means of support, conventionality of dress 
and grooming, ability to buy drugs from street 
drug dealers, etc.) and a self-medication/ 
recreational drug-use typology. The major 
point of their article is that use of these de- 
scriptors, in conjunction with the more com- 
mon descriptive typology, provides more in- 
formation about subjects than the descriptive 
typology alone. 

Control and comparison groups. Another 
common subject selection problem is choice 
of comparison groups when one wishes to con- 
trast alcohol or drug abusers with nonabusers. 
If subjects are inpatients, must the compari- 
son group also be hospitalized? And if it is, 
should it be composed of hospitalized psychi- 
atric patients, because, presumably, alcoholics 
and drug abusers are often victims of psy- 


chiatric disorder as well? By the same token, 
comparison 


if subjects are outpatients, can S 
subjects be chosen from any non-drug-abusing 
group or must they be psychiatric out- 
patients? In the same vein, how necessary 1s 
it to match experimental and control groups 
by age, sex, socioeconomic status (SES), level 
of education, and treatment motivation (if 
the latter is relevant to the study)? Our view 


1For detailed information on unobtrusive ap- 
proaches to the assessment of alcoholism and its con- 
comitants, the reader can consult Briddell and Nathan 
(1976), Miller (1976) and Nathan and Briddell 


(1977). 
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of this complex matter is that alcoholics or 
drug addicts whose addictions are accom- 
panied by concurrent psychiatric disorder 
must be compared to age-, sex-, and SES- 
matched nonaddicted individuals whose psy- 
chiatric diagnoses approximate those of the 
alcoholics. We do not assume, of course, that 
all drug-dependent individuals—or even most 
of them—are psychiatrically disordered just 
because they abuse drugs or alcohol. 

Though an obvious and well-accepted con- 
trol procedure, it is probably worth noting 
here, in passing, that groups of patients to be 
compared on the basis of treatment outcome 
must have been chosen from the same patient 
population to begin with, then assigned to 
treatment groups in a way guaranteed to en- 
sure comparability. Of equal importance, in 
this context, is selection of a proper compari- 
son treatment for alcoholics or drug abusers 
undergoing a nonstandard treatment whose 
efficacy is being assessed. In this regard, it is 
probably worth noting that “no treatment” 
or waiting list controls may be unethical when 
one is dealing with severely disabled addict 
populations. As important, they may also be 
unfeasible, since alternative treatment sources 
are often freely available to the motivated 
alcoholic or drug abuser, Most important, one 
must ensure that the control treatment that 
matched patients do receive is as active, long 
lasting, and time intensive as the new treat- 
ment being evaluated. This important issue 
is discussed in greater detail later, 

A related control issue, in this context, is 
the treatment motivation of experimental and 
control subjects of comparative treatment 
studies. Failure to assess subjects’ treatment 
motivation in treatment outcome studies in- 
volving small numbers of subjects, a common 
failing, Prevents one from knowing whether 
apparent differences in treatment efficacy, if 
found, reflect real differences in the power of 


treatment motivated, 
dertake this necessary assessm 

r h lent beca 
doing so inevitably reduces the size of T 
pool of potential subjects and because the 
e is really not so simple, Above all, 
subjects may not tell the truth about their 
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treatment motivation, especially when finan- 
cial, vocational, or judicial contingencies have 
brought them to treatment. But despite this 
problem, it is absolutely necessary to attempt 
to assess treatment motivation and to report 
results of that assessment in any study in 
which motivation for treatment could play a 
role in outcome. 

Alcoholic analogues. Despite their omni- 
presence, alcoholics are not easy to locate for 
study. They are also not likely to be highly 
motivated to participate in psychological re- 
search (whose payoff to them may be un 
clear), are not always reliable in keeping ap- 
pointments when they do agree to be subjects, 
and, when they are not drinking, they are 
apt to prefer working to participating in re- 
search. Because research designed to examine 
alcohol’s effects on alcoholics can only include 
alcoholics in good physical and psychological 
health despite their chronic alcoholism, those 
who design such research find themselves with 
an even smaller pool of alcoholics ready, will- 
ing, and able to be research subjects. 

For these reasons, researchers may choose 
to study alcoholic analogue subjects, who may 
be “problem drinkers” of one sort or another 
or, commonly, heavy-drinking college stu- 
dents. Though choice of such subjects is un- 
derstandable, generalization from their be 
havior to that of alcoholics must be done 
with great care—if at all. For this reason, it 
is almost always preferable to study alcoholics 
than their analogues; it is practically impos 
sible to control for all of the differences that 
separate alcoholics and nonalcoholics, some 
of which remain unknown. Among the most 
obvious of these differences are age, a 
tional level, cognitive functioning, and ra 
of ethanol metabolism. 

Above all, one must be chary about me 
cluding that what is characteristic of the es 
havior of a group of problem or heavy dri ie 
ers is likely also to characterize the behavi i 
of alcoholics—only more so! In all likelihood, 
such a conclusion is not justified. AL 

The problem of treatment dropouts. í 

4 kinds 0 
though the dropout problem for all ae 
psychological treatment is a serious one, 1 

i 2 addicts, 
especially so for alcoholics and drug ale 
both of whom are poorly motivated for tre 


A i Ji 
ment; variable in meeting vocational, familial, 


| personal obligations; and, often, in poor 
iol of their behavior. A recent study of 
than 14,000 chronic alcoholics treated 

44 federally funded alcoholism treatment 
fers throughout the country (Armor, Polich, 
fambul, 1976) revealed that fewer than 
M of these clients continued in treatment 
fas long as 3 consecutive months. More- 
Wet, of the original sample of 14,000, 6- 
ith follow-up data on only 2,371 clients 
reported. Both figures suggest something 
the scope of the attrition problem facing 
paddictions treatment outcome researcher. 
Given, then, that substantial numbers of 
Mholics and drug addicts drop from treat- 
it in its midst and that as many more 
Ø complete treatment cannot be located 
follow-up, how is one to compare two or 
ie treatment methods on both the short 
dlong term? To begin with, one must in- 
ide in any statistical analysis of differences 
g treatment groups subjects who dropped 
lim treatment; these subjects should be con- 
ted treatment failures regardless of the 
inalizations some may have given for the 
Hsion to terminate. The unintentional de- 
Bion that can result if this rule is not fol- 
Wed is illustrated by an article published 
Hy years ago comparing groups of alco- 
ics receiving two “active” treatments, both 
' Olving the administration of electric 
KX, and two “inactive” placebo treatments. 
l comparing the four groups in terms of 
Mtage length of abstinence immediately fol- 
had treatment, the study’s authors con- 
ig one of the active treatments was 
Rie with significantly longer periods 
4 stinence than the other three treatments. 
cn of the data on which this con- 
ive ee drawn, however, reveals that 13 
a aaa subjects dropped from treat- 
a tts midst, whereas only 1 placebo 
ent subject did so; this fact was not 
N account in the significance testing 
impan eatment abstinence rates, which only 
a subjects who had completed treat- 
ice many of the active treatment 
dropped from treatment because they 
ect, shock aversive, to conclude 
a sonence rates of subjects who 
main jn a enough to endure shock and 
eatment that the active treatment 
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was more effective than the placebo is decep- 
tive, however unintentional. 

Although most readers doubtless agree that 
patients who drop from treatment must be 
considered treatment failures for comparative 
purposes, it is more difficult to find agreement 
on how to handle treated subjects who cannot 
be located for follow-up. Is one to conclude 
that all subjects who cannot be found at a 
follow-up interval have returned to drinking 
or drug taking, have remained “dry” or 
“clean,” or are using alcohol and drugs in 
about the same proportions as subjects who 
could be located? Though all three stances 
are defensible, the most conservative approach 
to this problem is to consider all patients 
unlocated at follow-up to be treatment fail- 
ures (a position recommended by the Ameri- 
can Medical Association, 1956; Pattison, So- 
bell, & Sobell, 1977; Sobell, 1978). 

Before having to accept this admittedly un- 
satisfactory solution to the problem, though, 
one ought to take every step necessary to en- 
sure that as many patients as possible are 
available for follow-up assessment. Ways to 
achieve this desirable goal are summarized 
later in this article in the section entitled 
Follow-up. 

Who to study? It is inevitable that most 
of the subjects in alcoholism and drug treat- 
ment research will be male skid-row or “blue 
collar” alcoholics and ghetto-dwelling drug 
addicts, The reasons are that these individ- 
uals are often unemployed and, hence, availa- 
ble for intensive, long-term study; even if 
employed, they are more likely to be clients 
of publically funded clinics whose clientele 
is more available to researchers than are the 
patients of private psychologists and psy- 
chiatrists. Another is that this population of 
alcoholics and drug abusers is more likely 
than any other to reach the attention of the 
judicial system, and subsequently to be re- 
manded to treatment by the courts; such 
court referrals constitute an important source 
of subjects for treatment evaluation studies. 
This is especially important, since it 1s clear 
that data on treatment outcome for court- 
referred drug addicts—whatever their socio- 
economic status—cannot readily be general- 

eferred for different 


ized to drug abusers '' 
reasons. Unfortunately, few alcohol or drug 
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treatment reports include explicit descriptions 
of referral sources. 

As a result of these and other similar fac- 
tors, most of our data on drinking patterns, 
patterns of drug use, and response to treat- 
ment are from this group of subjects, even 
though it is not representative of the total 
universe of drug abusers and alcoholics. It has 
been estimated, for example, that fewer than 
5% of all alcoholics are of the skid-row vari- 
ety (Armor et al., 1976). 

With these observations in mind, we offer 
two remedial suggestions: (a) Every effort 
should be made to study socially and eco- 
nomically advantaged alcoholics and drug 
abusers as well as more readily available 
groups. Sources of such persons include au- 
tonomous Alcoholics Anonymous groups; pri- 
vate clinicians who might agree to cooperate 
with researchers they know and trust; and 
private hospital administrators, whose inter- 
est in research collaboration might reflect 
their wish to gain professional credibility by 
associating with a university’s research ef- 
forts. (b) If the usual groups of skid-row 
alcoholics and destitute heroin addicts are 
the only populations available for study, the 
researcher must make clear that generaliza- 
tions from his/her data can only be to com- 
parable groups of intellectually and voca- 
tionally limited individuals, that few alcohol- 
ics and drug-dependent individuals are as 
impoverished in so many ways as are these 
overstudied individuals, 


Procedure: Comparative Treatment Studies 


Group versus single-subject designs. The 
same design dilemma faces the alcoholism or 
drug treatment researcher as confronts any 
other clinical researcher who must choose be- 
tween group and single-subject designs. Group 
designs permit more 
tion of findings from 
other, because they 
of differences in the 
Procedure and another among substantial 
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comfortable generaliza- * 


baseline” implicit in the single-subject ABA 


design that is necessary to establish that iti 
was the treatment, rather than time, motiva-} 


tion, or expectancy, that brought about the 
observed changes in behavior. 


Because so much of the research on alco- | 


holism treatment by psychologists during the 
past 5 years has been behavioral, much of 
the literature reports single-subject designs, 
By contrast, the older literature reported re- 
sults predominantly from group designs. It 
is possible, in fact, to retain the virtues of 
both designs by ensuring that group designs 
provide for thorough and reliable pretreat- 
ment and posttreatment assessment of behay- 
ior, carefully matched experimental and con- 
trol treatment subjects, and suitably pro- 
longed follow-ups. 

Several recent studies comparing alcoholism 
treatment methods have achieved this desir- 
able blending of the strengths of single-subject 
and group designs, among them comparisons 
of abstinence-oriented and controlled-drink- 
ing-oriented behavior modification programs 
(Sobell & Sobell, 1976), behavioral family 
counseling, electrical aversion, covert sensitiza- 
tion, and systematic desensitization (Hedberg 
& Campbell, 1974), and two broad-spectrum 
behavioral treatment packages (Vogler, Comp- 
ton, & Weissbach, 1975). All three projects 
provided for comprehensive pretreatment and 
posttreatment assessment of drinking behav- 
ior and vocational, familial, and interpersonal 
effectiveness; follow-up periods extending 0 
or beyond a year, accompanied by procedures 
to minimize subject attrition during the fol- 
low-up period; treatment groups matched for 
relevant demographic and treatment motiva- 
tion variables; and careful attention to com 
parability of “active” and “placebo” treat- 
ments. Unfortunately, the blending of com 
structive design elements contained in i 
studies is not generally characteristic of the 
field. t 

Who are the therapists? It is important t° 
ensure that the treatments contrasted in er 
parative treatment studies be provid y 
comparably trained, equally committed a 
pists. To compare treatment outcomes W 2 
one treatment is given by experienced wipe 
pists and the other by graduate students, a 
when one is administered by clinicians 


ae ee 
eee ae 
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mitted to the treatment they are providing 
while the other is offered by men and women 
inconvinced that what they are doing has 
yalue, represents poor research design. Yet 
many researchers have made precisely this 
design error when comparing one or another 
wmbination of innovative therapeutic ap- 
proaches with what is euphemistically termed 
siandard hospital milieu therapy. The latter, 
which may include alcohol or drug education, 
group therapy, occasional personal counseling, 
and the opportunity to participate in an Al- 
‘toholics Anonymous or Synanon group, is 
Usually provided by undertrained, underpaid 
slate hospital workers whose enthusiasm for 
‘their work, insight into its value, and level 
of clinical training are rarely equal to those 
administering the new treatment approach. 
When the new therapeutic package turns out 
tobe more efficacious than the standard one 
‘in such comparative studies, one cannot be 
certain that it was actually more active or 
that it was provided by committed therapists 
Whose enthusiasm for their work was in- 
fectious, 

_ We recommend that no new therapeutic 
package be compared to “standard hospital 
treatment milieu,” so often a euphemism for 
‘Virtually no treatment at all. Instead, the 
Separate components of novel treatment pack- 
ges might more profitably be compared to 
= other, as well as to the package as a 
| whole, in this way permitting assessment of 
tach component’s contribution to the overall 
d ae effectiveness. For those researchers 
Dende to compare their efforts to some 
E a success rate for alcoholism treat- 
kae wo choices are among those possible: 
Ee al.’s (1976) recent survey data, 
E ng that 70% of the clients treated at 
| YPical alcoholism clinics across the country 


E short-term improvement in drinking 
, 


f 
a S ‘ aD earlier conclusion that the suc- 
Italie’ Icoholics Anonymous, often the al- 
Bis 4 couse treatment resource, is only 
a o A and 35% makes for much more 
CHE e ones. Our own view of the mat- 
ot mor t abstinence beyond a year by 40% 
Surpass of patients treated by any technique 
7 €s current expectations. 
eatment variables. To justify meaning- 


make for rigorous comparisons; Dit" 
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ful comparisons among different treatments, 
it is axiomatic (a) that all patients in all 
treatment groups must have received about 
the same number of hours of individual and 
group treatment and (b) that the intervals 
between treatment sessions must have been 
comparable. It is necessary to ensure com- 
parability of treatments in this way because 
there is a considerable clinical treatment lit- 
erature that attests to the therapeutic impact 
of the relationship between therapist and 
patient, an impact independent of the thera- 
peutic methods that the therapist chooses to 
use (Smith & Glass, 1977). 

In practice, this straightforward control 
procedure can present problems. Comparing 
an experimental treatment to “standard hos- 
pital milieu” treatment, for example, might 
prove impossible if this control is taken seri- 
ously; Most experimental treatment programs 
provide individual or group treatment—or 
both—administered by highly trained and ex- 
perienced clinical researchers and their stu- 
dents, whereas standard treatment, custom- 
arily administered by hospital workers whose 
training and motivation may be inferior, is 
also likely to involve much less 1:1 contact 
and small group treatment. In similar fash- 
ion, comparison of methadone maintenance to 
psychotherapy is frequently confounded, since 
methadone programs usually involve the drug 
addict in less treatment time and less intimate 
therapist contact than do most psychothera- 
peutic treatments. 

A related matter has to do with the scope 
of treatment issues addressed—the range of 
problem behaviors confronted. Consider how 
different the scope of treatment was for ex- 
perimental and control subjects in a recently 
reported, widely cited behavioral treatment 
package for alcoholics: 


Experimental Subjects: Procedures included subjects 
being videotaped while intoxicated under experi- 
mental conditions, providing subjects when sober 
with videotape self-confrontation of their own 
drunken behaviors, shaping of appropriate controlled 
drinking or nondrinking behaviors . . - the availa- 
bility of alcoholic beverages throughout treatment, 
and behavior change training sessions. [The latter] 
is a summary phrase to describe sessions which con- 
centrated upon determining setting events for each 
subject’s drinking, training the subject to generate 
a series of possible alternative responses to those 
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situations, to evaluate each of the delineated alterna- 
tives for potential short- and long-term consequences, 
and then to exercise the response which could be 
expected to incur the fewest self-destructive long- 
term consequences, Behavior change training sessions 
consisted of discussion, role playing, assertiveness 
training, role reversal or other appropriate behay- 
ioral techniques, 


Control Subjects: Control subjects received conven- 
tional treatment procedures which could include 
group therapy, chemotherapy, Alcoholics Anonymous, 
physiotherapy and other traditional services, (Sobell 
& Sobell, 1973, p. 601) 


Although the range and diversity of poten- 
tial treatments available to control subjects 
in this study could have been as great as 
those provided to experimental subjects, one 
is left with the distinct impression that the 
scope of treatment offered to experimental 
subjects—the specific problem areas that the 
treatments were designed to confront—was 
both better targeted and more comprehensive 
than that available to control subjects, In 
other words, the treatment package offered 
to experimental subjects was not only differ- 
ent from but better than that provided con- 
trol subjects. 

One resolves this common design problem 
by confronting the fundamental difficulty of 
Premature comparison of the efficacy of com- 
prehensive treatment packages, by resigning 
oneself to the necessity to compare the com- 
ponents of a treatment 
before attempting to 


c i ethanol, by drink- 
ing during the Posttreatment period with the 
same enthusiasm for alcohol that they had 
shown pretreatment, it was concluded that 
electrical aversion, either by itself or within 
ea treatment package, did not 
have the therapeutic pow I 
sumed it to have, Saige 
Follow-up, Although it is obvious that the 


longer the follow-up interyal Posttreatment, 
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the more sure one can be of data on treat-4 
ment efficacy, no one knows how short thef 
follow-up period can be and still reflect ulti- 
mate treatment efficacy. Nonetheless, most 
alcohol and drug researchers question any 
follow-up interval that fails to extend to 1} 
year or more posttreatment. They believe 
that such an interval is inadequate to assess 
the long-term effects of treatment for alcohol- | 
ism or drug dependence, given that many 
alcoholics and drug abusers spontaneously 
modify or cease drug or alcohol ingestion on 
their own for periods extending beyond a} 
year. We believe that a 2-year follow-up of 
treatment for alcoholism or drug dependence | 
is necessary to provide a suitably comprehen- 
sive view of the power of the treatment to 
effect lasting change. 

A 2-year follow-up period, however, pre- | 
sents formidable subject attrition problems. 
Given the transient nature of the existence 
of many alcoholics and drug addicts, how 
does one keep contact over that time with 
men and women whose residences, jobs, and 
lives change so much and so often? Sobell 
(1978), a clinical researcher with great ex- 
perience in maintaining contact with alcoholic 
clients over lengthy follow-up intervals, sug- 
gests the following set of coordinated steps 
to minimize attrition of subjects during a fol- 
low-up period: (a) Allow enough time and 
develop enough persistence to locate as many 
follow-up subjects as possible; do not settle 
for a majority of subjects or even for most 
of them. (b) Explain to subjects at the end 
of treatment why follow-up contacts are 
scheduled, when they can expect to be con- 
tacted for follow-up, the kinds of information 
to be requested, and how this informim 
will be used. (c) Identify as many collateral 
sources of information about subjects as Pos 
sible during treatment. (d) Maintain one 

“ity of contact with patients during the fol- 
low-up interval by keeping in touch every 
few weeks, even if follow-up informator g 
not required until 6-month or 1-year mar T 
(e) Be prepared to consult official ee 
(e.g., jail, hospital, welfare, driver oe 
etc.) to locate lost subjects. Parenthetica i 
maintaining this kind of frequent eee 
for follow-up purposes also serves as an ! 


” 
inuing cate 
portant source of low-cost “continuing 


PROBLEMS IN RESEARCH ON THE ADDICTIONS 


iht may help maintain therapeutic gains 
frst achieved during treatment. In this in- 
sance, then, good research design and good 
patient care go hand in hand. 
“A frequently ignored follow-up issue of 
yelevance to both alcohol and drug treatment 
studies is that a treatment program may be 
highly effective in attaining desired goals 
fiile patients are actively involved in the 
program, only to appear to fail when patients 
feturn to nonsupportive or destructive en- 
vironments. Unfortunately, a ‘follow-up assess- 
ment 3 or 6 months posttreatment will not 
Ay this Treatment X Environment inter- 
tetion (Gotestam, Melin, & Ost, 1976). Un- 
der these circumstances, assessment of out- 
tome immediately after treatment has ended 
and then again during a follow-up period 
[uch more clearly delineates dynamics of 
improvement and determinants of relapse. 
" Outcome measures. A variety of direct and 
self-report measures of drinking behavior 
‘have been developed. The Cahalan quantity- 
lfequency index (Cahalan & Cisin, 1968), the 
Michigan Alcoholism Screening Test (MAST; 
se Selzer, Vinokur, & Wilson, 1977), and 
Marlatt’s behaviorally oriented Drinking His- 
Ury questionnaire (Marlatt, 1975) are the 
n o MRM used self-report measures of 
í te Although use of such measures opens 
fo: investigator to criticism on grounds of 
3 telative unreliability of self-report data, 
E studies have recently reported that 
Pa. ics’ and drug addicts’ self-reports on 
Basi ing and drug ingestion may be far more 
lh rate than previously believed (Cox & 
aang 1974; Sobell, 1978; Homer & Ross, 
f: e Regardless of these data, however, 
fo prefer direct measures of drinking behav- 
E tng blood alcohol determinations 
1975) a unannounced intervals (e.g., Miller, 
ae a test or other drinking analogue 
973), o (e.g, Marlatt, Demming, & Reid, 
tunities E n ad libitum drinking oppor- 
atter tw chaefer, Sobell, & Mills, 1971). The 
a 3 methods, however, cannot be used 
Pa Se since they provide the 
tent ae ae alcohol ingestion; this assess- 
Nels, is considered in greater detail 
outcome n and Lansky (1978). Whatever 
Measures are ultimately selected for 
ı the point discussed earlier in regard to 
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the issues of reliability and validity is rele- 
vant here as well: These measures are rela- 
tively uninterpretable unless reliability and 
validity estimates for the population under 
study are reported. 

There are other measures of treatment out- 
come that relate less directly to drinking or 
drug ingestion. One of the most thorough 
tests of these pretreatment and posttreatment 
measures was provided by Armor et al.’s 
(1976) recent national study of 44 alcohol- 
ism treatment centers scattered around the 
country. A complex of vocational, marital, 
and social indicators of relevance to alcohol- 
ism were tapped; direct reports from em- 
ployers, clinicians, relatives, and others in a 
position to comment on job stability and 
marital adjustment were elicited. In some in- 
stances, these indices of behavior change 
showed more dramatic improvement following 
treatment than did direct measures of change 
in alcohol consumption. 

In similar fashion, treatment goals for drug 
abusers extend far beyond mere reduction of 
the frequency of drug abuse; goals are often 
set to include improved employment status, 
widened spheres of non-drug-related social 
contacts, and improved physical health (An- 
derson & Nutter, 1975; McCabe, Kurland, & 
Sullivan, 1975). As a result, pretreatment and 
posttreatment measures must tap information 
that directly reflects the status of these 50- 
cially desirous variables, frequently a most 
difficult task. A 

Whichever combination of direct and in- 
direct indicators of change-in-life function- 
ing is chosen, it is important that follow-up 
be circumscribed enough to permit 
reliable recall of the data requested. To ask 


to request such recall for a 2 p 
both more reasonable and, likely, will prove 


more reliable. 
Procedure: Other Studies 


Nature of the analogue in analogue studies. 
One of the most common thrusts of alcohol- 
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ism research that is not treatment oriented 
involves the use of alcoholic analogues. Such 
research might describe “typical” drinking 
patterns of chronic alcoholics, college stu- 
dents, or middle-class women. It might in- 
vestigate the effects of alcohol on psycho- 
logical, social, or cognitive functioning. Or 
the research could inquire into the impact of 
one or another environmental variables, de- 
signed to induce stress or anger, for example, 
on consequent drinking. All of this research 
could be undertaken analogically. Typical 
drinking in the real world can be inferred 
from ‘patterns of drinking in the laboratory; 
under certain circumstances, the effects of 
large amounts of alcohol on cognitive func- 
tioning can ‘be predicted from the effects of 
small doses; in some cases, the impact of 
stress on drinking can be studied by first 
creating artificial stress in the laboratory— 
an environmental condition analogous to real- 
life drinking stress—and then measuring con- 
sequent drinking ‘behavior. 

Reasons for undertaking analogue research 
instead of research in the real world are many. 
Alcoholics and drug addicts are hard to find, 
even harder to study. Giving alcohol in large 
doses to anyone, especially to an alcoholic, 
is difficult; giving hard drugs to anyone, 
especially to a drug addict, is virtually im- 
possible. For these and other reasons, ana- 
logue research on the addictions stands be- 
tween no research and the researcher. But 
there are important caveats to observe in un- 
dertaking analogue research in this field, They 
include, above all, assuring oneself—and one’s 
professional audience—that the analogue one 
has chosen to use is not stretched so far as 
to bear only vague resemblance to the phe- 
nomenon in the real world. For instance, the 
laboratory drinking behavior of colloge sopho- 
mores cannot be considered representative of 
the drinking of adults in neighborhood taverns 
or at home. Similarly, the absence of an im. 
pact on intelligence test Performance of mod- 
erate blood alcohol levels during a single test 
session has little relevance to Possible cogni- 
tive deficits suffered by lo 


yi S ng-term hea 
drinkers, Finally, stress induced in a eg 


tory might not be at all comparable to stress 
in the real world, especially when the latter 
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stress derives from things that matter and 
the former stress does not. 

The obvious solution to these potential 
problems is to avoid analogue studies, when 
possible. But when it is not possible to do 
so, as is so often the case, what then? In that 
event, one must document, to the extent pos- 
sible, the nature of the relationship between 
the analogue and the real world. The burden 
of proof of the relevance of an analogue to 
the natural environment is on the investiga- 
tor, not his or her audience! 

Experimenter bias. By now virtually every 
psychologist knows of the pitfalls of experi- 
menter bias, known to many as the “Rosen- 
thal effect” (Rosenthal, 1966). In essence, 
experimenter bias refers to the unintentional 
bias that experimenters bring to their inter- 
actions with experimental subjects that can, 
in some cases, affect data. 

Although it is doubtful that experimenter 
bias plays a more serious role in research on 
alcoholism and drug dependence than it does 
elsewhere, it can—and does—play a role that 
must be anticipated. Examples include the 
following: (a) In studies of free ad libitum 
drinking, especially those taking place in 
controlled laboratory settings, it is entirely 
possible for staff unintentionally to bias sub- 
jects toward either more or less drinking. This 
possibility is so real that we actively seek to 
reduce interaction between drinking subjects ” 
and staff to an absolute minimum in our own — 
laboratory drinking studies. (b) In studies 
in which the effects of stressors on drinking 
or of alcohol on response to stressors is the 
focus, the investigator must ensure that the 
effects of the stressor derive from its real im- 
pact, not from his/her unintentional con- 
veyance of expectations about those effects: 
In such studies, postexperimental question- 
naires must affirm that the subject did per 
ceive the stress stimulus as a stressor. (c) In 
studies in which alcohol’s effect on other be 
haviors is being examined (e.g. its impact a 
Projective responses, interpersonal facility, a 
psychomotor behavior), it is just as fae 
to ensure that hypothesized effects are ™ e 
translated into observed effects, because 6 
experimenters betray their i nacre) ih 
drinking subjects, To this end, carefu a 
quiry at the conclusion of the study 35 
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subjects’ perception of its intent is frequently 
most enlightening. 

The demography of alcoholism and drug 
dependence. As emphasized above, it is cru- 
tially important to control for variables other 
than variety of alcoholism or drug dependence 
in comparative treatment studies. Similar 
rontrols—or, at the least, awareness of the 
importance of such variables as age, sex, race, 
and socioeconomic status when experimental 
ontrols are impossible—are also necessary in 
studies exploring the impact of alcoholism on 
behavior. To fail to account for these vari- 
ables is to run the risk of attributing to al- 
holism or drug abuse responsibility for a 
articular behavior, a Rorschach percept, a 
Minnesota Multiphasic Personality Inventory 
(MMPI) or Wechsler Adult Intelligence 
Scale response pattern—or a treatment out- 
come—when, in fact, these behaviors may 
derive instead from the complex of ethnic, 
tducational, or socioeconomic factors asso- 
‘dated with alcoholism or drug dependence. 


Results 


On reporting results. It is tempting, when 
feporting the results of a study, to be optim- 
itic about their significance, even @ bit gran- 
diose about their staying power. Human 
Nature being what it is, it is hard to be overly 
Critical of the natural human tendency to see 
in one’s own efforts what one might not see 
in those of others, 

a On the other hand, although overestimation 
the significance of a new nodule on the 
eet of a personality theory can be toler- 
, misleading, immodest, or unrealistic re- 
“ig of the results of comparative treat- 
om pales goes beyond bad form. When 
inated. persons unequipped to separate an 
ilit ed claim from a barely significant proba- 
ii y value, such reporting could result in the 
application of unproven methods. 
ile a Point we wish to make here is a sim- 
ie. It is incumbent on the serious re- 
aly Pe report data parsimoniously, mod- 
eatin completely, even when data in their 
tive ae more confusing and less un- 

a “N interpretable than data that have 

O E pruned. Although improper 

tion of data is a snare for every Te- 
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searcher, it is particularly so in fields such as 
alcoholism and drug dependence. In these 
fields, little is known; the stakes are high for 
persons who develop viable theories of etiol- 
ogy or useful therapeutic approaches; and 
rapid, independent confirmation or discon- 
firmation of new findings is difficult. As a 
consequence, a series of revolutionary findings 
have swept the field, including several genetic 
theories of etiology, a variety of new, “sure 
fire” diagnostic methods, novel behavioral 
treatment procedures that do not require 
abstinence, and innovative prevention pro- 
grams guaranteed to reduce prevalence of 
these disorders. Unfortunately, like similar 
claims made of new discoveries about the 
schizophrenias in the late 1950s and early 
1960s, the light of day—and independent 
replication—has_ either diminished or dis- 
proved most of these epochal findings. 

Although Job was not necessarily speaking 
of psychological science when he concluded 
that there is nothing new under the sun, it 
might be wisest to heed his words until, or 
unless, independent confirmation supports 
your Nobel-prize-winning discovery, espe- 
cially if you work in alcoholism or drug de- 
pendence! 

Clinical versus statistical significance. It 
is as possible to overinterpret statistically 
significant as marginally or wholly insignifi- 
cant differences among groups, because sta- 
tistically significant differences may fail to 
achieve clinical significance. Although not 
conclusive, the following list includes three 
instances: (a) Psychological test patterns that 
differentiate groups at .05 or 01 levels rarely 
prove helpful in differentiating for diagnostic 
purposes. For example, alcoholics and drug 
abusers often differ from nonpsychiatric con- 
trols or unselected psychiatric patients by 
scoring higher on the Depression, Psycho- 
pathic Deviate, and Mania subscales of the 
MMPI; to draw similar diagnostic distinc- 
tions on the basis of symptoms of depression, 
mania, and psychopathic behavior alone, how- 
ever, is rarely sufficient. (b) Statistically sig- 
nificant differences in the efficacy of one 
treatment approach over another, no matter 
what the level of difference in efficacy or how 
it was judged, are inadequate bases for se- 
lection of treatment unless or until cross- 
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validation of the promising approach, with 
different therapists and different patient 
groups and in different settings, confirms the 
general utility of the technique. Electrical 
aversion, a promising behavioral technique 
that made excellent theoretical sense and 
showed exciting promise when first applied, 
has since proven itself to be of little value as 
a therapeutic technique on extensive cross- 
validation. Methadone maintenance may be 
in the midst of suffering the same ignominious 
fate. (c) Studies demonstrating the signifi- 
cant impact of a specific etiologic factor on 
a specific disorder do not, perforce, prove 
that every patient possessed of that factor 
will develop the resultant disorder or that 
every patient carrying that diagnostic label 
can'point to that etiologic factor, In the case 
of alcoholism, for example, recent research 
by Goodwin and Guze (1974) strongly sug- 
gests the role of genetic factors in the etiology 
of alcoholism. Significantly more Danish chil- 
dren of alcoholic parents given up for adop- 
tion developed alcoholism than did children 
of nonalcoholic parents. But despite this sig- 
nificant difference in outcome, the contribu- 
tion of the genetic factor was only partial: 
Some children of nonalcoholic natural parents 
became alcoholics as adults; most children of 
alcoholic parents successfully avoided alcohol- 
ism, 

Do statistically significant differences among 
groups ever have significance clinically? In 
our experience, such differences have clinical 
significance only when all or most of one 
group shows one 
another does so; 


research designs permitting identificati 

the “active ingredients” 8 fication of 
regimens capable of modifying a gi 

behavior every time it (o 8 a given target 


Discussion 


Issues of Seneralization. Thro 
f y. ughout thi 
article, we have urged the reader a adopt à 
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conservative, essentially modest, approach to 
his/her data and to claims or conclusions de- 
riving therefrom. For the same reasons, we 
wish to suggest a spirit of modesty in the 
Discussion section of a research paper, Two 


essential features of this attitude are worth | 


emphasizing: (a) Discussion of results might 
well begin with a section that details limits 
on generalization from the data reported. If 
only males over the age of 40 were studied 
and prior data suggest that the problem under 
investigation is not limited to that group, it 
is well to point out that conclusions from the 
research may refer only to older males, per- 
haps even older males from the geographic or 
sociocultural group studied (if those variables 
might also affect generalizability). (b) Even 
in the absence of specific methodological con- 
straints on generalizability, it is wisest and 
most sound to limit the extent to which one 
lays claim for the widespread or universal 
relevance of one’s findings. A host of varia- 
bles specific to one’s subject sample, proce- 
dure, or data analysis, unknown or undetected 
during the research, could later come back to 


haunt one. The classic example of such an | 


embarrassment, of course, is the unfortunate 
team of biochemists who reported discovery 
of a “schizophrenic” blood portion, only to 
have to report some years later that their 
sample of schizophrenics had consumed s0 
much coffee that the coffee’s metabolic re- 
siduals had stamped their subjects’ plasma as 
abnormal! 

Bias in discussion. Beyond overblown 
claims of the merits of a new therapeutic ap- 
proach or unwarranted generalizations beyond 
sample populations, the most common prob- 
lem arising from discussion of alcoholism and 
drug dependence data comes when the author 
does not discuss data in terms of prior Te- 
search on both sides of his/her position. 

Although this approach to science ee 
unthinkable, it is surprisingly easy to adon 
a position, accept it fully, then view as Yr 
done and careful only work that supports tha 
position. To this end, one of the most a 
ished, widely held views by alcoholism wor ; 
ers has long been that the only mami 
goal for alcoholics can be abstinence. T 
position has continued to keep its suppor En 
of whom there are many, from more recen 


a suggesting that conventional, abstinence- 
nted treatment for alcoholism is relatively 
neffective (Pattison et al., 1977) and that 
me alcoholics do return to a controlled pat- 
ln of drinking either after treatment or in 
f absence (Pomerleau, Pertschuk, Adkins, 
Í Brady, in press; Sobell & Sobell, 1976). 
fnetheless, one reads startlingly ad homi- 
im criticisms of researchers who have ac- 
ftowledged the possibility that the new data 
controlled drinking might justify a change 
thinking. A similar lack of objectivity 
racterizes those who hold to the belief that 
cholism is a medical disease, wholly or 
gly, despite evidence to the effect that 
Mial-learning mechanisms, sociocultural in- 
ences, and psychological phenomena also 
y important etiologic roles in the disorder. 


Reference Note 


l Homer, A. L, & Ross, S. M. The reliability and 
| validity of interview data from drug and alcohol 
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Methodological Issues in Research 
with Correctional Populations 


N. Dickon Reppucci and W. Glenn Clingempeel 
University of Virginia 


tions: 


a) the intrusion of values; (b) problems with trait-derived methodologies 


in correctional population research, including inadequate construct validity, ig- 
noring environmental influences and ignoring subjects; (c) naturalness and the 


problem 


of external validity; (d) reci 


divism; and (e) ethics. General recom- 


mendations are offered at the conclusion of each section. 


The purpose of the present article is to dis- 
Mss “methodological issues” confronting psy- 
: ologists who conduct research with cor- 
tional populctions. In the selection of is- 
Wes for discussion—a necessarily value-laden 


7 


e issues, although not entirely excluding 
ai is based on the assumption that 
= atter problems are generally given cen- 
ee in methodological critiques of psy- 
Bi research and that another similar 
nt would be less informative than the 
oe attempt. The accuracy of this judg- 
tis left to the reader. 
Th attempting to delimit this critique, sev- 


Re 
ae for reprints should be sent to N. Dickon 
Universit Department of Psychology, Gilmer Hall, 
y of Virginia, Charlottesville, Virginia 22901. 
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Meth »dological issues confronting psychologists who conduct research with cor- 
rectional populations are examined. The article is divided into five major sec- 


Copyright 1978 by the American Psychological ‘Association, Inc. 0022-006X, 


eral constraints and qualifications should be 
underscored: 

1. The problems discussed are not neces- 
sarily unique to research with correctional 
populations. Many, if not all, issues could just 
as easily be raised with regard to other in- 
voluntarily incarcerated populations, for ex- 
ample, mental patients. Indeed, some of the 
issues are salient for much of psychological 
research, in general, and for personality re- 
search with “clinical” populations, in par- 
ticular. 

2. Since there are recent excellent reviews 
of methodological problems of behavior modi- 
fication and delinquency studies (e.g, David- 
son & Seidman, 1974; Emery & Marholin, 
1977) and of Jarge-scale program evaluation 
research in correctional settings (e.g, Adams, 
1974; Glaser, 1974; Sarri & Selo, 1974), 
these areas are not considered. 

3. Only research actually conducted with 
subjects residing in correctional settings was 
examined. This constraint allowed exclusion 
of those studies focusing on experimental in- 
terventions with offenders subsequent to their 
release from a correctional facility. 

4. The data for deriving and elucidating 
the issues discussed in this article are in no 
way all inclusive, being largely based on only 
two sources: (a) a scrutiny of unpublished 
articles reviewed by the first author over the 
past 5 years in his role as a consulting edi- 
tor for the Journal of Consulting and Clinical 
Psychology (JCCP) and (b) an intensive 
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Table 1 


Studies Using Correctional Populations Published in the Journal of Consulting and Clinical 
Psychology (JCCP) and the Journal of Abnormal Psychology (JAP) During the Period 


January 1967-August 1977 


aaaaiňțIo 


Type 


JCCP 


JAP 


Personality discrimination 
Correctional 
subgroup model* 


Noncorrectional 


control model? 


Treatment evaluation 


Adjustment prediction 


Test construction 


Miscellaneous® 


Blackburn, 1969; Borkovec, 1970; 
Erikson & Roberts, 1971; 
Ganzer & Sarason, 1973; 
McGuire & Megargee, 1974; 
Roberts, Erikson, Riddle, & 
Bacon, 1974; Moss, Hosford, 
Anderson, & Petracca, 1977 


Bixenstine & Buterbaugh, 1967; 
Stein, Sarbin, & Kulik, 1968; 
Kulik, Stein, & Sarbin, 1968a; 
Kulik, Stein, & Sarbin, 1968b; 
Landau, 1976 


Schwitzgebel, 1969; Sowles & Gil 
1970; Jesness, 1975 


, 


Cowden & Pacht, 1967 


Armentrout & Rouzer, 1970; 
Grisso, 1975; McCandless, 
Persons, & Roberts, 1972; 
Gunn & Gristwood, 1975 


Hare, 1968; Mosher, Mortimer, & 
Grebel, 1968; Oliver & Mosher, 
1968; Cohen, Seghorn, & 
Calmas, 1969; Skrzypek, 1969; 
Roberts & Erikson, 1969; 
Orris, 1969; Schmid, 1970; 
Stewart, 1972; Kercher & 
Walker, 1973; Davids & 
Falkof, 1975 


Stewart & Resnick, 1970; 
Schlichter & Ratliff, 1971; 
Sutker, 1971; Hetherington, | 
Stouwie, & Ridberg, 1971; 
Jurkovic & Prentice, 1977 


Truax, Wargo, & Volksdorf, 
1970; Redfering, 1973 


Megargee, Cook, & Mendelsohn, 
1967; Carlson, 1972 


Roberts & Erikson, 1968; Ruma 
& Mosher, 1967; Persons & 
Marks, 1970; Rotenberg & 
Sarbin, 1971 


Total 20 


24 


G Subcategories of correctional populations 
bA correctional population is compared wi 
* Contains all studies that could not be un 


examination of all relevant studies published 
in JCCP and the Journal of Abnormal Psy- 
chology (JAP) during the last 10} years 
(January 1967 ~ August, 1977) (see Table 1)3 

5. Attention is focused on what we call 
the personality discrimination research para- 
digm, since, as is clear from Table 1, over 
60% of the published Studies examined were 
subsumed under this category. (‘The majority 
of unpubl ed studies submitted to JCCP 
also fell in this category.) In brief, this para- 
digm uses Population subtypes (eg., delin- 


dependent measures (e.g, 


em meas! im- 
pulsivity, delay of grati mses cof, itn 


fication, sensation 


(egu psychopathic and neurotic delinquents) are compared. 
th a noncorrectional population. 
equivocally classified in the other four categories. 


seeking, time perspective) those that dis- 
criminate between the groups. 4 

This article is divided into five major sec 
tions. The first section examines the intrusion 
of certain value assumptions on the researc! 
enterprise and the methodological stagnation 
that we believe has resulted from these 1- 
trusions. The second section confronts me! 
shortcomings fostered by trait-derived me i 
odologies, including the problem of inadequal 


intensive 
APA-spon- 
research af- 
d there- 
f this 


i These two journals were selected foi 
examination because they are the two 
sored journals most likely to publish 
ticles containing correctional populations ani 
fore of most potential interest to readers © 
journal. 


t validity, the problem of ignoring 
ntal influences, and the problem of 
subjects. The third section focuses 
mal validity problems stemming from 
ilure to incorporate naturalistic dimen- 
research designs. The fourth section 
me of the difficulties with the much 
divism measure. The final section 
a few comments on the ethics of re- 
with correctional populations. Each 
n is concluded with one or more rec- 
dations. 


The Intrusion of Values 


iggest that values pervade every stage 
research enterprise from choice of 
n and selection of research design to 
and interpretation of data is only 
at what numerous others have said 
(e.g, Allport, 1940, 1943; Hudson, 
Sarason, 1975). Yet the failure to con- 
‘the potential impact of implicit value 
Ptions on the research process is a ubi- 
is shortcoming endemic to much of so- 
nce research (Caplan & Nelson, 
“In this section, attention is focused 
lew specific conceptual-methodological 
ictions engendered by values pervading 
atch with correctional populations. 

m the review of studies with correctional 
(see Table 1), two major value pre- 
emerged—(a) the assumption of of- 
deficit and (b) the assumption of dis- 
ting traits. Although not exhaustive 
influences, these two assumptions 
x almost all of the research that was 
assumption of offender deficit is based 
he belief that something must be psycho- 
illy wrong with any person who has com- 
d'a crime (Brodsky, 1973). This as- 
tion appears to have contributed to much 
ch whose goal is the search for psycho- 

Or personality disorder in prisoners. 
t there is an overemphasis on dis- 
chopathology, and other negative 
tistics, Unfortunately, this emphasis 

es much behavioral science research 
tity and/or “clinical” groups (Cap- 
son, 1973; Jourard, 1968; West & 
1). Recently, Caplan and Nelson 
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(1973) demonstrated this preoccupation in 
psychological research with black Americans. 
They examined all studies of black Americans 
appearing in the first 6 months of the 1970 
Psychological Abstracts. In 82% of the stud- 
ies that they classified, the nature of the re- 
search suggested personal deficits as the cause 
of difficulties for black Americans, They con- 
cluded that 


the picture that emerges is one of psychologists in- 
vesting disproportionate amounts of time, funds, and 
energy in studies that lend themselves, directly or by 
implication, to interpreting the difficulties of black 
Americans in terms of personal shortcomings, (Cap- 
lan & Nelson, 1973, p. 204) 


The major point to be drawn from their 
analysis is the analogous one that the offender 
deficit assumption constricts the research 
questions asked and the methods used. Two 
major omissions, for example, of much of the 
psychological research on criminal behavior 
are (a) focusing on prisoner strengths, for 
example, studies of helping behavior or other 
positive characteristics, and (b) ignoring the 
potential impact of situational and environ- 
mental factors in precipitating the criminal 
act. These omissions are viewed as a by- 
product of the attribution of personal deficits. 
Tf the causes of delinquency are defined in 
negatively valenced, person-centered terms 
(e.g., impulsivity, inability to delay gratifica- 
tion), then the inclusion of positively valenced 
terms and situational or environmental factors 
in the research design is neglected. 

The second pervading value premise, de- 
rived from nomothetic personality theory, is 
the assumption of discriminating traits. This 


assumption suggests that a set of trait di- 
all persons and that 


mensions is applicable to i à 
individual differences are to be identified with 


different locations on these dimensions. More- 
over, on specific traits (e.g. impulsivity, ag- 
gressiveness), correctional subjects are pre- 
sumed to be distant enough on the same trait 
ble discrimination from non- 


continuum to ena! 0 
correctional subjects. A corollary assumption 


is that these discriminating characteristics are 
relatively enduring and cross-situationally con- 
sistent. It is implied that the rank order of 
individuals on any trait should remain the 
same in all situations. This assumption has 
given rise to methods (as in the personality 
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discrimination model) that ignore person— 
situation interactions and the advantages of 
idiographic strategies for personality assess- 
ment and prediction. Elaboration of these 
omissions is discussed elsewhere in this article 
(see The Problems of Ignoring Environments 
and Ignoring Subjects), At this point, it is 
necessary only to emphasize that in our opin- 
ion, conceptions of the personality of criminal 
offenders as internal behavior dispositions 
have resulted in a constriction of methods, a 
correspondingly reduced predictive utility, 
and incomplete conceptions of criminal be- 
havior, 


Recommendation 


Investigators should recognize that the 
value assumptions of offender deficit and dis- 
criminating traits underlie much of the psy- 
chological research with correctional popula- 
tions. Future research should focus on 
strengths as well as weaknesses and the de- 
velopment of experimental designs that do 
not neglect environmental influences and 
person-situation interactions, 


Problems with Trait-Derived Methodologies 
in Correctional Population Research 


Recently, 
have argued 


theory differences are inextricably linked to 
differences in 


As pointed out in the Previous section, trait 
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modes—is a testimony to the currently func. 


tional status of this Preoccupation. Despite 
its popularity in correctional research, the 
assumptions of the trait model have been 
frequently assailed. For example, an abun- 
dance of empirical evidence tending to refute 
the cross-situational consistency assumption 
has emerged from three sources (Endler & 
Magnusson, 1976)—(a) research (partition. 
ing the variance into person, situation, and 
petson-situation interaction components) that 


demonstrates that person-situation interac. 


tions account for more of the behavioral vari- 


ance than persons alone and situations alone 
(Bowers, 1973; Endler & Hunt, 1966; Endler 
& Magnusson, 1976; Mariotti & Paul, 1975; 
Moos, 1968, 1969; Rausch, Farbman, & 
Llewellyn, 1960); (b) correlational research 
Strategies in which correlations of same traits 
in different situations are found to be mod- 
erately low (Hartshorne & May, 1928; Mag- 
nusson, Heffer, & Nyman, 1968; Rushton, 
1976); and (c) studies (simultaneously in- 
cluding personality and situational variables 
as independent measures in the design) that 
reveal interaction effects rather than main 
effects as the true state of affairs (Baron, 
Cowan, Ganz, & McDonald, 1974; Cronbach 
& Snow, 1975). From a review of these and 
Other relevant studies, various investigators 
have concluded that there is a paucity of em- 
pirical support for the assumption of cross- 
Situational consistencies of behavior (Argyle 
& Little, 1972; Endler, 1973, 1975a, 1975b; 
Endler & Magnusson, 1976; Mischel, 1973, 
1977). With the acceptance of this conclu- 
sion, the task in this section is to focus Spee 
cifically on the limitations of trait-derived 
methodologies as they have been used in per- 
sonality research with correctional popula- 
tions. Three of these problems, construct 
validity, ignoring environments, and ignoring 
subjects, are examined in the following dis- 
cussion. 


The Problem of Construct Validity 


One salient problem that emerges from ne 
trait-derived personality discrimination stu : 
ies is the lack of attention given to ane 
validity issues (Cronbach & Meehl, 195 i 
Messick (1975) defined construct validity 


é a a a 
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the process of marshaling evidence in the 
form of theoretically relevant empirical rela- 
fins to support the inference that an ob- 
sved response consistency has a particular 
meaning” (p. 955). Campbell (1960) outlined 
iwo requirements of construct validity, which 
he labeled trait and mnomological validity. 

ait validity refers to the requirement that 
the domain of observables that purportedly 
measure a particular construct must demon- 
trate internal consistency when put to the 
tmpirical test. This means that measures of 
fhe same construct must correlate highly with 
ne another in individual difference studies 
md/or be similarly affected by experimental 
lteatments. Nomological validity refers to the 
fequirement that the measures of one con- 
ruct behave in accordance with the network 
tf relations to other constructs as derived 
fom a formal theoretical system. 

The trait validity requirement suggests that 
two or more measures of a personality dimen- 
ion (e.g., impulsivity) should be substan- 
lilly interrelated. For example, if institution- 
‘lized delinquents were to exhibit more im- 
þulsivity than nondelinquents on the Barratt 
Impulsivity Scale (Barratt, 1959), then evi- 
tence for construct validity would be obtained 
it this greater impulsivity also manifested it- 
lf on a second measure (e.g., the Hirschfield 
Inpulsivity Scale; Hirschfield, 1965). In ad- 

tion, both measures of impulsivity should 

have similarly with regard to discriminat- 
Ng a delinquent from a nondelinquent sample. 
Construct validity could be questioned if the 
poe sample scored significantly differ- 
from the nondelinquent sample on the 
Sr Scale but no dieng 
verse findi i w 

‘dd ay ing emerged on the Hir: 

Although it is important that measures of 
ch Pe construct exhibit a convergence and 
= 44 Similarly, Campbell and Fiske (1959), 

“Classic, albeit often ignored, article, main- 
that construct validation requires evi- 
ae discriminant as well as convergent 
is Sq Evidence for discriminant validity 
Tare i, led when measures of one construct 

Messina to have weak interrelations with 
a of distinctly different constructs. 

: g the discriminant validity require- 
» evidence for construct validity would 
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be weakened if personality measures of one 
dimension (e.g., impulsivity) correlated too 
highly with personality measures of another 
(e.g., nurturance). In the articles examined, 
the concept of discriminant validity was never 
mentioned. Clearly, a problem with the re- 
search that seeks to determine the extent to 
which correctional subjects differ from non- 
correctional subjects on some construct di- 
mension is the failure to use multiple mea- 
sures of both the focal construct and different 
constructs. As such, there tends to be little 
evidence for either convergent or discriminant 
validity in most studies of correctional 
populations. 

A second construct validity problem of the 
personality discrimination studies hinges on 
the failure to take into account that each 
measure of a construct represents a trait and 
a method unit. The fact that two measures 
of impulsivity are interrelated may reflect 
only that they were measured by the same 
method (e.g., self-report) and not that they 
“tap” the same construct. Moreover, the re- 
liance on such methods as self-report ques- 
tionnaires and rating scales in the personality 
discrimination studies suggests the possibility 
that the observed consistencies may be more 
in response to method constraints than true 
differences on the construct. It is possible, for 
example, that an “oppositional” or “devia- 
tion” response set in which correctional popu- 
lations choose responses that they believe 
differ from expectations could account for 
consistent performances on self-report ques- 
tionnaires. Campbell and Fiske (1959) pro- 
posed the multitrait-multimethod matrix as a 
solution to both the convergent—discriminant 
validity problem and the problem of method- 
trait confounds. In this triangulation method, 
two or more traits are correlated with two or 
more methods. This procedure would suggest, 
for example, that the two traits, impulsivity 
and hostility, be measured by at least two 
methods (e.g., self-report and behavioral ob- 
servations). Evidence for convergent validity 
would be obtained if the correlation of the 
same trait (e.g. impulsivity) by different 
methods (e.g., self-report and behavioral ob- 
servation) was high. Evidence for discrimi- 
nant validity would be obtained if (a) the 
same trait/different method (e.g., impulsivity 
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by self-report with impulsivity by behavioral 
observations) correlations were higher than 
different trait/different method (e.g., impul- 
sivity by self-report with nurturance by be- 
havioral observations) correlations, and, most 
importantly, (b) the same trait/different 
method correlations should also be higher 
than different trait/same method (e.g., im- 
pulsivity by self-report with nurturance by 
self-report) correlations. In effect, evidence 
for construct validity is obtained when mea- 
sures with only trait variance in common 
exhibit higher correlations with each other 
than measures with common method variance 
and measures with neither trait nor method 
variance in common. 

In summary, multitrait-multimethod tech- 
niques have seldom been used in research 
with correctional populations. As a result the 
construct validity of those measures in cur- 
rent use for discriminating correctional from 
noncorrectional populations remains uncertain 
at best. Although recognizing that evidence 
for construct validity is obtained when mea- 
sures of a particular construct consistently 
discriminate between population groups pre- 
sumed different on that construct, we contend 
that certain facets of the construct validation 
process have been paid relatively little at- 
tention. It is perhaps worth noting that in 
one study (Saunders, Reppucci, & Sarata, 
1973) that attempted to marshal evidence for 
the construct validation of impulsivity as a 
trait characterizing delinquents, the investiga- 
tors concluded that there was no evidence for 
the hypothesis that delinquent behavior js 
related to impulsivity or that existing mea- 
sures of impulsivity tap the same dimension, 
oe the process of construct validation 
1s a time-consuming venture, requirin, i- 
ple studies and the gradual eae EER 
of circumstantial evidence around a construct, 
its importance should not be underestimated. 
Personality labels that are supposedly de 
scriptive of inherent traits of correctional 
populations often contribute to decisions re- 
garding incarceration and to the development 
of intervention programs. 


Recommendation 


Increased attention in i 
f : Personality research 
with correctional Populations should be given 
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to the construct validation process, As a 
minimum requirement, research designs should 
include multiple measures of each attribute 
or construct under investigation with those 
measures reflecting at least two different meth- 
ods. Correlations among measures should be 
obtained. 


The Problem of Ignoring Environments 


The main criticism of the trait model has 
been its empirical neglect of situational fac- 
tors as they affect an individuals’ behavior 
(Endler & Magnusson, 1976). Consistent 
with this contention, the frequent use of 
trait-derived methodologies in correctional 
population research has resulted in the ignor- _ 
ing of possible interactive influences of en- 
vironmental and situational factors in re- 
search designs. Significant predictive and gen- 
eralizability limitations have been the conse- 
quence. This “problem of ignoring environ- 
ments” has manifested itself in at least three l 
forms in the personality discrimination studies © 
and one form in the adjustment prediction — 
studies. 

First, with an assumption of cross-situa- 
tional consistency ostensibly prevailing, per- 
sonality discrimination studies have moni- 
tored a narrow universe of response modes in 
a narrow universe of situations. In a typical 
study, for example, subjects consisting of 
either matched groups of correctional popu- 
lation subtypes (e.g., neurotic and psycho- 
pathic delinquents) or a correctional group 
and a noninstitutional nonoffender control 
(e.g., high school students) are administered 
individually or in small groups an array ©} 
self-report questionnaires, Any differences 
that emerge between the groups are viewed 
as relatively stable and situation-free pel 
sonality characteristics. The possible differ- 
ential influences of situational factors are not 
considered. This holding of situations COP- 
stant within and between studies has sua 
tained a bias toward proving the trait notion 
(therefore increasing the likelihood of post- 
tive results). 7 

Although incarcerated delinquents may, 1” 
fact, prefer immediate over delayed gratifica- 
tion more so than nonincarcerated seas 
quents as assessed by responses to hypothe 


RESEARCH WITH CORRECTIONAL POPULATIONS 


cal questions, the relative position of these 
groups on the delay of gratification continuum 
may shift as situational factors change (e.g., 
as the delinquents are released). In fact, 
Mischel (1973) has pointed out that to pre- 
lict a subject’s delay of gratification behav- 
jor, one may need to know such moderator 
variables as how old he/she is, the experi- 
Imenter’s sex, the particular objects for which 
lhe subject is waiting, and the consequences 
lof not waiting. Given the possible influences 
of similar situational variables and their po- 
(ential interaction with personality variables, 
ithe predictive utility or generalizability of 
[pod discriminators may be significantly im- 
aired.2 What is neglected in trait-derived 
methodologies is the attempt to answer the 
question, How are persons who end up in 
prisons different from each other and from 
Itoncorrectional populations with regard to 
specific personality characteristics as mea- 
sured across specific situations? Clearly, then, 
bo determine Personality X Situation interac- 
| tions, there is a need to incorporate both 
situational and personality variables as inde- 
pendent measures in research designs. 
A second form of ignoring environments 
| ters on the inadequate description of the 
f ‘perimental situation itself. The typical 
Personality discrimination study cursorily 
Mentions that an experimenter (whose charac- 
lstistics are not described) administered a 
[ries of questionnaires (or tasks) in one or 
pore settings (which are also usually not de- 
a to two or more groups of subjects. 
a internal and external validity problems 
oo rae The internal validity problem is 
an e microsetting of the experiment, 1n- 
es the personal characteristics of the 
J ihe rimenter, may interact differently with 
eoe and significantly influence any ob- 
ity wr Ee differences. The external valid- 
oi em is that inadequate descriptions 
r| “ne generalizability across experimenters, 
On ons, and tasks a most hazardous business. 
ihe area of inadequate description focuses 
Gate experimenter. Although various experi- 
A characteristics have been shown to 
bje Spake effects on the behavior of 
A ai experimental situations (€.g., Bar- 
ae: ilver, 1968; Rosenthal, 1963, 1968), 
uigan’s (1963) characterization of the 


on 


733 


experimenter as “the neglected stimulus ob- 
ject” certainly holds for the studies reviewed 
for this article. The fact that such basic ex- 
perimenter characteristics as sex, age, and 
attractiveness may differentially affect the 
performance of incarcerated versus nonincar- 
cerated males and/or females is simply not 
taken into account. In only 1 of the 28 pub- 
lished personality discrimination studies was 
there any description of the experimenter. 
This prevailed despite the fact that in 19 
of these studies, subjects were evaluated in- 
dividually—a situation that permits poten- 
tially greater experimenter impact. Moreover, 
although 18 of these studies had multiple 
authors, which ‘presumably means a high 
likelihood of multiple experimenters (Mc- 
Guigan, 1963), no studies attempted to in- 
vestigate experimenter characteristics as an 
independent variable. This state of affairs 
seems increasingly dubious in view of a recent 
wave of investigations demonstrating that dif- 
ferent experimenters obtain significantly dif- 
ferent results with different subjects (Ru- 
menik, Capasso, & Hendrick, 1977; Silverman, 
1974; Silverman, Shulman, & Wiesenthal, 
1972). 

Neglecting the effect of experimenter char- 
acteristics engenders an internal validity prob- 
lem; namely, the same or different experi- 
menters may affect the performance of sub- 
jects in different groups in different ways. 
Stewart and Resnick’s (1970) verbal condi- 
tioning study illustrates how this difficulty 
might manifest itself. These authors compared 
33 incarcerated delinquent males with a con- 
trol group of 33 boys from a nearby high 
school on a task measuring conditionability 
of aggressive and dependency verbs, All sub- 
jects were tested individually by an experi- 
menter (or experimenters) who was (were) 
not described. In accordance with predictions, 
the delinquent group rejected the dependency 
verbs more so than did the nondelinquent 
group. The authors concluded that delin- 


2 A particular measure may reveal 
cant difference between groups (and thus be a 
discriminator) and yet 
fore, the discrimination-to-pre: 
occur after empirical investigations 
the predictive utility of good discriminators. 


good 
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quents found it more difficult to express de- 
pendency behavior than nondelinquents. Al- 
though not necessarily an unreasonable con- 
clusion given a recognition of generalizability 
limitations, it is nonetheless quite possible 
that undescribed experimenter (s) characteris- 
tics were a (or the) major contribution to 
performance differences between the two 
groups. For example, depending on the ex- 
perimenter’s age, sex, and perceived status, 
he or she may have engendered less (or more) 
cooperation from the incarcerated delinquent 
group than from the high school group. Thus, 
differences between the groups may have re- 
sulted from the type of experimenter (s) used 
rather than because of personality differences 
between the boys. The point is that we do 
not know, because no information regarding 
the effect of experimenter characteristics was 
provided. 

Experimenter neglect also constitutes an 
external validity, generalizability of findings, 
problem. Levine (1974) captured the crux of 
this concern: 


The generalization possible from any experiment is 
limited to the set of conditions that includes a defi- 
nition of that universe of experimenters and their 
behavior, of which the experimenters used in the 
experiment are a sample taken according to a well- 
defined principle of sampling. (p. 663) 


Beyond the ignoring of possible experi- 
menter effects, there tends to be both inade- 
quate observation and description of the 
microculture of the experimental situation. 
The physical characteristics of the testing 


of experi- 
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develop studies that compare different subject, 
types across different situations and that use 
more naturalistic, nonreactive measures, While 
realizing the impossibility of incorporating a 
multitude of potential moderator variables 
into any single experimental design, we sug- 
gest that a more ethological perspective—with 
a concomitant sharpening of observational 
and descriptive skills—may be useful in the 
understanding of inconsistencies and the pos- 
sible influences of higher order interaction 
effects, 

The primary concern is that researchers 
avoid misinterpreting Morgan’s canon. As 
Cronbach (1975) stated: 


From Occam to Lloyd Morgan, the canon has re- 
ferred to parsimony in theorizing, not in observing. 
The theorist performs a dramatists’ function; if a 
plot with a few characters will tell the story, it is 
more satisfying than one with a crowded stage. But 
the observer should be a journalist, not a dramatist, 
To suppress a variation that might not recur is bad 
observing. (p. 124) 


The ignoring of the impact of correctional 
institutions on the behavior of offenders sig- 
nifies a third form of environmental neglect. 
This neglect spawns two kinds of problems 
for personality research with correctional 
populations. The first problem centers on the 
failure in studies which search for discrimi- 
nating personality characteristics of oriminal 
Offenders to extricate the effects of institu- 
tionalization from the effects attributable to 
offender characteristics. The subset of per- 
sonality discrimination studies that compares 
institutional and noninstitutional populations 
illustrates this problem. In only 1 (Landau, 
1976) of the 10 published studies was there 
any attempt to control for the impact of in- 
stitutionalization (independent of subject 
characteristics) on performance of the depen- 
dent measures, 

Thus, differences that have emerged be- 
tween correctional and noncorrectional popu- 
lations may be more a function of the im- 
pact of institutionalization than of dipaer ? 
between the groups on particular paonon 
characteristics. Certainly commentaries 4 
the possible profound effects of institutional 
environments on behavior (e.g. het li 
1961; Zimbardo, 1973) alert us to this li e 
hood. Landau (1976), in a study of time © 
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ation, recently attempted to remedy this 
blem. He included subject groups of in- 
itutionalized delinquents, noninstitutional- 
delinquents (delinquents on probation), 
titutionalized nondelinquents (soldiers), 
 noninstitutionalized nondelinquents (stu- 
ts in a vocational training school). He 
controlled for length of institutionaliza- 
on for the incarcerated delinquents and the 
loldiers, The results provided strong evidence 
the independent effect of institutionaliza- 
on on various aspects of past, present, and 
ture time perspective. Although (as might 
fe expected) the army proved to be a milder 
lnm of institutionalization than prison, the 
tiect of institutionalization was nevertheless 
ubstantial, Landau (1976) concluded that 


itis study provides additional evidence of the strong 
a measurable effect of institutionalization on vari- 
ols aspects of time orientation. This should be con- 
dered as a significant example of the possible ef- 
t that salient situational factors may have on 
tharacteristics and behaviors that are too frequently 
tonsidered as stable and situation-free personality 
laits. (p. 757) 


The second problem associated with the 
ioring of institutionalization effects is the 
btoblem of inducing comparability of research 
Indings conducted in different correctional 
institutions. The problem is accentuated by 
tmpitical evidence suggesting that correc- 
tonal institutions differ on a number of di- 
N (e.g, the treatment-security con- 
ei, and that these differences are re- 
Best in their effects on inmates (Moos, 1975; 
ble 4 Vinter, & Perrow, 1966). For exam- 
w treet et al.’s (1966) comparative eval- 
E of six correctional institutions revealed 
institutional differences in treatment 
foals and philosophies were significantly re- 
E inmate attitudes toward the- total 
Ee staff, other inmates, and themselves. 
a lons with different orientations (e.g. 
Piment and obedience, reeducation and devel- 
tects | pe treatment) clearly had different 
ae ie their inmates. Moreover, although 
ty char the variance could be accounted for 
‘the j ‘acteristics of the inmate populations, 
“ang dependent effects of the organizations 
. different orientations were pro- 
institutio Additional evidence of correctional 
on diversity comes from Moos’s 
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(1975) cluster analytic derivation of six dis- 
tinct institution types vis-à-vis the Correc- 
tional Institutions Environmental Scale 
(CIES). The major point is that regardless 
of whether the CIES, structural-organizational 
assessments, and/or alternative institutional 
description methods are used, institutional 
description strategies of some form are re- 
quired if researchers earnestly seek compara- 
bility of research findings obtained in different 
correctional institutions. 

Furthermore, it should be emphasized that 
correctional institutions are not homogeneous 
settings but contain a diversity of subsettings 
or situations, some of which are common 
across institutions, whereas others may be 
unique to a specific institution. Therefore, 
future researchers should elucidate and di- 
mensionalize the crucial situations within and 
between correctional institutions. With re- 
gard to methods, Magnusson and Ekeham- 
mar’s (1975) suggestion that we study situa- 
tion perception (the meaning that an individ- 
ual assigns to a situation), situation reaction 
(based on individual’s responses to situa- 
tions), and the relationship between these 
dimensions seems particularly worthy of ad- 
herence. The Person X Situation interaction 
studies could then take the form of more 
naturalistic studies of person subtypes across 
actual situations within the institution, An 
important caveat is in order. The study of 
Person X Situation interactions within cor- 
rectional institutions will not likely give us 
much information about Person X Situation 
interactions outside of the institutional set- 
ting. Obviously, relative to the extrainstitu- 
tional environment, in prisons the number of 
available situations as well as the freedom to 
choose among and capacity to change such 
situations is markedly delimited. Thus, there 
should be no expectation that research with 
incarcerated offenders will shed much light 
on the question of which specific Person x 
Situation interactions outside. the prison will 
be likely to result in the commission of which 
specific law-violative behaviors. i 

The major form of ignoring the environ- 
ment in the adjustment prediction studies * 


8 Adjustment prediction studies contain an array 
of personality, demographic, and institutional per- 
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has been the lack of attention paid to char- 
acteristics of the postinstitutional environ- 
ment to which the correctional subject returns. 
Typically, the array of predictors of recidi- 
vism and postinstitutional adjustment have 
consisted of demographic, personality, and 
institutional adjustment measures. The avail- 
ability of formal support systems (e.g., jobs, 
recreational facilities, financial resources, edu- 
cational opportunities) and informal social 
networks (e.g., family, extended kin, friends) , 
both of which may have a significant impact 
on the process of postinstitutional adjustment 
(see, e.g., McArthur, 1974), are seldom in- 
cluded in the prediction equation. Recently, 
Mischel (1977) has argued in a correspond- 
ing situation with mental patients that “ac- 
curate predictions of posthospital adjustment 
require knowledge of the environment in 
which the ex-patient will be living . . . rather 
than any measured person variables or in- 
hospital behavior” (p. 251), 

Perhaps the best example of the inability 
of the typical psychological and demographic 
data to predict the future behavior of pris- 
oners and others is found in the studies of 
the prediction of violent behavior (Megargee, 
1970; Monahan, 1976; Wenk, Robison, & 
Smith, 1972). Not only has univariate anal- 
ysis proved woefully inadequate, but there is 
also evidence to suggest that combining a 
multitude of person-centered predictors using 
sophisticated multivariate analyses is insuf- 
ficient (Monahan, 1976). Most studies indi- 
cate a false positive prediction rate of over 
80%, and the most accurate prediction of true 
positives (46%) still mispredicted 54% of 
the subjects (Monahan, 1976). In addition, 
Heller and Monahan (1977) pointed out that 
“the studies that report the highest percen- 
tage of true positives are generally those with 
the weakest methodologies” (p. 137). With 
the prediction difficulties encumbering person- 
centered approaches to violence prediction 
recommendations have surfaced that ecologi- 
cal evaluations (€g., predicting which situa- 


formance variables in an atte 
‘Mmpt to determine th 
best predictors (through multiple regression analysis) 


of such criterion variabl j 
and E ae es as postrelease adjustment 
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tions will elicit violent behavior) replace the! 
more common individual-oriented strategies. 


Recommendations 


1. Psychologists should adopt a cautious 
approach to inferring behavioral consistency 
based on responses to a specific task under a 
specific set of experimental conditions, Ade- 
quate descriptions of the experimental situa- 
tion should be provided. When possible, mul- 
tiple situations should be incorporated into 
the research design as independent variables. 
Data analysis should account for Person X 
Situation interactions. 

2. Adequate descriptions of the experi- 
menter(s) should be provided. When possi- 
ble, multiple experimenters should be incor- 
porated into the research design as indepen- 
dent variables. 

3. The factor of institutionalization should 
be controlled for in the experimental design. 
In addition, institutional description strate- 
gies (e.g., structural—organizational assess- 
ments) should be used to facilitate compari- 
sons of correctional institutions and their 
differential effects. Future research should 
begin to dimensionalize the important situa- 
tional variables within and between correc- 
tional institutions and to incorporate these 
variables in Person X Situation studies. Fi- 
nally, psychologists should proceed with ex- 
treme caution, if at all, in drawing conclusions 
about extrainstitutional Person X Situation 
interactions from interaction studies Con- 
ducted within the correctional environment. 

4. Ecological evaluations that incorporat 
measures of the postinstitutional environmen 
of prisoners, such as existing support syste 
and social networks, should become a stan 
dard part of adjustment prediction studies. 


4 


The Problem of Ignoring Subjects 


A third pervasive problem of trait-derived 

methodologies is the problem of ignoring su 
jects. Partially as a result of the pane 
tion with the nomothetic assumption Oe i 
set of trait dimensions is universally ap 
ble and that all individuals can be identi x 
with different locations of these R RA 
the correctional subject (as most others) 
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ben neglected as an expert on his/her own 
behavior. 

In not one of the studies listed in Table 1 
vere the subjects asked about their percep- 
tions of, attitudes toward, or behavior in the 
esearch in which they were participants. Yet, 
subjects, in most cases are not passive re- 
fipients, even if cooperative; rather, they 
will develop hypotheses about the nature and 
urposes of the investigation, These hypothe- 
s may be interpreted as demand character- 
istics that influence the subject’s behavior 
and responses. 

One potentially problematic demand char- 
acteristic is the general attitude of the cor- 
tectional subject toward research and the 
(ncomitant motivation for participation. In 
tomparison with free-world citizens, the atti- 
ludes of prisoners toward participation in 
search has been overwhelmingly positive 
(Wilson & Donnerstein, 1976). Correctional 
subjects may believe that participation in re- 
Search is an indication of “good behavior” and 
therefore linked to getting out. Another possi- 
bility is that prisoners are motivated to par- 
ticipate in research by their boredom and the 
fenerally negative aspects of prison living 
(Brodsky, Note 2). 

A second potential demand characteristic 
Problem has been referred to as multiple treat- 
7 inference (Campbell & Stanley, 1963). 
he essence of this difficulty is that the effect 
g prior treatments and research participation 
E not erasable. This refers to multiple treat- 
as in the same experiment as well as to 
och experience in other experiments. 
*tody (Note 3) provided an example of this 
Problem. In an attempt to manipulate the 
a he of a delayed and immediate reward for 
Bs Nstitutionalized delinquent and a nonin- 
Be alized “normal” sample, the delin- 
ae in contrast to previous research find- 
Bently chose the delayed reward 
Pest all experimental conditions. After in- 

is fae of his subjects, Brody attributed 

“Sed to the fact that similar research 
Sa carried out in that institution only 

RR i. that the subjects were no longer 

Ns, e manipulations. 

Viate i well be that the best way to alle- 
d e possible effect of these and other 
d characteristics would be to enlist the 
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subject as an expert on his/her own behav- 
ior, both in the experimental situation and 
in general. Recently, Bem and Allen (1974) 
demonstrated the wisdom of enlisting the sub- 
ject’s self-knowledge to increase predictive 
power. They hypothesized that individuals 
who identified themselves as consistent on a 
particular trait dimension would behave more 
consistently cross-situationally than those who 
identified themselves as highly variable. Their 
results supported the ‘hypothesis demonstrat- 
ing consistency for some of the people some 
of the time. They then argued that a shift to 
idiographic rather than nomothetic assump- 
tions about individual differences would im- 
prove predictive utility and that more atten- 
tion should be paid to both persons and situa- , 
tions. This shift is supported by Jones and 
Nisbet (1971), who demonstrated that when 
explaining the behavior of others, people tend 
to invoke consistent personality dispositions, 
but when explaining their own behavior, they 
consider specific situations. 

In summary, we agree with Mischel (1977) 
in arguing for a move toward more idiographic 
functional analyses: 

The moral, for me, is that it would be wise to allow 
our “subjects” to slip out of their roles as passive 
Sassessees” or “testees” and to enroll them, at least 
sometimes, as active colleagues who are the best ex- 
perts on themselves and are eminently qualified to 
participate in the development of descriptions and 
predictions—not to mention decisions—about them- 
selves, (p. 249) 

However, we also agree with Allport (1937) 
that idiographic and nomothetic methods are 
“overlapping and contributing to one another” 
(p. 22). Thus it would be unproductive to 
scuttle one approach in favor of the other. 


Recommendation 


Psychologists should take the role of the 
subject as expert seriously, since subjects 
know more than anyone else about situational 
influences on their own behavioral consistency. 
At the minimum, 4 postexperimental inquiry 
should be used as a standard experimental 
procedure to gather data on the operation of 
demand characteristics; at the maximum, the 
subject should be enrolled as “informant” and 
active participants in the exploration of his/ 


her behaviors. 
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Naturalness and the Problem of 
External Validity 


External validity concerns the extent to 
which the findings of a study are generaliza- 
ble beyond the specific domains of that study. 
The relevant questions are directed at gen- 
eralizability across subjects, experimental con- 
ditions, time, settings, and behaviors. Intui- 
tively, external validity is limited to the extent 
that the set of conditions under which the 
results were obtained differ from the set of 
conditions to which one wants to generalize. 
Moreover, if generalization to real-world con- 
ditions is desired, then the closer the experi- 
mental conditions resemble real-world condi- 
tions, the greater is the potential external 
validity. Focusing on these assertions, the 
artificiality and concomitant generalizability 
limitations of personality-oriented research 
with correctional populations are viewed as 
a central methodological problem. 

Tunnel (1977) has recently conceptualized 
three independent dimensions of naturalness— 
setting, behavior, and treatment—that he ar- 
gues should be incorporated into research de- 
signs to increase generalizability of experi- 
mental results. Tunnel defined these dimen- 
sions as follows: (a) natural setting = “a 
context outside the lab to which a person is 
naturally exposed” (p. 427); (b) natural be- 
havior = “one that is not established or 
maintained for the sole or primary purpose of 
conducting research; the behavior is part of 
the person’s existing repertoire” (p. 426); and 
(c) natural treatment = “a natural, discrete 
event, temporally bounded, that would have 
occurred without the researcher’s presence” 
(p. 427). 

Even though research with prisoners has 
quite often been conducted in natural settings, 
Tunnel’s (1977) natural behavior and natu- 
ral treatment dimensions have rarely been 
used in research designs, For example, of the 
28 personality discrimination 
Table 1, only 3 used natural behaviors as de- 
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to true-false or agree—disagree personality 
items. Clearly, in these tasks, neither behav- | 
iors nor treatment bear much resemblance to 
real-life conditions. As Mischel (1977) so 
aptly put it: 

In the conditions of real life, the psychological “stim- 
uli” that people encounter are neither questionnai 
items, nor experimental instructions, nor inanimate 


events, but involve people and reciprocal relation- ` 
ships. (p. 248) 


As a consequence of ignoring these natural- 
istic dimensions, the results of correctional 
population research are severely limited in 
their generalizability across behaviors and to 
real-world situations, Again, Mischel’s (1977) 
comment accurately reflects our position: 


The future of personality measurement will be 
brighter if we can move beyond our favorite pencil- 
and-paper and laboratory measures to include direct 
observation as well as unobtrusive nonreactive mea- 
sures to study lives where they are really lived and 
not merely where the researcher finds it convenient 
to look at them. (p. 248) 


An article recently reviewed for JCCP il- 
lustrates the artificiality problem and the 
concomitant difficulty of generalizing across ' 
experimental conditions. The goal of the re- 
search was to elucidate components of model- 
ing techniques and juvenile offender types 
potentially important in the design of treat- 
ment programs, As independent variables, two 
offender types (immature inadequate and s0- 
cialized subcultural) and two model status 
dimensions (peer and staff) were studied, with 
imitation of self-reward criteria on a pursuit” 
rotor task as the dependent variable. Con- 
sistent with the predictions, interaction effects 
were obtained in which offender types were 
found to be differentially susceptible to status 
of model influences, The authors conclud 
that “the results of this present study ha 
confirmed the efficacy of the application 0 
modeling techniques to the treatment of juve- 
nile offenders.” Although this study was rela- 
tively free from internal validity probl 
the authors’ conclusion seemed unjustifiable 
from the standpoint of external eee 
Given that a Model Status x Offender Typ 
interaction was demonstrated within the mk 
croculture of this research, the potential Ei; 
portance of these influences to the ren 
process bring into focus questions of increas@ 
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mplexity. For example, the nature of Model 
atus X Offender Type interactions may be 
pite dependent: on broader social contextual 
fluence in the natural environments of juve- 
je offenders. Certainly a particular setting 
eg, a correctional facility) may have con- 
erable influence on what type of offender 
ill imitate what type of model. Even though 
is recognized that this criticism is rather 
wasive and certainly not unique to this 
cific study, the authors provided no indi- 
tion of an awareness of the complex nature 
these issues. 
A second external validity problem high- 
gted by this study centered on the choice 
dependent measures. The imitation of self- 
tward criteria on a pursuit rotor task is quite 
istant from treatment goals relevant to juve- 
le offenders. Yet the investigators, as is 
i en the case, suggested the extension of 
beir method to treatment on the basis of the 
perimental results. Even with a more face- 
alid dependent measure (e.g., a treatment- 
tlevant activity such as an increase in posi- 
Ne peer interactions), demonstrated general- 
lability across situations should be a goal. 
Psychological research with correctional 
eo has unfortunately concentrated 
a 4 cial rather than naturalistic paradigms 
* aan As a result, external validity 
b ave been largely neglected. As with 
Be venological investigations, the focus 
on internal validity issues. This un- 
ae oy the internal versus external validity 
el ie which more restrictive controls to 
ition internal | validity problems have 
a let ee increased exter- 
i ead. problems. Since the ultimate aim 
Bei. correctional populations is 
tions ability to natural behaviors and situ- 
a tic attention should be paid to ex- 
Te ee, issues even if it means initial 
ational ae dence in experimentally derived 
cue To be able to record accurately 
vith Ae a pursuit rotor task provides 
taturally oon, or no information regarding 
aed ‘nied behavior. A shift in focus 
ould Ta ural behaviors and treatments 
lopment n as a catalyst for the de- 
ith a improved methods for dealing 
ons of se validity problems under condi- 
uralness. 
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Recommendation 


In an effort to increase external validity, 
psychologists should incorporate both natural 
treatments and natural behaviors, as well as 
natural settings, into research designs. The 
use of direct observation in natural settings 
and the concommitant search for and use of 
nonreactive, unobtrusive measures should’ be 
encouraged. 


Recidivism: A Measure in Need 
of Refinement 


‘Although there has been much written in 
the criminology/sociology literature (egs 
Adams, 1974; Empey & Erickson, 1972; 
Kirby, 1954; Lerman, 1968) regarding prob- 
lems inherent in measures of recidivism (an 
index of renewed criminal behavior following 
contact with the criminal justice system), we 
were unable to locate any discussion of the 
limitations of this measure in a psychological 
journal. Yet, recidivism, defined in its most 
usual form as reincarceration following release 
from a correctional institution, was used as 
both an independent variable in several of 
the personality discrimination studies and a 
dependent measure in the adjustment predic- 
tion and treatment evaluation studies. In the 
former, subjects were classified as recidivists 
or nonrecidivists, after which their perform- 
ance on various dependent measures of 
personality (obtained during their previous 
incarceration) was assessed in an effort to de- 
termine which measures discriminated be- 
tween the groups. In the latter types of stud- 
ies, a host of demographic and personality 
varjables were used as predictors of recidi- 
vism. There was, however, no indication that 
the investigators had any misgivings about 
or were concerned with the methodological 
problems of the recidivism measure itself. The 
following is a brief discussion of several is- 
sues that should concern psychological re- 
searchers who use this measure. 

A major problem for recidivism measures 
in both types of studies is the criminal be- 
havior/system discretion confound. In essence, 
recidivism, however defined, involves discre- 
tionary judgments of criminal justice system 
personnel that cannot be extricated from the 
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criminal behavior dimension that this measure 
is ideally designed to tap. From the decision 
to arrest by the police to the decision to re- 
incarcerate by the judiciary or parole official, 
several discretionary judgments are made. At 
the point of reincarceration, the summative 
impact of these judgments maximizes the like- 
lihood that differential biases have operated 
in determining who becomes a recidivist. 
Some offenders (e.g., nonwhite lower-class 
urban youth) may be arrested and returned 
to a correctional facility for relatively minor 
infractions, whereas others (e.g., white mid- 
dle-class suburban youth) may commit the 
same or more serious crimes and remain out 
on parole. Moreover, although recidivism 
rates are sensitive to policy shifts at all levels, 
the sensitivity increases the further one be- 
comes entangled in the criminal justice sys- 
tem process, with the point of reincarceration 
allowing for the maximum policy and admin- 
istrative impact. Sellin (1962), a well-known 
criminologist, has commented that 


the difficulty with statistics drawn from later stages 
in the administrative process is that they may show 
changes or fluctuations which are not due to changes 
in criminality but to variations in the policies or the 
efficiencies of administrative agencies, (p. 64) 


One of the best examples of problems en- 
gendered by the criminal behavior/system dis- 
cretion confound comes from the California 
Youth Authority’s investigation of the com- 
munity treatment program (Warren, Neto, 
Palmer, & Turner, 1966). First-commitment 
youths were randomly assigned either to ex- 
perimental services in their own communities 
or to a control group situation that involved 
residence in an institution away from home. 
The results indicated that in general, a sig- 
nificantly higher percentage of control than 
experimental youths had engaged in parole 
violations (e.g., parole was Officially revoked 
recommitment occurred, or unfavorable dis. 
charge from the Youth Authority was given) 

Although these results may have supported 
the contention that the community-based ser- 
vices were superior, Lerman’s (1968) reanal- 
ysis of the data Suggested that the parole 
violation rates were quite misleading about 
the behavior of the two groups. For example 
poe the higher parole violation rates for 

e control youths, the experimentals engaged 
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in more known delinquent offenses per boy 
than did the controls. Moreover, a consider: 
tion of seriousness levels revealed that thi 
experimentals exhibited higher delinquent bi 
havior rates for low- and medium-seriousn 
offenses, with both groups about equal wi 
regard to high-seriousness offenses. This find: 
ing, however, was confounded by evidence 
that the parole officers of the experimental 
boys were much more likely to know about 
their boys’ offenses than the parole officers of 
the controls. Thus, the delinquent behavioral 
output may have been the same for the two 
groups, but the rate of being noticed was sig: 
nificantly higher for the experimental boys 
(Lerman, 1968) and the rate of reincarcera- 
tion was higher for the controls. The expert: 
mental youths thus had lower parole violation 
rates but higher rates of recorded delinquent 
offenses. q 

This apparent, albeit spurious, contradic: 
tion can be explained in terms of differential 
reactions of the experimental and control ot- 
ganizations to recorded offenses. Although the 
parole officers of the experimental boys were 
much more likely to notice offenses, they were 
also much Jess likely to handle these casts 
via the parole violation method except with 
high-seriousness offenses. The lower parole 
violation rates obtained by the experimental 
group may therefore be due to their parole 
officers’ disinclination to handle medium- ani 
low-seriousness offenses by parole violation. 
Thus, the measure of recidivism may actually 
be telling us more about the behavior of the 
parole officers than that of the boys. Ras 

As elucidated in this example, the policies 
(and concomitant process) for both detection 
and handling of offenses may differ across spe 
cific organizations and with regard to partici 
lar offenders. This difficulty combined He 
the necessarily judgmental nature of OA 
tasks may engender serious problems for stu 
ies that depend on recidivism measures. “ee 

The criminal behavior/system discre! o 
confound can also be examined from the n 
spective of Type I and Type II errors the 
their varying probabilities throughout n it 
criminal justice system process. Agony 
is unfortunate for researchers Concerne! iduals 
the reliability of recidivism 


data, indivi hen 
it w 
are understandably reluctant to admit 
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ey commit a crime. Therefore, recidivism 
‘known only when a person is arrested, con- 
iced, sentenced, and/or institutionalized. 
hroughout this process of formal contact 
th the criminal justice system, both kinds 
classification errors are possible. A Type 
error, or false positive, occurs when a per- 
in is classified as a recidivist when he/she 
id not, in fact, engage in criminal behavior. 
Type II error, or false negative, occurs 
en a person is classified as a nonrecidivist 
fen he/she did, in fact, commit another of- 
mise following release from prison. The rela- 
ie probability of each error type will vary 
the stage of the criminal justice system 
cess. At the earliest point in the process— 
lie point of rearrest by the police—there is 
lich greater probability of Type I errors 
lan at later stages. For example, when the 
tocess reaches reincarceration, Type I errors 
lit minimized by the abundance of legal safe- 
rds along the way (unless the person is 
on parole and therefore can often be re- 
fiearcerated without any judicial review). 
hese safeguards are, of course, designed to 
Mevent a person from erroneously being sent 
lack to prison, Although the probability of 
pe I errors decreases as one moves further 
fito the administrative process, the probabil- 
Wy of Type II errors correspondingly in- 
fases. Thus, the Type II error presents the 
ajor problem for studies that define recidi- 

4 as reincarceration. 
he II errors present at least two kinds 
a co problems. In each case the diffi- 
a les with the determination of the non- 
Hadivist group. First, due to the impossibility 
detecting every offender and the inconsis- 
Bsc of crime detection systems, many per- 
oe as nonrecidivists may, in fact, 
te Committed a crime. In the evaluation of 
i ay treatment program, for ex- 
Dle, the fact that the experimental boys 
More recorded offenses than the controls 
or oe been due to a greater likelihood 
than ; A noticed by their parole officers rather 
faviora] any real difference in delinquent be- 
fher t apit Detection system differences 
lay a ; behavioral differences may thus 
essificati e in the recidivist-nonrecidivist 
fined lon process. Second, with recidivism 
as reincarceration, all parole viola- 
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tions and other forms of criminal justice sys- 
tem contact short of reincarceration may still 
result in a person being classified as a non- 
recidivist. Fortunately, most of the studies 
examined defined a nonrecidivist as a person 
having no contact with the criminal justice 
system. A problem still remains, however. Al- 
though the no-contact and reincarceration 
criteria for nonrecidivist and recidivist may 
assure maximum differences between groups 
(and maximum differences in criminal justice 
system reentanglement), all gradations of law- 
violative behavior and criminal justice system 
contact in between these extremes is ignored. 
Some investigators (e.g, Roberts, Erikson, 
Riddle, & Bacon, 1974) improved their meth- 
odology by including all of these gradations 
into a third, some contact, group. Finer dif- 
ferentiations (e.g., by frequency and type of 
offense) with more than three groups could 
yield still more accurate and complete in- 
formation. 

Two additional problems, both related to 
institutional bookkeeping, warrant attention. 
First, individuals are often counted as non- 
recidivists who end up in other institutional 
programs (e.g. mental hospitals), die, or 
move out of state where they may engage 
in criminal behavior and even be reincarcer- 
ated. The point is that these individuals, be- 
cause they have not been reincarcerated in 
a correctional facility in the same state, are 
often classified as nonrecidivist. Clearly this 
designation is erroneous. Second, although 
most investigators suggested that they used 
no contact with the criminal justice system 
as a criterion for nonrecidivism classification, 
in no study was there any indication of how 
this information was gathered. The problem 
is that most investigators usually consult only 
one source to gain this information under the 
assumption that there is a coordinated record 
keeping system between all aspects of the 
criminal justice apparatus within a given 
state. This is a faulty assumption. Not only 
does such a coordinated system usually not 
exist, but also depending on which source of 
information one uses (e.g., police records, 
correctional department files, parole officers 
reports, court records), one is likely to find 
different numbers and types of recidivism in- 
formation on any given individual. Thus, dif- 
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ferent conclusions about the extent and nature 
of recidivism may vary depending on source 
of data. Although these problems may be 
more serious when recidivism is used as a 
measure of treatment program success, re- 
searchers who use this measure in other types 
of studies should be aware of the fact that 
the measure is not as straightforward and in- 
ternally valid as it is generally assumed. 


Recommendation 


Recidivism should be operationally defined 
whenever it is used, and source of data should 
be provided. When used as a measure of post- 
institutional adjustment, recidivism should be 
supplemented by other measures of adjust- 
ment in the areas of education, employment, 
and family life. When possible, recidivism 
measures should be based on multiple grada- 
tions of such factors as extent of contact 
with the criminal justice system, frequency 
of law-violative behavior, and type and seri- 
ousness of offense. 


Ethics of Research with Correctional 
Populations 


This is an area that has attracted much 
professional attention recently but one that 
some investigators may not wish to consider 
as a methodological issue. We, however, con- 
sider it to be so integrally related that a brief 
discussion is warranted within the present 
context, 

The National Commission for the Protec- 
tion of Human Subjects in Biomedical and 
Behavioral Research (1977) has recently pro- 
vided recommendations regarding prison re- 
search that have considerable merit. The 
Commission recognized three broad categories: 

1. Research conducted with the goal of im- 
proving institutional and program effective- 
ness, which includes Psychological treatments 
having “the intent or reasonable probability 


of improving the health or Jl-bei 
individual prisoner” well-being of the 


h ed to prisoners 
but not having a goal of benefiting them 
er personality) 
ts and processes 
of prisons as in- 


N. DICKON REPPUCCI AND W. GLENN CLINGEMPEEL 


stitutional structures or of prisoners as incar 
cerated persons” (p. 3080); and 

3. Research using prisoners as subjects be 
cause they are available rather than because 
of their status as prisoners (e.g., psychophar- 
maceutical testing). 

The Commission recommended that a hu- 
man subjects review committee, composed of 
individuals of diverse racial and cultural back- 
grounds, including prisoners, prisoner advo- 
cates, clergy, community representatives, be- 
havioral scientists, and medical personnel, 
should approve all research. This committee 
would consider the risks involved in the re- 
search, the provisions for obtaining informed 
consent, and safeguards to protect confiden- 
tiality and other concerns. Furthermore, pa- 
role boards could not take into account pris- 
oners’ participation in research. Research in 
the second category, that is, nonbeneficial but 
related to prisoners, must additionally “pre: 
sent minimal or no risk and no more than 
mere inconvenience to the subjects” (National 
Commission for the Protection, 1977, p: 
$080). Finally, research on prisoners because 
of their availability could only be conducted 
if three additional requirements were met: 
(a) The research must fill an “important s0- 
cial and scientific need,” (b) it must “satisfy 
conditions of equity,” and (c) it must be 
distinguished by “a high degree of voluntari- 
ness” on the part of the prisoner. 

These recommendations deserve close at- 
tention and adherence. As Brodsky (Note 2) 
recently stated in his report to the APA Task 
Force on Psychology and the Criminal Jus 
tice System: 
In spite of the fact that there may be some observa: 
tional and nonreactive investigations which wou! 
seem to have minimal impact—such as questionnaire 
studies—the potential for both misuse and harm to 
individuals exists in any study. In these as well as 
the more active manipulations of treatment condi- 
tions or offenders, it is important that consent fort’; 
institutional reviews and other elements of protec 
tion for human subjects be included. (p. 57) 


Recommendation 


Some acknowledgment that ethical za 
guards have been used (e.g., informed A 
was obtained) should be included in ri 
Method section of all research submitted 
publication in APA journals. 
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Conclusion 


In concluding this article it is perhaps 
worth stating that the issues discussed are 
neither exhaustive of the methodological 
problems confronting psychologists doing re- 
search with correctional populations nor are 
they complete discussions in and of them- 
selves. Moreover, the emphasis on personality 
discrimination research may provide the 
teader with a distorted view of the type of 
research that is conducted with correctional 
populations. Although this may be true, it 
is certainly no distortion to suggest that this 
is the predominant type of research that is 
published in JCCP and JAP. Finally, we do 
fot expect that all of the issues discussed can 
he satisfactorily resolved in any single re- 
search project. Nevertheless, we firmly believe 
‘that most of these issues have received little 
‘Sustained attention in past psychological re- 
search with correctional populations and that 
‘this state of affairs should not continue. 


i 
| 
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in Marital and Child Treatment Research 
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Common methodological errors in child and marital treatment research are dis- 
cussed, and suggestions are made to help investigators avoid such errors. The 
following areas are covered: selection of subjects and therapists, scope and 
source of dependent measures, treatment specification, experimental design, and 
data analysis and interpretation, Some of the most salient errors include (a) un- 
substantiated diagnoses or client labels; (b) very few therapists per treatment 
condition; (c) restricted outcome criteria and the lack of reliable, valid depen- 
dent measures; (d) failure to provide treatment manuals and to check empir- 
ically whether the treatments were actually implemented; and (e) experimental 


designs that fail to address issues such as maturation, expectation, nonspecific 
relationship factors, and practical significance. 


Our purpose is to discuss methodological 
problems that are common to most child and 
marital treatment research. Child and marital 
therapy are addressed as separate activities, 
although we recognize that childhood disor- 
ders are often influenced by marital discord 
and vice versa. That is, even though the 
problems of children and marriage are often 
mt independent, we are reviewing the diffi- 
culties in research when either the child or 
the marital interaction is the focus of treat- 
ment. Such specific programs have proven suc- 
cessful in treating marital (Gurman & Knis- 
Kern, 1978) and childhood problems (K. D. 
O'Leary & Wilson, 1975). We recognize that 
individuals with other treatment orientations 

Might view child and marital problems only 
pean a family context and would treat the 
pol family—children, parents, and some- 
NaN grandparents—simultaneously. Family 
ay research will not be specifically ad- 
essed here; readers interested in this meth- 
ney should consult Wells, Dilkes, and 

lutckhardt (1976). 
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Marital discord has highly diverse etiol- 
ogies, and distressed couples present a wide 
variety of problems including role conflicts, 
jealousy, individual pathology, physical abuse, 
and sexual dissatisfaction. Despite the varied 
presenting problems, most marital researchers 
and therapists have the commonly accepted 
goals of enhancing communication (D. H. 
Olson, 1970) and increasing satisfaction. The 
outcome research involving these general 
will be the focus of this article. Those inter- 
ested in the research methodology of treat- 
ment for specific sexual dysfunctions should 
consult LoPiccolo (1978). 

Two of the most commonly diagnosed child- 
hood problems are conduct disorders (un- 
socialized aggressive reaction) and hyper- 
kinesis (Cerreto & Tuma, 1977), and we will 
illustrate methodological errors in child treat- 
ment research from these areas. Both prob- 
lems are the result of complex psychological, 
social, and biological factors, and the goals of 
treatment are multiple, including increases in 
academic productivity, frustration tolerance, 
attention span, positive parent-child inter- 
actions, and self-esteem. Consequently, the 
childhood problems covered represent areas 
of concern to most child treatment research- 
ers. For those interested in other specific 
childhood problems, such as autism, enuresis, 
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adolescent suicide, and delinquency, Annual 
Progress in Child Psychiatry and Child De- 
velopment (Chess & Thomas, 1977) and 
Handbook of Treatment of Mental Disorders 
in Childhood and Adolescence (Wolman, 
Egan, & Ross, 1978) are recommended as 
sources of discussions on substantive and 
methodological problems. 

We will address common methodological 
problems by noting frequent errors of omis- 
sion and commission and by making sugges- 
tions that can help investigators avoid such 
errors. The methodological errors will be dis- 
cussed in the order of their occurrence as one 
reads a manuscript (American Psychological 
Association, 1974), More specifically, we will 
discuss (a) subjects, (b) therapists, (c) de- 
pendent measures, (d) treatment specifica- 
tion, (e) experimental design, and (f) data 
analysis and interpretation of results. The 
suggestions are offered as guidelines, but it is 
recognized that in outcome research with 
child and marital populations, meeting all of 
the suggested recommendations may be im- 
possible. The extent to which they are met, 
however, will clearly increase both the inter- 
nal and external validity of the research, 


Subjects 
Small Sample Size 


Controlled research in child, and particu- 
larly marital, therapy is plagued by the prob- 
lem of small subject samples, Enough subjects 
should be included in each experimental group 


ease power or the 
t differences, How- 
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ever, the smaller the sample, the greater the 
‘probability of sampling error and artifacts 
that would limit the generalizability of the 
findings. 


VAVIS Samples 


Williams (1956) coined the acronym 
YAVIS (young, attractive, verbal, intelligent, 
successful) in his discussion of the ‘typical 
psychotherapy patient. This problem of re- 
stricted patient populations is particular] 
salient in the controlled research on mari 
therapy. Almost all of the published studies 
contain young, well-educated, middle-class 
couples. Because age and/or number of years 
married can affect the outcome of different 
approaches to marital therapy (Turkewitz & 
O'Leary, Note 1), a sample should be se- 
lected that includes a wide age range or one’s 
conclusions must be restricted to the particu- 
lar age group involved in the study. Further 
research is necessary to evaluate marital treat- 
ment programs for both older clients and 
clients from lower socioeconomic backgrounds. 


Unspecified Sub ject Characteristics 


A very common error of omission in the 
Presentation of research on child and marital 
therapy is inadequate sample description. In 
addition to age, educational level, occupation, 
Socioeconomic status, and information re- 
garding previous therapy, authors should 
Specify the fee paid, if any, for treatment 
in the research program. In marital therapy 
Outcome studies, the mean and range of years 
married, the number of children, and the 
number of previous marriages should also be 
reported. In child research, it is advisable to 
Provide information on the parents’ marital 
Status and the involvement of mothers and/ot 
fathers in treatment. Depending on the pat- 
ticular child problem under study, a thor- 
ough assessment and description of the bre 
ple may involve a measure of intellectual 
functioning, academic performance, or ne 
Satisfaction of the parents. A complete if 
Scription of subject characteristics is a p 
sary so that other researchers can attemp! 
replications of the research and so that see 
ticing clinicians can judge if the treatmen 


gram described is likely to have similar 
ects for the population with whom they 
re working. 


Unsubstantiated Labels and Diagnoses 


In addition to specifying demographic char- 
researchers executing outcome 


g a particular clinical problem, for example, 
marital distress, adjustment reaction of child- 
, or hyperactivity. However, outcome 
rch is often hampered by very general 
definitions of these labels. To allow for gen- 
ralizations regarding the treatment and rep- 
lications of studies, specific data substantiat- 
ing the diagnoses must be provided. For ex- 
ample, in marital research, couples referred 
by divorce courts are probably more severely 
distressed than those referred by clergy. De- 
gree of distress could also depend on geo- 
graphical location and community sanctions 
tegarding divorce. Obtaining a numerical rat- 
ing of distress from a valid assessment in- 
strument that has normative data provides a 
description of the sample that can be under- 
stood by other professionals and increases the 
teliability of the diagnosis of marital distress. 
The Locke-Wallace Marriage Relationship 
Inventory (Kimmel & Van der Veen, 1974; 
Locke & Wallace, 1959) is a useful question- 
naire for these purposes, as discriminant valid- 
ity (Locke & Wallace, 1959), construct valid- 
ity (Weiss, Hops, & Patterson, 1973), and 
test-retest reliability (Kimmel & Van der 
Veen, 1974) have been demonstrated for vari- 

Ous forms of this inventory. 
ey, the need for specific, defined cri- 
h la is particularly salient in the research on 
Yperactivity. Children may be diagnosed as 
ao by a pediatrician, clinical psy- 
ae psychiatrist, or school psychologist. 
itionally, children may receive the label 
Sie they are taking Ritalin, have been 
4 ted by a teacher, or have parents who 
vaca help in handling them at home. These 
ee sources of labels yield highly heter- 
eid groups of children described as hy- 
oh tive. In this regard, the prevalence rates 
ae ciny range from 4% to 20% of 
1977) tary school samples (Ross & Ross, 
. Even though teachers are probably 
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the best information source regarding a child’s 
activity level, caution should be observed in 
basing a diagnosis of hyperactivity on a single 
teacher’s referral. Teachers occasionally in- 
flate their assessment of a child’s problems 
because they want to obtain consultation 
from the researcher (S. O’Leary, Note 2). 
Further, mental health practitioners may not 
be able to diagnose hyperactivity reliably 
(Kenny et al., 1971). A recommended assess- 
ment measure for hyperactive children is the 
Conners Teacher Rating Scale (Conners, 
1969). However, such a measure should be 
supplemented by parent ratings (Routh, 
Schroeder, & Otuama, 1974). The general 
principle is that labeling or diagnosing the 
subject population requires well-validated, re- 
liable measures to substantiate the descrip- 
tive label. 


Inadequate and Unspecified Selection 
Process 


An investigator conducting outcome re- 
search should define the problem under study 
and include only those subjects who meet a 
priori criteria. The use of a priori criteria in- 
creases the likelihood that the sample ob- 
tained has clinically significant problems and 
will be comparable to those in other investi- 
gations. (This argument assumes that some 
other investigators also use commonly ac- 
cepted criteria for clinical problems.) 

We recognize that it is often difficult to 
obtain a sufficiently large sample if one uses 
stringent a priori criteria regarding severity 
of the problem. If this is not possible, a more 
general criterion such as referral by a teacher, 
parent, pediatrician, mental health clinic, or 


divorce court should be accompanied by a 
specific description of the 


referral process and 
descriptive measures like the ones discussed 
in the previous section. If more than one re- 
ferral source is used, the percentage of sub- 
jects obtained from each different source 
should be noted, as different sources can pro- 
duce clinically different samples, for example, 
teacher versus pediatrician and clinic versus 
court referrals. 

In addition to a severity criterion, many 
researchers exclude clients who would not be 
appropriate for the treatment under study. 
Examples of these exclusionary criteria in 
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short-term marital therapy are evidence of 
severe psychopathology in either spouse (e.g., 
psychosis or chronic alcoholism), concurrent 
therapy, spouses living separately, and the 
refusal of either spouse to cooperate in the 
treatment program. In child treatment pro- 
grams, exclusion has been based on neuro- 
logical impairment, childhood psychosis, in- 
ability of both parents to attend therapy 
sessions, severe marital discord, or uncoopera- 
tive teachers. Although these exclusionary 
criteria may be valid for certain treatments, 
they are often not specified. For example, re- 
searchers rarely state how they determined 
the presence of neurological involvement or 
severe psychopathology. As excluding subjects 
with these problems limits claims of general- 
izability of the results, it is important to de- 
scribe the decision-making process and specify 
the percentage of potential subjects who were 
excluded, 


Therapists 
Small Therapist Samples 


One of the most common methodological 
errors made in marital and child outcome re- 
search is that the number of therapists in- 
volved is too small to allow for appropriate 
generalizations regarding treatment. Thera- 
pist factors have been repeatedly shown to 
influence outcome in family therapy (Gurman 
& Kniskern, 1978), and it is very likely that 
these factors also have an impact on child 
and marital treatment. Given the importance 
of therapist variables, it is necessary to have 
as many therapists as Practically feasible so 
that one can study whether one’s treatment 
Program can be successfully implemented by 
therapists of varying styles, Keeping in con- 
sideration the need for training and close 
‘supervision in studies of Particular treatment 
Programs, a minimum of 3 


Bias Introduced by Motivational Factors 


« Teacher incentives, In child 
ncen 5 treatment re- 
search, motivational factors need to be Ga 
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sidered when teachers are asked to implement 
therapeutic programs in their classrooms, Tf 
a teacher receives graduate credits, free time, 
additional classroom materials, or teaching 
assistants for conducting an intervention, 
these potential reinforcers can positively bias 
the consistency and quality of the teacher’s 
efforts. We would not advise against using 
such incentives, but it should be noted that 
clinicians in nonresearch settings cannot offer 
the teacher incentives and consequently might 
not find the intervention to be as successf 
as it appeared in the research program. 

Authors as therapists. A serious methodo- 
logical problem regarding therapist selection 
involves the experimenter (s) as therapist (s). 
The emotional investment and enthusiasm of | 
the author/therapist may  incréase placebo 
effects and spuriously inflate the success of 
the treatment under evaluation. These fac- 
tors would hold even if there were two or 
three therapists, if all were authors, 


Con founds Resulting from Therapist 
Characteristics or Theoretical Biases 


4 
The general issue of motivation and en- 
thusiasm for a particular treatment approach - 
becomes even more critical when two or more 
treatments are being compared. Many of the 
Comparative studies in the marital and child 
treatment areas involve two or more therapists 
who implement both treatments under study. 
This methodology of crossing therapists with 
treatments has clear advantages and disad- 
vantages. One advantage is that differential 
treatment outcomes can more readily be at- 
tributed to the treatments rather than to the — 
Skill of the therapists (since the therapists 
are the same for both treatment programs) f 
One serious disadvantage, however, is the 
possibility of systematic bias if all the thera- 
pists involved have a theoretical orientation 
that favors one of the treatments. For ex- 
ample, if several behavior therapists were a 
execute a study comparing behavioral marita! 
therapy with a systems approach and if nee 
behavior therapists were to conduct bo l 
treatment programs, a bias would be intro- 
duced by greater enthusiasm for the penay 
ioral intervention, greater knowledge of t é 
Specific behavioral procedures, and mor 


clinical experience in treating clients with 
havioral methods. 
Rather than crossing therapists with treat- 
ents, different therapists can execute each 
rogram under study. If this methodology is 
ed, large numbers of therapists must be 
to avoid confounding treatment effects 
and differential therapist skills. The difficulty 
obtaining a sufficient number of therapists 
fof differing orientations often led investi- 
gators to cross therapists with treatments. 
ecause this method is frequently used, the 
following recommendations are made to re- 
duce possible biases: Include therapists of 
‘varying theoretical orientations; present co- 
gent rationales and theoretical explanations 
for all treatments under study; offer extensive 
training pregrams, including readings, role 
playing, and pilot cases; and provide close 
supervision Even with these precautions, any 
therapist biases should be assessed and re- 
ported. For example, one can obtain the 
therapists’ predictions regarding the effec- 
tiveness of the different approaches and their 
| teports on how comfortable they were in con- 
ducting the treatments. The assessment and 
control for the biases introduced by therapist 
factors is important, because the small num- 
ber of therapists often involved in treatment 
| research makes it difficult, if not impossible, 
to study the influence of therapist character- 
| istics statistically. 


Inadequate Description of Therapists 


Infor tion should be included on the 
therapis s’ (and/or teachers’) training, edu- 
cational, and professional background; sex; 
age; previous experience with marital or child 
treatment; and incentives for participation 
(salaried vs. volunteer). In studies of marital 
i child therapy, the marital status of the 
es and whether or not they have chil- 

en may be of interest. 


Dependent Measures 


; Multiple Outcome Criteria 


k very serious error in child and marital 
ee research is the lack of multiple 
Come criteria (Johnson & Christensen, 
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1975). In both the child and marital areas, 
many investigators have relied exclusively on 
self-report measures. In contrast, some of the 
child treatment studies—especially evaluations 
of behavior therapy programs—have included 
only naturalistic observations. As self-report 
and observational measures both have 
strengths and weaknesses (K. D. O'Leary & 
Johnson, in press) and sometimes can lead to 
different conclusions (e.g., Harrell & Guerney, 
1976; Liberman, Levine, Wheeler, Sanders, 
& Wallace, 1976), neither data source should 
be considered sufficient. Each provides unique 
information that is necessary in any compre- 
hensive treatment evaluation. 

Some authors have argued for a tripartite 
assessment model including self-report, obser- 
vational, and physiological measures (Lang, 
1971). Such a model has been quite useful 
in evaluating fears and anxiety, but the model 
is not especially relevant to the evaluation of 
child and marital treatment. In child treat- 
ment, academic achievement tests, mechanical 
measures (such as stabilometer chairs to as- 
sess hyperactivity), and sociometric evalua- 
tions are useful adjuncts to self-report and 
observational measures. In marital treatment, 
adjunctive measures such as days absent from 
work, job productivity, and physical well- 
being could be considered. 

Regardless of the type of assessment data 
obtained, a clinical interview is a critical and 
almost universal mode of assessment. How- 
ever, the material collected in an interview, 
the meaning assigned to that material (e.g, 
a personality structure or a behavior rein- 
forced by significant others), and the extent 
of standardization in the interview will de- 
pend on one’s theoretical orientation, Under 
certain conditions a clinical interview can be 
reliable, and the interview allows for maximal 
flexibility in developing new hypotheses re- 
garding a client’s problem (K. D. O'Leary & 


Johnson, in press). 


Comprehensive A ssessment 


In addition to the assessment modality that 
is used, there are two major issues to be con- 
sidered when choosing dependent measures: 

1. Scope of the assessment. For example 
in a child treatment study, possible factor 
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of interest include classroom academic and/ 
or social behavior, self-esteem, a child’s feel- 
ings toward his/her parents or teachers, and 
parent-child interactions at home. In marital 
therapy outcome research, one can consider 
assessment of individual functioning, sexual 
satisfaction, role conflicts, social life, expecta- 
tions of and feelings toward one’s spouse, and 
a consumer evaluation of the program. 

2. Source of the data. Strupp and Hadley 
(1977) argued that psychotherapy outcomes 
should be evaluated from three different per- 
spectives—(a) society or significant others 
(e.g., relatives, teachers, employers), (b) the 
client, and (c) the therapist, In particular, we 
emphasize the strong need for data from the 
child’s perspective in evaluations of child 
therapy. 

The choice of dependent measures and the 
data sources will be dictated by one’s theo- 
retical orientation and the particular ques- 
tions of interest. It is beyond the scope of 
this article to discuss fully all of the factors 
that should be assessed in the study of vari- 
ous child or marital problems, (The reader is 
referred to Gurman and Kniskern, 1978, for 
a detailed discussion of this issue.) As self- 
report and observational measures are the 
most commonly used dependent measures in 
the evaluation of child and marital treatment, 
they will be discussed in detail, 


Self-report 


One of the most frequently cited problems 
of self-report measures is the influence of 
demand characteristics (e.g., cues that con- 
vey the experimental hypotheses and the im- 
plicit or explicit expectations to the clients). 
Demand characteristics can be minimized by 
keeping the specific criteria for inclusion in 
a treatment study unclear to 
having the therapist absent 


Social desirability, or the tendency to re- 
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spouses answered “‘true” to impossible re- 
sponses. (e.g., “There is never a moment I 
don’t feel head over heels in love with my 
spouse.”) It is moot whether these socially 
desirable responses reflect active distortions 
or honest tendencies of spouses to exaggerate 
positive qualities of their mates, and it has 
also been argued that a large percentage of 
socially desirable responses may not decrease 
the validity of a marital satisfaction test 
(Murstein & Beck, 1972). A full discussion 
of social desirability is not warranted heri 
However, when social desirability may be 

critical factor (e.g., cases of child abuse), an 
investigator can consult the personality re- 
search literature for methods to control for 
it, such as transforming raw scores using a 
Social desirability correction factor (e.g, 
Minnesota Multiphasic Personality Inven- 
tory) and developing special multiple-choice 
questionnaire formats. 

A third problem of self-report measures 
that is often overlooked is that clients must 
have a certain language proficiency to com- 
plete questionnaires; and with few exceptions 
(e.g., Bienvenu, 1970), authors do not specify 
the degree of literacy required. When treat- 
ing clients with limited educational back- 
grounds (a group much neglected in the mari- 
tal research area), investigators should use a 
reading survey instrument to determine the 
language proficiency required to complete 
their questionnaire (e.g., Fry, 1968). 

Finally, a common error is to use self-report 
measures that do not have established relia- 
bility and validity, which renders the degree 
of reported change very difficult, if not im- 
Possible, to interpret. Valid questionnaires are 
available in the marital area for assessing 
communication (e.g., Navran, 1967) and sat- 
isfaction (e.g., Locke & Wallace, 1959). For 
child treatment studies, there are numerous 
teacher and parent rating scales that have 
demonstrated reliability and validity (Achen- 
bach, 1978; K. D. O'Leary & Johnson, 10 
Press; Quay, in press). On the other hand, 
valid self-report questionnaires that ane 
self-esteem and attitudes of children towar 
Parents and school are needed (cf. Cooper- 
smith, 1967). ‘ied 

Even though all of the problems associa A 
with self-report measures cannot be elimi 


CHILD AND MARITAL TREATMENT 


fed, we strongly recommend their use be- 
use they are focused, convenient, and eco- 
mical. Further, certain areas cannot be 
essed by observational measures (e.g., per- 
ption of feelings toward one’s spouse, ex- 
tations regarding marriage, attitudes of 
jldren toward teachers, and self-esteem). 


bservational Measures 


| The problems discussed above in regard to 
-teport measures indicate a clear need for 
titect observation of the behaviors of interest. 
Direct, or naturalistic, observation has be- 
me a hallmark of research in behavior ther- 
y, and certain journals will often not accept 
anuscripts that do not include such data 
(tg, Journal of Applied Behavior Analysis). 
Ditect observations usually provide less biased 
tata than self-reports, and they can be used 
b obtain information not easily provided by 
ients (e.g, data on facial expressions or 
ther nonverbal aspects of communication of 
hich a spouse or parent may not be aware). 
inally, the collection of audiotapes and/or 
ideotapes allows for the possible reanalysis 
ti data by different investigators who wish 
flo compare and contrast treatments. 

In the child treatment area, observational 
“a systems ‘have been used extensively in 
ee (Kent & Foster, 1977). Classroom 
ns have proven reliable, valid, sen- 
a to treatment changes; are largely non- 
bo (S. G. O'Leary & K. D. O'Leary, 
a and are capable of distinguishing 
Gites, from Clinical populations (Abikoff, 
ein, & Klein, 1977). In homes or 
af oe home settings, direct observations 
a ildren have also proven reliable, valid, 

sensitive to treatment changes (Patterson, 
4), but sometimes they are reactive; that 
ilter € presence of an observer in a home can 
the nature of parent-child interactions 
Gine 1977). The problem of reactivity is 
(a las one that is often overlooked and 
e especially problematic with adoles- 
? a only a few observations are made 
: O'Leary & Johnson, in press). 

arital interaction codes have been devel- 
a discriminate between distressed and 
2 tressed couples (Rausch, Barry, Hertel, 

ain, 1974; Gottman, Notarius, & Mark- 


' 
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man, Note 3). However, with few exceptions, 
complex coding systems have not been used 
as pre-post dependent measures (e.g., Weiss, 
Hops, & Patterson, 1973), The use of such 
observations is strongly recommended, and 
several methodological issues should be con- 
sidered. In marital treatment research the 
observational context is generally a labora- 
tory or interview setting. Many investigators 
use variations of Strodtbeck’s revealed differ- 
ence technique (Strodtbeck, 1951), in which 
the experimenter identifies differences of opin- 
ion between the spouses and then asks them 
to resolve these differences and reach mu- 
tually agreed-on conclusions. Typically, the 
couple is left alone when discussing the con- 
flicts, and their interaction is videotaped or 
audiotaped. When using this procedure, dis- 
cussions may occur that evoke very hostile 
feelings. Thus, it may be ill-advised to assess 
certain populations (e.g., physically violent 
couples) in this manner. In all cases, the 
investigator should conduct at least a brief 
clinical interview following the task to insure 
that the spouses do not leave the assessment 
interview angrier with each other than when 
they arrived. 

The content of the conflicts discussed may 
have a critical influence on the data obtained. 
For example, some evidence indicates that 
with older couples a high degree of induced 
conflict is necessary to discriminate the com- 
munication patterns of distressed and nondis- 
tressed couples (Gottman, Notarius, Gonso, 
& Markman, 1976). In the Gottman et al. 
study, the difference between the high and 
low conflict situations involved the degree to 
which the topic discussed was relevant to the 
marital relationship. One implication of this 
finding is that topics relevant to the couples’ 
problems produce more realistic samples of 
communication. 

A method used to insure a highly relevant 
discussion has been to ask the couples to dis- 
cuss their own marital problems. However, 
we feel that this procedure has a serious meth- 
odological problem in that a change in the 
interactions may reflect either an ameliora- 
tion in the particular presenting complaints 
or an alteration in the general communication 
style of the couple. Presumably, general 
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changes in communication and problem-soly- 
ing skills should be measured on topics rele- 
vant to, but not identical with, the couple’s 
presenting problems if predictions are to be 
made about how the couple will handle con- 
flicts that arise in the future. 

One additional methodological issue in the 
marital area involves selecting the kinds of 
behaviors to observe and the means to assess 
them. Both the affective and content domains 
of communication should be assessed to dis- 
criminate between distressed and nondis- 
tressed couples (Gottman et al., 1976). Al- 
though it is not clear that videotapes offer 
incremental validity over audiotapes, the af- 
fective quality of an exchange ought to be 
more easily assessed on videotapes (e.g., 
sneers, eye contact, body position). However, 
since video recordings are costly, potentially 
reactive, and unavailable in most nonresearch 
clinics, reliable, discriminating codes for rat- 
ing audiotapes should be developed. 

At present, observational methodology in 
the child treatment area, particularly in class- 
rooms, is more sophisticated than that in 
marital therapy research. Further develop- 
ment of behavioral codes of marital interac- 
tions and parent-child interactions in the 
home is mandated. Observational methodol- 
ogy is costly and time-consuming, but obtain- 
ing independent observations that supplement 
self-report data is usually crucial. 


Treatment Specification 


Therapy Manuals 


Failure to specify therapeutic procedures 
in detail is one of the most common and yet 
most serious problems in psychotherapy re- 
search, This error is largely an error of omis- 
sion, and to rectify the problem some journal 
editors (e.g, Cognitive Therapy and Re- 
search, Journal of Applied Behavior Anal- 
ysis) have recently Tequired that treatment 
manuals be made available for interested 
readers through the author, the National 
Auxiliary Publications Service, or some other 
central information source, Without such 
manuals or other training Materials (e.g 
video or audio Casettes), replication of treat. 
ment studies is almost impossible. 
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Training Programs 


In addition to the description of the aci 
treatment, a manuscript should include 
following information on the therapists’ prep. 
aration: the length and type of training, that 
is, pilot cases, role playing, reviewing ta) 
and books, manuals, and other reading ma 
terials used. The amount and frequency of 
therapy supervision should also be speci 
Since complete specification of the training 
program would be impossible in most jout 
nals, details regarding the training programs 
should be explicated in therapy manuals. —— 


Assessment of the Independent Variable(s) 


To state unequivocally that a treatment i§ 
effective, some measure that the treatment) 
was actually implemented must be obtained 
In psychological research, several levels o! 
analysis can be used to assess the indepen 
dent variable. On one level, an investigator 
can determine whether the therapists acted 
in accord with the therapeutic regimen. Até 
second level, with regard to certain types Oy 
therapy (e.g., behavior therapy), a determina 
tion can be made of whether the clients a 
tually implemented the procedures recom 
mended by the therapists. For example, ina 
communication program for parents and 
children, one should assess whether the thera- 
pists gave the appropriate suggestions as out 
lined in the therapy manual. It can also be 
determined whether or not the parents fol 
lowed the designated communication exercises, 
with their children at home. r, 

Checks on the independent variables i 
treatment research with children and ma "4 
partners are seldom reported or are a 
plete. Both therapists’ and clients’ self-reports 
as well as observations of therapy progran 4 
or analyses of therapy tapes, are strongty an 
vocated. Having a detailed behavioral asses 
ment of the independent treatment varial Hi 
prompts investigators to provide a 
treatment descriptions, including the ee i. 
tage of time therapists spent in various treat 
ment activities. Most importantly, ca 
comparisons between different meme Pa 
grams are made, an investigator shou T f} 
vide reliable, objective data to documen! 
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‘actual procedural differences that distin- 
ished the various treatment groups. 


Experimental Design 


| 
__ Aspects of therapist—client interactions that 
“should be controlled for include attention to 
a client’s problem, expectations about thera- 
peutic outcome, and other nonspecific rela- 
tionship factors. Traditionally, solutions for 
‘such problems have been placebo or atten- 
tonal control groups. We consider placebo 
goups to be ethically and methodologically 
problematic, and a number of alternatives 
have been suggested to control for therapist 
attention and client expectations (K. D. 
| O'Leary & Borkovec, in press). A few of these 
choices are (a) the use of the best available 
alternatives (Jacobson & Baucom, Note 4), 
that is, comparing one form of treatment with 
another commonly accepted treatment for a 
particular problem; (b) component control 
comparisons in which procedural elements 
ftom a total treatment package are evaluated; 
(c) evaluation of the same treatment program 
under two different conditions—a neutral ex- 
| pectancy control group that receives a neutral 
| expectation set and a second group that re- 
tives usual or exaggerated positive expecta- 
tion sets; and (d) counterdemand manipula- 
tions in which clients are told that they should 
Not expect improvement until after a fixed 
number of sessions. A full discussion of con- 
ttols for expectations, therapist attention, 
4 relationship factors is clearly beyond the 
ae of this article, but such factors must be 
| Addressed by any serious researcher. 
A problems that change with maturation 
i. a spontaneously, waiting list controls 
“ae periods may be especially useful. 
CA a knowing whether an intervention is 
i effective than none at all is absolutely 
a RA In child treatment evaluations, 
A controls are critical (Kent, 
), and although many people would think 
7 t marital cases are generally unlikely to 
en O short periods without treat- 
ing list urman, 1975), the inclusion of wait- 
shine control groups can result in tempered 
ion ee the effectiveness of an inter- 
is, Ie D. O'Leary & Turkewitz, 1978). 
Alusión ical concerns do not allow for in- 
) of a standard waiting list control 
i 
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group, the clients on the waiting list can be 
monitored, and those who evidence deteriora- 
tion can be placed in the treatment group 
(Stuart, 1973). 

When comparing different treatment pro- 
grams, some assessment should be made re- 
garding the clients’ acceptance of the thera- 
peutic rationale (Kazdin & Wilcoxon, 1976). 
In addition, clients should be asked to eval- 
uate the therapists as well as the therapeutic 
procedures (Kazdin, 1977; Kent & O'Leary, 
1976; Wolf, 1978). 

Single-subject or within-subject designs 
have been used by both child and marital 
researchers, and they have been especially 
useful in evaluating the effectiveness of treat- 
ment procedures when (a) the effect can 
ethically be reversed (reversal designs) or 
(b) generalization of effects across behaviors 
is small (multiple baseline designs). When 
either of these conditions is unlikely, group 
designs should be used. (For a full discussion 
of single subject methodology, consult Hersen 
& Barlow, 1976.) 

In addition to deciding on adequate control 
groups and research designs, plans should be 
made to collect follow-up data to determine 
whether the treatment effects persist when the 
natural environment is not reprogrammed by 
the therapist. An assessment of the posttreat- 
ment environment is advisable—especially 
with children—to ascertain the conditions un- 
der which behavior is or is not maintained 
(Bijou, 1974). 

Data Analysis and Interpretation 


A common error in analyzing data from 
dyadic interactions in the child and marital 
areas is to include both partners from the 
dyad in the degrees of freedom. Since the 
members of the dyad influence one another, 
they should not be regarded as independent 
units in analyses of variance. For example, 
in examining the simple pre-post effects of 
an intervention on 10 couples, analyses using 
data from individual subjects should be par- 
titioned as follows: pre-post, male versus 
female, the interaction between male-female 
and pre-post, and couple differences. Alterna- 
tively, the average of the husband’s and wife’s 
scores could be treated as a single unit. 

Tf a therapist conducts group therapy, as 
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is often the case in marital research (Gur- 
man & Kniskern, 1978), generalizations about 
therapeutic effects must be limited to a group 
context and the number and types of thera- 
pist(s) used. Further, the nonindependence 
of subjects must be considered in determining 
the degrees of freedom for testing group ef- 
fects. As a general rule, the number of therapy 
groups in each experimental condition should 
be large enough to detect differences that can- 
not be attributed to idiosyncratic group 
characteristics, 

If several dependent measures are used and 
the correlations between the measures are 
high, a multivariate analysis of variance that 
takes the intercorrelations into account should 
be considered (C. L. Olson, 1976). 

Practicing clinicians who are the major 
consumers of clinical research are concerned 
about individual variability (range and stan- 
dard deviation) in response to treatment. 
Consequently, estimates of the strength of 
association (omega-squared) should be re- 
ported wherever possible to allow the clini- 
cian to ascertain the impact of the treatment 
(Hays, 1963). That is, it is useful to know 
the percentage of variance in a dependent 
measure accounted for by the treatment. 

Subject attrition and incomplete data can 
present serious problems in interpretation. 
In a review of psychotherapy literature, 
Baekeland and Lundwall (1975) found that 
clients who dropped out of treatment were 
more likely to be from lower socioeconomic 
groups and to have little affiliation with 
others. As such factors may relate to outcome, 
it is essential to report the percentage of drop- 
outs. Additionally, one should attempt to ob- 

tain posttermination data from dropouts; we 
have found that some clients are willing to 
complete the assessment even if they are un- 
willing to continue in therapy. There are 
occasions when clients participate in the entire 
therapy program but do not complete the 
posttreatment or follow-up assessment ma- 
terials. In an evaluation of a child clinic, Olt- 
manns, Broderick, and O'Leary ( 1977) found 
a significant relationship between difficulty in 
obtaining outcome data and responsiveness 
to treatment. Thus, both therapy dropouts 
and missing data points should be considered 
as potential sources of positive bias, 
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Conclusion 


Salient methodological errors in conducting 
child and marital treatment research include 
inadequate client and therapist selection and 
description, unsubstantiated labels and diag- 
noses, lack of multiple outcome criteria, use 
of dependent measures of unspecified reliabil- 
ity and validity, and failure to provide treat- 
ment manuals and to check whether the spe- 
cific treatments were in fact implemented. 
Experimental design issues are complex and 
relate to the substantive questions of interest, 
Very common problems, however, involve the 
need to control for expectation, therapist at- 
tention, and relationship factors, and the fail- 
ure to conduct follow-up evaluations. Finally, 
common errors in data analysis and inter- 
pretation have been noted. Throughout the 
text suggestions have been made to help mini- 
mize the methodological errors discussed. 
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Psychopathology of Childhood: 
Research Problems and Issues 


Thomas M. Achenbach 
National Institute of Mental Health 
Bethesda, Maryland 


It is argued that research on child psychopathology would benefit from reducing 
the influence of adult treatment models and from applying a developmental per- 
spective to clinical research on children. Specific methodological problems are 
discussed, including the assessment of subject characteristics; the use of repli- 
cable and generalizable diagnostic classifications; the effects of situational spe- 
cificity and developmental variance on measures of children’s behavior; the need 
to avoid pathological biases in judging children; relationships among correlation, 
causation, and prediction in a developmental context; problems in measuring 
change; and the effects of age, cohort, and time of measurement, as well as 
fallacies in drawing longitudinal conclusions from cross-sectional data. Outstand- 
ing research needs are also identified, including the need to devise and use well- 
standardized measures; the need to evaluate interactions between subject and 
treatment variables in outcome research; the need for long-term follow-ups of 
children identified as being at risk; the need for cumulative programmatic re- 


Research on child psychopathology has tra- 
ditionally been treated like a stepchild for 
whom no one assumes full responsibility. De- 
Spite clinical emphasis on the childhood roots 
of adult disorders, psychopathology has been 
studied far more intensively in adults than in 
children, Furthermore, research on psycho- 
Pathology in children has been unduly shaped 
by a mental health system in which clients 
Must adopt a patientlike role. This system 
May not be entirely unjustified for adults, but 
itengenders assumptions that are certainly in- 
Appropriate for children. 

First, children rarely have realistic concep- 
tons of psychopathology and mental health 
“vices. Second, both the judgment that a 
thild needs help and the initiative required 
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search; and the need to link research more closely to service systems. 


to obtain it originate with adults rather than 
with the child. Third, children have much less 
freedom to alter their circumstances than do 
adults, and adult-centered views of treatment 
as primarily a relationship between therapist 
and patient neglect children’s overwhelming 
dependence on their families. Fourth, unlike 
adults who have reached plateaus in their s0- 
cial, educational, cognitive, and physical 
growth, children must be viewed in relation 
to their progress along these dimensions. 
Rather than judging clients’ problems in 
terms of interference with current capabilities 
—as may be appropriate with adults—it is 
necessary to judge children’s problems in 
terms of their interference with future devel- 
opment. Unless children can progress in 
knowledge, skills, and ways of relating to 
others, they are unlikely to achieve a success- 
ful adaptation in the long run, no matter how 
satisfactory an outcome may appear in the 
short run. 

To help troubled children, we need a far 
greater investment of resources, energy, and 
brain power in objectively assessing their 
needs and in developing and evaluating sys- 
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tematic approaches to meeting these needs. 
Despite a flood of rhetoric and legislation on 
behalf of children, there is little evidence of 
cumulative progress in alleviating children’s 
mental health problems. In my view, it is the 
responsibility of mental health professionals 
not only to advocate funds and programs for 
children but to begin demonstrating more ac- 
countability for the services rendered. The 
lack of evidence for the efficacy of most men- 
tal health services to children (cf. Achenbach, 
1974; Levitt, 1971; Shepherd, Oppenheim, 
& Mitchell, 1971) behooves mental health 
professionals to assume greater initiative in 
determining how disturbed children can in 
fact be helped. Reform is vitally needed to 
improve the mental health services available 
to children, However, meaningful reforms can 
only originate from within the mental health 
professions through better validation and use 
of whatever is good in the present system and 
development of new approaches to replace 
whatever cannot be firmly validated, 

In hope of bridging the gap between the 
pressing needs of the mental health service 
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phasis on treatment of adults stems no doy 
from the fact that adults have long been th 
primary paying customers for mental heal 
services. An additional factor may be 
treatment of adults offers a glamorous an 
prestigious role more in keeping with the p 
fessional status and personal insights s 
by those who enter clinical training. As 
result, there has been insufficient incentive 
to develop an independent tradition of 
clinical training and research. Instead, mom 
research on child behavior has been left { 
academic developmental psychologists who aft 
not attuned to the functioning and needs of 
the mental health service system. Wideniti 
the gulf still further between the “two a 
tures” of researchers and practitioners is tit} 
current preoccupation of clinical psychologist 
with winning recognition as independent se 
vice providers. 
Several changes of orientation might help 
to overcome these obstacles to research oi 
child psychopathology. An especially desi 
ble change would be to make child clinic 
training a primary specialty rather than ák 
lowing it to remain secondary to adult tri im) 
ing. This could be done in a number of way® 
One way would be to make child training 4 
Separate and distinct track coequal with adul 
training in the larger clinical programs. 4 
Second way would be to give more empha 
to the differences between approaches appre 
priate to children and adults in general clini» 
cal training programs, supplemented with it 
struction in normal child development. 4 
third way would be to encourage more tal 
ing programs to make child clinical 
primary and official emphasis, rather 
continuing to have so many child practiti0 
trained in general clinical programs geam 
almost exclusively to treatment of adults. 
Another desirable change would be in c 
roles for psychologists whose talents lie 
research. A major portion of biomedi 
search is carried out by full-time res 
who function as members of teams or 
tories specializing in specific organic dise 
and abnormalities. By contrast, the d 
model for researchers in psychopathology * 
been that of the teacher cum r ry 
clinician. Attempts to simultaneously fi 
three roles may have interfered with P 


tic research in the best of times, but 
iirinking resources, the disappearance of 
aching positions, and increasing demands on 
hose who do teach mean that alternatives to 
Ye primarily academic and primarily service 
‘reer models must be fostered if behavioral 
twarch on psychopathology is to progress. 

A further desirable change would be to 
ink the study of psychopathology in children 
more closely to the study of normal develop- 
ment. This would entail delineating the spe- 
fic ways in which disorders disrupt the typi- 
al course of cognitive, emotional, behavioral, 
ind social development and how the outcomes 
of various disorders affect long-term adapta- 
ton, When considered in a developmental 
context, some disorders that seem ominous 
may not in fact be debilitating unless they 
frevent children from receiving socialization 
eperiences necessary for adaptation to their 
ilture. Other disorders that seem less omi- 
Mus might have severe consequences if they 
interfere with the cumulative socialization 
meded for long-term adaptation, A develop- 
mental perspective on child psychopathology 
would emphasize (a) the transactional rela- 
tionships between children’s characteristics as 
individuals and the social contexts in which 
they function and (b) the different implica- 

that various combinations of child and 
favironment have at different developmental 
levels, Although linking childhood character- 
istics with adult psychopathology should re- 
fain an important goal, research on child 
‘Pychopathology would benefit from greater 
feliance on knowledge of normal development 
than on models of adult psychopathology. 


Methodological Problems 


The emphasis here will be on problems evi- 
in manuscripts submitted to the Journal 
# Consulting and Clinical Psychology, but 
are representative of weaknesses in work 
‘pearing elsewhere as well. Although some 
the problems are quite general, I will dis- 
Ss them as they arise in the context of 
"earch on child psychopathology. 


Subject Characteristics 


É pe choice of subjects in too many studies 
determined more by convenience than by 
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the requirements of sound research design. 
The design of every study should take care- 
ful account of such variables as the develop- 
mental level, intelligence, sex, race, socio- 
economic status (SES), and clinical status of 
subjects. 

Developmental level. Even small differ- 
ences in developmental level can have large 
effects on subjects’ capabilities, the ways in 
which they construe situations, the kinds of 
experiences they have had, and the behavior 


(CA) is not an 
mental level if other relevant aspects of dê- 
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be based as much on their cognitive 
as on their CA. A frequent error is to 
vior of children having nor- 

that of retarded children of the 
and then to interpret differences as 
defects inherent in mental retarda- 
. Since retarded children—by definition— 

lower MAs than children of the 
CA, attributing differences to 
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tion per se is tautological at best. Worse than 
being tautological, however, it may be pat- 
ently wrong to attribute behavioral differences 
to the IQ differences, because developing 
more slowly than the norm for one’s culture 
has many consequences besides the effects of 
whatever causes the slow development. These 
consequences include stigmatic labels, a high 
rate of failure when CA-appropriate tasks are 
attempted, ridicule, deprivation of positive 
adult attention, and atypical educational regi- 
mens. All of these factors may affect learning, 
personality, and motivational variables not 
directly affected by slow cognitive develop- 
ment per se (cf. Zigler, 1971). 

Researchers must also be sensitive to the 
possible effects of cognitive level and atypical 
rates of development in children who do not 
happen to be called retarded. For example, 
characteristics other than low IQ earn chil- 
dren labels such as psychotic, delinquent, or 
disadvantaged. However, the cognitive func- 
tioning of such children is often below the 
norm for their CA. When comparisons with 
normal children reveal inferiorities, they 
should not be attributed to psychosis, de- 
linquency, or environmental disadvantages 
unless the effects of cognitive level and the 
concomitants of slow intellectual develop- 
ment can be ruled out through comparisons 
with children matched for cognitive level 
(e.g., retarded children and/or younger nor- 
mal children), 

Demographic variables. Experiential def- 
ferences related to sex, race, and SES are as 
likely to affect behavior as are the experien- 
tial differences related to cognitive level. 
Demographic characteristics of child subjects 
may also affect the behavior of adult experi- 
menters toward the subjects and vice versa 
resulting in interactions between the effects of 
child and experimenter characteristics (Back 


& Dana, 1977; Marwit & A 
Ea Neumann, 1974; 
Despite the influence of sex, 
SES differences, some otherwise Main 
studies describe their samples only as “chil- 
dren,” with no mention at all of these varia- 
bles, much less any indication that their ef- 
seats were tested or controlled for (e.g., Kent 
& O'Leary, 1976). In other studies, the anal. 
yses of demographic variables are confounded 
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in such a way as to obscure possible interac- 
tions among them (e.g., Gesten, 1976), Al- 
though limitations on subject pools and the 
different incidence of particular disorders in 
various groups often preclude exhaustive test- 
ing of the effects of demographic variables, 
their distributions within subject samples 
should at least be described in detail, and 
analyses involving them should be sensitive 
to possible interactions. When the effects of ` 
such variables cannot be tested, investigators 
should explicitly note that their findings are 
limited to populations like those represented 
by their samples. 

Sources of subjects. The settings from 
which children are obtained can exert major 
influences on behavior. The different experi- 
ential histories, motivational structures, and 
expectations of subjects obtained through 
public schools, clinics, residential institutions, 
and courts can produce major differences in 
behavior despite similarities in developmental 
level, demographic characteristics, and diag- 
nosis. It has been amply demonstrated, for 
example, that the deprivation of adult atten- 
tion experienced by institutionalized children 
can affect their behavior on experimental tasks 
(cf. Zigler, 1971). In addition, because chil- 
dren do not always distinguish among occu- 
pational roles within a setting, their attitudes — 
toward familiar adults are likely to general- 


ize to research personnel. It is therefore es- 
sential that sources of subjects be clearly 
described and that the setting be treated as 
an important variable that can affect results. | 

Although control over subject and setting } 
variables is usually sought through standard- 
ization of instructions and procedures, the 
variety of settings in which disturbed children 
typically must be studied may preclude stan 
dardization of all relevant variables. ee 
more, the fragmentation of services for a 
dren can create selective biases in sampi® 
obtained from each setting. Because subject, 
setting, and sampling variables can sue 
be fully controlled, an alternative is to TP 4 
cate findings with subjects obtained from 
variety of settings. Findings that are T d 
despite differences in settings have the ae 
est breadth of application, but it is also m 
portant to identify the limitations that spec! 
settings impose on findings. 
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Perennial controversies over diagnosis re- 
fet the immaturity of psychopathology as 
ified of study. Even more than adult psy- 
opathology, child psychopathology lacks a 
herent conceptual framework for describing 
ad discriminating among disorders, much less 
br supporting inferences about the etiology, 
ourse, or appropriate treatments for specific 
isorders. Not only is there no accepted diag- 
wstic system, but there are fundamental dis- 
jreements about whether diagnosis is a 
lgitimate enterprise at all (cf. Hobbs, 1975). 
ritics justly contend that diagnostic labels 
my stigmatize children without gaining them 
he benefits of appropriate services. Research- 
ts must be sensitive to the whole child and 
mbat indiscriminate summary labels for in- 
ividual children, Yet there is no way to 

cumulate and transmit knowledge without 
tnceptual categories. Findings cannot be 

{eralized unless researchers group and de- 
wibe subjects according to categories that 
tan be reliably used by others. 
| In the absence of a generally accepted 
jonomy, diagnostic classification schemes 
Must be chosen to fit particular populations 
a research aims, The portrayal of child- 

od disorders in the official system for clini- 
ees embodied in the second edition 

iy American Psychiatric Association’s 

Ment ) Diagnostic and Statistical Manual of 
a Disorders (DSM 11), represents an 
ae over the first edition of the DSM 
TA ican Psychiatric Association, 1952) in 
a more differentiated and based at least 
i, y empirical studies. However, no Op- 
e definitions are provided for its cate- 
a a the categories do not reflect de- 
Mies wes differences. Furthermore, research 
tly ya that interjudge reliability averages 
a out 60% for broad categories, such 

Psychotic versus neurotic, and considerably 
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» | : 

s for specific subcategories (cf. Achenbach 
| 
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E Classification 
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t 
ypa cbrock, in press). The forthcoming 
III is intended to remedy at least some 


of : 

lag deficiencies, but current drafts in- 
2 no hope of a panacea. 

i k eativa system has been proposed by 
Lee for the Advancement of Psychia- 


1966). It is designed specifically for 
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childhood disorders, is far more differentiated 
than the DSM II, and reflects developmental 
considerations. However, like the DSM, its 
categories consist of narrative mixtures of 
description and inference formulated by com- 
mittees of psychiatrists rather than being op- 
erationally defined on the basis of empirical 
research. Interjudge reliability is no better 
than for the adult categories of the DSM 
(Freeman, 1971). 

A third alternative is to use categories de- 
rived through multivariate analyses of behav- 
ior checklists. Despite differences in check- 
lists, subject populations, types of raters, and 
methods of analysis, there is considerable 
convergence in the behavioral syndromes iden- 
tified in various studies (cf. Achenbach & 
Edelbrock, in press). Much remains to be 
done by way of translating these syndromes 
into categories of individuals or disorders, 
but several of the checklists have been suf- 
ficiently validated to provide a basis for de- 
scribing and categorizing disturbed children 
for research purposes. Even if investigators 
prefer a different approach to diagnostic clas- 
sification, the generalizability of their work 
will be- greatly enhanced by scoring subjects 
on at least one well-validated checklist. 

In addition to identifying behavioral char- 
acteristics as objectively as possible, it is im- 
portant to identify any organic conditions and 
medications that may be relevant. Even in 
studies focused on nonorganic characteristics, 
differences in the distribution of organic con- 
ditions among subject groups can affect find- 
ings. This is just as true for disorders in which 
organic etiologies have not been proven as 
for disorders in which organic etiologies are 
known. As inconclusive as organic diagnoses 
may be for a particular study, they almost al- 
ways have potential implications for the find- 
ings. For example, in a comparison of black 
children and white children, it was found that 
low-IQ black subjects obtained higher Ror- 
schach perceptual integration scores than did 
low-IQ white subjects (Gerstein, Brodzinsky, 
& Reiskind, 1976). This was interpreted as 
indicating that standard IQ tests may under- 
estimate the cognitive capacities of black 
children. However, the only diagnostic in- 
formation given was that “all groups were 
heterogeneous with respect to diagnostic cate- 
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gory” (Gerstein et al., p. 761). A low-IQ 
clinic population is likely to contain children 
with neurological dysfunctions, and, as Ger- 
stein et al. noted, such dysfunctions have been 
found to be related to poor integrative per- 
formance on the Rorschach. It is therefore 
possible that differences in the distribution 
of neurological dysfunctions in the black sam- 
ples and white samples could account for the 
racial differences in disparities observed be- 
tween IQ and Rorschach performance. The 
value of the study could have been greatly 
enhanced if organic dysfunctions in the sam- 
ples had been controlled for, their effects 
tested, or at least reported. 


Measures of Children’s Behavior 


A great many measures of children’s behav- 
ior have been devised, but few have been ade- 
quately standardized and validated for study- 
ing psychopathology. Because of the time, 
sample sizes, and effort required for standard- 
ization, investigators are often forced to choose 
between devising a new measure without ade- 
quate standardization or employing a mea- 
sure that was standardized for populations or 
uses not entirely appropriate for the investi- 
gator’s purposes. The result is both a pro- 
liferation of measures having unknown prop- 
erties and an accumulation of apparent fail- 
ures of standardized measures to deliver what 
they promised. The latter problem may be 
due in part to the failure of authors of stan- 
dardized measures to emphasize the limita- 
tions on the populations and situations for 
which their measures are appropriate. How- 
ever, because authors cannot foresee all pos- 
sible uses and abuses of their measures, the 
measures should not be blamed if they fail to 
fulfill purposes ‘for which they were not vali- 
dated. Careful attempts to broaden the ap- 
plication of a measure are warranted as long 
as poor results are not interpreted to mean 
that the measure is invalid for its intended 
Purpose. But, lacking an armamentarium of 
well-validated research instruments, we have 
a collective responsibility to insure that exist- 
‘ng instruments are properly used rather than 


unfairly undermined through i i 
Unie gA inappropriate 


Whether measures have Previously been 


THOMAS M. ACHENBACH 


standardized or are constructed for a particu, 
lar study, it is always necessary to pilot test 
them with subjects and conditions like those 
to be studied in the research. The variability 
in children’s expectations about research sit-| 
uations, their attentional and cognitive limita. | 
tions, and other variables that affect their 
responses to research procedures make it es- 
sential to obtain a child’s eye view of the 
research situation in order to eliminate sources , 
of anxiety, misunderstanding, response sets, 

and demand characteristics (cf. Achenbach, 

1978b, chap. 7). Careful observation and in- 

terviewing of pilot subjects are therefore: 
necessary precursors of all research with chil- 

dren. Once procedures have been finalized, 

they should be documented and reported ini 

sufficient detail to enable others to replicate 

them with no inadvertent variations that 

might affect findings. 

Situational specificity of behavior. What- 
ever position one takes in the debate over the 
relative dominance of person or situation vari 
ance in behavior (cf. Mischel, 1977), it i$ 
obvious that children’s behavior varies tre- 
mendously with the situation. In ratings of 
normal nursery school children, for example; 
Rose, Blank, and Spalter (1975) found very) 
little consistency from one situation to an- 
other even within a nursery school envitolg 
ment, despite high reliability among oben 
This does not necessarily mean that no stable 
subject variance was detectable, as significant 
correlations were found ‘between reratings 9i 
children in the same situations within the nur 
sery school 4 months apart. } 

In addition to situational effects, the o 
of observer and his/her relationship to © d 
subjects inevitably affect the behavior a 
ported. The greater the disparity tem S, 
situations in which observers view chil a 
and the more the observers differ 1m ane 
relationships to the children, the pes, 
agreement in observations (cf. A A 
Edelbrock, in press). Because there 1 ie 
cally no single criterion situation mee i 
which to validate observations, it is IE 
to obtain multiple measures from obse be 
who differ in their relationships with ier 
jects. Despite their potential biaa A 
are usually the key informants regarding ir- 
havior of clinical concern, because their P 


fons determine what will be done about 
ir children. Furthermore, parents’ reports 
behavior problems have been found to be 
ch more complete than those of teachers, 
ol observers, home observers, or clinic in- 
e workers (Novick, Rosenfeld, Bloch, & 
wwson, 1966). Interparent agreement and 
agreement of teachers and clinicians with 
rents have also been found to exceed agree- 
mnt between clinicians, as well as agreement 
(ween clinicians and teachers (Miller, 
64). Furthermore, clinical trainees’ judg- 
tents of children’s pathology have been 
lind to be influenced more by parents’ re- 
4 ts than by direct observations of children 
fa clinical setting (McCoy, 1976). When- 
possible, studies of child psychopathol- 
should therefore take parent reports into 
count. 

In addition to parents, alternative sources 
ude other family members, teachers, clini- 
s, peers, and self-reports. The weight 
en to reports from these sources should 
pend on the cross-validated accuracy with 


sles of interest. For example, peer reports 
y be among the best predictors of behav- 
in group social situations, teacher reports 
My be the best predictors of academic func- 
= and clinician reports may be the best 
tedictors of behavior during psychotherapy 
jons. When reports from various sources 
sigtee, choices must be made about the 
a importance of situations to which 
ings are to be applied. 
j telopmental versus trait variance. AD 
as goal of most behavioral research 
eet relationships among various 


ls eee 
tks to identify enduring individual differ- 


fi 

it, oe people, whether or not these 
Merlooked are viewed as traits. An often 
idua] i complication of the search for in- 
Eea ifferences in children is that devel- 
Pina, differences account for significant 

à a in almost every measurable behavior. 
ted “ee is that measurements Te- 
ks a the same subjects more than a few 

ota are likely to differ as a function 
i opment, even if the subjects show 
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stability with respect to their rank ordering 
within their cohort. A second consequence is 
that unless all subjects in a sample are at the 
same developmental level with respect to the 
behavior in question, individual differences in 
the behavior may in fact reflect differences in 
developmental level rather than traitlike 
characteristics. A third consequence is that 
covariation among several measures may 
merely reflect the variance that they all share 
with development rather than an independent 
trait. 
As an example, impulsivity-reflectivity has 
been regarded as a dimension of cognitive 
style that correlates with a variety of behav- 
ior in normal and clinical samples. However, 
it is also known to correlate with CA and 
MA, indicating that at least some of the 
variance in measures of impulsivity-reflectiv- 
ity is shared with indices of development. To 
determine whether impulsivity-reflectivity 
represents significant trait variance over and 
above the variance it shares with cognitive 
development, Achenbach and Weisz (1975) 
administered the Stanford-Binet and a test 
of impulsivity-reflectivity to preschoolers on 
two occasions 6 months apart. It was found 
that impulsivity-reflectivity correlated signifi- 
cantly higher with MA than with IQ on each 
occasion and across occasions. This indicated 
that cognitive developmental level as mea- 
sured by MA, rather than deviation from 
normative age groups as measured by 1Q, was 
the appropriate cognitive measure | against 
which to assess variance in impulsivity-re- 
flectivity- j 
Although there was 4 significant correlation 
between pretest and posttest impulsivity—re- 
flectivity scores, regression of posttest scores 
t scores showed that 


on both MA and pretes i 
considerably more of the variance 1n posttest 
scores could be accounted for by MA than by 
pretest impulsivity-reflectivity. Furthermore, 
a significant correlation between impulsivity— 
reflectivity and hypothesis usage—a behavior 
assumed central to cognitive style (Kagan & 
Kogan, 1970)—disappeated when MA was 
partialed out. It thus appears that general 
cognitive development may account for much 
of the variance in impulsivity-reflectivity that 
has been ascribed to an independent trait. A 


similar lack of independence from general cog- 
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nitive development has been reported for field 
independence (Weisz, O’Neill, & O'Neill, 
1975) and for measures of moral development 
(Taylor & Achenbach, 1975). Before drawing 
conclusions about variables that correlate 
with development, it is therefore necessary to 
demonstrate that they represent reliable 
variance over and above the variance that 
they share with general development. Other- 
wise, we risk a proliferation of “traits” that 
can be more parsimoniously measured and 
conceptualized in terms of general indices of 
development, such as MA and CA. 


Avoiding Pathological Biases 


It is widely recognized that when people 
are asked to recall the developmental histories 
of adults whom the informants know to be 
disturbed, the reports are likely to be biased 
in the direction of excessive pathology. Path- 
ological biases of this sort are not restricted 
to retrospective reports by untrained inform- 
ants, however. Lacking objective criteria for 
discriminating normality from pathology, 
mental health workers may be overly sensitive 
to signs of pathology. This is especially true 
where children are concerned, because they 
may become anxious, constricted, impulsive, 
or withdrawn when brought to mental health 
settings, which they view as mysterious, 
threatening, or punitive. Clinical settings are 
thus likely to highlight signs of pathology, 
and the effect of pathological biases on clini- 
cal judgments of even the most normal chil- 
dren has been well documented (McCoy, 
1976). Research on disturbed children should 
therefore be grounded firmly on comparisons 
with normal children matched to clinical sam- 
ples for conditions of observation as well as 


for such variables as develo; mental li 
SES, race, and sex, i ig 


Correlation, Causation, and Prediction 


Causal relationships can rarely be į 
from covariation between ee os 
controlled experimental manipulation of the 
tite ese independent variable is followed 
y corresponding changes in the dependent 
variable. Experimental designs offer the most 
Powerful means for testing causality, and they 
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are certainly the most appropriate way to test 
the effects of variables subject to experimental 
manipulation, such as treatment conditions, 
However, true experimental designs are rarely 
feasible in research on the etiology and de- 
velopmental course of psychopathology in 
humans. As a result, much research on psy- 
chopathology is correlational. Despite the 
maxim that “correlation does not imply causa- 
tion,” it is surprising how often statistical | 
covariation is concluded to demonstrate causa- 
tion without the benefit of experimental 
manipulation or convergent findings that rule 
out alternative explanations for the covaria- 
tion. For example, correlations between child 
behavior and parent socialization practices 
have often been interpreted as demonstrating 
the effects of the socialization practices om 
children. Yet, as Bell and Harper (1977) have 
shown, most of these correlations are ambigu- 
ous with respect to causation, because they 
could just as plausibly reflect the effects of 
children on their parents’ behavior, genetic 
similarities between parents and children, of 
correlations between the extrafamilial influ 
ences that operate on children and on their; 
parents. 

Another pitfall of correlational research on 
children is the tendency to overinterpret cor 
relations in terms of prediction. One source 
of this tendency is the statistical terminology 
of multiple regression and discriminant anal- 
ysis in which the independent variables “a 
referred to as “predictors” of the dependem 
variable. The summaries of covariation ann 
variables provided by these statistical metho i 
should not be confused with the prediction k 
behavior across time, however. Unless a 
dictor” variables are in fact measured earlie 
than the “outcome” variable and indep 
dently of it, they are not predictors 
temporal sense. sa 

cea source of the tendency to one 
terpret correlations in terms of peli i 
the traditional longitudinal research stra i. 
A goal shared by research on devel a 
and psychopathology is to identify ae ant 
ships between variables measured at i A 
points in people’s lives, and longitudina Hoel 
ies are the most obvious way to do this: zs in 
ever, significant longitudinal poe 
a particular sample do not necessa" yi 


‘that the earlier variables are predictors of 
ihe later variables. Before longitudinal find- 
ings are interpreted, it is necessary to deter- 
mine whether the number of statistically sig- 
nificant relationships substantially exceeds 
that expected by chance. This is an important 
step in any research, but longitudinal research 
offers such inordinate temptations for post 
hoc analysis that extra care must be taken to 
control for chance. 

Besides controlling for the effects of chance 
on the distribution of statistically significant 
relationships, it is important to consider the 
limitations on generalization of longitudinal 
findings. Initial selection factors and attri- 
tion inevitably undermine the representative- 
[ness of longitudinal samples. Moreover, as 
J will be discussed in detail later, peculiarities 
"of particular cohorts and the periods in which 
they are studied can affect the relationships 
obtained. Unless they are specifically pre- 
dicted by a priori hypotheses, covariations 
between earlier and later measurements ina 
particular sample are best used to generate 
hypotheses subject to further test. In rare 
tases, further tests of the hypotheses may be 
made with other data from the same samples, 
| or path analysis may be used to choose among 
competing interpretations (cf. Achenbach, 
1978b, chap. 6; Kerlinger & Pedhazur, 1973). 
More typically, new samples or designs will 
be needed to replicate and triangulate the re- 
lationships obtained in the initial samples. 
Without replication and further test, signifi- 
cant covariation between earlier and later 
measurements in a single sample is not a suf- 


ficient basis for inferring “prediction,” much 


less causation. 


Problems in Measuring Change 


i At first glance, the measurement of stabil- 
ae change in behavior appears to be a 
ae e task. However, it is fraught with perils 
a ee there are few simple solutions. I 
th rst consider regression effects and then 

e problems encountered in using change 
Scores, 

Regression effects. Any score can be re- 
a as a sum of a true score for the vari- 
a in question and the errors of measure- 

nt that cause the obtained score to deviate 


| 
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from the true score. Sources of error include 
momentary fluctuations in the phenomenon 
being measured, as well as biases and incon- 
sistencies in the measuring procedure. Like 
most multidetermined phenomena, errors of 
measurement are assumed to be normally dis- 
tributed around each of the true scores that 
would be obtained if there were no errors of 
measurement. If errors are normally distrib- 
uted around a true score, extremely large er- 
rors will be much rarer than small errors. 
Therefore, if a large error is made on one 0C- 
casion, the most probable outcome of a sub- 
sequent measurement is a score that is closer 
to the true score than the first score was. If 
the first score was extremely high, then the 
subsequent score will be lower. On the other 
hand, if the first score was extremely low, the 
subsequent score will be higher. 

To illustrate the way in which regression 
effects can complicate research, suppose that 
100 subjects all have a true score of 50 on 
variable X. Due to errors of measurement, the 
actual scores obtained by these subjects on 
a particular occasion may be normally dis- 
tributed from 40 to 60, with a mean of 50. 
Suppose that the same 100 subjects are mea- 
sured again 6 months later, that their true 
scores have all remained at 50, and that their 
obtained scores are again normally distributed 
from 40 to 60, with a mean of 50, Unless 
the error of measurement is perfectly reliable 
from the first to the second occasion, a plot 
of the relations between each subject’s first 
and second scores will show that subjects who 
scored lowest on the first occasion now score 
higher. Conversely, subjects who scored high- 
est on the first occasion now score lower. 

Next, suppose that we had performed an 
experiment to compare the effects of a par- 
ticular treatment on subjects who initially 
scored between 40 and 45 on variable X with 
the effects on those who scored between 55 
and 60. Or suppose we could not introduce 
an, experimental manipulation but simply 
wished to compare the developmental course 
of subjects who had initially scored low and 
high on variable X. In either case, we would 
find the Time 2 scores of both groups to be 
much closer to the overall mean of 50 than 
they were at Time 1, merely because the clas- 
sification of subjects was based on measure- 
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ment errors that were replaced for those par- 
ticular subjects by less extreme measurement 
errors at Time 2. If instead of comparing low 
and high scorers we had studied the effects 
of a particular treatment on low scorers, we 
would have found their Time 2 scores to ap- 
proximate the mean for the entire sample and 
might erroneously conclude that the treat- 
ment was responsible for the increase in their 
scores. 

Errors of measurement are not the only 
source of regression effects, Behavioral varia- 
bles are influenced by so many determinants 
that even true scores are subject to regression 
effects from one occasion to another. This is 
because the lowest true scores on variable X 
are obtained by subjects who have received 
the most extreme combination of X-depressing 
influences, By contrast, the highest true scores 
on variable X are obtained by subjects who 
have received the most extreme combinations 
of X-enhancing influences. Because extreme 
combinations of influences on X are likely to 
be due at least in part to chance, the true 
scores of extreme scorers are likely to regress 
toward the population mean on subsequent 
occasions if X is a variable whose true value 
changes. Thus, children who display many 
behavior problems on one occasion are likely 
to display fewer problems on a subsequent 
occasion due to regression effects, Since most 
behavioral variables are subject to change— 
especially variables measured during the 
course of development—tegression in true 
scores must be expected in addition to regres- 
sion due to errors of measurement, 

Change scores. It would seem logical to 
compute the difference between an initial 
score and a later score to measure develop- 
mental change over time, as well as change 
in response to experimental manipulations. 
However, the size of each change score de- 
pends not only on the effects of intervening 
development and manipulations but also on 
tng tw say ty atl nay ob- 

variable are likely to 


scores are likely to 
a ge scores, 

ven though regression effects may differ.’ 

: y differ- 

entially affect change scores of individual high 
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and low scorers, this would not bias a com. 


parison of groups receiving two different treat- 
ments, provided that the groups receiving the 
different treatments had the same distribution 
of initial scores. However, if one group in- 
itially scored higher than the other, regres- 
sion effects could enhance the change scores 
of one group while depressing the change 
scores of the other group. Even if change 
Scores are not used, other considerations argue 
against testing the effects of independent vari- 
ables with groups differing in initial scores 
on the dependent variable. Yet, because 
change scores mask both the initial size of 
scores and regression effects, they complicate 
the problem still further. i 

The most general solution to problems of 
measuring change is to compare groups that 
are initially well matched on all relevant vari- 
ables, including their scores on the dependent 
variable. Random assignment, stratified ran- 
domization, and formation of matched blocks 
within which treatments are randomly as 
signed provide the most straightforward ways 
to obtain comparability. When any form of 
matching is used, however, care must be taken } 
to avoid biased selection from populations 
whose distributions differ. For example, pat 
ents of lower-SES boys tend to report more 
delinquent behavior problems than parents 
of upper-SES boys (Achenbach, 1978a). In 
order to match lower-SES boys to uppet-SES 
boys on delinquent behavior, it would be nec- 
essary to select lower-SES boys whose be 
havior is less delinquent than the average of 
their population and/or upper-SES boys 
whose behavior is more delinquent than the 
average of their population. However, match 
ing of this sort is fallacious, because, besides 
being unrepresentative of their SES groups) 
the boys would be likely to show divergence 
toward the means of their respective popula- 
tions on subsequent occasions due to regres: 
sion effects. Detailed treatment of statistical 
approaches to the analysis of change 48 be- 
yond the scope of this article, but advice om 
compromise solutions to measuring change has 
been provided by Campbell and Stanley 
(1963), Cronbach and Furby (1970); Ha 
(1963), McCall and Appelbaum (1973), an 
Wilson (1975). 
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Bfects of Age, Cohort, and Time of 
‘Assessment 


In addition to general problems in measur- 
fg change, assessment of behavior that can 
dange with age is complicated by the fact 
that age differences may be confounded with 
daracteristics peculiar to the cohorts studied 
md with the effects of cultural conditions pre- 
miling at the time of assessment. Confound- 
ing of this sort is most evident when longi- 
udinal conclusions are drawn from cross- 
wetional data, After considering sources of 
leror in this practice, I will outline various 
insearch strategies for separating the effects 
tf age, cohort, and time of assessment. (See 
Achenbach, 1978b, chaps. 4 and 7, for a more 
tetailed presentation.) 

Drawing longitudinal inferences from cross- 
sectional data. Because longitudinal research 
is so slow and expensive, it is tempting to 
tse cross-sectional data to infer age changes 
in behavior. It might seem reasonable to as- 
sume that if 6-year-olds in a population be- 
have differently from 16-year-olds in the same 
Population, then the 6-year-olds will change 
intil, when they reach 16, they will resemble 
the current 16-year-olds. Depending on the 


j È . 
oe in question and the ages spanned, 
this can be a very risky assumption. Differ- 
‘ 
K 


ces in the birth order, SES, cultural experi- 
fice, health, and schooling of the two cohorts 
tan have powerful effects on their develop- 
mental course, Differences in all of these vari- 
ables and many more can occur even when 
eos appear to belong to the same popu 
a because they live in the same locality 
ain attend the same schools. The recent de- 
fi in the birthrate, for example, means that 
ha ren who are currently 6 years old will 
a fewer siblings when they are 16 than 
a current 16-year-olds. Although the effects 
Ea trends are difficult to measure 
RA Y, the myriad of uncontrolled variables 
ie A on each cohort makes it risky to 
i that cross-sectional age differences Te 
he the same changes in behavior as occur 
n a single cohort is studied longitudinally. 
of the clearest contradictions between 
es and longitudinal findings has 
a from the study of IQ test perform- 
N at various ages. The norms for the 
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Wechsler Adult Intelligence Scale (WAIS; 
Wechsler, 1955), for example, were based on 
cross-sectional samples that showed progres- 
sive declines in performance from early to 
late adulthood. By contrast, longitudinal data 
on samples of adults retested at intervals of 
12 years have shown significant increases in 
WAIS performance (Kangas & Bradway, 
1971). Likewise, a cross-sectional study of 
the Binet IQs of southern black children 
showed progressively lower scores from the 
younger to the older cohorts (Kennedy, Van 
De Riet, & White, 1963). However, retesting 
of the same children 5 years later showed that 
their IQs had remained stable (Kennedy, 
1969). In both cases, the declines in IQ im- 
plied by the cross-sectional data were due to 
differences in cohorts other than their ages: 
In the WAIS samples, the older cohorts had 
grown up when average educational attain- 
ment was less than for the younger cohorts. 
In the southern black sample, the cohorts 
were from 5 to 16 years old but included only 
children who were in elementary school. This 
means that the youngest cohorts contained 
children who were bright enough to begin 
school early, but the older cohorts contained 
progressively larger proportions of students 
who had not been promoted beyond elemen- 
tary school at the customary age. 

Jensen's approach. In a widely publicized 
alternative to the ordinary cross-sectional de- 
sign, Jensen (1977) has made cross-sectional 
comparisons between siblings to test the hy- 
pothesis of a cumulative deficit in the IQs of 
southern black children. He has done this by 
computing the differences between the IQs of 
black siblings to determine whether older 
black children consistently have lower IQs 
than their younger siblings. He then com- 
pared these sibling differences in IQ to those 
of white children in the same schools. Because 
r and older cohorts were from the 


the youngel i 
same families within each racial group, Jen- 
the possible effects 


sen hoped to control for 
of cohort differences in gene pool, SES, and 
other factors that might arise at various ages. 
Consistent with the cumulative deficit hy- 
pothesis, Jensen found that younger blacks 
generally had higher IQs than their older sib- 
lings. The sibling differences were less con- 
sistent for whites, indicating no general trend 
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toward cumulative deficit. Jensen concluded 
from this cross-sectional finding that the IQs 
of blacks but not whites were declining with 
age. 

Unfortunately, Jensen’s alternative leaves 
so many possible variables uncontrolled that 
it does not improve much on standard cross- 
sectional designs. Without passing judgment 
on the cumulative deficit hypothesis per se, 
it is instructive to consider weaknesses of the 
Jensen study that argue against drawing 
longitudinal conclusions from even this re- 
vised version of a cross-sectional design. First, 
Jensen at one point describes his sample as 
including “all of the white and black children 
enrolled in the public schools of a small rural 
town” (p. 185), but then he says that “the 
present data . . . include only subjects rang- 
ing in age from 6 to 16 years” (p. 187). Since 
the school system probably enrolled children 
younger than 6 and older than 16, Jensen’s 
unreported procedure for excluding subjects 
could have exerted a bias like that apparent 
in the Kennedy et al. (1963) cross-sectional 
study. The inclusion of siblings would not 
entirely eliminate such a bias. There is also 
no report of alternative schooling opportun- 
ities for either whites or blacks that might 
cause differential attrition from the public 
schools at different ages. Nor is there any re- 
port of how sibling status was ascertained or 
of how the distribution of half-siblings and 
stepsiblings might differ between the races, 
All of these factors could influence the results 
of cross-sectional sibling comparisons. 

The primary comparisons between the races 
involved analyses by family size such that 
the mean IQ differences were computed be- 
tween each child and each of his/her younger 
siblings in families having 2, 3, 4, and 5 
children, respectively. However, cea few 
white families had more than 3 children, In 
fact, all four comparisons for 5-child families 
involved only 2—4 white sibling pairs. Besides 
the obvious statistical fallacies of reporting 
and testing means with such small samples 
and with repeated representation of the same 
mee in these samples, the different dis- 
cea Hea: ae in the black sample 

uld interact with birth 


rank in ways that are not controlled by cross- 


sectional comparison of siblings. For example, 
3 
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more of the older black children in the sib- 
ling comparisons for each family size were 
likely to be lower in birth rank simply because 
the black families were larger on the average, 
Since neither the birth orders nor the ages | 
were reported for children of each race having 
each family size, these factors cannot be ruled 
out. 

The failure of Jensen’s (1977) sibling-based 
cross-sectional approach to control for so 
many variables does not by itself argue for | 
specific alternative explanations for the dif- 
ferences between IQs of black children and 
their younger siblings. However, because 
sen’s approach leaves so many variables 
controlled, the findings do not enable us 
choose among a variety of possible explahar’ 
tions. As an example, one possible,rival inter- 
pretation of Jensen’s findings is that recent 
improvements in living conditions for black 
families have produced higher IQs in their 
most recently born children than in their chil- 
dren born before conditions improved. Noth; 
ing in Jensen’s data can rule out alternative i 
interpretations of this sort, and neither or 
ventional cross-sectional designs nor Jensens 
approach are likely to discriminate between 
age changes and stable differences between 
age cohorts. 

Separating the effects of age, cohort, and 
time of assessment. Despite the relative ad- 
vantages of longitudinal over cross-sectional 
designs for identifying changes with) ase 
longitudinal designs can also produce mislead: 
ing results if there is selective attrition 0 
subjects or if only one cohort is studied. Re- 
search on a single cohort may obscure m 
fact that what"appears to be an agede a 
change in behavior is either peculiar to Ae 
cohort or is manifested by other cohorts & 
that point in history, regardless of theit En 
In a 2-year longitudinal study of aa 
for example, Nesselroade and Baltes (197 ; 
found that cultural changes from the iit 
ning to the end of the study had more 1 x i 
ence on personality than did age ee, 
The cohorts ranged from 13 to 16 years all 
initial age and 15 to 18 in final age; but A 
four cohorts showed declines in Bye! 
strength, social-emotional anxiety, Dt 
achievement. As a result, the youngest SU é 
jects resembled the oldest subjects at 
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dof the study rather than resembling the 
est subjects as they had been at the be- 
imning of the study. If a single cohort had 
een studied longitudinally, it might have 
en erroneously inferred that the changes in 
sonality were a function of age per se, 
hereas comparison of the cohorts showed 
at the changes occurred simultaneously in 
four cohorts. 
i Because age, cohort, and cultural-historical 
iiects may be confounded in conventional 
‘ross-sectional and longitudinal designs, it is 
portant to separate these variables as ex- 
Dlicitly as possible in any studies of develop- 
Ment, normal or abnormal (cf. Schaie, 1965). 
As an aid to separating these variables con- 
p ually, the relevant relationships between 
fem are illustrated in Table 1. Table 1 
shows that a cross-sectional study comparing 
5 7-, and 9-year-olds in 1978 would require 
children from cohorts born in 1973, 1971, 
and 1969, respectively. A threat to the inter- 
nal validity of this purely cross-sectional de- 
j Sign is that differences between the 5-, 1-, 
fand 9-year-olds might be attributable to char- 
acteristics other than age. A threat to external 
| validity is that any similarities or differences 
| mong cohorts might not be generalizable to 
| earlier or later points in time because of cul- 
tural-historical changes. 
| As illustrated in Table 1, a longitudinal 
| study of the 1975 birth cohort- at ages 5, His 
| ad 9 would require assessing the 1975 co- 
tort in 1980, 1982, and 1984. The internal 
Validity of this design is vulnerable to the 
Possibility that differences found from 1980 
| 4 1982 and 1984 might be due to cultural- 
ee changes rather than to aging per 
- The external validity may be limited by 
pe aites of the cohort that distinguish it 
om cohorts born earlier or later. 
4 . third design, the time-lag design, is some- 
| a used to identify cultural-historical ef- 
| fa As illustrated in Table 1, historical 
1 be ges in the behavior of 11-year-olds could 
aes by assessing subjects from the 
oe 1971, 1973, and 1975 cohorts when they 
rch age 11 in 1980, 1982, 1984, and 1986, 
eee However, this design confounds 
le differences in the cohorts with dif- 
oe in year of assessment. 
j o avoid confounding age, cohort, and time 
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Table 1 

Interrelationships Among Time and Age 
Variables Involved in Developmental Analyses 
ooma 


Age 
Birth 
cohort 5 7 9 11 
1969 1974 1976 1978" 1980» 
1971 1976 19788 1980 1982» 
1973 1978" 1980 1982 1984 
1975 1980° 1982° 1984° 1986 


Note. Figures in the table are the years in which each 
birth cohort would be studied at each age listed. 
(Adapted from Achenbach, 1978b.) 

a Cross-sectional. 

b Time lag. 

© Longitudinal. 


of assessment, several designs have been de- 
veloped to combine aspects of the cross-sec- 
tional, longitudinal, and time-lag strategies. 
The Nesselroade and Baltes (1974) study of 
adolescent personality combined these strat- 
egies in a longitudinal sequential design. By 
assessing several birth year cohorts over the 
same longitudinal period, it is possible to do 
cross-sectional comparisons of cohorts at any 
point in the study, longitudinal comparisons 
of each cohort, and time-lag comparisons 
among cohorts as they reach a particular age 
in successive years, aS illustrated in Table 2. 
It can thus be determined whether changes in 
behavior are attributable to cultural-historical 
changes, age changes, or an interaction be- 
tween the two. 

Another design—the cross-sectional sequen- 
tial design—combines cross-sectional and 
longitudinal strategies by making cross-sec- 
tional comparisons of samples from several 
cohorts at several different points in time. As 
illustrated in Table 2, the pattern of analyses 
is similar to that i 
tial design, except that 

Je from each cohort, 


nally following one samp 
new samples are drawn from each cohort at 


each time of assessment. Drawing new sam- 
ples avoids biases affecting longitudinal stud- 
jes due to attrition, the effects of repeated 
testing of the same subjects, and initial se- 
lection ‘for expected availability. On the other 
hand, because new samples are drawn for each 
assessment, changes in individuals over time 
cannot be traced, and congruence among suc- 
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Table 2 
Longitudinal Sequential, Cross-sectional 
Sequential, and Time-Lag Sequential Designs 


Age 

Birth 
cohort 5 6 7 8 9 

Longitudinal sequential* 
1971 1978 1979 1980 
1972 1978 1979 1980 
1973 1978 1979 1980 

Cross-sectional sequential? 
1971 1978 1979 1980 
1972 1978 1979 1980 
1973 1978 1979 1980 

Time-lag sequential* 

1971 1978 1979 1980 
1972 1979 1980 1981 
1973 1980 1981 1982 


Note. Adapted from Achenbach (1978b). 

«A single sample from each of three cohorts is 
assessed on three occasions from 1978 to 1980, 

b Three samples are drawn from each of three cohorts 
on three occasions from 1978 to 1980, 

© Three samples from each of three cohorts are com- 
pared on three occasions from 1978 to 1980, 1979 
to 1981, and 1980 to 1982, respectively. 


cessive samples may be reduced by unrecog- 
nized sampling fluctuations and changes in the 
composition of the cohorts from which the 
samples are drawn. 

In an additional design, known as the time- 
lag sequential design, two or more samples are 
assessed from each cohort as they reach dif- 
ferent ages in different years, As illustrated 
in Table 2, such a design could be used to 
study the behavior of the 1971, 1972, and 
1973 birth cohorts as they reach the TA of 
7, 8, and 9 in 1978-1982, respectively. This 
design permits separation of effects due to 
time of assessment from effects du 
assessment without ri 
samples, However, like Cross-sectional sequ 
tial designs, the time- batten ts 
eats i; me-lag sequential design is 


same cohort. The chief advantage of this de- 
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Outstanding Research Needs 


There is no doubt that research on child 
psychopathology is in a rudimentary stage 
and that it faces many challenges. Perhaps 
the greatest challenge is to accept the task 
of forging a science suitable to the problems 
despite the primitiveness of our knowledge 
base and the influence of myth, fad, and cus- 
tom. Other challenges can be summarized in | 
terms of general needs as follows: the need 
to view children as continually changing with 
respect to biological, cognitive, social, educa- 
tional, and emotional development; the need 
to assess behavior in relation to development 
that is normal and adaptive within the child's’ 
culture; the need to evaluate current behavior 
from a longitudinal perspective on what has 
gone before and what is likely to follow; the 
need to see children’s behavior in a family” 
context over which children have little con- 
trol; the need to develop alternatives to men- 
tal health services that require the recipients 
to assume the role of patient; and the need 
to abide by ethical constraints without ab- 
dicating our responsibilities to seek knowl 
edge that will benefit troubled children. A 
thorough discussion of the latter issue is be- 
yond the scope of this article, but it is clear 
that the pendulum swing from inadequate 
ethical guidelines to minute regulation of 
everything labeled research presents formida- 
ble obstacles to the study of child behavior 
(cf. Achenbach, 1978b, chap. 10). However, 
lacking validated means for helping troubled 
children, we must consider not only the ethics 
of specific research procedures but, to para- 
phrase Haywood (1977), the ethics of failing 
to do the research needed to improve od 
ways of helping children. In addition ar 4 
general needs facing research on child psy’ a 
pathology, a number of needs specific to TA 
current stage of development are outline! 
the following sections. 


Standardized Measures 


of 
The handicaps stemming from our lack 


well-validated and standardized TE, 
have already been alluded to, but insu this 
effort is currently devoted to remedy’ ae in- 
situation. Although highly sophisticate 
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{umentation requires well-refined and vali- 
hed theory, it is equally true that theory is 
ilikely to progress without reliable descrip- 
jms of the phenomena in question. The de- 
lopment of standardized descriptive instru- 
wnts for behavior is a slow, arduous, and 
mglamorous enterprise that is effectively dis- 
nuraged by our graduate training programs, 
ward systems for researchers and teachers, 
pofessional journals, and funding agencies. 
inthe face of such obstacles, is it any won- 
ier that our standardized instruments are lim- 
jel primarily to IQ and achievement tests, 
which owe their existence to subsidies from 
¢ educational system? 

If it is acknowledged that research on child 
chopathology is indeed in a rudimentary 
ge, it should be clear that development of 
landardized procedures for assessing path- 
Wogies and competencies and for measuring 
thange in them should command a high pri- 
Mity, The more diverse the sources of data 
(parents, teachers, peers, clinicians, self-re- 
l rts) and the greater the focus on important 
ite situations, the more useful such proce- 
lures are likely to be. It is only when re- 
searchers and clinicians can rely on a com- 
Mon body of baseline measures that we are 
likely to see significant coordinated advances 
inknowledge. 


"he of Subject x Type of Treatment 
Designs 


he vast quantity of research has accumu- 
‘hited on the outcome of adult treatment. How- 
a the value of this research has been 
aM limited by the heterogeneity of its 
ie. samples, without adequate regard for 
ei among the subjects. The much 
fre let quantity of research on child treat- 
nt shares the same flaw, despite the fact 
a when subject characteristics have been 
eae their effects have been found to 
o T for more variance in outcome than 
os in treatment alone (e.g., Love 
ae 1974; Miller, Barrett, Hampe, & 
ce 1972). Because it should be obvious 

= A single treatment modality will be bet- 
on all other modalities for all children 

} fellation of age, sex, IQ, SES, family con- 
} ion, and so forth, we should not have to 
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repeat the history of adult psychotherapy re- 
search before we recognize that outcome stud- 
ies will be most informative if they take ac- 
count of subject variables and especially of 
Subject X Treatment interactions. 

Subject x Treatment designs are, of course, 
even more difficult to implement than ordi- 
nary controlled studies of treatment outcome, 
which are themselves formidable. However, 
because the effects of most interventions are 
likely to be subtle and multidetermined, it is 
essential that interventions not be viewed in- 
dependently of the children with whom they 
are used. Any outcome study that provides 
systematic comparison of results for children 
differing in demographic, diagnostic, and other 
characteristics will be far more valuable than 
studies that do not. 


Long-term Follow-ups 


Although by no means infallible, the most 
comprehensive approach to determining the 
significance of potential predisposing factors 
and manifestations of psychopathology in 
childhood is through long-term follow-ups. It 
has become evident that even major adult 
psychiatric disorders such as schizophrenia 
may be best understood through longitudinal 
study of children who are at high risk for 
the adult disorder (Mednick, Schulsinger, 
Higgins, & Bell, 1974). Because of the rela- 
tively low base rate for severe psychopathol- 


ogy in the general population, long-term study 


of subjects known to be at risk is more effi- 


cient than traditional longitudinal studies of 
broad samples of children. However, studies 
s can also be informative 


of broad sample: 

with respect to psychopathology if they are 

sufficiently focused on the context and course 

of disorders that a large percentage of chil- 

dren manifest at least temporarily (Thomas, 

Chess, & Birch, 1968). As mentioned earlier, 

relationships identified in single cohorts stud- 
jed longitudinally provide an inadequate basis 
for definitive inferences, but the complexity 
of human behavioral development and the 
rudimentary state of our field make longitudi- 
nal data a unique source of hypotheses for 
further test, even if the effects of age, cohort, 
and time of assessment cannot be fully sepa- 


rated. 
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In addition to basic longitudinal research, 
it is important to build a longitudinal dimen- 
sion into studies of treatment outcome. Un- 
less adequate follow-up periods are used, it 
cannot be known how interventions will affect 
the behavioral, social, emotional, cognitive, 
and educational development of children. Suc- 
cessful adaptation depends not merely on the 
removal of troublesome behavior but also on 
the continued acquisition of progressively 
more advanced coping strategies. Studies that 
include follow-ups of even 1 or 2 years often 
show results quite different from shorter fol- 
low-ups of the same children (eg., Hampe, 
Noble, Miller, & Barrett, 1973). Arduous 
though they may be, follow-ups over periods 
of at least a year should become a routine 
part of clinical services, as well as of research 
on these services. 


Cumulative Programmatic Research 


The volume of publications related to child 
psychopathology far outweighs its actual im- 
pact. Although no field shows a linear pro- 
gression of knowledge, especially during its 
formative stages, the literature on child psy- 
chopathology reflects two problems that may 
be remediable. One is the preponderance of 
one-shot studies whose contribution to knowl- 
edge is limited by the uniqueness of their pro- 
cedures, measures, and subject samples. Since 
almost no study provides definitive answers 
by itself, a study is of little value if it can- 
not be linked to others for Purposes of replica- 
tion, generalization, cross-validation, and tri- 
angulation of findings, 

The preponderance of one-shot studies is 
not surprising in view of the obstacles to pro- 
grammatic research encountered by the grad- 
uate students, academicians, and Practitioners 
who have been the chief Contributors to our 
literature. However, unless research can be 
better coordinated with respect to procedures, 

; 


ect samples, and unless 


THOMAS M. ACHENBACH 


teaching and/or clinical responsibilities, In 
addition, graduate students and others who 
are limited to doing one-shot studies should 
be encouraged by their advisors and by jour- 
nal policies to link their work as closely as 
possible to other research through the use of 
standardized measures, procedures, and sub- 
ject samples that can be generalized and rep- 
licated. By their sanctions against closely 
linked series of studies, graduate committees 
and journal editors may otherwise be per- 
petuating fragmentation where convergence 
is needed. 

A second impediment to cumulative prog- 
ress is the tendency for practitioners and 
researchers alike to identify with a single 
theoretical viewpoint. This has resulted in 
what might be called a “horizontal” progres- 
sion within each school of thought, rather 
than a “vertical” progression, whereby ideas 
from various sources are tested and revised or 
discarded according to the empirical findings 
they generate. As productive as research with- 
in a single paradigm may sometimes be, no 
single theory or level of analysis is yet s 
powerful for representing child psychopath- 
ology that it deserves exclusive allegiance. In 
the foreseeable future, cumulative knowledge 
is more likely to emerge from integration of 
multiple perspectives than from dogmatic ad- 
herence to a single one. 


Linking Research to Service Systems 


The ultimate aim of research on child psy- 
chopathology is to aid children by preventing 
maladaptive development and by ameliorating 
it when it does occur. Children rece 
treatment through the current mental healt 
system represent only the tip of a very isi 
iceberg that encompasses the einen 
legal, medical, and welfare systems. When E 
rect mental health services are available 2 
children at all, they are often sought only A 
ter the other systems have failed. These s 
tems are ill equipped to identify and ee i 
maladaptive deviations in behavioral i 
opment, but they are representative oF K 
social context in which children develop. a 
within this context of mixed motives and mê: 
sages, competing agendas, matdistribution 4 
resources, and bureaucracy that the frui 
research must ultimately be consumed. 7 


None of the systems is designed to foster 
earch, much less to implement its findings. 
en though the mental health system should 

the most responsive to research on child 
jychopathology, it raises barriers of its own. 
Ji is clear, for example, that the personal 
ilosophies of mental health practitioners 
ermine the treatments received by clients 
afar greater extent than in organic medi- 
lin, The concomitantly greater difficulty of 
‘tanging one’s approach to treatment of psy- 
fhopathology than of trying out a new medi- 
fain for an organic disorder probably ac- 
[punts for the resistance of many practitioners 
0 participating in research and to adopting 
W approaches, no* matter what research 
ows about their customary approach. On 
lhe other hand, research related to treatment 

Í children has rarely offered practitioners 
ll-validated alternatives to their customary 
ipproaches. Is it therefore surprising that 
litle effective cross-fertilization occurs be- 
Ween research and practice? 

To promote cross-fertilization as well as to 
Increase the ecological validity and potential 
pplication of their work, researchers must 
eat their work as closely as possible to the 
Meds of practitioners and must provide prac- 
tiioners with usable results in a constructive 
ee is only through implementation 
ee tice that researchers can obtain feed- 

k on the effectiveness of their efforts. 
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Sex Roles and Psychological Well-Being: 
Perspectives on Methodology 


| Judith Worell 
University of Kentucky 


| The purpose of this article is to delineate some sources of problematic method- 
ology in recent research designed to relate current measures of sex-role orienta- 


New formulations of psychological health 
it well-being have renewed an interest in the 
kasurement of masculinity and femininity 
(Bem, 1974; Berzins, Welling, & Wetter, 
N78; Constantinople, 1973; Heilbrun, 1976; 
ence, Helmreich, & Stapp, 1975). In con- 
ast to earlier conceptions of sex-typed per- 
mality descriptions that relied on a single 
n polar dimension, recent approaches to as- 
Mssment of masculinity and femininity view 
ese as independent, orthogonal dimensions. 
en treated in this manner, characteristics 
iÍ masculinity and femininity can be mea- 
ltd in varying amounts in the same indi- 
Nidual. Persons are considered to be sex typed 
1 the extent that they endorse a relatively 
gh degree of one set of characteristics in 
Meference to the other. The most conceptually 
ductive outcome of the orthogonal model of 
Mx typing is androgyny, whereby the person 
Jdorses relatively equal numbers of mascu- 
o feminine traits (Bem, 1974). The 
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tion to indices of psychological well-being. Practices and procedures in sex-role 
research are examined in relation to orthogonal scales of sex-role orientation that 
provide independent measures of masculinity, femininity, and a newer assess- 
ment of androgyny. Directions for increased conceptual and methodological clar- 
ity include theoretical and psychometric definitions of androgyny, the relation- 
ship of sex-role typing to other aspects of interpersonal functioning, and varying 
procedures in sex-role and gender distinction, population sampling, and construct 
validation. Issues are raised concerning the generality of sex-role measures and 
the desirability of direct behavioral validation criteria. 


flurry of new scales to measure androgyny has 
been followed by a storm of research designed 
to relate responses on these scales to “some- 
thing else.” 

The aim of this article is to examine the 
methodological characteristics of recent re- 
search on psychological sex roles, with the in- 
tent of clarifying some current practices and 
procedures in relation to commonly accepted 
criteria of research design and analysis. Since 
earlier reviews have considered at length and 
in detail the psychometric properties of both 
traditional and recent sex-role scales (Con- 
stantinople, 1973; Kelly & Worell, 1977; 
Worell, Note 1), issues in scale construction 
will be minimized here. The focus of the pres- 
ent remarks will be on the implications of 
current sex-role personality measurement and 
research for conceptions of psychological 
health and personal well-being. 

The theoretical link between sex typing and 
adjustment has been altered dramatically 
since the introduction of the androgyny model, 
Traditional formulations of sex typing sug- 
gest that adoption of the sex roles appropriate 
to one’s male or female gender is develop- 
mentally desirable. Deviations from culturally 
sanctioned sex-role behavior were considered 
maladaptive and undesirable (Kagan, 1964; 
Kohlberg, 1966; Mussen, 1969). In contrast, 
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the androgyny model of sex-role organization 
suggests that a relative balance of sex-typed 
characteristics may lead to the most advan- 
tageous outcomes. It is important to examine 
the recent research strategies in relation to 
the differing assumptions that underlie the 
selection and interpretation of dependent vari- 
ables. Androgyny has been compared to sex 
typing in relation to one or more of the fol- 
lowing outcomes: (a) adaptive, flexible, and 
effective interpersonal behavior; (b) self- 
esteem or positive self-evaluation; (c) free- 
dom from obvious pathology; and (d) broad 
life-style coping variables. Each of these ad- 
justment criteria will be discussed in the con- 
text of current sex-role research. Since this 
discussion is not intended as a review of the 
literature, only selected illustrative examples 
will be included, 

Bem (1974, 1975, 1976) conceives of the 
androgynous person as adaptive, flexible, and 
effective in particular interpersonal contexts, 
Accordingly, the androgynous person can be 
both instrumental (assertive, competent, 
forceful, independent) and expressive (nur- 
turant, warm, supportive, compassionate), de- 
pending on the demands of the situation. The 
outcome of an androgynous orientation is a 
high degree of alternative options for attain- 
ing interpersonal reinforcement in situations 
requiring culturally sex-typed behaviors. On 
the Bem Sex-Role Inventory (BSRI), an- 
drogyny was originally scored as the relative 
balance, determined by a negligible ¢ ratio, 
between masculine and feminine sex-typed 
characteristics (Bem, 1974). In contrast, sex- 
typed persons scored significantly higher on 
either their masculine or feminine sex-role 
traits. Bem considers the sex-typed individual 
to be more constricted and behaviorally lim- 
ited in situations in which sex-inappropriate. 
or cross-sex-typed, behavior is required. Indi. 
viduals who are highly masculine or feminine 
sex ae may inhibit or suppress these cross- 
sex aviors to i = i 
PL avoid self- or other-dis- 

In a series of construct validation studies 

and her associates pro- 
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males showed lowered supportive, playful, an 
expressive behaviors across several situatio 
and sex-typed females failed to maintain in: 
dependence under external pressure to con 
form (Bem, 1975; Bem & Lenney, 1976; 
Bem, Martyna, & Watson, 1976). These ap. 
parent deficits in the behavioral repertoire ol 
sex-typed, in comparison to androgynous, per 
sons lend support to a flexible-restrictive in- 
terpretation. If psychological health and well. 
being are defined in terms of the availability 
of behaviors for achieving interpersonal satis: 
faction, then according to Bem, there may b 
many instances in which sex-typed persons a 
clearly at a disadvantage. 

A second important dimension of psych 
logical well-being entered into the issues sut 
rounding the measurement of androgyny. U 
ing the Personal Attribuies Questionn 
(PAQ) of sex-role endorsement, Spence et 
(1975) found that both male-valued and f 
male-valued scores contributed to a measull 
of self-esteem. These authors suggested tha 
masculinity and femininity may contribute i 
an additive way to an individual’s positivi 
self-evaluation. Consequently, their definitio 
of androgyny included absolute strength , 
well as the relative balance of masculinity 
and femininity scores. The decision to me 
absolute strengths of each sex-role dimensió 
resulted in a fourfold index, determined by 
median split of the combined male and í 
male PAQ scores. Here, the androgynous pers 
sons are those who score above the median on 
both masculinity and femininity. A new od 
gory, consisting of those who score below e 
median on both scales, is termed undifferem 
tiated. y a 

The utility of the fourfold scoring ie 
for conceptions of positive well-being w 
demonstrated in the correlations between a 
PAQ and a measure of self-esteem. When a 
drogyny was defined in terms of both respo a 
strength and balance, androgynous Lee 
were the highest in self-esteem, mae Pr 
those who scored high masculine — low i a 
nine, or sex-typed masculine. When Seal a 
al. rescored these data according to the a 
balance method, androgynous persons S¢ 
at only a moderate level of self-esteem. y- 
implications of these findings for positive H 
chological functioning have led to the a 
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jn of a four-quadrant scoring system by 

hem (1976, 1977), as well as by developers 

(i two more recent sex-role inventories (Ber- 

ins et al., 1978; Heilbrun, 1976). The scor- 

fig issue is not a closed one; some researchers 
tinue to use and defend a balance, rather 
han an absolute strength, definition of an- 
ttogyny (Jones, Chernovetz, & Hansson, 

1978; Wiggins & Holzmuller, 1978). As cur- 
rently conceptualized, however, most research- 
tts have preferred to include the absolute 
stength of masculine and feminine character- 
istics as an indicator of positive and effective 
interpersonal functioning. From a response 
repertoire approach to adaptive situational be- 
havior, the person who endorses more positive 
wlf-characteristics appears to have higher 
gi-esteem and should be functional inter- 
personally in a wide variety of culturally sex- 
lyped situations. 

A third indication of effective psychological 
functioning is the extent to which individuals 
tan remain relatively free from obvious path- 
ology or self-defeating patterns of behavior. 
Sex-role measures have been variously related 
to such adjustment indices as anxiety, self- 
criticism, dependency, helplessness, depres- 
sion, problem drinking, neurosis, introversion, 
and requests for personal counseling (Baggio 
& Neilson, 1976; Deutsch & Gilbert, 1976; 
Gump, 1972; Heilbrun, 1968; Jones et al., 
1978). Since androgynous persons are as- 
sumed to be more adaptive, they should have 
better coping skills and be relatively free of 
problem behaviors. One concern related to this 
aspect of adequate psychological functioning 
is the extent to which individuals provide 
themselves with negative self-communications 
and negative self-evaluations (Bandura, 1969; 
e 1974; Meichenbaum, 1977). Al- 
i a at least two of the presently used sex- 
he scales (BSRI; PAQ) confine their at- 
tibutes to positive and valued characteristics, 
4 contribution of negative self-attribution 
A sex-role categorization has imp! ications for 

of the adjustment indices listed above 
(Kelly & Worell, 1977). 
ot evidence suggests that androgynous 
a les, and some androgynous females, endorse 
Ra fewest negative self-statements (Kelly, 
See Hathorn, & O’Brien, 1977; Wiggins 
Holzmuller, 1978). In contrast to these 
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positive findings, androgynous males were 
also found to contribute more than their ex- 
pected proportion to self-reported problem 
drinking and social introversion (Jones et al., 
1978; Wiggins & Holzmuller, 1978). The re- 
search findings in this area of adjustment 
pathology remain cloudy and inconsistent, es- 
pecially with regard to androgynous males. 
Sex-role functioning, as defined iby the present 
measures, may have differential prediction for 
some psychopathological variables, but no 
consistent pattern has emerged. 

Finally, some investigators have hypothe- 
sized that the flexibility presumed to operate 
in instrumental and expressive domains will 
be manifested in more effective functioning in 
a wide variety of cognitive and interpersonal 
life-style variables. Research efforts in this 
direction have attempted to relate sex-role 
types to such variables as creativity, liberal 
political views, endorsement of feminist views, 
sexism, marital adjustment, parenting skills, 
multiple personality factors, and preferred 
coital position. In some of these studies, the 
theoretical relationship between effective psy- 
chological functioning and androgyny seems 
quite tenuous. The general theme seems to be 
that androgynous persons ought to be better 
at everything, because they are presumably 
more flexible and adaptive. When androgyny 
is approached from this viewpoint, it fre- 
quently slips into becoming a contest between 
white hats and black hats. 

The discussion of the theoretical underpin- 
nings of current androgyny research becomes 
important when we take a look at some of 
the methodological problems that beset the 
research efforts in this field. In the remainder 
of this article, I wish to consider the more 


glaring examples of questionable practices, as 


well as- some of the better ones, in the con- 
ceptualization, design, and analysis of re- 
search into the nature of psychological sex 


roles. In doing 50, examples will be drawn as 
prototypes from both published and unpub- 
lished research and will be unreferenced. The 
purpose here will be to organize and discuss 
these problems in the spirit of constructivism, 
rather than to point a finger at any particu- 
Jar culprits. The intent of this discussion is to 
inject a dose of preventative thought and re- 
flection at the early planning stages SO that 
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small problems in design and presentation can 
be remediated or obviated. 


Problems and Practices 


The following discussion will generally ap- 
proximate a theory, design, procedure, and 
analysis format. Specific topics to be covered 
are (a) application of theory to design; (b) 
sex-role and gender distinctions; (c) sampling 
practices; (d) test and construct validation 
procedures; and (e) psychometric and statis- 
tical considerations. A final section will deal 
with some unresolved issues, 


Theory and Rationale 


A substantial portion of manuscripts on sex- 
role issues ‘fail to explicate a theoretical foun- 
dation for the research. This is especially true 
when a variety of tests or measures are ad- 
ministered, Frequently, little consideration is 
given to the constructs being measured and 
their proposed relationship to any sex-role 
theory. In particular default are studies that 
take a “one-shot” or a “statistical dragnet” 
approach. In the one-shot approach, the re- 
searcher tends to oversimplify the task of 
construct validation, implying that any one 
instance or situation can possibly confirm or 
negate a psychological concept. For example, 
Suppose a researcher finds that sex-role cate- 
gories do not predict which women wish to be 
addressed as Ms., rather than Miss or Mrs, 
Does this finding negate the utility of the an- 
drogyny construct? In a similar mode, the 
statistical dragnet technique correlates a vari- 


research, 
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hypotheses that prompted the research in the 
first place. Even a replication study might dis- 
cuss briefly (a) why this study is useful for 
the procedures it intends to replicate, (b) 
what it expects to accomplish with this par- 
ticular sample, and (c) what the implications 
and/or limitations are in advance of what the 
results might uncover. What has occurred in 
the area of sex-role research is a rash of new 


scales and a bevy of eager researchers who 


wish to relate androgyny to everything and 
anything. 
There also appear to be a number of con- 


fusions and disagreements concerning the the- 


ory behind the androgyny research. Bem, for 
example, limited her conceptions of flexibility 
to situations that require effective, interper- 
sonal, and culturally sex-typed behavior. Sup- 
pose a researcher wishes to extend this notion 
of flexibility to preferences for differing coital 
positions. Here, it is important to specify in 
advance the evidence for any particular posi- 
tion to be sex-role appropriate or inappro- 
priate. Only in this manner can we conclude 
that sex-typed persons tend to inhibit cross- 
sex behavior or that androgynous persons can 
comfortably engage in particular practices. 
Likewise, if one wishes to correlate sex-role 
types with acceptance of the women’s libera- 
tion movement, what is the basis for predict- 
ing a favorable response from androgynous 
individuals? If a man describes himself as 
warm and supportive, does this necessarily 
mean that he is willing to assume responsibil- 
ity for dirty dishes and diapers? Perhaps this 
is a viable prediction, but again, it requires 
considerable support in advance. This issue 
also touches on the distinction between sex- 
role traits and sex-role behaviors. Within the 
Present context there is no a priori reason 
why sex-role traits cannot be predictive of 4 
broader band of sex-role behaviors than those 
encompassed by the instrumental and expres- 
sive domains. In practice, however, the basis 
of this type of extended prediction is fre- 
quently unclear, making the results uninter- 
pretable. A 
Cross-cultural and contrast-group r 
can be particularly fertile for application $ 
androgyny theory to sex-role research. Sinc 
the scales currently in use were construc a 
and standardized on middle-class America 


college students, hypotheses can be tested on 
low subcultures and differing vocational or 
interest groups compare. In these contexts, it 
the responsibility of the researcher to pro- 
pse some hypotheses about pertinent differ- 
mces in cultural norms or sex-role practices 
md how these might effect the obtained re- 
sults. Sex-role stereotypes in Formosa or Bra- 
jal may indeed differ from U.S. responses, but 
how do the obtained data articulate with the 
cultural standards or role demands of each 
culture? Similarly, there are implicit popula- 
tion hypotheses embedded in an assessment of 
sex-role responses of homosexual persons that 
Bay or may not be logically derived from a 
sex-role theory. Findings that indicate no dif- 
ferences in sex-role endorsement between 
specified groups and published college norms 
‘may have important implications for the the- 
ty or they may be allocated to the “so what” 
tatch-all box. 

The inappropriate application of theory to 
4 particular measurement instrument, to a 
dass of behaviors or traits not directly related 
lo the theory, or to miscellaneous contrast 
Populations may result in a premature rejec- 
‘tion of hypotheses derived from the theory. 
If an investigator uses a sex-role measure to 
‘Predict to a wider band of behaviors than 
‘those encompassed by the theory, or to a tan- 
= class of responses, negative findings 
lo not negate the theory. For example, an 
‘investigator wishes to test the hypothesis that 
‘indrogynous persons are more adaptive than 
sex-typed individuals, A sex-role scale and 
an anxiety scale ‘are administered; androgy- 
Nous males turn out to have higher anxiety 
Scores than sex-typed males. Does this finding 
RA that androgynous males are less, rather 

an more, adaptive? And does this finding 
Point to a hole in the theory? When using a 
Particular scale, or dependent measure, to 
“sess predictions from a specific androgyny 
3 it is extremely important to differentiate 
(an scales (e.g., PAQ, BSRI), the construct 
la aogyny), and the theory (adaptability). 
a is case, the negative findings do not 
“ees imply a useless theory, a question- 
S e construct, or an invalid sex-role scale. A 

OMpeting hypothesis might suggest that the 
gher anxiety of androgynous males reflects 
eir awareness of their unorthodox match 
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with culturally prescribed standards for mas- 
culine behavior. The high obtained anxiety 
scores for androgynous males may expand our 
information about this sex-role group, but it 
does not invalidate previous findings that an- 
drogynous males are more willing than sex- 
typed males to engage in cross-sex behaviors. 

How can these problems in relating theory 
to design be avoided? In the planning and 
conceptualization stage of sex-role research, 
the investigator might consider the following: 
(a) Relate the present research topic to pre- 
vious findings; (b) state clear experimental 
hypotheses; (c) relate these hypotheses to 
some aspect of sex-role theory; (d) state ex- 
pected outcomes for each hypothesis; and (e) 
indicate the conditions and limitations under 
which the hypotheses might support or fail 
to support the theory. Although these five 
steps may not eradicate all of the problems in 
application of theory to design, they may help 
to prevent unproductive forays and dead-end 
journeys. 


Sex-Role and Gender Distinctions 


Judging from the types of designs and sta- 
tistical analyses that appear in current manu- 
scripts, disagreements exist on how to deal 
separately with the effects of sex role and 


gender. Here, gender refers to categorical dis- 


tinctions between males and females regard- 


less of their behaviors. Sex an S 
typing will be applied to cultural expectations 
DAI the attitudes, beliefs, and behaviors 
associated with masculinity and femininity. 
Constantinople’s (1973) landmark review 
presented clear arguments ‘for rejecting gen- 
der differences in response as a criterion for 
item selection or scoring procedures for mas- 
culinity-femininity scales. The raison d’etre 
of the four sex-role scales referenced here was 
the measurement of sociocultural sex roles or 
traits independent of their distribution by 
gender.* Although gender and sociocultural 


ee ee 

1Two of the four scales under discussion, the 
PAQ and Heilbrun’s Adjective Check List (ACL), 
have components of gender differentiation built into 


the structure of the scale. The PAQ contains a sepa- 
rate Masculinity-Femininity scale on which items 
have been selected by gender discrimination. The 
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sex roles are not completely independent of 
each other, they are by no means isomorphic. 
This distinction has been violated or ignored 
in at least three major ways. 

First, we would like to know whether the 
functional properties of sex-typed roles or 
traits are similar for males and females. Some 
studies have divided subjects by sex type 
alone, ignoring the Sex Type X Gender inter- 
actions that have appeared in the literature. 
Some of the obtained interactions between sex 
types and gender may have distinctive adjus- 
tive implications for males and females. Like- 
wise, cross-sex typing appears to function in 
different ways for the two genders in some 
studies. The early findings on the functional 
relationships between gender and sex roles 
are fragmentary and scattered. Many studies 
fail to isolate enough cross-sex-typed persons 
to analyze, but the specific results are sug- 
gestive. Further research should clearly in- 
clude both gender and sex-role considerations, 

A problem related to that of the Sex Role 
X Gender interaction is the failure to specify 
the gender distribution of subjects who make 
judgments or attributions of sex-typed behav- 
ior (32 counselors in training were asked to 
judge . . .). In a similar vein is the failure to 
analyze for gender even when it is specified 
and available as an independent variable (36 
counselors, 14 males and 22 females, were 
asked to rate... .). Surely we are throwing 
away much valuable data that should be ex. 
amined rather than swept under the rug. In 
either instance, the resulting data will not be 
very meaningful. 

The second violation of the Sex Type x 
Gender dichotomy is the treatment of data 
according to gender rather than sex role in 
assessing or determining sex roles and traits, 
The effects of sex role and gender should be 
carefully separated so that we can determine 
the contributions of each variable. Consider 
the following examples: To demonstrate that 
two tasks are sex-role appropriate, the inves- 
tigator shows that these tasks discriminate the 
choices of males from those of females. A sec- 
ond example reflects a similar Gender x Sex 


ACL items were origi all; B eye 
between males wh ena et by differentiation 


o identified with mas uline f: 
and females who identified with ae 6 ay 
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Role contamination. Males and females are 
assigned to sex-role groups on the basis of a 
median-split procedure computed with sepa- 
rate male and female frequency distributions, 
Similarly, sex-role groups are compared on a 
dependent measure, such as the Minnesota 
Multiphasic Personality Inventory or intro- 
version—extraversion, the T scores for which 
were computed using separate male and fe- 
male distributions. In each of these examples, | 
the implications for the resulting data and 
interpretations may differ. Especially in the 
latter case, this practice may alter the value 
of some of the criterion measures and may dis- 
tort the data in the direction of artificial 
gender differences. These practices may mask 
sex-role differences, or Sex Role X Gender in- 
teractions, or they may have no significant | 
effects at all. In each case, the problem can 
be resolved by examining the data in both 
ways. Having agreed that sociocultural sex 
roles and gender are not isomorphic, let us 
not throw them both into the same pool so 
that their separate effects become diluted and 
indistinguishable. 

A third confusion concerns the stated pur- 
pose of current sex-role scales. The purpose, 
of these instruments is to measure sociocul- 
tural sex roles or traits orthogonally and in- 
dependently of gender. The criterion of rela- 
tive independence has been the demonstration 
of a low correlation between the masculinity 
and femininity scales on each measure. With 
some exceptions, each scale meets this cri- 
terion. Now, an investigator repeats the cor- 
relation procedure, supports the scale inde- 
pendence, and concludes that the test is m- 
valid because it fails to discriminate males 
from females. Or similarly, finding that mas- 
culinity and femininity scales on a oa 
of androgyny do differentiate males from te- 
males to some extent, the conclusion is again 
reached that the sex-role scales are bipolat 
rather than two-dimensional and orthontea 
In each instance, the purpose of the scales i 
misunderstood. £ be 

Gender and sex-type confusions Can i 
avoided primarily by reversing the proce 
above that produced the original One 
tion. (a) When using sex role as an inda 
dent variable, include gender as another a 
able and analyze for each variable separa 


addition, look for possible interactions 
viding that the design is appropriate for 
actions to appear). (b) Compute median 
Wiis for assigning sex-role groups by com- 
g male and female data into a single 
‘bution. (c) Use raw scores on dependent 
lables in place of T scores to avoid cri- 
m problems. (d) Read relevant references, 
as Constantinople (1973), Bem (1974, 
), and Spence et al. (1975), before em- 
ng on any new sex-role research. (e) A 
ing style addition to this topic is in order. 
a consistent notation to refer to gender 
e, female) or to sex roles (masculinity, 
ininity, androgyny). Frequent changes 
ihin an article in notation can leave the 


inpling Procedures 


Adequate sampling procedures are critical 
fall areas of psychological research. Certain 
Mpling problems appear particularly rele- 
Mt to sex-role research because the socio- 
iltural definitions of sex types and sex- 
fopriate behavior may vary across popu- 
Hons, generations, and settings. College 
Iphomores participating in experiments for 
irse credit present the fewest sampling 
lenges; they are available in both genders, 
sufficient numbers, and are relatively ho- 
neous on many relevant characteristics. 
tricting all of sex-role research to college 
lidents, unfortunately, leaves us with many 
answered questions about the generality of 
lts and the applicability to contrast pop- 
tions. Those researchers who brave the real 
Old to measure other kinds of people face 
iny more hazards in meeting the require- 
ients for an adequate sample. Therefore, it 
Particularly incumbent on researchers who 
the college classroom to make certain 
lat their samples meet at least four require- 
Etts: Make sure that (a) sample charac- 
istics and selection procedures are recorded 
a clearly described, (b) samples are suffi- 
silly large and (c) representative to allow 
ton to new groups, and (d) the 
aria for selection of contrast populations 
ue clearly defined. In the absence of any one 
these requirements, the experimental find- 
Í may be difficult or impossible to inter- 
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pret, and the results may have little applica- 
bility to other populations. Examples of each 
of these may clarify the position. 

Insuficient descriptions of sampling pro- 
cedures and population characteristics guaran- 
tee that a research design cannot be appropri- 
ately replicated. To relate a set of findings to 
a larger body of sex-role research, it is im- 
portant for the reader to be informed of the 
exact N, age or age range, gender, and what- 
ever other demographic characteristics of the 
participants are deemed relevant. For sex- 
role research, information on career, marital, 
and educational status might be important. 
Readers also wish to know how the subjects 
were enlisted and what the conditions were 
under which some subjects dropped out or 
were eliminated from the sample (mortality). 
Some recent examples of inadequate descrip- 
tions include such statements as “20 house- 
wives returning to school were administered 

_ ” or “30 graduate students at Univer- 
sity X in Calcutta were approached and 
asked. . . .” These vague sample descriptions 
do not help us to accumulate a body of useful 
data from which to generalize either to similar 
or to contrast populations. In cross-cultural 
research, in which the impact of national char- 
acteristics is examined, the appropriate match 
between cross-national samples becomes criti- 
cal. 
A similar caution is appropriate when ex- 
amining sex typing in clinical groups such as 
male alcoholics or female depressives. Unless 
the demographic characteristics of the clini- 


cal population are carefully described and 


matched with an appropriate control group, 
sex-role differences may be ascribed to the 
dynamics of the clinical syndrome that are 
probably a function of the restricted or atypi- 
cal sample. For example, a researcher who 


finds a disproportionate number of feminine- 
coholics in a Veterans 


typed males among al 

Administration hospital may draw conclusions 
about sex typing and substance abuse that 
are sample related rather than syndrome spe- 
cific. 

Inadequate or unrepresentative samples pre- 
sent two additional problems in drawing gen- 
eralizations from the obtained sex-role data. 
Although statistical requirements may fre- 
quently be met by small sampling procedures, 
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application to larger populations is question- 
able. For example, suppose the researcher 
wishes to examine the effects of counselor 
gender and sex type on counselor and client 
behavior during assertion training. If only one 
counselor from each gender and sex-role cate- 
gory is used, the outcome is particularly vul- 
nerable to unreliable conclusions. 

Sampling problems also appear in the selec- 
tion of control or comparison groups. Re- 
search designs that cut across age ranges, such 
as in developmental comparisons, are singu- 
larly sensitive to differences in cohorts, socio- 
economic class, parental education, cultural 
milieu, and achievement or ability factors. 
Unrepresentative samples are easy to find. 
Take any set of high school juniors and com- 
pare them on a sex-role measure to groups of 
college and graduate students. Developmental 
hypotheses such as age or generation effects 
cannot be adequately tested between these 
sets of subjects unless each relevant external 
factor is accounted for, controlled, or covaried 
statistically. Socioeconomic factors may like- 
wise differentiate an experimental from a nor- 
mal control group. If a researcher wishes to 
examine psychological sex roles of girls with 
unplanned pregnancies who are now living in 
a home for unwed mothers, what is an appro- 
priate comparison population? 

Criteria for the selection of contrast or con- 
trol groups are frequently unclear or gratu- 
itous. The appearance of new ways to measure 
sex typing has encouraged the application of 
these scales to diverse populations that vary 
along unknown dimensions. Certainly, when 
contrast groups are used to demonstrate the 
discriminant validity of a sex-role measure 
considerable support should be marshaled for 
assuming that these groups differ in sex-type 
orientation, Are all women who choose not to 
work feminine sex typed and all policemen 


masculine sex typed? If a validation study 


finds no difference in sex-role endorsement be- 


tween these two groups, where does the fault 
lie? Is the test not valid, or are the groups 
unrepresentative of each sex type? 

ts are not meant to 
ties in obtaining ap- 
p 0 ips for various pop- 
ulations, Simple Suggestions about how b 
avoid sampling problems will necessarily leave 
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many issues untouched. However, consistent 
attention at least to the four criteria outlined 
above would be a welcome start. 


Construct and Test Validation Procedures 


Four major criteria for test and construc 
validity appear in the research literature: (a) 
self-report; (b) interpersonal perception an 
attribution; (c) task and situational perform 
ance; and (d) contrast populations. Since con: 
trast populations were considered in the previ: 
ous section, they will not be covered againi 
For purposes of brevity, both construct 
test validation procedures will be consideri 
together. However, it is important to poin 
out that scale and construct validation do nol 
always involve converging procedures (Fisk 
1971; Wiggins, 1973). Establishing the valid 
ity of the BSRI should be a less exhausti 
procedure than considering the theoretical 
conceptual domains encompassed by the con: 
struct of androgyny. For example, the intet 
scale correlations among the four sex-rdl 
scales discussed above range from .61 to 8 
for masculinity and from .51 to .73 for femi 
ninity (Kelly, Furman, & Young, in press) 
These correlations shrink by 52%-58% whel 
the same correlations are computed for di 
chotomized scores (above and below th 
median). Although each scale is designed to 
measure some aspect of androgyny, it is cleat 
that they overlap only partially and may be 
assessing somewhat differing conceptual and 
content domains. Consequently, the predictive 
validity of each scale needs to be assessed in- 
dividually and independently of the scale k 
lidity of the remaining measures. Nevertheless 
the following discussion will cover criterion 
variables as they apply both to the validity 
of specific tests and to the construct vali 
of androgyny, recognizing that some comments 
may apply differentially to each. 

Self-report studies are by far the most pop: 
ular method of demonstrating test oF “a 
struct validity for androgyny. Seli- reper 
obviously quick and economical, and it hed 
face validity. Self-report methods base inl 
strength on the belief that persons ae 
own best observers (Mischel, 1968; Mis ‘ive 
Note 2). Problems arise with side ae 
measures in androgyny research when t 
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wales are (a) insufficiently described; (b) 
arginally related to scales previously used 
measure a similar construct; (c) dubiously 
idated to the construct of androgyny or to 
lie domain of sex typing; (d) unpublished, 
invalidated, and possibly unreliable; or (e) 
In violation of the separation of gender and 
lex typing. When any of these shortcomings 
ppear, questions can be raised about the sig- 
licance of the data reported using self-report 
measures. Scales with unknown psychometric 
properties are particularly vulnerable when 
negative results are obtained. If the investi- 
ator finds no relationship between androgyny 
sores on the PAQ and “a scale especially con- 
lived to measure openness to experience,” is 
mdrogyny theory at fault, is the PAQ not 
predictive of “openness,” or do we have an 
invalid criterion? A further problem in re- 
stricting the measurement of dependent vari- 
‘tbles to self-report is the introduction of sub- 
stantial “method variance” (Campbell & 
Hiske, 1959). For example, to what extent 
does the obtained positive correlation between 
[masculine or androgynous sex types and mea- 
sutes of self-esteem reflect the tendency of 
these people to endorse a high degree of many 
Positive traits? 
Despite their shortcomings under certain 
circumstances, self-report measures of de- 
eo variables are not likely to disappear 
a use. Therefore, investigators who use 
PN scales can refine their approach by 
3 sidering one or all of the following: (a) 
ive preference to scales that have been used 
oy in similar research; (b) describe 
the os used in sufficient detail to inform 
es aa of relevant psychometric character- 
a e the scale and its previous use in simi- 
tot E exts; (c) provide a theory or rationale 
their se i these particular scales, including 
E. rel ree to other scales that have 
ich used to measure the same construct 
a io feminism or need for achievement) ; 
the ) develop directional hypotheses about 
aa Geers relationship between measures 
rend ie and the variables presumably 
Fic. ; other self-report measures. Progress 
liable a or theory depends on a body of re- 
ae comparable research. New data are 
‘aningless unless they can be fitted into a 


| matrix of accumulated knowledge. When di- 
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vergencies do occur, it is the task of the re- 
searcher to support the use of particular mea- 
sures and to indicate their relationship to 
existing methods. 

Interpersonal ratings and attributions repre- 
sent a second area of validation research. 
Studies using this approach generally fall into 
one of two categories: (a) Subjects rate 
selected others on sex-typed characteristics to 
obtain stereotypes or (b) clinical analogues 
are constructed in which subjects evaluate 
written, visual, or auditory material about 
standard stimulus persons who appear in sev- 
eral sex-typed roles. The most popular varia- 
tion of stereotype ratings has been to attempt 
to replicate the study by Broverman, Brover- 
man, Clarkson, Rosenkrantz, and Vogel 
(1970), mainly to demonstrate negative out- 
comes. A common difficulty in replication of 
sex-role stereotype studies is neglect to repeat 
exactly the procedures and instructions used 
by the earlier studies. For example, in a single 
study, researchers change the target person 
from adult to child, the raters from clinicians 
to child-care workers, the instructions from 
“healthy” to “ideal,” the scoring procedures 
from seven to two rating categories, and they 
shorten the number of items in the scale. Al- 
though this type of study may have utility 
for some purposes, it is clearly not a proper 
replication of Broverman et al. 

A more appropriate application of replica- 
tion procedures might be systematically to 
vary each factor that could contribute to the 
total variance in stereotyped attributions. In 
this manner, perhaps the researcher could sub- 


stantiate a hypothesis that previous results 
instructions, item format, 


are a function of i 
scoring criteria, or 4 sampled population. For 
example, should a researcher seriously wish to 
examine stereotyping in differing populations, 
it would be yseful to obtain more than one 

that teachers 


contrast group. Thus, to show 
do or do not hold these stereotypes is probably 
h to make a worm turn 


not exciting enoug! v a 
these days. However, it might be interesting 


to determine stereotyping differences among 
groups that vary along a particular dimen- 
sion such as education, employment, adminis- 
trative power, and so forth, especially when 
the chosen dimension has implications for in- 
stitutional or administrative change. 
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Clinical analogues in which standard stimu- 

lus persons are given diagnoses, labels, or pre- 
scriptions for behavior change have crept into 
sex-role research. Since analysis of analogue 
research is beyond the scope of this article, 
comments will be limited to sex-role implica- 
tions. A typical study runs like this: Subjects, 
usually undergraduates, are given a standard 
personality description and are asked to pro- 
vide a diagnosis, recommend treatment, or 
make predictions about marital adjustment. 
Sex roles enter into the design when stimulus 
persons vary on gender and sex-role-appropri- 
ate or sex-role-inappropriate behaviors. Sup- 
pose the outcome of this design shows that 
male and female stimulus persons who use 
sex-role-reversed behaviors are judged as more 
seriously ill, more in need of treatment, and 
less likely to be happy in marriage. What are 
the problems here? First, generalization about 
current diagnostic and treatment practices 
from college sophomores to practicing clin- 
icians is questionable. Do judgments on a two- 
paragraph written description replicate re- 
ferral behavior in clinical situations in which 
multiple sources of information are available 
and when economics and family relationships 
are involved? Second, sex-role implications 
for each stimulus person may be unequal. 
When John runs sobbing into the bedroom 
after a marital quarrel, is this equal in social 
value to Mary who tuned up the family car 
in preference to going out for lunch with her 
friends? Problems in task equivalence appear 
in performance studies as well as in attribu- 
tion or analogue studies. In either case, it is 
necessary to equate, by means of observer 
agreement, the sex-role relevance of the task 
or situation and its social value (degree of 
Positive or negative valence). In the absence 
of this information, we cannot conclude that 
sex-role inversions lead to dire social conse- 
quences, 

Two related areas of 
involving sex roles cen 
variables: fear of succ 
ia attributions. The issues related to 

Opics are far too complex to consider 
here, but a number of recent reviews arı il. 
able (see Frieze, 1975; Stein & Bales. cou, 
Trese ; ; ailey, 1973; 
tor mer, 1976; Zuckerman & Wheeler 
975), It should be noted, however, that many 


attribution research 


ter on achievement 
ess and achievement 
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of the same problems occur in all of these: 
attribution areas: task variables, expectancy 
for and the value of success, and external 
validity. 

Task performance and behavior samples 
represent the most direct external validation 
procedures in sex-role research. Aside from 
what people tell us they do or prefer, or what 
others judge them to be, how do sex-typed or 
androgynous persons actually behave in sam- 
ple situations? Although there are many prob- 
lems involved in making behavioral predic- 
tions from self-reported characteristics (Fiske, 
1971; Mischel, 1968; Spence & Helmreich, 
1978; Mischel, Note 2; Spence, Note 3), be: 
havior samples are essential to sex-role val- 
idation research. Androgynous orientations] 
should predict greater adaptive or effective 
behavior across situations designed to elicit be- 
haviors in the instrumental or expressive do- 
mains (Bem, 1974, 1975). Persons who de 
scribe themselves in few sex-typed terms 
might be expected to show behavioral deficits 
equally across sex-role correlated situations 
(Kelly, O’Brien, Phillips, Hosford, & Kini 
singer, Note 4). Individuals who score high 
on either instrumental or expressive domains 
might be expected to show competence on sex- 
typed tasks relevant to expressive and instru- 
mental domains and to exhibit behavioral in- 
hibition or deficits on similar sex-reversed 
tasks. 

Clearly, the conceptual and experimental 
challenge here is to construct tasks and situa 
tions that match the theoretical predictions. 
The two most common problems appeat ‘a 
be (a) criterion or construct validity: Is the 
selected situation congruent with the com 
struct? Does the task represent a reasonable 
and logical extension of the construct domain 
to be examined (e.g., conformity, competitive” 
ness, nurturance, risk taking)? and (b) se 
tole validity: Does the task represent a logica 
sample of sex-role behavior within the do- 
mains assessed by the particular sex-typing : 
measure? Issues in the general criterion vej 
lidity of tasks are common to all resend 
(Fiske, 1971) and will be omitted here. SA 
role validity of tasks has been considered E y 
recently in the literature (Stein & Bai a 
1973) and is still problematic in current E 
search efforts. One conceptual challenge C0 
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ming task selection is to delineate the prob- 
le domains of observable behaviors encom- 
ed by the instrumental and expressive 
mensions. To my knowledge, none of the 
irent sex-role scales was designed to mea- 
we all of the traits that discriminate males 
fom females or to differentiate all sex-typed 
juitural traits, When predictions to external 
lihavioral situations extend beyond the con- 
feptual network of the assessment instrument, 
by or negligible results can be expected. 

The experimental challenge involves, among 
iher things, careful attention to task selec- 
lon for the behavioral validation of sex-role 
lneasures, Considerations should include ade- 
late pretesting of tasks for sex-role endorse- 
Ment; the value of the tasks for the partici- 
ants: and the expectancy for success on the 
lisks, equated across sex-role or gender 
foups. In many studies, no information is 
jovided on how the tasks were chosen. The 
sulting tasks may be gender typed, sex role 
yped, or only tangentially related to either 
at these variables. For example, when examin- 
iig sex typing on the choice of luck or skill 
Kimes, an electronic ball game may not be the 
tost neutral task. It probably elicits both 
Value and expectancy differences among 


(tndencies of sex-typed or androgynous per- 
ns to prefer skill or luck tasks are immedi- 
itely biased by the nature of the activity. 

i If current sex-role measures are to receive a 
i trial, pretest procedures should clearly 
ifferentiate tasks according to a sociocultural 


3 Any conclusions that are drawn about 


a sex-role definition (what is more desir- 
e, typical, appropriate, etc.), rather than 
1 Ra discrimination (what differentiates 
‘= from females). Although gender selec- 
iy may be used to demonstrate gender typ- 
i once the tasks are chosen, it seems ap- 
4 priate to restrict the final task selection to 
feel judgment criterion. Comparisons 
the these two procedures may show that 
asis on which any set of tasks is chosen 
a no appreciable effect on the data. How- 
et, this comparison is seldom attempted. 
ee many researchers continue to con- 
lack a gender and sex roles in design and 
the te ection, thereby confusing and masking 
ontributions of each. 

A summary of current practices in the vali- 
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dation of sex-role measures reveals a number 
of design problems that may render the re- 
sults uninterpretable. In all three areas re- 
viewed here—self-report, attributions, and be- 
havior samples—procedural difficulties appear 
that are potentially avoidable. 


Psychometric and Statistical Considerations 


Discrepant approaches to data analysis are 
not limited to sex-role research. However, cer- 
tain procedures are sufficiently problematic 
and common that they merit a brief com- 
mentary. Two topics will be considered; scor- 
ing for androgyny and statistical treatment of 
data. 

Scoring procedures that are problematic fall 
into three categories: (a) failure to report 
how scales were scored or to report mean scale 
scores (A study reports the number of sex- 
typed and androgynous subjects but does not 
explain how these categories were obtained.) ; 
(b) idiosyncratic scoring systems that do not 
demonstrate the effects of these changes on 
the data (Researcher figures up a new way 
to calculate androgyny and reports only re- 
sulting data, without comparison to previous 
scoring criteria.) ; (c) misuse of androgyny 
scores indicating some misunderstanding of 
their meaning, (Study reports that one group 
androgynous as the other, based 
on the absolute differences between the an- 
drogyny scores.) Any one of these scoring 
practices prevents the reader from relating the 
present research to previous findings. New 
scoring procedures may be extremely useful, 
but their utility should be demonstrated by 
means of data-impact comparisons between 


the new and the traditional methods. 


Statistical procedures deserve more than a 
since many sex-role 


passing nod. However, - 
studies attempt to demonstrate group differ- 


ences, a word of caution is justified. By far the 
most common annoyance is data analysis that 
treats each dependent variable separately, 
using individual ¢ tests, chi-squares, one-way 
analyses of variance, or columns of single cor- 
relations. Aside from the opportunity to capi- 
talize on chance variations, many dependent 
variables in personality research tend to be 
correlated. Single tests of significance cannot 
correct for the common variance in a group 


was twice as 
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of correlated measures. Many research studies 
would benefit from the use of multivariate 
procedures, and readers would benefit by en- 
countering fewer dramatic results that fail to 
hold up under cross-validation. 


Conclusions 


The wealth of current research designed to 
evaluate the correlates and consequences of 
androgyny has been beset with multiple meth- 
odological problems. Many of the design and 
procedural difficulties that appear are com- 
mon to other areas of virgin research. How- 
ever, the current orthogonal conceptualization 
of sociocultural sex roles and their measure- 
ment by specific scales that meet this assump- 
tion introduce a number of methodological is- 
sues that are specific to androgyny investiga- 
tions. As a function of Bem’s unveiling of 
androgyny as a model of effective psychologi- 
cal functioning, validation studies have cen- 
tered on the adjustment advantages of a rela- 
tively balanced, as compared to a sex-typed, 
sex-role endorsement. The resulting research 
efforts have produced some very creative solu- 
tions to the challenges of construct validation, 
as well as some repetitive and inconclusive 
approaches, 

At the risk of oversimplification, one major 
design and conceptual problem centers around 
a diffusion and confusion between gender 
(male, female) and sex-role endorsement 
(masculine, feminine, androgynous). The dis- 
tinction between these two variables is fre- 
quently ignored in the development of hy- 
potheses, tasks, and tests. Even though Bem 
and her associates (Bem, 1975, 1976) have 
been careful to make this distinction clear 
many others have not. The position was le 
here that all components of androgyny re- 
search should differentiate cultural sex roles 
from gender typing. This consideration im- 
plies that theory, measurement of both inde- 
pendent and dependent variables as well as 
reporting procedures, would benefit by re- 
specting the sex role/gender distinction, Only 
in this manner can we tease out the relative 
contributions of sex-role advantages or dr 
backs to males and females separately, a 

The limited research data accumulated so 


far suggest that sex Toles emphasizing either 
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instrumental, expressive, or androgynous ori- 
entations may have differing implications for 
the psychological well-being of males and fe- 
males in American culture (Bem & Lenney, 
1976; Berzins et al., 1978; Jones et al., 1978; 
Kelly et al., 1977; Wiggins & Holzmuller, 
1978). Although these group differences tend 
to favor the presence of moderately high mas- 
culinity scores (either alone or in a combined 
androgyny score), the adjustment advantages | 
of masculinity are by no means universal and 

are tempered by the situation, the task, the 

measures used, and the gender of the experi- 

menter. In a recent series of studies across a 

diversity of situations, Jones et al. (1978) 

found a number of Gender X Sex Role inter: | 
actions, with androgynous males showing 

more adjustive problems and both androgy- 

nous and masculine-typed females appearing] 
more adaptive. The implication of these di- 

verse findings is clearly to direct attention to 

refinements in research design, a specification 

of limiting conditions under which particular 

results can be expected to occur, and a con- 
tinued careful distinction at all levels of re 

search between sex-role and gender effects. 

A second major issue relates to the psycho: 
metric definition of androgyny. At the present | 
time, considerable disagreement exists con 
cerning the appropriate method for translat 
ing raw scores on current sex-role scales into 
a predictive metric that is both statistically 
sound and psychologically meaningful. Tg 
major scoring procedures are currently in use: 
Bem’s original ¢ ratio, which emphasizes ai 
intraindividual subtractive balance definition 
of androgyny (Bem, 1974), and a median- 
split procedure, which includes the absolute | 
numbers of masculinity and femininity item 
endorsements within: a specified samP® 
(Spence et al., 1975). Comparison of aA 
using either of these two procedures sugges 
that under some conditions the two son 
methods produce equivalent results, and a 
other conditions the additive method va ‘ 
new data (Bem, 1977; Berzins et al., ee 
Jones et al, 1978; Wiggins & Holzmu ie 
1978). Some researchers have combined al 
two scoring approaches to produce @ a 
metric based on both the relative and abso a 
scores (Babl, in press; Heilbrun & oe 
Note 5). Still others have suggested the 
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(multiple regression analysis to take into 
unt the full range of scores on both mas- 
jnity and femininity scales (Kelly et al., in 
s; Wakefield, Sasek, Friedman, & Bow- 
In, 1976). With linear data, a multiple re- 
ssion design might overcome some of the 
surement limitations of current scoring 
ipologies. However, there do not appear to 
any published data that examine the em- 
ical implications of regression analysis for 


mirogyny research. It is clear that some 
ains and losses accrue to each scoring pro- 
ure, and the issue is by no means closed. 


fopefully, accumulated data will suggest 
tich method yields a useful scoring pro- 
lure that will enable research from different 
iboratories to be compared, In the interim, 
Lis suggested here that researchers specify 
lirefully their scoring assumptions and pro- 
dures, 
Regardless of the scoring scheme adopted, 
iiferences in measuring instruments will also 
‘ntribute to variations in the androgyny vali- 
ition theme. Although all four of the present 
sales assessing orthogonal sex roles and traits 
share a common variance, they are by no 
tans interchangeable. It has been pointed 
out elsewhere (Kelly & Worell, 1977; Spence 
t Helmreich, 1978) that these scales all vary 
M assumptions about androgyny, item selec- 
a (origins of item pool, instructions to item 
laters, criteria for item inclusions, etc.), item 
ntent and format, and instructions to Te- 
MPondents (yes-no ratings on bipolar or uni- 
Polar choices). In addition, the masculinity 
EL femininity scales show varying correla- 
a with corresponding scales on each of 
Bote tests, although the associations tend 
e moderate to high. Moreover, when the 
a respondents fill out all four sex-role in- 
Baca” a large majority (61% when cor- 
a Jons are made for chance agreements) are 
egorized discrepantly by any pair of scales 
(Kelly et al., in press). y 
p these differences in scale character- 
ka and concordance imply for androgyny 
= is that generalizations concerning ad- 
: ment and psychological well-being should 
k confined to the particular instrument us 
ee sex roles and the particular scoring 
is ods adopted. None of the present scales 
|'S sufficiently calibrated so that predictions 
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from one scale can be transferred to another 
instrument. A recent study using matched 
predictions from the BSRI and the PRF 
ANDRO scales showed very little overlap on 
a set of dependent variables (Hicks, 1977). 

A final issue concerns the extent to which 
the characteristics measured by any of these 
current sex-role scales reflect unitary traits or 
dispositions that are predictive of a wide 
range of behaviors, attitudes, and life-style 
choices. The limited data accumulated thus 
far strongly suggest that endorsement of one’s 
typical degree of instrumental and expressive 
characteristics is not necessarily predictive of 
any or all sex-role and gender-correlated be- 
havior. It should come as no surprise to any- 
one familiar with personality assessment that 
behavioral prediction from self-described per- 
sonality traits is constrained by what one is 
willing and able to do in any particular situa- 
tion. Self-attributions will thus interact situa- 
tionally for each person with the value of an 
activity, expectancies for success, fear of fail- 
ure, desires to please self and others, and real 
or perceived behavioral skill. Specific role-de- 
termined behaviors may not coexist with trait 
descriptions, because they vary on one oF all 
of these limiting dimensions. 

Of particular interest to androgyny traits 
is the question of the relative contribution of 
social competency to performance on sex- 
typed tasks. The behavioral activities selected 
to test androgyny hypotheses have included 
tasks that involve trait-related skills (such as 
positive and negative assertiveness), aS well 
as role behaviors that require no skill but that 
may be differentially acceptable or aversive 
(e.g. diapering a baby, playing with a kitten). 
Future research should attempt to tease out 
the extent to which interpersonal skills and 
the positive and aversive stimulus values of 
criterion tasks interact with measures of sex- 
typed orientations. Although it is apparent 
that orthogonal sex-role measures are not in- 
discriminately predictive of varying indices of 
psychological well-being, direct behavioral 
tests must continue to be important criteria 
of what people are willing and able to do. As 
Mischel has aptly pointed out (Mischel, Note 
2); situation-free personality variables hold 
little interest or utility for any psychologist. 
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Some Problems in Community Program Evaluation Research 


Emory L. Cowen 
University of Rochester 


Realities of the community context militate against good program evaluation 
research. Many limiting factors in such research stem from a clash in values 
between those who must deliver and those who must evaluate community ser- 
vices. Detailed consideration is given to several clusters of difficulties that plague 
community program evaluation studies, including (a) sources of data bias, (b) i 
issues of design, (c) problems in the choice and use of criteria, and (d) prob- 
lems of experimental control. Although community program evaluation studies 
can surely be improved, it is unlikely that the purity of antiseptic, laboratory 
research will ever be attained. Ultimate conclusions about the effectiveness of 
community service programs may thus have to come about slowly and cumula- 
tively, based on convergent findings from many individual less-than-ideal out- 


come studies. 


The author is well qualified to write about 
research errors in community mental health 
because he has, personally, committed vir- 
tually every one of them. Had the editor told 
authors to start their article with a pithy 
folk wisdom that captured its essence, mine 
would have been: “Do as I say, not as I do!” 
This article is intended as a straightforward 
consideration of several hazards that plague 
community program evaluation research. It is 
not designed to lecture or to pontificate. If 
the words come through as “holier than thou,” 
it will reflect a serious communication failure, 

Starting this article as I did was dictated 
by other than modesty—or even masochism. 
It was to suggest that some problems of com- 
munity research are so intrinsic to the nature 
of the beast that they are very difficult to 
surmount. As stated elsewhere (Cowen, 
Lorion, & Dorr, 1974), the choice for investi- 
gators in this field is often between doing far 
less than ideal research or no research at all. 

With that as preamble, a first practical task 
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munity psychology share common prime 
cerns with people’s adjustment, adaptati 
security, happiness, self-concept, that is, 
well-being, the nature and timing of their 
fining practices differ radically (Co 
1977). Thus, clinical psychology—as well 
psychiatry and social work—has tradition 
used repair strategies such as psychother! 
addressed to already evident, crystal! 
problems. Community mental health’s (CM 
roots lie in keenly felt dissatisfactions with 
effectiveness of classic mental health repi 


systems. CMH does not abandon the casu 
repair orientation; rather, it directs its 
k 


is to sharpen the article’s focus—some' 
easier said than done. Terms such as 
munity research or research in communi 
psychology or mental health are broad 
amorphous (Cowen, 1973). In an earlier 
ticle, I suggested that although clinical 
chology, community mental health, and co 
li; 


forts toward the perceived insufficiencies 
past traditional approaches. The thrusts 
the CMH movement are to identify probl 
earlier, in more natural settings (¢.8-, schools 
and to use more flexible, hopefully more 
istic approaches sometimes carried out by 
traditional help agents. The real impor 
of community in CMH is that it harbors 


tings and contexts that make it easier to 
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things. Community psychology, sharing 
the same goals as traditional or CMH 
oaches, departs sharply from them in its 
tegies. It is mass oriented rather than in- 
idual oriented, and it seeks to build health 
the start rather than to repair. 
The preceding account is grossly oversim- 
fied, It extracts distilled essences of pure 
els that are blurred and muddied, in na- 
e, at many overlapping points. Moreover, 
approaches are depicted unidimensionally 
en, in fact, each consists of an agglomerate 
strategies, techniques and, ultimately, re- 
ch terrains. Bloom (1977) documented 
t complexity for CMH, identifying 10 im- 
tant ways in which such approaches differ 
m past traditional mental health practices. 
special interest to this discussion is Bloom’s 
ning definition of CMH as “all activities 
ertaken in the community in the name of 
tal health” (1977, p. 49). In that vein, 
cites as a first feature, which distinguishes 
H from traditional clinical activities, the 
t that the former are based on practice in 
community. Several of Bloom’s later key 
isctiminanda (e.g., emphases on early ser- 
delivery, indirect services, use of non- 
ditional manpower) are natural derivatives 
he community locus of CMH programs. 
Important as the preceding structural em- 
s are, they do not yet begin to identify 
{H's substantive complexities. The latter 
be illustrated simply by noting several of 
e field’s active current areas of program- 
ing and research: needs assessment surveys; 
holism and drug-abuse programs; mental 
consultation by, and for, diverse 


th 

ups; varied types of crisis intervention 
ams; selection, training, and perform- 

i of nontraditional help agents; informal 


P-giving processes, natural caregivers, and 
munity support networks; early detection 
7 intervention; and alternative service de- 
ty modes for inner-city and rural folk. 
A brief account such as this could not en- 
fompass the myriad of research problems of 
the foregoing areas even if the author had 
Which he does not) the expertise to do s0. 
limitation, however, is more merciful 
tragic. Though CMH evaluation research 
clear defining qualities, it is not, thank 
"odness, a world unto itself with a totally 
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unique technology, methodology, and modus 
operandi. Its basic problems are related to 
those of other major areas of outcome research 
(e.g, evaluation of psychotherapy or educa- 
tional programs). 

The full array of research problems in- 
cludes those of designing, conducting, and in- 
terpreting studies. The greatest commonality 
in such problems between outcome research in 
CMH and other areas—and thus the topic 
least pursued in this article—is error in in- 
terpreting findings. Thus, overinterpreting 
chance findings (e.g., the “meaning” of five Fs 
out of 100 significant at p< .05); confusing 
statistical significance with meaningfulness in 
interpreting correlation coefficients (¢.g., 7 = 
.07, significant at p < 05, because N = 1,000) ; 
capitalizing on chance to develop a regression 
equation that is not cross-validated; or over- 
reacting to a significant chi-square based on 
small expected cell frequencies, unmodified by 
Yates’s correction, are common errors of data 
interpretation in many substantive areas—not 
just CMH. 

A key thesis of this article is that certain 
errors in designing and conducting CMH pro- 
gram evaluation studies flow naturally from 
the special hazards of doing research in the 
community. The latter include (a) the low 
priority that program evaluation research may 
have in an agency’s hierarchy of values; (b) 
the researcher may be—or be seen as—a for- 
eign body to the system; (c) the threat that 


evaluation research poses for a program's 
funding or personnel; (d) difficulties n Pi 
e) the 


ing entry into community systems; 
pe í longitudinal 


programs impose; (f) 


view bodies; $ 3 
human rights and the invasion of privacy. 
Specific research problems considered later in 
this article can stem, directly or indirectly, 
from such qualities of community contexts. 
Accordingly, many gut problems of commu- 
nity research occur less because investigators 
do not “know any better” and more because 
reality keeps them from doing any better. 

One key “common denominator” obstacle 
to sound community evaluation studies is that 
the researcher’s and agency’s goals frequently 


794 


work at cross-purposes. Schools are for teach- 
ing; courts are for meting out justice; com- 
munity mental health center (CMHC) clinics 
are for working with patients, and hospital 
wards are to care for the sick. Community 
agencies share the mandate of bringing needed 
services to people. Their first concern must be 
with the extent and quality of those services. 
Their prime goal is to insure delivery of op- 
timal services. The evaluator’s first allegiance, 
by contrast, is to sound design, methodology, 
and instrumentation, which are so important 
to the defensibility of a study’s conclusions. 
These perspectives can, and do, clash. The 
real rub comes when program personnel see 
the design of a prospective study as encroach- 
ing on, or restricting, services, and the re- 
searcher sees service pressures as a factor that 
will corrupt the study’s integrity. To compli- 
cate the matter further, some service personnel 
see program evaluations as personal evalua- 
tions and, therefore, as threatening. Others, 
not sympathetic to research in the first place, 
or who feel harassed by heavy job pressures, 
resent the extra burdens that an evaluation 
study places on them. 

Whether or not a just God in heaven might 
judge such perceptions to be warranted is not 
the critical issue. Because they are real to be- 
holders, they result in very real behaviors such 
as lack of cooperation, anger, passive aggres- 
sion, delayed and/or careless completion of 
forms in ways that limit their validity, and in- 
terpretability. There are few guaranteed solu- 
tions for such problems. Involving program 
personnel in planning studies, taking time to 
explain the purposes and significance of the 
research to them, using maximally parsimoni- 
ous measures that insofar as possible, are rele- 
vant to the respondents’ job turf, and feeding 
back findings in usable forms are steps that 
can strengthen program evaluation studies. 
But even so, the differing needs and values of 
program workers and evaluators remain as 
background noise that often works against 
sound research, 

; That less-than-idyllic backdrop frames con- 
sran of specific problems in CMH pro- 
gram evaluation re: i 3 

follow. Although tei Labs 
neatly into pigeonhol ea list ne 
ai datalknd see ne broad categories such 

gn bias, choice of criteria, and 
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the misuse of controls, clearly are chronic of. 
fenders. 


Data Bias 


In the past several decades, psychologica 
research in general has become more sensitive 
to, and sophisticated about, sources of dats 
bias. Observer judgments are highly subjec 
to stylistic inputs such as social desirability 
responding, “yeasaying” or “naysaying,” ex 
cessive use or avoidance of extreme ratings 
and halo effects, any of which can be con 
founded with substantive variables that an in 
strument purports to assess. Although ways 
instrument construction and usage have bee 
developed that minimize stylistic confound 
(or, at least, identify them after the fact), 
corrosive effects of those variables still 
community program evaluation studies. Th 
is so because such studies often depend heavil 
on the judgments of service recipients 
providers as prime data sources. 

Few would disagree with the assertion t 
a client’s view of how he/she has done in 
program is one relevant way to evaluate t 
program’s effectiveness. But there are many 
reasons why it is misleading to use such data 
as the only way of evaluating a progtal 
(Bloom, 1972). For brevity’s sake, the issi 
can be put in concrete, caricatured form. 
When a program ends, the male client 18 
asked, in 20 different guises: “So, how'd you 
do?” His response, also in 20 ways, iS Ter 
rific!” Problem No. 1: Did he respond tha 
way because that’s how he feels or because 
senses that that’s what the experimenter wani 
to hear? Problem No. 2: If he truly does fë 
better, is it due to the program or because 
just struck oil or won the Irish Sweepstak ‘i fi 
Problem No. 3: If indeed he does feel bette 
because of the program, has his behav 
changed in a parallel way? 

Service providers are also rele 
important—data sources in program 
tion, because they are uniquely familiar WY 
clients and their everyday behaviors. But ma 
too are sources of potentially se y 
bias. They can bias a study, obvious! a | 

s : completing 
not completing essential forms, by 
them carelessly, or by submitting them 
late. More subtly, their responses to ques” 


; 
vant—indeed, 
evalua 
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‘ut client behavior and how it has changed 
be shaped by their stake in, and cathexes 
a program, that is, whether they believe 
lat they are really being asked to evaluate 
dient’s behavior or their own effective- 
Miss; whether they see the program as theirs 
Jrothers’ and, if the latter, whether they like 
Wi dislike those people. 
he dilemma is clear. On the one hand, 
fel le who staff a program and/or receive its 
Awvices have observations and information 
iat are highly pertinent to its evaluation. 
But, for whatever reasons—many extraneous 
lh the program’s content and thrust—they do 
Wt always respond in those terms. The inter- 
hetability and ultimate contribution of rater 
lidgments may be increased by broadening 
lhe observational report bases. There is more 
inger of bias, pro or con, with only one ver- 
ils several judges or one versus several per- 
iyectives (e,g., self, therapist, job, home). If 
ihservers with different stakes and perspec- 
lives agree about change, it is more likely that 
tal change has taken place. Even more use- 
il, if feasible, is including behavioral anchor 
oints in the overall evaluation net. Corre- 
pondences between bona fide behavior change 
ind the judgments of human observers in- 
[ease one’s confidence in the latter as a con- 
Verging source of evidence in evaluating pro- 
kam effects, 


Wesign Problems 


eel (1969) has written a sophisti- 
i ig treatise on program evaluation designs 
| community research—a topic somewhat 


. 
aa the scope of this article. The present 
i ae focuses on two specific design issues 
a have created special problems for com- 
OY research—follow-up and systematic 
Sus representative design. 
on purpose of follow-up is to insure that 
i s observed when a program ends ac- 
rately and stably mirror the program’s im- 
fee Follow-up data thus solidify generaliza- 
E about program effects over time. Su 
a mation is important for planning future 
ae Immediate postprogram findings 
Rist e misleading in several respects. Thus, 
rs seems to be improvement can dissipate 
r time, because it was not (a) real in the 
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first place, (b) solid enough to permit the in- 
dividual to meet life’s demands after the pro- 
gram ended, or (c) supported by the post- 
program environmental context. 

Without follow-up, we can also underesti- 
mate program effects. Significant experimen- 
tal-control differences may not be found when 
a program ends, but they may show up 6 
months or a year later. Tilustratively, a pre- 
ventive program is developed for children ex- 
periencing current life crises who have not 
yet shown major signs of maladjustment. At 
the end of the program experimental subjects 
look much the same as comparable crisis (non- 
program) control subjects. When evaluated a 
year later, however, experimentals are found 
to have maintained sound adjustment, but 
nonprogram controls have deteriorated behav- 
iorally and educationally. In such a situation 
one might infer that the program had impor- 
tant innoculative value but that it was too 
soon for that effect to be detected when the 
program ended. Without follow-up the com- 
munity program evaluator is vulnerable to in- 
correct conclusions and generalizations. He/ 
she may thus either perpetuate a shaky pro- 
gram or dismiss an effective one prematurely. 

Brunswik (1947) wrote an informative es- 
say on systematic and representative design in 
psychological experiments, later applied to re- 
search in clinical psychology (Hammond, 
1954). Brunswik’s main argument was that in 
order to generalize beyond a study’s specific 
circumstances, one need not only have an ade- 
quate subject N, which most experiments do, 
but also experimental conditions that repre- 
sent, statistically, the universe of circum- 
stances to which the experimenter hopes to 
generalize. 4 

A simple example would be a comparative 
study of the effects of male versus female ex- 
aminers on operant conditioning rates of male 
and female subjects. Assume that such a study 
was done with one male and one female re- 
search assistant, each of whom ran 50 male 
and 50 female subjects, for a total of 200 sub- 
jects. Although, on the surface, that seems a 
“reasonable” N, the question is, reasonable 
for what? How useful is it to know that two 
particular experimenters—one of whom hap- 
pened to be male and the other female—got 
different conditioning results either overall or 


796 


differentially by subjects’ sex? Does such a 
finding mean that being a male or a female 
experimenter was the critical variable under- 
lying the observed performance differences 
rather than, let’s say, differences in their 
warmth, verbal styles, cues emitted, or degree 
of comfort with subjects? Most studies seek to 
generalize beyond their own literal conditions. 
For the hypothetical study cited, to generalize 
about the effects of male and female examiners 
would require representative sampling along 
the dimension of experimenter sex. Thus, in 
effect, the V for the study was not 200; it was 
1 in each experimenter’s group. 

Community program evaluation studies are 
especially vulnerable to problems of system- 
atic versus representative design. Such re- 
search often seeks to generalize about large 
community units (e.g., schools, CMHCs). The 
researcher, however, may have access to one 
or, at most, a very limited number of such 
units. Thus a study is conducted to learn how 
low-income subjects use mental health services 
within a CMHC and how effective those ser- 
vices are. The study, done at a large urban 
CMHC, involves a consecutive sample of 500 
low-income subjects who sought services for 
whatever reasons, during the past X number 
of months. The study’s main findings are that 
54% of the subjects did not return after the 
initial visit and that short-term goal-oriented 
therapy was found (by whatever criteria) to 
be the most effective of four treatment condi- 
tions studied. Although such information may 
be very helpful in pinpointing the practices 
and strengths of the particular setting, a fre- 
quent error is to generalize the findings to 


cal layout; poor/good reception and/or initial 
and committed, dedi- 
), any of which 
ings obtained. To 


sultation. Such a study is 


iiss: 
terms of the number and typ oe edn 


es of groups (eg., 
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teachers, lawyers, clergymen) with whom the 
approaches were used and the numbers of con- 
tacts with each group. However, if there is 
only one consultant per approach, generaliza- 
tion with respect to the study’s main question 
is drastically restricted. Because there are so 
many ways (e.g., experience, comfort, and 
confidence with an approach; personal 
warmth; verbal facility) in which the con- 
sultants could have differed, besides the os- 
tensible variable under study (i.e., type df! 
consulting approach), conclusions about the 
relative effectiveness of the approaches coul 
not be made without representative samplini 
on the consultant dimension. 
Another example would be a comparati 
study of the attitudes and job satisfactions of 
mental health professionals and paraprofes 
sionals. One convenient (often the most com 
venient) way of doing such a study is to "i 
cruit fairly sizable Vs for both groups from 
single, large facility that employs, let’s say, 
professionals and 40 paraprofessionals. 
sume that the study is done that way and th 
clear group differences in job satisfaction at 
found. The investigator concludes that thet 
are basic cross-group (mental health profes 
sionals vs. paraprofessionals) differences 1 
job satisfaction. But, again, because the study 
was done in a single center, the findings ati 
more likely to reflect that setting’s particulats 
(e.g., hours or conditions of employment, sal- 
ary levels, promotion policies, job security 
how positions at various levels are perceive 
and valued) than generalized cross-group dif- 
ferences on the variable in question. k 
Generalization of research findings depends 
on representativeness of design on all : 


S 


nent dimensions. This cannot ordinarily ye 
achieved in one variable systematic design. I 
a community program evaluation study seeks 
to reach conclusions that transcend & i 
ticular setting, it must adequately sample t 
situations and variables that are central to 


generalization focus, as well as the usual ade 
quate sampling of subjects. 


Criterion Problems 


ic 
The present discussion assumes that da 
reliability a" 


psychometric problems (e.g., ee intel | 


validity) are well understood at 


COMMUNITY PROGRAM EVALUATION RESEARCH 


tually, and that researchers as well as edi- 
sdo what they can to keep the faith with 
spect to them. Hence, this section focuses 
me particularly on criterion problems that 
e predisposed by the special nature and 
sures of community research. Two such 
oups of problems are considered: (a) the 
tent to which criterion measures are ap- 
opriate to a study’s purposes and (b) pres- 
lites on the researcher to use less than optimal 
qiterion measures. 
Are the criteria appropriate to the study’s 
jurposes? In considering how CMH ap- 
jtoaches differ from traditional ones, one fre- 
juently noted characteristic is that they in- 
jolve indirect rather than direct services 
(Bloom, 1977). Thus, we do consultation with 
nblic health nurses, pediatricians, teachers, 
ind clergymen because these groups have €x- 
Imsive everyday contacts with distressed in- 
viduals. The rationale for consultation is 
bat upgrading consultees’ knowledge and 
Alls helps them to be more effective in deal- 
ig with the personal problems that their 
tentele often bring to them. 
| Psychologically oriented educational pro- 
gams for those who are about to become 
farents, or for the parents of newborn or 
pms children, have a similar rationale. 
NA strengthening the knowledge and/ 
i pe bases of program participants can 
ki to more facilitating, health-producing 
poe attitudes and practices. Thus, 
Ta tation and mental health education epit- 
3 ue the structural pattern of indirect ser- 
ce; that is, they both are directed toward 
ae that have systematic contacts with and 
a ences ina position to help others. But the 
ee ultimate concerns are with the 
a. cial effects that intermediaries have on 
get people. 
oe health consultation and education 
ot 3 enhance the knowledge, feelings, and 
5 es of the groups that they touch. It is 
en assumed that if those variables change 
poo constructive change in the behavior 
ee with whom the intermediary inter- 
TAa ensue. Because indirect service pro- 
a have direct contacts only with inter- 
pene and because first-line changes are 
a to get at than once-removed ones, mea- 
as of change in consultees or parents are 
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often used as the only basis for evaluating 
their effectiveness. Even if criterion measures 
are well selected at that level and convincing 
evidence of change is found (i.e. consultees 
or parents are shown to have enjoyed a pro- 
gram, learned a lot from it, and to have de- 
veloped more favorable mental health atti- 
tudes), it cannot be assumed that those 
changes lead to more effective helping, or 
growth-supporting, practices. 
Kelly (1971) addresses this issue: 


The payoff from a consultation program is not only 
an alteration of the feeling states, belief systems, and 
aspirations of the consultee, but should also reflect a 
change in a person’s relationships with those sig- 
nificant others who directly participate in his life 
setting. Therefore evaluation studies should not mea- 
sure change in attitudes of consultees, nor analyze 
samples of the interactions between consultant and 
consultee, nor note changes in the consultee’s self- 
concept, for such attempts at evaluation are not con- 
gruent with a conception of consultation as a pre- 
ventive intervention. ... If... consultation is effec- 
tive in initiating a change process, then indices of 
effectiveness should be defined not only by changes 
in consultee performance, such as the classroom 
teacher, but by cumulative and successive changes in 
the behavior of significant others, for example, stu- 
dents in the classroom... + When considering re- 
search designs to document the effects of consultation 
. . . it is essential to provide for assessment of the 
radiating effects of the intervention. . . » An inter- 
vention such as consultation can be preventive only 
if the consultee produces change in significant others. 


(pp. 114-115) 


The point to emphasize is that although 
there is nothing wrong per sè with using way- 
station criteria such as changes in the knowl- 
edge and attitudes of consultees or parents 
in indirect service programs, such criteria 
alone are insufficient. So-called instrumental 
changes, if found, do not guarantee that posi- 
tive behavior changes will follow in the ul- 
timate target group. Without assessing the 
latter directly, there is the danger that all 


concerned parties will have had a pleasant, 


seemingly productive experience, that fails to 


help the program’s ultimate targets. 

Although the preceding is a widespread 
problem, it is not universal. Behavioral con- 
sultation (Heller & Monahan, 1977), for ex- 
ample, is often evaluated only in terms of 
specific behavior changes observed in ultimate 
targets. Moreover, there are examples of (non- 
behavioral) indirect service programs in par- 
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ent education (e.g., Hereford, 1963; Glide- 
well, Gildea, & Kaufman, 1973), in which 
positive program effects (behavioral and ad- 
justive) were indeed shown for the intended 
ultimate targets—children. Research designs 
that include measures of changes both in di- 
rect recipients and ultimate targets offer an 
added dimension richness in that linkages be- 
tween the two levels of change can be explored. 

Pressures affecting the choice of criterion 
measures. Community background contextual 
factors, such as those discussed earlier, that 
interfere in general with program evaluation 
hit especially hard when it comes to selecting 
and using research criteria. If the basic cli- 
mate is not conducive to evaluation, resist- 
ances to assessment procedures can, and do, 
develop. Such procedures can all too readily 
be seen as time-consuming, disruptive, and 
personally intrusive. 

Time-consuming is often defined phenom- 
enologically rather than objectively. The au- 
thor has had the experience of finding re- 
spondents more receptive to a 1-page format 
that required 5 minutes to complete than to a 
similar 10-page format that required only 3 
minutes to complete. That aside, the key prac- 
tical concern is that if key responders see a 
measure as too time-consuming, for whatever 
reason, it can effectively rule out that measure 
as a criterion, Take the following example: 
The main objective of a day-care program is 
to “resocialize” patients along dimensions such 
as (a) initiative, (b) self-help behaviors, (c) 
interaction with peers, (d) interaction with 
program personnel, (e) outside recreational 
activities, and (f) outside job activities, The 
researcher thus believes that judgments 
by knowledgeable program personnel about 


changes in specific behaviors, in 
the above subareas, ‘ nen 


teria of choice for 


quires only 20 mi 
plete, respondents decide they cannot, or will 


cumstances is to use “quick and dirty” but 
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the phenomena under the study, pull for gen- 
eralized attitudinal responses to the program 
and its personnel, and narrow the base for 
evaluating specific program effects or identify. 
ing particular program strengths and weak. 
nesses as a guide to its modification and im. 
provement. 

The reality bugaboo of the potential dis- 
ruptiveness of evaluation procedures is am. 
other source of noise that materially restricts 
the researcher’s choice of outcome criteria, 
One might, for example, envision several other 
data-gathering strategies for the hypothetical 
study just cited such as (a) direct behavioral 
observations of subjects to assess variables 
such as peer interaction, autonomy, initiative, 
and reactions to program personnel; and (b) 
tests that provide data from which inferences 
about such variables could be made, Both ap 
proaches present hazards. Apart from the it 
trinsic complexities involved in developing re 
liable, valid frameworks for observing and re 
cording behaviors, such procedures are often 
seen as intrusive and are therefore resisted. 
“Outsiders” must be introduced to the doing 
of an ongoing program, a process that program 
personnel or participants may see as disrup: 
tive, threatening, uncomfortable, or just plain 
not wanted. Similarly, removing subjects from 
ongoing program activities (e.g., schoolwork) 
for evaluations, especially time-consuming 
ones, also elicits resistance—if not to stop the 
procedure entirely, then to pare it to the bonè: 

Sensitivity about potential invasion of pri- 
vacy is another factor that restricts the use 0! 
certain criteria in evaluating community pte 
grams. A given program may seek to improve 
people’s sexual adjustment; another may be 
aimed at solidifying disrupted parent-child re- 
lationships, But to probe directly in thes 
sensitive (albeit face-valid) areas may be # 
threatening that the use of theoretically ideal 
criteria is blocked before the fact. 

Zax and Klein (1960) argued that behav 
ioral criteria are often among the best to Us 
in evaluating the effectiveness of E, 
health interventions. Because behavior, and i 
disruption, often defines and is at the n 
center of an individual’s problem, and because 
it tends to be objective and palpable, using © d 

p Nemes 7 vi 
havioral indices of change, is both face V 
and commonsensical. Unfortunately, the ina 


hility of such data, plus the fact that it 
[be costly or time-consuming to obtain, 
‘aused it to be underused in evaluating 
idiectiveness of community programs. 
preceding constraints on the use of 
iia put dire pressures on the community 
mam evaluator to compromise. Compro- 
& sometimes means using indirect mea- 
j; instruments that are out of phase with 
yogram’s goals; measures of unknown or 
jus reliability and validity; and vague, 
lal criteria that are difficult to relate to the 
i's focal variables. Though such problems 
far from unique to community program 
uations, they are pronounced in that field. 
Sis an especially tough blow, since vari- 
Ks of prime concern in CMH program 
ation research (e.g., health, pathology, 
stment, coping) are difficult enough to 
jure even when we have a “clean shot” at 
m (Bloom, 1977). Stated another way, 
igam evaluation research is limited by the 
} of the art” in assessment. Illustratively, 
researcher may be asked to evaluate a 
am designed to strengthen the self-con- 
of preschool children. If, however, a 
hometrically sound measure of self-con- 
it for children of that age is not available, 
functional choice is between using an un- 
: factory measure of self-concept or a psy- 
Mmetrically more sound instrument that 
les as close as possible to assessing that 
liable, 
Special criterion problems come up in out- 
ne Studies that cut across groups or settings. 
E> in a study designed to compare the ef- 
Sof a specific intervention on the atti- 
|» intellectual performance, and/or ad- 
a of middle- and low-income groups, it 
Mportant that the criterion measures be 
illy appropriate for the two groups being 
feted. If not, extraneous factors such as 
à of item clarity or inappropriateness of 
Eo for either group might easily be 
lounded with differential outcomes for the 
"hod being evaluated. 
i uch the same problem can occur in evalua- 
: oe that cut across structurally com- 
e settings, which, however, use differ- 
7 preset: procedures. Cowen et al. 
A ave addressed this question in the 
xt of a study designed to evaluate the ef- 
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fectiveness of a multidistrict school mental 
health program. Because the program was 
school based, an estimate of the child’s cur- 
rent school performance seemed to be one 
reasonable criterion for evaluating its effec- 
tiveness. But going across school districts 
made it virtually impossible to obtain such a 
measure. Formerly, the classic “A, B, GD; 
E” report card was an answer to the research- 
er’s prayers precisely because it offered a 
nearly universal metric for evaluating current 
academic performance. But “them simplistic 
days” seem to be gone forever. The ancient 
grading system has been supplanted by a near 
infinity of variants—single, double and triple 
checks; red, blue, and green stars; or lions, 
tigers, and giraffes. Even more perplexing for 
the aspiring quantifier are the extensive free- 
prose reports used by many school districts 
these days to evaluate children. Such reports 
have been known to cover 20 or more pages 
and to be as much, or more, oriented to un- 
familiar turf, such as identity problems, 50- 
cialization skills, and self-concept as to the 
erstwhile, inviolable three Rs. However laud- 
able efforts to find better, less competitive, 
less accusatory ways to assess & ild’s school 
performance are, they make it difficult, if not 
impossible, for researchers to use report card 
grades as a performance estimate in cross-dis- 
trict comparisons of program effects. Similar 
problems are encountered with educational 
achievement tests. School districts have be- 
come more and more individualized about 
which tests they use, for whom, and when 
such tests are given. Thus, in designs that cut 
across school districts, the experimenter may 
need to combine apples and bananas if he/she 
hopes to use performance and achievement 


data. 
Such reality factors will continue to restrict 
f criterion measures. In 


the researcher’s use 0 

addition, the intrinsic complexity of many 
community evaluation studies poses challenges 
in selecting criteria. Complexity means sev- 
eral things—obvious ones such as (a) the 
multicomponent nature of many community 
programs and (b) the fact that they seek to 
affect multiple functions—sometimes differ- 
ently for different people—and less than ob- 
vious ones, such as the fact that they usually 
take place in specific contexts of cost, morale, 
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attitudes, and expectancies. Though it helps 
a great deal to know that a program “works,” 
ultimately it is important to disaggregate com- 
ponent effects, separating active from inert in- 
gredients and identifying differential program 
effects for participants with different charac- 
teristics. Because such information is critical 
for improving programs, criteria should be 
selected with those realities in mind. 

Because programs have multivariate, differ- 

ential, and changing outcomes, multiple out- 
come criteria, including behavioral ones wher- 
ever possible, should be used to evaluate them. 
Doing so not only accurately reflects a pro- 
gram’s true complexity, but it also reduces the 
risk of putting all one’s eggs in a single deli- 
cate criterion basket. Greater use should also 
be made of unobtrusive and/or unintended 
outcome measures; they are relatively easy to 
collect and potentially informative. Thus, an 
investigator might weep a bit less after finding 
that children in a school mental health pro- 
gram improved at p < .09, rather than p< 
.05, on a problem behavior inventory, were it 
also established that the principal had 30 
fewer disciplinary referrals during the active 
program period. And, finally, in evaluation 
studies that require human judgments (by 
program personnel or target persons), prag- 
matic decisions to use brief, objective, and 
easy to understand and handle measures that 
are relevant to respondents’ main interests and 
spheres of involvement can help the cause im- 
measurably, 

Problems of control. Problems of control 
like those of criteria, are basic to most ates 
of psychological research. Their uniqueness, if 
any, to community program evaluation re- 
search derives from the realities of the com- 
munity context. Control, in evaluation studies 
seeks to insure that changes in behavior and/ 
or performance are due to the effects of the 
intervention, rather than to potentially con- 
rating Done at CAN, produce similar 

ieee 4 control is basically a 
aun unenviable fate. It commits the “victim” 
o all the “dirty work” of research i 
loss, intrusiveness, distupti (e ou 
a » distuptiveness) without di- 
rect “pay off” —characteristically defined 
needed services. Small wonder that ody 
a places jump through hoops to be neo 
ince many settings (schools, hospitals, clin- 
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ics, courts) are unwilling to serve as control 
under any circumstance, the theoretical pog 
of control groups is limited before the re 
searcher ever gets to it. 
Moreover, locating “any old” control grou 
is hardly enough to assure a good outcom| 
study. Ideally, experimentals and contro 
should be matched on a host of variabl 
which, left unattended, could confound 
findings. Although the exact nature of thes 
variables depends on the program’s natu 
and purposes as well as characteristics of th 
subjects, they often include age, sex, intell 
gence, race, socioeconomic status, and the n 
ture and extent of preprogram maladjustmell 
Both because many settings eschew the coi 
trol role and because experimentals and coi 
trols must be matched on many control vai 
ables, the experimenter may have few degrét 
of freedom in searching for appropriate con 
munity control groups. At the very least, | 
can be a vexing, time-consuming challenge. 
Not surprisingly, then, many progtal 
evaluation studies are done without conti 
groups. Such studies depend on within-grol} 
pre-post comparisons, which, though somi 
times helpful, entail certain risks. For 0m 
thing, they do not control for spontaneous @ 
natural change over time; for another, 
they rely heavily on human judgments 0 
change—by subjects themselves or by Pm 
gram personnel—they are particularly susce 
tible to distortions such as response bia 
Hawthorne, and reverse Hawthorne effet 
(Zdep & Irvine, 1970). Still another dang 
of studies that lack controls is the tendai 
of initially extreme test scores to regress 
the mean on readministration. Such na 
regression on the surface, “looks like’ int 
provement and can be confused with it, f 
nally, base rates for some behavioral critet! 
(e.g., delinquency rates and employment) 
are appropriate in evaluating the fee 
of certain programs change rapidly OVER a 
time periods (e.g., ages 12-14 for delinquett 
or 16-19 for employment). To use such a 
teria, without controls or anchoring pari 
data, could seriously distort a study's in 
pretation (Freeman & Sherwood, 1969). ii 
Another community reality that aa 
optimal control is the fact that experiné 
programs must often start at a certain 7 


f 


far and run for X period of time if they 
ip be evaluated at all. School-based pro- 
fs, for example, are bounded by the be- 
ing and end of the school year. The pro- 
hmust start when it must start, even if an 
Jate control group is not available. Fol- 
fig all due effort an approximately satis- 
ity control group may be located several 
iis later. But, by then, children manifest 
fmt patterns of class adjustment prob- 
(either because they are better known or 
i) normal seasonal variations), and class 
metric structures have changed from 
they were 2 months earlier. It is dif- 
to know, when experimentals and con- 
bare assessed at different times of the 
whether scores on key criterion mea- 
Ksuch as the preceding ones mean the 
ithing for the two groups. 


h scene, many adaptations to, and “solu- 
iof, the control dilemma have been tried 
ime voluntarily, some otherwise. The own- 
P procedure, in which a group goes 
i gh an inert preprogram wait period, is 
fed to bypass the thorny problems of 
ig matched controls. A variant of that 
bach is for a group to serve simultane- 
Jas a matched control for an experimental 
P and as an own control for a specified 
me with the understanding that it 
‘later participate in the regular program 
pte Although both of those variants 
e useful, they may still present prob- 
$ for example, delay in providing services 
viduals who need them or the dangers 
p ial selecting as time controls 
cts with less pressing service needs. 
Ee situations the most direct route to 
* is to subdivide a prospective pool of 
subjects, within a given setting, into 
‘4 experimental and control subgroups. 
ae is attractive both because it is 
3 ient and because it may be easier to 
a overall match among within-set- 
ie jects who share similar backgrounds, 
d Wiens status, and histories. But 
il are also reasons why it may not work. 
x if the program involves needed services, 
mnel from the setting may argue—some- 
l vocally and insistently—that those with 
Steatest need must have first call on ser- 
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vices. Withholding a promising service from 
someone who needs it badly to satisfy the 
niceties of an abstract research design simply 
will not fly. Indeed, if pushed too hard, it can 
sound the program’s demise—either before it 
starts or through later noncooperation or ac- 
tive resistance. It takes no genius to imagine 
the public relations confusion that can ensue. 
The obvious point is that community research 
has a real, vital ecological surround—more so 
than almost any other area of psychological 
research. That surround must be taken into 
serious account at all stages. 

‘Another factor that limits the usefulness of 
within-setting control is- intrasetting commu- 
nication about a program. If, for example, the 
program intervention involves the use of cot- 
tage parents to teach verbal mediational tech- 
niques of self-control to residential delinquent 
adolescents, there is the danger that training 
procedures and program practices will spill 
over from experimentals to controls in a given 
institutional setting. Many community inter- 
vention programs involve indirect services 
such as consultation. If teachers are targets of 
a consultation program, it is unrealistic to ex- 
pect that they will apply newly learned skills 
only to experimental children in their class 
and not to controls, or that they will not dis- 
cuss useful new discoveries with other (non- 
experimental) teachers in the building. 

Control, as many investigators have learned, 
can be an elusive phenomenon; that is, “now 
you see it, now you don’t.” Seemingly pretty 
initial matches evaporate in the face of haz- 
ards beyond the experimenter’s control, Thus 
an experimenter, who must start a program 
by a certain date, matches experimental and 
control groups on all major sociodemographic 
measures but not on preprogram adjustment 
or performance status. Because the latter data 
take longer to collect and score, the experi- 


menter makes the perfectly reasonable as- 


sumption that random assignment of subjects 
groups on 


will yield approximately matched 
those dimensions. But it does not! Or, after 
careful matching is completed, attrition while 


the program is underway destroys the match. 


There are many reasons why such attrition 


occurs. People move, rater judges who fur- 


nished predata change jobs; administrative 


decisions are made to shift individuals out of 
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a program; needed data cannot be collected 
or prove to be invalid. And so it goes! If 
everyone who has been burned by such a prob- 
lem submitted a brief description of his/her 
special headache, the resulting compilation of 
unanticipated, and “undeserved,” false turns 
would fill many entire issues of this journal. 
After-the-fact loss of initial control for any of 
the preceding reasons is common, not rare, in 
community program outcome research. That 
is one reason why not all program evaluation 
studies reported—and even fewer of those 
done—ever “mess with” controls. Of those 
that do, a substantial number, present com- 
pany included, have been victimized either by 
incomplete or far less than ideal initial con- 
trol, or by unavoidable breakdown of control 
during the study. 
Faced with such natural disasters, investi- 
gators who care about control have several 
compensatory options to pursue. One is to bail 
out, as best one can, statistically. Analysis of 
covariance is a generic procedure designed to 
minimize initial mismatches and to bring com- 
parison groups back to approximately the 
same starting point. Another way to deal with 
initial mismatches is to hack and chisel at the 
subject groups in hopes of paring them down 
to approximately comparable samples. Among 
the dangers of this procedure is that the ad- 
justments may have to be asymmetrical (either 
because the original disproportionality comes 
more from one group than from the other or 
because the M is more robust in one group 
than the other). If the reduction in V comes 
primarily from one group, it may distort 
the group’s defining characteristics. Moreover 
such a procedure often highlights an inherent 
conflict between the ideal of a tight match 
(which, necessarily entails loss of N ) versus 
the robustness and representativeness of the 
samples retained. Since this conflict is real 
it is sometimes resolved by tolerating aise 
in the match to preserve sufficiently large and 
representative Ns for the Major substantive 
program evaluation analyses. 
An interesting but more subtle question that 
the researcher sometimes faces is, When i 
control not a control? Thus a = i 
well matched Statisticalh Bi ie ee oa 
losi % y but not Psycho- 
logically.” Take as an exampl. 5 
which crisis interventioni ple a program in 
erventionists are trained to 
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use special abreactive techniques in whic 
they have interest but no special investment 
The experimental question is whether the 
of such techniques improves intervention out 
comes with people experiencing current li 
crises. The groups seen by the special 
trained and “regular” workers are wi 
matched demographically and in terms of 
nature and seriousness of the crises that the 
have experienced. The study’s criteria inclu 
the clinician’s preratings and postratings ofi 
series of relevant patient behaviors. In suchi 
situation, experimental and control interven 
tionists may have different cognitive s 
about the study, with experimentals thinkin 
that “they are evaluating the effectiveness i 
this new program” and controls believing thi 
“they are evaluating my effectiveness as) 
clinician.” If such differential program vie 
exist, experimentals’ postprogram ratings a 
more likely either to be objective or to 14 
flect personal (pro or con) views of the pi 
gram, whereas controls, who see themselves 
the focus of the study, are more likely to pi 
vide positive change estimates for clien 
Should that happen, genuine program effet 
are obscured or lost. A similar example can H 
cited in evaluating the effectiveness of schot 
based intervention programs. Teachers of ay 
perimental children in such programs 4 
more likely than those of controls to see thé 
behavioral evaluations of children as progi 
related. Hence, their judgments may be™ 
fluenced by their views of, and attitudes tH 
ward, the program. By contrast, teachers f 
control children, lacking a program metric, 4 | 
more likely to see the rating task in the com 
text of how good a job / have personally 
with Johnny or Mary this year. If so, there 
a pull for them to give more positive endi 
year ratings (Cowen et al., 1974). 
Standard statistical controls may aa 
suffice in situations in which a program’s ™ 
content and activities happen, incidental 
involve a major structural change in the ev 
day lives of the target subjects. Assume, 
example, that college student volunteers 
trained to work with chronic hospitalize 
tients using rational-emotive techniques’ 
sufficient in evaluating the effectiveness a 
intervention to have a matched enei 
and control group? Probably not! Suc 


ae 
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p might confound the program’s ostensibly 
We ingredient (i.e., the therapeutic ap- 
ch) with the fact that the procedure, in- 
entally, involved establishing a meaningful 
personal relationship with people whose 
ordinarily lacked such relationships. An 
| third group that would strengthen the 
dy's interpretive base, is one that could 
nirol for the personal relationship (e.g., 
ih games and recreational activities) in the 
hence of an intentional, therapy-system 
ust, Similar examples can be identified for 
ive therapeutic programs with adolescents 
correctional settings, children in institu- 
ns for the retarded, or geriatric patients. 
The unusual complexity of certain commu- 
ly settings (e.g., hospitals and schools) plus 
fact that many are veritable laboratories 
“exploring many, ever-changing, program 
jations underscores the fact that mere dem- 
aphic-statistical comparability of experi- 
ntals and controls does not automatically 
ve the control problem. From an experi- 
ter’s standpoint, evaluation of a specific 
gram would be clearest if there were 70 
êr special programs in either the experi- 
ntal or control settings. Indeed, experi- 
ters’ special blinders may impel them to 
# the world that way, even though that view 
Ws not correspond, ecologically, to reality. 
ore often than not, settings such as schools 
d hospitals house a variety of programs— 
mal and informal and short- and long-lived. 
me of these programs may address the same 
ffectives and behaviors as the experimental 
er in question (Freeman & Sherwood, 
a A behavior modification program for 
k al patients may take place alongside 
i rug-therapy and patient ward-governance 
Nograms. A school mental health intervention 
lay co-occur with Glasser circle and Distar 
as (Cowen et al., 1974). The intermix- 
y of such programs not only makes it dif- 
Kult to evaluate their separate contributions 
ut also often means that an ostensibly pure 
“perimental program is in fact that program 
US several overlapping services or programs: 
l a setting, compared to another (so-called 
= setting, which happens not to have 
4 ; particular program but does have three 
our other programs addressed to similar 
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behaviors in comparable target subjects. 
Sometimes, in fact, an administrative decision 
is made to assign the special program to one 
setting because (compared to other similar 
ones) it is deficient in the type of services that 
the program provides. Conversely, control set- 
tings may be assigned other similar programs 
as part of an (understandable) administra- 
tive philosophy of sharing the wealth. Practi- 
cal problems of control, in such situations, are 
magnified by the facts that some of the over- 
lapping programs, either in experimental or 
control settings or both, are likely to be short- 
lived or to change in the processs, and new 
programs may be introduced while the experi- 
ment is in progress. Although each of the fore- 
going possibilities is regrettable experimen- 
tally, they are part of the community’s reality. 

Problems of proper control, in community 
research, are diabolically complex and create 
serious persistent stumbling blocks to sound 
program evaluation research, 


Overview and Summary 


Communities are many things. One thing 
they are not is an ideal laboratory for anti- - 
septic psychological studies. Their extraordi- 
nary complexity, omnipresent flux, action-ser- 
vice orientation, and susceptibility to day- 
to-day pressures present real and formidable 
barriers to “Mr. Clean” program evaluation 
studies. These factors place major constraints 
on the design of studies, the types of criteria 
that can be used, and the rigor of sophistica- 
tion of the control that can be exercised. Al- 
though some of those problems can be reduced 


through judicious planning, others, quite be- 
yond the experimenter’s control, cannot, This 
and the actual 


is one reason why theory, logic, 
development and implementation of new com- 
munity programs have outpaced the field’s 


supporting research base. ate 
The tugs and pulls of this situation 
are clear. On the one side is the obvious 


need to pose important, socially significant 


questions and to understand the impact and 


value of innovative practices designed to over- 
fractory problems in 


come long-standing, Te! 5 
mental health. On the other are our training 


and bloodlines as experimenters and our un- 
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derstandings of past accepted canons for ac- 
creting new knowledge. These opposing ten- 
sions are as apparent in community research 
as in any subdomain of psychology today. 

The intent of this article is not to discour- 
age trying harder. Such effort is sorely needed; 
it can have great payoff value. Much can be 
done to strengthen community program evalu- 
ation technology and to design studies that 
reduce sources of confound or error. Weak- 
nesses in specific measures or in classes of cri- 
teria typically used in community program 
outcome research dictate that greater em- 
phasis be placed on converging sources of 
evidence. But we must still expect that com- 
munity realities will remain to militate against 
ideal research studies. The vulnerability of 
findings from any single community evaluation 
study points to the importance both of replica- 
tion and of tolerance for a slow accretive 
process, in which small pieces in a puzzle 
gradually cumulate toward weight-of-evidence 
conclusions about major new programming ap- 
proaches. Although such a process is not in- 
trinsically inimical to the way of science, it 
may be more caricatured in community re- 
search than in other fields. 

The compelling logic of the community ap- 
proach, the significance of the problems it ad- 
dresses, and the excitement and clinical prom- 
ise of some of its early innovative programmatic 
efforts have been sufficient to carry the field’s 
infancy and early childhood, The future, how- 
ever, will stand or fall on the solidity of its 
empirical footing. Social significance cannot, 
in that process, be sacrificed at the altar of 
laboratory precision. Hence, we must expect 
that successive approximations—the gradual 
putting together of sometimes chipped or 
scarred building blocks—will be the way of 
community Program evaluation research in the 
coming decades. 

For the reader who seeks wisdom an 
phistication beyond the frailties of the oa 
account, the following additional sources are 
Suggested: Schulberg, Sheldon, and Baker 
1969; Bloom, 1972; Roen, 1971; Glass 1976- 
Hammer, Landsberg, and Neigher i 1976. 
Fairweather and To; Wine sg! 

rnatzky, 1977; Neigher 
Hammer £ : 
, and Landsberg, 1977; and Gutten- 

tag and Saar, 1978, À 
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Theoretical Risks and Tabular Asterisks: Sir Karl, 


Sir Ronald, and the Slow Progress of Soft Psychology 


Paul E. Meehl 


University of Minnesota 


Theories in “soft” areas of psychology lack the cumulative character of scientific 
knowledge. They tend neither to be refuted nor corroborated, but instead merely 
fade away as people lose interest. Even though intrinsic subject matter difficul- 
ties (20 listed) contribute to this, the excessive reliance on significance testing 
is partly responsible, being a poor way of doing science. Karl Popper’s approach, 
with modifications, would be prophylactic. Since the null hypothesis is quasi- 
always false, tables summarizing research in terms of patterns of “significant 
differences” are little more than complex, causally uninterpretable outcomes of 
statistical power functions. Multiple paths to estimating numerical point values 
(“consistency tests”) are better, even if approximate with rough tolerances; and 
lacking this, ranges, orderings, second-order differences, curve peaks and valleys, 
and function forms should be used. Such methods are usual in developed sciences 
that seldom report statistical significance. Consistency tests of a conjectural 


taxometric model yielded 94% success with zero false negatives. 


I had supposed that the title gave an easy 
tipoff to my topic, but some puzzled reactions 
by my Minnesota colleagues show otherwise, 
which heartens me because it suggests that 
what I am about to say is not trivial and uni- 
versally known. The two knights are Sir Karl 
Raimund Popper (1959, 1962, 1972; Schilpp, 
1974) and Sir Ronald Aylmer Fisher (1956, 
1966, 1967), whose respective emphases on 
subjecting scientific theories to grave danger 
of refutation (that’s Sir Karl) and major re- 
liance on tests of Statistical significance 
(that’s Sir Ronald) are, at least in current 
practice, not well integrated—perhaps even 
incompatible. If you have not been accus- 
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tomed to thinking about this incoherency, and 
my remarks lead you to do so (whether or not 
you end up agreeing with me), this article will 
have served its scholarly function, 

I consider it unnecessary to persuade you 
that most so-called “theories” in the soft areas 
of psychology (clinical, counseling, social, pe 
sonality, community, and school psychology) 
are scientifically unimpressive and technologi- 
cally worthless. Documenting that statement 
would of course require a considerable amount 
of time, but you can quickly get the flavor by 
having a look at Braun (1966); Fiske 
(1974); Gergen (1973); Hogan, DeSoto, an 
Solano (1977); McGuire (1973); Meel 
(1960/1973a, 1959/1973f); Mischel (1977); 
Schlenker (1974); Smith (1973); and Wig: 
gins (1973). These are merely some high vis: 
ible and forceful samples; I make no claim to 
bibliographic completeness on the large themi 
of “What’s wrong with ‘soft’ psychology: , 
beautiful hatchet job, which in my opinion 
should be required reading for all PhD candi- 
dates, is by the sociologist Andreski Cae 
Perhaps the easiest way to convince yourse! 
is by scanning the literature of soft psycho 
ogy over the last 30 years and noticing whal 
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happens to theories. Most of them suffer the 
late that General MacArthur ascribed to old 
enerals—They never die, they just slowly 
lade away. In the developed sciences, theories 
fend either to become widely accepted and 
built into the larger edifice of well-tested hu- 
nan knowledge or else they suffer destruction 
inthe face of recalcitrant facts and are aban- 
doned, perhaps regretfully as a “nice try.” 
But in fields like personology and social psy- 
ology, this seems not to happen. There is a 
period of enthusiasm about a new theory, a 
period of attempted application to several fact 
lomains, a period of disillusionment as the 
negative data come in, a growing bafflement 
about inconsistent and unreplicable empirical 
results, multiple resort to ad hoc excuses, and 
then finally people just sort of lose interest in 
the thing and pursue other endeavors. 

Since I do not want to step on toes lest my 
nopaganda falls on deaf ears, I dare not men- 
tion what strike me as the most egregious con- 
temporary examples, so let us go back to the 
late 1930s and early 1940s when I ‘was a stu- 
dent. In those days we were talking about 
[level of aspiration, You could not pick up a 
psychological journal—even the Journal of 
Experimental. Psychology—without finding at 
one and sometimes several articles on 


level of aspiration in schizophrenics, or in 
juvenile delinquents, or in Phi Beta Kappas, 
or whatever. It was supposed to be a great 
| Powerful theoretical construct that would ex- 
plain all kinds of things about the human 
mind from psychopathology to politics. What 
| happened to it? Well, I have looked into some 
of the recent textbooks of general psychology 
and have found that either they do not men- 
tion it at all—the very phrase is missing from 
the index—or if they do, it gets cursory treat- 
ment in a couple of sentences. There is no 
doubt something to the notion. We all agree 
(from common sense) that people differ in 
what they demand or expect of themselves, 


1 
and that this probably has something to do, 


sometimes, with their performance. But it did 

| not get integrated into the total nomological 
network, nor did it get clearly liquidated as a 

| nothing concept. It did not get killed or resur- 
eee or transformed or solidified; it just 
Sa of dried up and blew away, and we no 
ger wanted to talk about it or do experi- 
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mental research on it. A more recent example 

is the theory of “risky shift,” about which 

Cartwright (1973) wrote, after reviewing 196 

papers that appeared in the 1960s: 


As time went by... it gradually became clear that 
the cumulative impact of these findings was quite dif- 
ferent from what had been expected by those who 
produced them. Instead of providing an explanation 
of why “groups are riskier than individuals,” they in 
fact cast serious doubt on the validity of the proposi- 
tion itself (p. 225). 


It is now evident that the persistent search for an 
explanation of “the risky shift” was misdirected and 
that any adequate theory will have to account for a 
much more complicated set of data than originally 
anticipated. But it is not clear how theorizing should 
proceed, since serious questions have been raised as to 
whether, or in what way, “risk” is involved in the ef- 
fects to be explained (p. 226). 


After 10 years of research, [the] original problem re- 
mains unsolved. We still do not know how the risk- 
taking behavior of “real-life” groups compares with 
that of individuals (p. 231). 


I do not think that there is any dispute 
about this matter among psychologists fa- 
miliar with the history of the other sciences. 
It is simply a sad fact that in soft psychology 
theories rise and decline, come and go, more as 
a function of baffled boredom than anything 
else; and the enterprise shows & disturbing 
absence of that cumulative character that is 
so impressive in disciplines like astronomy, 
molecular biology, and genetics. 

There are some solid substantive reasons for 
this that I will list here, Jest you think that I 
am beating up On the profession, unaware of 
the terrible intrinsic difficulty of our subject 

0 minutes of superficial 


matter. Since (in 1 
thought) I easily came up with 20 features 


that make human psychology hard to scien- 
tize, I invite you to pick your own favorites. 
Differences as to which difficulties are em- 
phasized will not, I am sure, cause any dis- 
he general fact. This is not 
in detail the thesis that 
d is hard to scientize, let alone 


it—or who, 
would maintain 
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to develop shortly, methods adequate to over- 
come or circumvent it. Each of these alleged 
difficulties in scientizing the human mind is 
sufficiently controversial to deserve a meth- 
odological article by itself. This being so, to 
substitute a once-over lightly (and hence in- 
evitably dogmatic) defense of each as a real 
difficulty is, for those who accept it, a work 
of supererogation, and for the others, it is 
doomed to failure. I therefore confine myself 
to listing and explaining the problems, re- 
peating that my purpose in so doing is to 
prevent the rest of my article from being 
taken as a kind of malicious and unsympa- 
thetic attack on psychologists (of which, after 
all, I am one!) based on an inadequate ap- 
preciation of the terrible difficulties under 
which we work. In a few cases I have ex- 
plained at some length and replied to objec- 
tions, these being cases in which a difficulty 
is not widely recognized in our profession or 
in which it is generally held to have been dis- 
posed of by a familiar (but erroneous) refuta- 
tion or solution. Regrettably, some psycholo- 
gists use “philosophical” arguments that are a 
generation or more out of date. 

Since I am listing and summarizing rather 
than developing or proving, it seems appropri- 
ate to present the set of difficulties as follows: 


1. Response-Class Problem 


This involves the well-known difficulties of 
slicing up the raw behavioral flux into mean- 
ingful intervals identified by causally relevant 
attributes on the response side, a problem that 
exists already in the Skinner box (Skinner, 
1938, p. 70), worsens in field study by an 
ethologist, and reaches almost unmanageable 
proportions in studying human social behavior 
of the kind to which clinical, social, and per- 
sonology psychologists must address them- 
selves (see, e.g., MacCorquodale & Meehl 
1954, pp. 218-231, after a quarter century 
still considered by some as best statement of 
me proren minde, 1970, pp. 10-13; Meehl, 

, Pp. 4 and im; Ski 
a chap. 6 passim; Skinner, 


2: Situation-Taxonomy Problem 


As is well-known, the importance of an ade- 
quate classification and sampling of environ- 


q 
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ments and situations has received less atten- 
tion than Problem 1, above, despite emphasis 
by several major contributors such as Roger 
Barker (1968), Egon Brunswik (1955), and 
Saul B. Sells (1963). It seems likely that the 
problems of characterizing the stimulus side, 
even though often neglected by the profession 
or dealt with superficially, are about as in- 
tractable as the characterization of the re- 
sponse class. It is not even clear whether iden- 
tification and measurement of the relevant 
stimulus dimensions (e.g., size) is the same 
task as concocting a taxonomy of “situations! 
and “environments,” nor whether the answet 
to this question would quickly generate 
rules for an adequate statistical ecology api 
plicable to research design. So I am perhaps 
lumping under this “situation-taxonomy” rub 
ric three distinguishable but related problems: 
I am inclined to think that most (not all) of 
the current methodological controversy coni 
cerning traits versus situations is logically 
mathematically reducible to this and the ptt 
ceding category, since I think that traits are 
disposition clusters, and dispositions always 
involve at least implicit reference to the stim 
ulus side; but this is not the place to push 
that view. 


i 
: 
3. Unit of Measurement | 
i 


One sometimes hears this conflated with one 
or both of the preceding, but, of course, ko 
not the same. There are questions in xating 
scales and in psychometrics (as well as in cet 
tain branches of nondifferential psychology) H 
which disagreements persist about such f 
mental matters as the necessity of a genuit 
interval or ratio scale for the use of cerlall 


kinds of sampling statistical inference. 


4. Individual Differences 


Perhaps the shortest way to discuss this E 
is to point out the oddity that what 1 a 
psychologist’s subject matter is another Po 
chologist’s error term (Cronbach, 1 ai 
More generally, the fact is that organi “a 
differ not only with respect to the E E. 
of various dispositions, but, more commo! difil 
more distressing for the researcher; they Or 
as to how their dispositions are shap! | 
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ganized. As a result, the individual differ- 
es involved in “mental chemistry” are 
ygher to deal with than, say, the fact that 
ferent elements have different atomic num- 
yrs or that elements with the same atomic 
umber vary in atomic weights (isotopes). 


| Polygenic Heredity 


It is generally conceded that the measure- 
mnt and causal inference problems that arise 
In biometrical genetics are, with some excep- 
lions, more difficult than those found in the 
ind of single factor dominant or recessive 
ne situation on which the science of genetics 
as originally founded. Except for Mendeliz- 
hg mental deficiencies and perhaps some psy- 
Ihiatric disorders that are transmitted in a 
Mendelizing fashion, most of the attributes 
psychologists are in- 
enced by polygenic systems. Usually we 


(1963) and others. However, 
disposition of the adult-acculturated individ- 
tal, it presumably results from a confluence of 
tifferent polygenic contributors such as basic 
anxiety readiness, mesomorphic toughness, 
garden-variety social introversion, dominance, 
Need for affiliation, and the like. 


6. Divergent Causality 


| _ As pointed out 35 years ago by the physical 
penis Irving Langmuir (1943; London, 
a Meehl, 1954, pp. 60-61; Meehl, 1967/ 
ie b, especially Footnotes 1-8 on PP: 395- 
k ), there are complex systems whose causal 
th ucture and boundary conditions are such 
3 at slight differences—including those that 
re, for practical predictive and explanatory 

, Purposes, effectively “random” (whatever their 
jinner deterministic nature may be)—tend to 
poe out,” “cancel each other,” or “balance” 
A the long run. On the other hand, there 
ais systems in which such slight per- 
of 2 ions or differences in the exact character 
tif initial conditions are, so to speak, am- 
plified over the long run. Langmuir christened 
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the former kind of causality as “convergent,” 
as when we say that the average errors in 
making repeated measurements of a table 
length tend to cancel out and leave us with a 
stable and highly trustworthy mean value of 
the result. On the other hand, an object in 
unstable equilibrium can lean slightly toward 
the right instead of the left, as a result of 
which a deadly avalanche occurs burying a 
whole village. Although both sorts of systems 
are found at all levels of Comte’s Pyramid of 
the Sciences, it seems regrettably true that the 
incidence of important and pervasive types of 
divergent causality is greater in the sciences 
of behavior. 


7. Idiographic Problem 


It is not necessary to “settle” the long-con- 
tinued methodological controversies regarding 
idiographic versus nomothetic methods in psy- 
chology and history (e.g., whether they are 
philosophically, metaphysically fundamentally 
different) to agree with strong proponents of 
the idiographic method, such as Gordon All- 
port (Allport, 1937) or my long-time friendly 
adversary on the prediction issue, Robert R. 
Holt (1958), that the human personality— 
unless one approaches it with the postulate of 
impoverished reality—has in its content, 
structure, and, conceivably, even in individual 

some of its “laws,” and very 


differences as to 
much in its origins, properties and relations 
rather 


that make the study of personality 
more similar to such disciplines as history, 
archeology (historical), geology, or the recon- 


struction of a criminal case from police evi- 


dence than the derivation of the molar gas 
laws from the kinetic theory of heat or the 
mechanisms of heredity from molecular biol- 
ogy. Some would argue that such explanatory 
derivations aside, even the mere inductive sub- 
sumption of particulars (episodes, molar traits, 
persons) under descriptive generalizations is 
a more difficult and problematic affair in these 
disciplines than in most branches of physical 


and biological science. 


8. Unknown Critical Events 


Related to divergent causality and idio- 
inguishable 


graphic understanding but disti 
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from them is the fact that critical events in 
the history of personality development are fre- 
quently hard to ascertain. There is reason to 
believe that in some instances they are liter- 
ally never ascertained by us or known to the 
individual under study, even somebody who 
has spent 500 hours on the analytic couch. 
They are sometimes observable events that, 
however, were not in fact observed and re- 
corded, such as the precise tone of voice and 
facial expression that a patient’s father had 
when he was reacting to an off-color joke that 
the patient innocently told at the dinner table 
at age 7. Every thoughtful clinician realizes 
that the standard life history that one finds in 
a medical chart is, from the standpoint of 
thorough causal comprehension, so thin and 
spotty and selective as to border on the ludi- 
crous, But there is also what I would view as 
an important causal source of movement in 
one rather than another direction of divergent 
causality, namely, inner events, such as fan- 
tasies, resolutions, shifts in cognitive struc- 
ture, that the patient may or may not report 
and that he or she may later be unable to re- 
call. 


9, Nuisance Variables 


Other things equal, it is handy for research 
and theorizing if we can sort out the variables 
into three classes, namely, (a) variables that 
we manipulate (in the narrow sense of the 
word experimental), (b) variables that we do 
not manipulate but can hold constant or ef- 
fectively exclude from influence by one or 
another means isolating the system under 
study, and (c) variables that are quasirandom 
with respect to the phenomena under study 
so that they only contribute to measurement 
error or the standard deviation of a statistic. 
Unfortunately, there are systems, especially 
social and biological systems of the kind that 
clinical psychologists and Personologists study. 
in which there is operative a nonnegligible 
class of variables that are not random but sys- 
tematic, that exert a sizable influence, and are 
themselves also sizably influenced by other 
variables, either exogenous to the system (F. 
M. Fisher, 1966) or contained in it, such that 
we have to worry about the influence of these 

but we cannot always ascertain the 
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direction of the causal arrow. Sometimes we. 
cannot even get sufficiently trustworthy mea- 
surements of these variables so as to “partial 
out” or “correct” their influence even if we are 
willing to make conjectures about the direc- 
tion of causality. There are some circum- 
stances in which we can extrapolate from ex- 
perimental studies or from well-corroborated 
theory to make a high-confidence decision 
about the direction of causal influence, 
but there are many other circumstances—in| 
soft psychology, the preponcerating ones—in 
which this is not possible. Further, lacking 
special configurations such as highly atypical] 
cells in a multivariate space or correlation co 
efficients that impose strong constraints on a 
causal interpretation, or provisional assump- 
tions as relied on in path analysis (Li, 1975), 
the system is statistically and causally inde 
terminate. (Why these constraints are regu- 
larly treated as “assumptions” instead of 
refutable conjectures is itself a deep and fas- 
cinating question that I plan to examine some 
other time.) The well-known difficulties in as- 
sessing the influence of socioeconomic status 
(SES) on children’s IQ when unscrambling 
the hereditary and environmental contributors) 
to intelligence is perhaps the most dramatici 
one, but other less emotion-laden examples 
can be found on all sides in the behavioral 
sciences. (See Meehl, 1970a, 1971/1973). 


10. Feedback Loops 


A special case in engineering is the usual in 
psychology, that a person’s behavior affects 
the behavior of other persons and hence alterss 
the schedule imposed by the “social Skinner | 
box.” The complexities here are so refractory 
to quantitative decomposition that yoked a 
setups came to be used even for the (relative 
simple) animal case as a factual substitute a 
piecewise causal—dispositional analysis. In } 
human social case, they may be devastating: 


11. Autocatalytic Processes 


The chemist is familiar under the lanen 
tocatalysis with a rare but important kin cf 
preparation in which one of the end oe of 
of the chemical processes is itself capa ex- 
catalyzing the process. Numerous common 
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ples spring to mind in psychology, such as 
uxiety and depression as affects or economic 
uilure as a social impact. Much of neurosis is 
ntocatalytic in the cognitive-affective-voli- 
lonal system, as are counterneurotic healing 
jrocesses. When this kind of complicated setup 
s conjoined with the critical event, idio- 
gaphic, and divergent causality factors, and 
Iso with the individual differences factor (that 
rameters relating the growth of one state of 
thedule to a dependent variable, which itself 
n turn acts autocatalytically, show individual 
ferences), the task of unscrambling such a 
ituation becomes terribly difficult. 


2. Random Walk 


There is a widespread and understandable 
endency to assume that the class of less- 
wobable outcomes, given constancy of other 
lasses of causally efficacious variables, should 
n principle be explicable by detecting a class 
if systematic input differences. Thus, for in- 
tance, we try to understand the genetic/en- 
vironmental contributions to schizophrenia by 
studying discordant monozygotic twins. If I 
levelop a florid clinical schizophrenia and my 
monozygotic twin remains sane and wins the 
Pulitzer Prize for poetry, it is a sensible strat- 
igy for the psychologist to consider my case 
ind similar cases with an eye for “systematic 
lifferences” (such as who was born first, who 
Was in what position in the uterus, or who 
a a severe case of scarlet fever with de- 
litium) as responsible for dramatic difference 
in final outcomes, When one reflects on the 
tather meager yield of such assiduous ferret- 
E out of systematic differences by, say Got- 
ee and Shields (1972) in their excellent 
Be one experiences bafflement. On the one 
ee the concordance rate for monozygotic 
oe is only a little over 50%, indicating a 
Ye. arge nongenetic component in causality. 
E on the other hand, we find feeble or null 
ees when we look at the list of “ob- 
4 us, plausible” differentiators between the 
wins who fall ill and the twins who remain 
aS Of course, one can always say—and 
See no doubt be partly right in this—that 
eee not been clever enough to hit on 
au g t ones; or even if, qualitatively, they 

e the right ones, we do not have sufficiently 
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construct-valid measures of them to show up 
in the statistics. 

There is, however, an alternative explana- 
tion that when one reflects on it, is plausible 
(at least to a clinical practitioner like myself) 
and that has analogues in organic medicine 
and in other historical sciences like geology or 
the theory of evolution, to wit, that we are 
mistaken to look for a “big systematic vari- 
able” of the kind that is already in our stan- 
dard list of influences, such as organic disease, 
parental preference, Or SES of an adoptive 
home. Rather, we might emphasize that a hu- 
man being’s life history involves as one form 
of divergent causality, something akin to the 
stochastic process known as a “random walk” 
(Bartlett, 1955, pp. 15-20, 47-50, 89-96; 
Feller, 1957, pp. 73, 311; Kemeny, Snell, & 
Thompson, 1957, pp. 171-177; Read, 1972, 
pp. 779-782). At several points that are indi- 
vidually minor but collectively critical deter- 
minative, it is an almost “chance” affair 
whether the patient does A or not A, whether 
his girl friend says she will or will not go out 
with him on a certain evening, or whether he 
he happens to hit it off with the opthamologist 
that he consults about some peculiar vision 
disturbances that are making him anxious 
about becoming blind, and the like, If one 
twin becomes psychotic at the end of such a 
random walk, it is possible that he was suffer- 
ing from what was only, so to speak, “bad 
luck”—not a concept that appears in any 
standard list of biological and social nuisance 
variables! 

Luck is one of the most important contrib- 
utors to individual differences in human suffer- 
ing, satisfaction, illness, achievement, and so 
forth, an embarrassingly “obvious” point that 
social scientists readily forget (Gunther, 1977: 
Jencks, 1972, pp. 8-9, 227-228; Popper, 1974, 
pp. 36-37; Stoddard, 1929; for further dis- 
cussion of this see Meehl, 1972/1973g, PP. 
402-407, Meehl, 1973d, pp. 220-221). Of 
course, the fact that a process resembles a 
random walk does not mean that it is not 
susceptible to quantitative treatment. Witness 
e formal development of this sort 


the extensiv 0 
of process in the field of finite mathematics 


by engineers and others. The point is that its 
analytical treatment will not look like the 
familiar kind of search for a systematic class 
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of differentiating variables like SES as a 
nuisance variable in relationship to educa- 
tional outcome and intelligence. 


13, Sheer Number of Variables 


I suppose that this is the most commonly 
mentioned of the difficulties of social science, 
and I assume that my readers would accept 
it without further elaboration. But it is worth 
mention that the number of variables is 
large from several different viewpoints. Thus 
we deal on one side with a large number of 
phenotypic traits, conceiving a phenotypic 
trait as a related family of response disposi- 
tions that (a) are correlated to some stipu- 
lated degree pairwise and that (b) have some 
kind of logical, semantic, social, or other 
“meaning” overlap or resemblance that en- 
titles us to class them together. Or, again, we 
consider a large number of dimensions on the 
Stimulus side and on the response side that 
are relevant in formulating a law of behavior 
acquisition, as well as in the subsequent con- 
trol and activation dispositions thus acquired. 
From still another viewpoint, the list of his- 
torical causal influences is long and heteroge- 
neous, ranging from such diverse factors as a 
mutated gene or a never-diagnosed subclinical 
tuberculosis to a mother who mysteriously 
absented herself the day after a patient first 
permitted himself the fantasy that a brutal 
father would go away, and the like. It should 
be noted that this matter of sheer number of 
variables would not be so important (except 
as a contributor to residual “random varia- 
tion” in various kinds of Outcomes) if they 
were each small contributors and independent, 
like the sources of error 
shots at a target in classical theory of errors, 
But in psychology this is not typically the 
situation. Rather, the variables, although large 
in number, are each nuisance variables that 


14. Importance of Cultural Factors 


_ This Source of individual differences, both 
in acquired response clusters ( traits) and in 
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y 
the parameters of acquisition and a 


functions, especially when taken to 
the genetic factors contributing, for 
to social competence, mental health, 
and so on, makes for unusual comp) 
understanding how somebody got 
way he is. We are, for instance, so 
to referring to nuisance variables lil 
considering the design of experiments 
volve SES-related individual differ 
we readily forget something every ri 
person knows—that the measures o! 
like SES are general and not tailor-mai 
what is idiographically more significant 
development of a particular person, 
we speak of “controlling for SES,” 
loose use of language in comparison Wil 
trolling the temperature” in a Skinner 
controlling the efflux of calories in a 
lab by use of a bomb calorimeter, A tre 
on the principles of internal medicine 
as Harrison et al., 1966) sometimes | € 
cultural factors, including those that 
at all understood—in the way that, 
etary deficiency might be mediated by 
poverty in a backward country—and 
says that for some reason this disease i 
more frequently among the rich than 
the poor. But the important causal cha 
prime interest to the physician, 
role as an advisor of preventive mei 
not typically involve worry about 
somebody is fifth-generation upper € 
third child of parents who became 
after the birth of the second oldest 
However, this kind of consideration mig 
crucial in reconstructing the life 
such a person, 


15. Context-Dependent Stochastolo, 


Cronbach and Meehl (1955/1973) 
sequent writers adopted (from the 
ist philosophers of science) the ph 
logical network to designate the 
lawlike relationships conjectured to 
tween theoretical entities (states, 
events, dispositions) and between 
entities and their observable in 
“network” metaphor is chosen to 
the structure of such systems, in 
nodes of the network, representing t 
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ated theoretical entities, are connected by 
the strands of the network, representing the 
lawful relationships hypothesized to hold be- 
tween the entities. What makes such a set of 
theoretical statements a system (rather than 
a mere conjunction of unrelated assertions, a 
“heap of hypotheses”) is the semantic fact of 
their shared terms, an overlap in the proposi- 
tions’ inner components, without which, of 
course, no deductive fertility and no deriva- 
tion chains to observational statements would 
be formally possible. The network is empirical 
(and “scientifically respectable”), because a 
proper subset of the theoretical terms is co- 
ordinated in fairly direct ways (“operation- 
ally”) with terms designating perceptual or 
instrument-reading predicates. These latter 
predicates normally possess the admirable 
j Properties of quick decision, minimal theory 
dependence, and high interpersonal consensus. 
Despite the current distaste for these “ob- 
jectivist” conceptions, I remain an old-fash- 
joned unreconstructed positivist to the limited 
extent that I think science—both “normal 
science” and “revolutionary, paradigm-replac- 
ing science”—differs from less promising, non- 
cumulative, and personalistic enterprises like 
politics, psychotherapy, folklore, ethics, meta- 
physics, aesthetics, and theology in part be- 
cause of its skeptical insistence on reliable (in- 
tersubjective, replicable) protocols that de- 
| scribe observations, Skinner is in better shape 
- than Freud partly because Norman Campbell 
(1920/1957, p. 29) was right in saying that 
the kinds of judgments for which universal 
| assent can be obtained are (a) judgments of 
5 temporal simultaneity, consecutiveness, 
betweenness”; (b) judgments of coincidence 
and “betweenness” in space; and (c) judg- 
ments of number. I cannot view the increas- 
ingly fashionable dismissal of these objectiv- 
ity-oriented views as other than obscurantist 
| in tendency. (See Kordig, 1971, 1973.) 
| However, the nomological network, even 
þa ‘hough correlated directly, here and there, 
with observational data, is not “operational” 
throughout, since some of the nodes and 
strands are connected with the observational 
data base only via other subregions of the net- 
work. As Hempel said (1952): 


pec theory might therefore be likened to a 
plex spatial network: Its terms are represented 


= 
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by the knots, while the threads connecting the latter 
correspond, in part, to the definitions and, in part to 
the fundamental and derivative hypotheses included 
in the theory. The whole system floats, as it were, 
above the plane of observation and is anchored to it 
by rules of interpretation. These might be viewed as 
strings which are not part of the network but link 
certain points of the latter with specific places in the 
plane of observation, By virtue of those interpretive 
connections, the network can function as a scientific 
theory: From certain observational data, we may 
ascend, via an interpretive string, to some point in 
the theoretical network, thence proceed, via defini- 
tions and hypotheses, to other points, from which 
another interpretive string permits a descent to the 
plane of observation, (p. 36) 


Even though the core of these ideas is sound 
and important, the word nomological is in soft 
psychology at best an extension of meaning 
and at worst a misleading corruption of the 
logician’s terminology. Originally it designated 
strict laws as in W. E. Johnson’s (1921/1964) 
earlier use of “nomic necessity” (p. 61). The 
lawlike relationships we have to work with in 
soft psychology are rarely (never?) of this 
strict kind, errors of measurement aside. In- 
stead, they are correlations, tendencies, sta- 
tistical clusterings, increments of probabilities, 
and altered stochastic dispositions, The ugly 
neologism stochastological (as analogue to 
nomological) is at least shorter than the usual 
“probabilistic relation” or “statistical depen- 
dence,” so I shall adopt it, We are so accus- 
tomed to our immersion in a sea of stochasto- 
logicals that we may fail to notice what a ter- 
rible disadvantage this sort of probabilistic 
law network puts us under, both as to the 
clarity of our concepts and, more importantly, 
the testability of our theories. (One still hears 
the tiresome complaint that a theoretical sys- 
tem cannot be simultaneously concept defina- 
tory and factually assertive, despite repeated ex- 
planations of how this works. See, €g Braith- 
waite, 1960, pp. 76-87; Campbell, 1920/1957, 
pp. 119-158; Carnap, 1936-1937/1950, 1952/ 
1936, 1966, pp: 225-226, 265-274; Feigl, 
1956, pp. 17-19; Hempel, 1952, 1958, pp. 
81-87; Lewis, 1970; Maxwell, 1961, 1962; 
Meehl, 1977, pp- 35-37; Nagel, 1961, pp- 87, 
91-93; Pap, 1958, pp. 318-321, 1962, pp- 
46-52; Popper, 1974, pP- 14-73; Ramsey, 
1931/1960; Sellars, 1948.) 

When the observational corroborators of 
the theory consist wholly of percentages, 
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curve fits, correlations, significance tests, and 
distribution overlaps, it is difficult or impos- 
sible to see clearly when a given batch of 
empirical data refutes a theory or even when 
two batches of data are (in any interesting 
sense) “inconsistent.” All we can usually say 
with quasi-certainty is that context-dependent 
statistics should mo¢ be numerically identical 
in different studies of the same problem. (A 
dramatic recent example of this was the dis- 
covery that some of Sir Cyril Burt’s correla- 
tion coefficients were too consistent to have 
been derived from the different tests and pop- 
ulations that he reported!) 

In heading this section “Context-Dependent 
Stochastologicals,” I mean to emphasize the 
aspect of this problem that seems to me most 
frustrating to our theoretical interests, namely, 
that the statistical dependencies we observe 
are always somewhat, and often strongly, de- 
pendent on the institution-cum-population set- 
ting in which the measurements were ob- 
tained. Lacking a “complete (causal) theory” 
of what influences what, and how much, we 
simply cannot compute expected numerical 
changes in stochastic dependencies when mov- 
ing from one population or setting to another. 
Sometimes we cannot even rationally predict 
the direction of such changes. If the difference 
between two Pearson correlations were safely 
attributable to random sampling fluctuation 
alone, we could use the Statistician’s standard 
tools to decide whether Jones’s study “fails 
to replicate” Smith’s. But the usual situation 
is not one of simple cross-validation shrinkage 
(or “boostage”)—rather, it involves the va- 
lidity generalization problem, For this, there 
are no standard statistical procedures. We 
may be able, relying on strong theorems in 
general statistics plus a backlog of previous 
experience and a smattering of theory, to say 
some fairly safe things about restriction of 
range and the like, However, thoughtful the- 
orists realize how little quantitatively we can 
say with sufficient confidence to warrant 
counting an unexpected shift in a stochastic 
quantity as a strong “discorroborator,” This 
being so, we cannot fairly count an “in the 
ball park” predicted value as a strong cor- 
roborator, For example, Meehl’s Mental Mea- 
Sure correlates .50 with SES in Duluth junior 
high school Students, as predicted from Fis- 
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bee’s theory of sociability. When Jones tries 
to replicate the finding on Chicano seniors in | 
Tucson, he gets 7 = .34. Who can say any- 
thing theoretically cogent about this differ- 
ence? Does any sane psychologist believe that | 
one can do much more than shrug? 
Although probability concepts (in the the- 
ory) and statistical distributions (in the data) 
Sometimes appear in both classical and quan- } 
tum physics, their usual rôle differs from that 
of context-dependent stochastologicals in so- 
cial science. Without exceeding space limita- 
tions or my competence, let me briefly suggest 
some differences. When probabilities appear 
in physics and chemistry, they often drop out 
in the course of the derivation chain, yielding 
a quasi-nomological at its termination (e.g. 
derivation of gas laws or Graham’s diffusion 
law from the kinetic theory of heat, in which 
the postulates are nomological, the “condi- 
tions” are probability distributions, and the 
resulting theorems are again nomological). 
Second, when the predicted observational re- 
sult still contains statistical notions, their nu- 
merical values are either not context depen- 
dent or the context dependencies permit pre- 
cise experimental manipulation, A statistical 
scatter function for photons or electrons ce 
be finely tuned by altering a very limited 
number of experimental variables (e.g., wave- 
length, slit width, screen distance), and the 
law of large numbers assures that the expected 
“probabilistic” values of, say, photon inci- 
dence in a specified band will be indiscernibly 
different from the observed (finite but huge) 
numbers, LA 
All this is very unlike the stochastologicals 1 
of soft psychology, in which strong contex 
dependence prevails, but we do not know u 
the complete list of contextual influences, ( 
the function form of context dependency o 
those influences that we can list, (c) the 2 a 
merical values of parameters in those fe 
forms that we know or guess, or (d) the va “al 
of the context variables if we are so fortuna 
x A unlike 
as to get past Ignorances a-c. Finally, 
physics, our sample sizes are usually s ie 
that the Bernoulli theorem does not use 
a close fit between theoretical and ee. 
frequencies—perhaps one of the few 8 
uses for significance tests? 
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| of the tough-minded, superscientific orienta- 
z tion) or to be contented with fuzzy verbalisms 
As a consequence of the factors listed supra, on the other side (if we are more artsy-craftsy 
especially those numbered 4, 7, 9, 15, it is or literary), thinking that it is the best we 
usually not possible in the soft areas of social can get. The important point for methodology 
science to provide rigorous, explicit, or—the of psychology is that jus 
‘holy word when I was in graduate school— can have a reasonably precise theory of prob- 
“operational definitions for theoretical con- able inference, being “quasi-exact about the 
“cepts. This difficulty occurs not because psy- inherently inexact,” so psychologists should 
chologists are intellectually lazy or sloppy, al- jearn to be sophisticated and rigorous in their 
though most of us are at times (some routinely metathinking about open concepts at the sub- 
“and on principle). Rather, it arises from the  stantive level. I do not mean to suggest in 
intrinsic nature of the subject matter, that is, saying this that the logicians’ theory of open 
from the organism’s real compositional nature concepts is in a highly developed state, but it 
and structure and the causal texture of its en- js far more developed than one would think 
vironment. As has often been pointed out, one from reading or listening to most psycholo- 
can concoct quick and easy “operational defi- gists. 
nitions” of psychological terms, but they will I have elsewhere (Meehl, 1977) distin- 


usually lack theoretical interest and, except guished three kinds of openness that are in- 


for some important special cases (e.g, purely volved in varying degrees i 
predictive task-tailored psychometrics and logical concepts 
some kinds of operant behavior control), gen- the same theoretical construct, namely, (a 


eralizable technological power (Lazarus, 1971; openness arising from the indefinite extensibil- 


Loevinger, 1957). It is remarkable evidence ity of our provisional list of operational indi- 
penness associ- 


of cultural lag in intellectual life that one can cators of the construct; (b) 0) 

still find quite a few psychologists who are ated with each indicator singly, because of the 
hooked on the dire necessity of strictly opera- empirical fact 

tional definitions, and who view open concepts abilistically, rather than nomologically, linked 
as somehow methodologically sinful, although to the inferred theoretical construct; and (c 
it is now a quarter of a century since the late openness due to the fact that most of our 
Arthur Pap published his brilliant article on theoretical entities are introduced by an 1m- 


open concepts (Pap, 1955) see also chap. 11 plicit or contextual definition, that is, 
cepted nomologi network, 


of Pap, 1958). To do justice, and highlight role in the ac k 
the cultural lag, I should mention the related pather than by their inner nature. By their 
article of Waismann that antedated Pap’s by “inner nature” I mean nothing spooky ot 
8 years (Waismann, 1945) and even Carnap’s metaphysical heir ontological 
Or aO NES (1936-1937/ 1950). I cannot structure oF composition as the latter will, 
name a single logician or 4 philosopher (or with the progress of research, be formulatable 


historian) of science who today defends strict jn terms of the theoretical entities of more 
operationism in the sense that some psycholo- asic sciences in Comte’s pyramid. In social 
gists claim to believe in it. (They don’t really and piological science, one should keep 10 
—but you have to listen awhile to catch the mind that ex? 
deviations in substance when pseudooperation- entities is seldom achiev 
ists are not discoursing dogmatically about initial observational variables of those 
merkot) ences, but it 

The problem of open concepts and their re- theoretical reduction oF fusi 
lation to empirical falsifiability warrants a tion is achieved, if ever, in terms of some more 
separate article, with which I am currently pasic underlying science eehl, 1977, see 
engaged, but suffice it to say here that the also Cronbach & Meehl (1955/ 1973); Meehl, 


unavoidability of open concepts Ìn social and 1959/1973f, 1973h, pp. 285-288). 
biological science tempts us tO sidestep it by A final remark, which also deserves fe 
treatment in another place, 15 that when W 


fake operationism on the one side (if we are 


16, Open Concepts 


becomes possible instead by 
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deal with open concepts, as in personality psy- 
chometrics of traits or taxa, the statistical 
phenomenon of psychometric drift as a result 
of bootstrap operations, refinement of mea- 
sures, and theoretical reflection on the big ma- 
trix of convergent and discriminative validities 
(Campbell & Fiske, 1959) also generates, via 
our reliance on implicit or contextual defini- 
tions of theoretical entities, an associated con- 
ceptual drift, a meaning shift. When we re- 
assign weight to fallible indicators of an en- 
tity to the extent that the very meaning of the 
term designating that entity is specified by its 
role in the network, such reassignment of 
weights—especially under drastic revisions of 
the system such as dropping a previously re- 
lied-upon indicator—constitutes a change in 
the theoretical concept. Difficult interpreta- 
tive and research strategy problems arise here, 
because, on the one hand (especially in psy- 
chometrics) we encounter the danger that the 
resulting conceptual drift has pulled us away 
from what we started out to measure, but we 
also recognize that in psychology, as in the 
other sciences, part of the research aim is pre- 
cisely that of bringing about revisions of con- 
cepts on the basis of revisions of the nomologi- 
cal network that implicitly defines them. We 
want, as Plato said, to carve nature at its 
joints; and the best test of this achievement is 
increased order in our material. 


17. Intentionality, Purpose, and Meaning 


We do not need to settle the philosopher’s 
question of what is the essential condition for 
the existence of intentionality, nor buy Bren- 
tano’s famous criterion that intentionality is 
the distinctive mark of the mental, to recog- 
nize that human beings think and plan and 
intend, that if rats do so they do it at a much 
lower level, that sunflowers probably do not, 
and that stones certainly do not. The formula- 
tion of powerful functional relationships for 
systems that do not possess the capacity to 
think, worry, regret, plan, and intend is ob- 
viously on the average an easier task. (But 
see Vico, 1744/1948, for a view so different 
that an American social scientist of our time 
can hardly grasp it.) 
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18. Rule Governance 


Related to intentionality but sufficiently im- 
portant to deserve a special listing is the fact — 
that human behavior is rule governed. People 
do something not merely “in accordance with” 

a generalization but because they feel bound 
to obey the generalization stated in the form 
of a rule. Nobody has succeeded in coming up 
with a fully satisfactory definition of when a | 
tule is a rule, but a sufficiently good approxi- . 
mation is to say that a rule differs from an em- 
pirical generalization in that a rule is not 
liquidated by being broken, whereas an em- 
pirical generalization is thereby liquidated 
(assuming that the conditions stated in its 
antecedent clause are granted, and the viola- 
tion event is admitted into the corpus). Con- 
tinued controversies in psycholinguistics re- 
flect the importance of this kind of considera- 
tion in any discussion of human conduct. 


19. Uniquely Human Events and Powers 


In addition to being rule governed, there 
are several other human features that we do 
not share with chimpanzees, let alone sponges 
or boulders. I recall the late Richard M. El- 
liott saying that the main reason that psy- 
chology had done so poorly in its “theories” of 
humor is that man is the only animal that 
laughs. I think he had a good point here, since 
we have learned so much about aspects of hu- 
man functioning, such as digestion and repro- 
duction, by the experimental study of animals. — 
There are a number of other things that hu- 
man beings do that no infrahuman animal d 
does, so far as we know. Only man speculates 
about nonpractical, theoretical matters; only 
man worships; only man systematically goes ' 
about seeking revenge, years later, for an 10- 
jury done to him; only man carries on discus- | 
sions about how to make decisions; and there 
are some features of cultural transmission that 
only man engages in, although the evidence 
now indicates that numerous other species 
transmit learned forms of behavior to 509- 
sequent generations. 


20. Ethical Constraints on Research | 


s ; . X] si- 
This one is so obvious as to pee S i 
tion. One can readily conceive quasi-d€ ; 
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n the 1Q-heredity controversy, 
Ir yhether there are family dynamics sufficient one feat 
e into a manic-depressive; 
formed because to do so 


mould be immoral. 

Not to be overly pessimistic, Jet me mention 
(without proof) five noble traditions in clini- 
ul psychology that I believe have permanent 
be with us 50 or 100 years 


merit and will still 

fom now, despite the usual changes. Some of know an; : 

| these are currently unpopular among those ad- graphs of the developmental change of items. 

liicted to one of the contemporary fly-by- Į suggest to you that Sir onald has befuddl ed 
us, and led us down the prim- 


[night theories, but that does not bother me- us, 


| These five noble traditions are (a) descriptive 105° path. 
| dinical psychiatry, (b) psychometric assess- reliance on merely re 
ment, (c) behavior genetics, (d) behavior sis aS the standard me for corroborating 
substantive theories 10 the soft areas is a 
i ally unsound, poor 


modification (I lump under this rubric posi- 
terrible mistake, 1S basic > 
f the worst things 


tive contingency management, aversion ther- 


apy, and desensitization), and (e) psychody- 
namics, This list should convince you that I that ever happened ! 
, êm not using methodological arguments to ogy: 5 
grind any substantive ax. I am probably one Tt is easiest to $9 oN put f 
of the few psychologist: logical viewpoint © i i tee 
would list all five of these as great, have here a rare instance in whic 
sir Karl’s posi 
sees 


enduring intellectual traditions. Í particularly 
hodynamics, since I 


methodo- 


or useful if it is not either based on labora- 
E experiments Or statistical correlations. statistics an e le ae 
| There is not a single experiment reported in Briefly and mplistically) 
Poppet and eo-Popperians is that we 4° 
es by some kin 


a 23-volume set of the standar 
; reud nor is there a ź test. B 
ae clinical observation 
Bia ae any time. I am confide 
ber ia Ta concepts will be around after rub- 
ment = theory, transactional theory, attach- E 
© ory. wee labeling theory, dissonance the- wildes 
cane tribution theory, and so OR, have sub- 
risk ae a state of innocuous desuetude like t ) gr ae ; 
PA ift and level of aspiration. At the very with the e a 
<4 Cetin is an interesting theory, 
the te ANE, than I can say about some of 
TaS that are currently fashionable. 
the ie a: noble traditions differ greatly 19 
cepts, and T they use and their central con- 
Kr am hard put to say what 1S com- o nN 
havior R them. Some of them, such as pe. Meechis y 
citing to eee are not conceptually © elise 
like Fre: roa of us who are interestd in jdeas to be the ae E we 
that b ud’s, but they more than make up fot with my “predict! e 
y their remarkable technological power. 
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than in May, even showing three asterisks 
(for p < .001) in my t-test table! If I pre- 
dict from my theory that it will rain on 7 of 
the 30 days of April, and it rains on exactly 
7, you might perk up your ears a bit, but still 
you would be inclined to think of this as a 
“lucky coincidence.” But suppose that I spec- 
ify which 7 days in April it will rain and ring 
the bell; then you will start getting seriously 
interested in Meehl’s meteorological conjec- 
tures. Finally, if I tell you that on April 4th it 
will rain 1.7 inches (.66 cm), and on April 
9th, 2.3 inches (.90 cm) and so forth, and get 
seven of these correct within reasonable toler- 
ance, you will begin to think that Meehl’s the- 
ory must have a lot going for it. You may 
believe that Meehl’s theory of the weather, 
like all theories, is, when taken literally, 
false, since probably all theories are false 
in the eyes of God, but you will at least 
say, to use Popper’s language, that it is be- 
ginning to look as if Meehl’s theory has con- 
siderable verisimilitude, that is, “truth-like- 
ness.” (An adequate reconstruction of the 
verisimilitude concept has yet to be provided 
by our logician friends, see, e.g., Popper, 1976, 
but few reflective psychologists will doubt that 
some such notion of “nearness to the truth” is 
unavoidable when we evaluate theories. It is 
crucial to recognize that verisimilitude is an 
ontological, not an epistemological, concept 
that must not be conflated with confirmation, 
probability, evidence, proof, corroboration, be- 
lief, support, or plausibility.) 

Popperians would speak of low logical or 
prior probability, of the high content (for- 
bidding much), because it specifies exactly 
which days it will rain how many inches, A 
Bayesian (who would reject Popper’s philos- 
ophy on the grounds that we want our “the- 
oretical prior” to be high to get a nice boost 
out of Bayes’ theorem when the facts turn 
out right) would express Popper’s point by 
saying that we want what Pap (1962, p. 160) 
calls the expectedness, the Prior on the ob- 
servations that is found in the denominator of 
Bayes’ theorem to be low. An unphilosophical 
chemist or astronomer or molecular biologist 
would Say that this was just good sensible 
scientific practice, that a theory that makes 
Precise predictions and correctly picks out nar- 
row intervals or point values out of the range 
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of experimental possibilities is a pretty strong 
theory. There are revisions (as I think, neces- 
sary) of the classic Popperian position urged 
on us by his heretical exstudents P, K, Feyera- 
bend and the late Imre Lakatos, but psycholo- 
gists must reach at least the stage of Bayes 
and Popper before they can profitably go on 
to the refinements and criticisms of these 
gentlemen. 
The most important caveat I would adjoin f 
to Sir Karl’s falsifiability requirement arises 
from the considerations pressed by Feyera | 
bend (1962, 1965, 1970, 1971), Lakatos 
(1970, 1974a, 1974b), and others concerning 
the crucial role of auxiliary theories in sub- 
jecting the main substantive theory of interest 
to danger of modus tollens. As is well-known 
(and not disputed by Popper), when we spell 
out in detail the logical structure of what pur- i 
ports to be an observational test of a theoreti- 
cal conjecture T, we normally find that we 
cannot get to an observational statement from S 
T alone. We require further a set of often 
complex and problematic auxiliaries A, plus 
the empirical realization of certain conditions 
describing the experimental particulars, com- 
monly labeled collectively as C. So that the 
derivation of an observation from a substan 
tive theory T amounts always to the longer 
formula (T.A.C) — 0, rather than the simpli- 
fied schema (T — 0) that most of us learned 
in undergraduate logic courses. This presents 
a problem not perhaps for Popper’s main 
thesis (although some critics do say this) but 
for its application as a criterion of the scien- 
tific status of theories (or the scientific ap: g 
proach of a particular theoretician or investi- 
gator?). The modus tollens now reads: Since 
(T.A.C) — 0, and we have falsified 0 ob- 
servationally, we have the consequence 
~(T7.A.C). Unfortunately, this result ae 
not entail the falsity of 7, the substantive the- 
ory of interest but only the falsity of the T 
junction (T.A.C); that is, we have prove 5 
disjunction of the falsities of the eonia a o 
the failure to get the expected observatio ot 
proves that ~T V ~A V ~C, which is n 
quite what we would like to show. Du- 
One need not subscribe to the famous D3 
hemian thesis regarding falsification of sp n 
as a whole (Grünbaum, 1960, 1962, ( i 
1976) or to the Lakatosian exposition 


tos, 1970, 1974a, 1974b) about the protec- 
we belt of auxiliaries against which the 
nodus tollens is directed versus the hard core 
‘of the theory against which the modus tollens 
iis, prior to a Kuhnian revolution (Kuhn, 
1970, 1970b, 1970c), forbidden to be di- 
ected, to see that there is a difficult problem 
presented to even a neo-Popperian (like my- 
li), because in social science the auxiliaries 

itial and boundary conditions of 


i and the in 
he system C are frequently as problematic as 
Suppose that a 


ihe theory T itself. Example: 
jersonologist or social psychologist wants to 
investigate the effect of social fear on visual 
attempts to mobilize anxiety in 
olescent males, chosen by their 
] Introversion (Si) scale of 
ultiphasic Personality In- 
entory (MMPI), by employing a research 
assistant who is a raving beauty, instructing 
her to wear Chanel No. 5, and adopt a mixed 
“seductive and castrative manner toward the 
Subjects. An interpretation of a negative em- 
pirical result leaves us wondering whether the 
main substantive theory of interest concern- 
ing social fear and visual perception has been 
} falsified, or whether only the auxiliary theories 
ithat the Si scale is valid for social introversion 
and that attractive but hostile female experi- 
menters elicit social fear in introverted young 
males have been falsified. Or perhaps even the 
particular conditions were not met; that is, 
she did not consistently act the way she was 
instructed to or the MMPI protocols were 
Misscored. 
There is nothing qualitatively unique about 
| this problem for the inexact sciences, but it is 
| quantitatively more severe for us than for the 
chemist or astronomer, for at least two rea- 
sons, which I shall set forth without either 
' oe or developing them here. First, in- 
i ee testing of the auxiliary theories 
a often means validation of psycho- 
. W instruments or ascertaining efficacy of 
Due aus inpas) is harder to carry out. 
: beat unavoidable looseness of the nomo- 
1973) re (Cronbach & Meehl, 1955/ 
cities us the factors in the list of 20 diffi- 
cay ae the range of research circum- 
i ee auxiliaries A are problematic 
Ñ but not att an in the exact sciences oF in some 
of the biological sciences. Second, 
by 


fa sample of ad 
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a point to which philosophers of science have 
devoted little attention, in physics or chem- 
istry there is usually a more intimate connec- 
tion, sometimes one of contributing to de- 
rivability, between the substantive theory of 
interest T and components of the auxiliaries 
A. This is sometimes even true in advanced 
branches of biology. Example: There is a com- 
plicated, well-developed, and highly corrobo- 
rated theory of how a cyclotron works, and 
the subject matter of that auxiliary “theory 
of the instrument” is for the most part identi- 
cal to the subject matter of the physical the- 
ories concerning nuclear particles, and so on, 
being investigated by the physicist. Devices 
for bringing about a state of affairs, for isolat- . 
ing the system under study, and for observing 
result are all themselves 


in which auxiliary 
theories used by and biological sci- 
entists are at least subtly informed by what 
may be loosely called the s the leading 
ideas, the core, i 

main substantive theory 


system 
ad hoc. 
be analyzed and 
but most scientists and historians © 
are—howevet informally—well aw 
influence. (See, €-8» Holton, 1973.) 

In the social sciences, NO such intimate con- ~ 
nection, and almost never 4 relation of the- 
oretical derivability, exists; hence, the aux- 
jliary theory (such as a theory that the 
Rorschach is valid for detecting subclinical 
schizoid cognitive slippage oF that Chanel- 
research assistants are anx- 
must stand on its own feet. Al- 


jary. The situation in which A is merely con- 


joined to T in setting up our 


it hard for us social scien 
perian falsifiability requireme 


fore the fact what would count as 


falsifier. ‘ 
J shall illustrate this 


a simple example whose 


nt—to state be- 
a strong 


problem further with 
adequate exposition 


820 PAUL E. 
will appear elsewhere (Golden & Meehl, in 
press). Suppose that I wish to test my domi- 
nant gene conjecture (Golden & Meehl, 1978; 
Meehl, 1972, 1972/1973g, 1977) concerning 
Sschizotaxia as the central nervous system con- 
dition for the development by social learning 
of schizotypy (Meehl, 1962/1973c), which in 
turn is the personality precondition for the 
development of a clinical schizophrenia—al- 
though the latter must then occur only in one 
fourth of the persons carrying the gene, given 
the roughly 12% concordance for first-degree 
relatives as regards diagnosable clinical schiz- 
ophrenia. (See also Böök, 1960; Heston, 1966, 
1970; Slater, 1958/1971). I might rely on 
some complex neurological or projective or 
structured test “sign” as having such-and-such 
estimated construct validity for the schizo- 
typal personality makeup. Such a quantitative 
estimate might be made relying on a combina- 
tion of empirical evidence concerning dis- 
cordant monozygotic twins of known schizo- 
phrenics, protocols of persons tested as col- 
lege freshmen who subsequently decompensate 
into a recognizable schizophrenia, and the like, 
Such numerical estimates will all suffer not 
only from the usual test unreliability and ran- 
dom sampling fluctuations, but they will also 
have some unknown degree of systematic bias. 
For instance, it clearly will not do to assume 
that the taxon all compensated Schizotypes 
would average the same scores on a Rorschach 
or MMPI indicator variable as do the com- 
pensated (discordant) monozygotic twins, the 
latter being a biased selection, since they have 
the same potentiating genes that their decom- 
pensated twins have. However, there must be 
something else about them—of an environ- 
mental sort—that works strongly in their 
favor and helps keep them discordant, that is, 
clinically well. One simply has no way of as- 
certaining the net impact of these two op- 
posed kinds of forces on the psychometric 
results, 

Suppose that we take some combination of 
earlier findings on preschizophrenics remitted 
schizophrenics, compensated discordant mono- 
zygotic twins of schizophrenics, and so forth, 
and we ascertain that while the valid positive 
Aa Ps among these safely presumed schizo. 
it will aloy f os . = sample sizes are huge, 

amount unexplainable 
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by random sampling fluctuation), it heverthe. y 
less shows a “reasonably close” agreement, 
(Again, we think like physicists or physiolo. 
gists instead of like social scientists fooling 
around with ¢ tests.) So we strike some kind 
of rough average ĝs of these several valid posi- 
tive rates, knowing that it is the best we can | 
do at this point with data on different groups 
of schizotypes, who, despite their differences, | 
must all have somehow been tagged as such, 
Given that estimated valid positive rate, and4 
given a false positive rate p, (also systemati- 
cally biased because of the undiagnosed com- l 
pensated schizotypes in any “control popula- l 
tion”), we record our numerical predictions | 
for the incidence of our psychometric sign 
among parent pairs of schizophrenic probands 
(where, on the dominant gene theory, we ex- 
pect not only a 50% schizotypy incidence buty 
something stronger; to wit, at least one mem- 
ber of each parent pair must be a schizotype). 
We also compute it for siblings and dizygotic 
twins and—although here things get a bit 
feeble—with sufficiently large samples, maybe 
second-degree relatives. Thus, for instance, the 
expected sign-positive rate among parents ~ 
(and sibs, if they all cooperate) is given by © 
the simple expression p* = 4p, + $n: 4 
Now the substantive dominant gene theory 
T, when conjoined with the auxiliary theory 4 . 
concerning psychometric validity, and assum- y 
ing that we have identified the right relatives 
and the probands were all schizophrenics 
[=C], generates point predictions and there- 
fore takes a high Popperian risk when the ue 
junction (T.A.C) is considered as the “the- 
ory” under test. Hence, the verification of , 
those numerical point predictions as to the 
values of the psychometric incidence in rela- 
tives of different degrees of consanguinity pro- 
vides a strong Popperian test for that conjunc 
tive “theory.” One would then normally say 
that successful negotiation of this hurdle, the 
failure to be clobbered modus tollens by the 
outcome of the empirical study, provide 
moderate to strong corroboration of the oan 
junctive theory. Hence, (T.A.C) is or 
well; that is, it has escaped false 
spite taking a high risk by making severa! n 
merical point predictions. . 
So far, so good, and Popper as well as i 
critics would have no complaint. Howevels “i 
+ 


' 


| 


| 
f 
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classical Popperian requirement on playing 
the scientific game fairly involves the theoreti- 
‘cian’s saying, before doing the research, what 
would count as 2 strong basis for rejecting the 
theory. If “the theory” is taken to be the sub- 
tive theory T (which it is, if one is not 
eing philosophically disengenuous) rather 
Fihan the psychometric auxiliary and diagnostic 
validity conjectures A and C, then one will be 
(ommitting what amounts in spirit to a Pop- 
perian sin against falsificationism as 4 method. 
If the empirical research does not pan out as 
predicted, one does not abandon T; instead 
Ae tells us that either T is incorrect, A is in- 
“correct, or the diagnoses were untrustworthy! 
I am not persuaded from his writings nor 
from conversations that I have had with him 
that Sir Karl adequately appreciates the de- 
gree to which this theory and auxiliary prob- 
lem permeate research in the inexact sciences, 
especially the social sciences in their soft 
areas. Whether it presents a general problem 
for the Popperian formulation of scientific 
method is beyond the scope of this article and 
my competence. It is perhaps worth saying, 
> however, for the benefit of philosophically ori- 
» ented readers, that the above described situa- 
ytion—certainly no rarity in our field or in 
biology—may represent a social fact about the 
way science works that presents grave diffi- 
culties for the Popperian reconstruction. That 
is, the stipulation beforehand that one will be 
pleased about substantive theory T when the 
numerical results come out as forecast, but 
will not necessarily abandon it when they do 
a seems on the face of it to be about as 
) e ca a violation of the Popperian com- 
eh ment as you could commit. For the in- 
aa in a way, is doing what Popper 
ad T ought not to do, and what astrologers 
ES par and psychoanalysts allegedly do, 
E heads I win, tails you lose.” But it 
tice oS accordance with much scientific prac- 
Neer” as far as I have sampled, with most 
oot pon common sense or intuitions, 
il a if the combination (T.A.C ) gen- 
A S igh-risk numerical point prediction, 
Mrs ae really does support all three of 
pretty ae The reason it does so seems 
formalized d despite its commonsense, NON- 
S character: Because of the lack of 
inner connection in the inexact sci- 
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ences between the components of these con- 
junctions, it would strike us as a very strange 
coincidence if the substantive theory T should 
have low verisimilitude (which would, were T 
true, also generate mispredictions of the nu- 
merical point values) and yet the two (largely 
unrelated) “wrongs” of T and A are somehow 
systematically balanced so as to generate the 
same numerical prediction generated from the 
conjecture that T and A both have relatively 
high verisimilitude. 

Such a delicate quantitative counterbalanc- 
ing of theoretical errors is not impossible, but 
it seems quite implausible, assuming that na- 
ture is (as Einstein says) “subtle but not 
malicious.” SO I think we are not being un- 
reasonable to congratulate ourselves on at- 
riving at a successful prediction of high-risk 
point values or other antecedently improbable 
observational patterns from the conjunction 

(T.A.C), despite the fact that we seem to be 


tion by 
ence, especially i 
of science, and 
question of whether there 
ences between 
ences, or even 
cial sciences, 
Popperian methodology should 
and applied. 


But, you may 
jonificance testing? Tsn’t the sock 


scientist’s f the null hypothesis simply 
the application of Popperi 


danger. The kin 


theoretical e pu 
we use significance 
hod are not like testing 
well it fore- 
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to testing the theory by seeing whether it 
rains in April at all, or rains several days in 
April, or rains in April more than in May. It 
happens mainly because, as I believe is gen- 
erally recognized by statisticians today and by 
thoughtful social scientists, the null hypothe- 
sis, taken literally, is always false. I shall not 
attempt to document this here, because among 
sophisticated persons it is taken for granted. 
(See Morrison & Henkel, 1970, especially the 
chapters by Bakan, Hogben, Lykken, Meehl, 
and Rozeboom.) A little reflection shows us 
why it has to be the case, since an output vari- 
able such as adult IQ, or academic achieve- 
ment, or effectiveness at communication, or 
whatever, will always, in the social sciences, 
be a function of a sizable but finite number of 
factors. (The smallest contributions may be 
considered as essentially a random variance 
term.) In order for two groups (males and fe- 
males, or whites and blacks, or manic depres- 
sives and schizophrenics, or Republicans and 
Democrats) to be exactly equal on such an 
output variable, we have to imagine that they 
are exactly equal or delicately counterbal- 
anced on all of the contributors in the causal 
equation, which will never be the case. 
Following the general line of reasoning 
(presented by myself and several others over 
the last decade), from the fact that the null 
hypothesis is always false in soft Psychology, 
it follows that the probability of refuting it 
depends wholly on the sensitivity of the ex- 
periment—its logical design, the net (attenu- 
ated) construct validity of the measures, and, 
most importantly, the sample size, which de- 
termines where we are on the statistical power 
function. Putting it crudely, if you have 
enough cases and your measures are not to- 
tally unreliable, the null hypothesis will al- 
ways be falsified, regardless of the truth of 
the substantive theory, Of course, it could be 
falsified in the wrong direction, which means 
that as the power j 
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like the situation desired from either a Bayes- 
ian, a Popperian, or a commonsense scientific 
standpoint. As I have pointed out elsewhere 
(Meehl, 1967/1970b; but see criticism by 
Oakes, 1975; Keuth, 1973; and rebuttal by 
Swoyer & Monson, 1975), an improvement in 
instrumentation or other sources of experi- 
mental accuracy tends, in physics or astron- 
omy or chemistry or genetics, to subject the 
theory to a greater risk of refutation modus ? 
tollens, whereas improved precision in null“ 
hypothesis testing usually decreases this risk. 
A successful significance test of a substantive 
theory in soft psychology provides a feeble 
corroboration of the theory because the pro- 
cedure has subjected the theory to a feeble 
risk. 

But, you may say, we do not look at just 
one; we look at a batch of them. Yes, we do; 
and how do we usually do it? In the typical 
Psychological Bulletin article reviewing re- 
search on some theory, we see a table showing 
with asterisks (hence, my title) whether this 
or that experimenter found a difference in the 
expected direction at the .05 (one asterisk), 
-01 (two asterisks!), or .001 (three aster- 
isks!!) levels of significance. Typically, of 
course, some of them come out favorable and 
some of them come out unfavorable. What 
does the reviewer usually do? He goes through 
what is from the standpoint of the logician an 
almost meaningless exercise; to wit, he 
counts noses. If, say, Fisbee’s theory of the 
mind has a batting average of 7:3 on 10 sig- 
nificance tests in the table, he concludes that 
Fisbee’s theory seems to be rather well sup- 
ported, “although further research is needed . 
to explain the discrepancies.” This is scien- 
tifically a preposterous way to reason. It com- 
pletely neglects the crucial asymmetry be- 
tween confirmation, which involves an infer- 
ence in the formally invalid third figure of 
the implicative syllogism (this is why induc- 
tive inferences are ampliative and dangerous 
and why we can be objectively wrong eveny, 
though we proceed correctly), and refutation, 
which is in the valid fourth figure, and which 
gives the modus tollens its privileged position 
in inductive inference. Thus the adverse ¢ 
tests, seen properly, do Fisbee’s theory iar 
more damage than the favorable ones do i 
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J am not making some nit-picking statis- 
tician’s correction. I am saying that the whole 
business is so radically defective as to be sci- 
entifically almost pointless. This is not a tech- 
nical hassle about whether Fisbee should have 
used the varimax rotation, or how he esti- 
mated the communalities, or that perhaps 
some of the higher order interactions that 
‘are marginally significant should have been 
_lumped together as a part of the error term, 
or that the covariance matrices were not 
quite homogeneous. I am not a statistician, 
and I am not making a statistical complaint. 
I am making a philosophical complaint or, if 
you prefer, a complaint in the domain of sci- 
entific method. I suggest that when a reviewer 
tries to “make theoretical sense” out of such 
a table of favorable and adverse significance 
* test results, what the reviewer is actually en- 
gaged in, willy-nilly or unwittingly, is mean- 
ingless substantive constructions on the prop- 
erties of the statistical power function, and 
almost nothing else. 

This feckless activity is made worse by the 
almost universal practice of what I call step- 
wise low validation. By this I mean that we 
rely on one investigation to “validate” a par- 

\ ticular instrument and some other study to 
validate another instrument, and then we cor- 
relate the two instruments and claim to have 
validated the substantive theory. I do not 
argue that this is a scientific nothing, but it is 
about as close to a nothing as you can get 
without intending to. Consider that I first 
show that Meehl’s Mental Measure has a va- 
lidity coefficient (against the criterion I shall 

. here for simplicity take to be quasi-infallible 

or definitive) of, say, 40—somewhat higher 
than we usually get in personology and social 
psychology! Then I show that Glotz’s Global 

Gauge has a validity for its alleged variable 

of the same amount. Relying on these results, 
having stated the coefficient and gleefully re- 
corded the asterisks showing that these coef- 

“ficients are not zero (!), I now try to corrobo- 

rate the Glotz-Meehl theory of personality by 
ahead that the two instruments, each having 
oa duly “validated,” correlate .40, provid- 
me happily, some more asterisks in the table. 

Now just what kind of a business is this? Let 

ie suppose that each instrument has a reliabil- 
y of .90 to make it easy. That means that 
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the portion of construct-valid variance for 
each of the devices is around one fifth of the 
reliable variance and the same for their over- 
lap when correlated with each other. I do not 
want to push the discredited (although re- 
cently revived) principle of indifference, but 
without other knowledge, it is easily possible, 
and one could perhaps say rather likely, that 
the correlation between the two occurs in a 
region of each one’s components that has lit- 
erally nothing to do with either of the two 
criterion variables used in the validity studies 
relied on. This is, of course, especially dan- 
gerous in light of the research that we have 
on the contribution of methods variance. 

I seem to have trouble conveying to my stu- 
dents and colleagues just how dreadful a mess 
of flabby inferences this kind of thing in- 
volves. It is as if we were interested in the 
effect of sunlight on the mating behavior of 
birds, but not being able to get directly at 
either of these two things, we settle for cor- 
relating a proxy variable like field-mice den- 
sity (because the birds tend to destroy the 
field mice) with, say, incidence of human skin 
cancer (since you can get that by spending too 
much time in the sun!) You may think this 
analogy dreadfully unfair; but I think it is a 
good one. Of course, the whole idea of simply 
counting noses is wrong, because a theory that 
has seven facts for it and three facts against 
it is not in good shape, and it would not be 
considered so in any developed science. 

You may say, “But, Meehl, R. A. Fisher 
was a genius, and we all know how valuable 
his stuff has been in agronomy. Why shouldn’t 
it work for soft psychology?” Well, I am not 
intimidated by Fisher’s genius, because my 
complaint is not in the field of mathematical 
statistics; and as regards inductive logic and 
philosophy of science, it is well-known that Sir 
Ronald permitted himself a great deal of dog- 
matism. I remember my amazement when the 
late Rudolf Carnap said to me, the first time 
I met him, “But, of course, on this subject 
Fisher is just mistaken; surely you must know 
that.” My statistician friends tell me that it 

is not clear just how useful the significance 
test has been in biological science either, but 
I set that aside as beyond my competence to 
discuss. The shortest answer to this rebuttal 
about agronomy, and one that has general im- 
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portance in thinking about soft psychology, is 
that we must carefully distinguish substantive 
theory from statistical hypothesis. There is a 
tendency in the social sciences to conflate these 
in talking about our inferences. (A neglected 
article by Bolles, 1962, did not cure the psy- 
chologists’ disease.) The substantive theory is 
the theory about the causal structure of the 
world, the entities and processes underlying 
the phenomena; the statistical hypothesis is a 
much more restricted and “operational” con- 
jecture about the value of some parameter, 
such as the mean of a specified statistical pop- 
ulation. The main point in agronomy is that 
the logical distance, the difference in meaning 
or content, so to say, between the alternative 
hypothesis and substantive theory T is so 
small that only a logician would be concerned 
to distinguish them. Example: I want to find 
out whether I should be putting potash on the 
ground to help me raise more corn. Now every- 
body knows from common sense as well as 
biology that the corn gets its nutrients from 
the soil, and furthermore that the yield of 
corn at harvest time is not causally efficacious 
in determining what I did in the spring, ran- 
dom numbers aside. If I refute the statistical 
null hypothesis that plots of corn with potash 
do not differ in yield from plots without pot- 
ash, I have thereby proved the alternative hy- 
pothesis—that there is a difference between 
these two sorts of plots; and the only substan- 
tive conclusion to draw, given such a differ- 
ence, is that the potash made the difference. 
Such a situation, in which the content of the 
substantive theory is logically quasi-identical 
with the alternative hypothesis, which was re- 
futed by our significance test, is completely 
different from the situation in soft psychology. 
Fisbee’s substantive theory of the mind is not 
equivalent, or anywhere near equivalent, to 
the alternative hypothesis. All sorts of com- 
peting theories are around, including my 
grandmother’s common sense, to explain the 
nonnull statistical difference, So the psycholo- 
gist can take little reassurance about the use 
of significance tests from knowing that Fisher’s 
approach has been useful in studying the ef- 

fect of fertilizer on crop yields. 
Although this presents a pretty depressing 
picture, I daresay that the Skinner disciples 
among you will be inclined to think, 
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well, that’s just one more way of showing what we 
have known all along. The point is to prove that you 
have achieved experimental control over your sub- 
ject matter, as Skinner says. If you have, I am not 
much interested in tabular asterisks; if you haven’t, 
I'm not interested in them either. 


But that is easy for Skinnerians because their 
theory (it is a theory in Sir Karl Popper’s 
sense) is close to a pure dispositional theory. 
and does not usually present us with the kind 
of evidentiary evaluation problem that we get 
with entity-postulating theories such as those 
of Freud, Hull, Albert Ellis, or, to come closer 
to home, my conjectures about schizophrenia 
or hedonic deficit (Meehl, 1972, 1974, 1975, 
1962/1973c, 1972/1973g). Those of us whose 
cognitive passions are incompletely satisfied 
by dispositional theories, whether Skinnerian 
or psychometric, should ask ourselves what} 
kind of inferred entity construction we want 
and how it could generate the sorts of intel- 
lectual “surprises” that Robert Nozick (1974, 
pp. 18-22) considers typical of invisible hand 
theories, which have proved so eminently suc- 
cessful in the physical and biological sciences 
and—somewhat less so—in economics. Some 
directions of solution (before I go on to the 
one that I am using in my own research) fol- 
low. 

We could take the complex form of Bayes’s 
theorem more seriously in concrete application 
to various substantive theories to take into 
account, even if crudely in the sense of set- 
ting upper and lower bounds to the probabil- 
ities involved, the logical asymmetry between 
confirmation and refutation (see, e.g., Max- 
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well, 1974). Second, it may be that the Fish-. s 


erian tradition, with its soothing illusion of 
quantitative rigor, has inhibited our search 
for stronger tests, so we have thrown in the 
sponge and abandoned hope of concocting sub- 
stantive theories that will generate stronger 
consequences than merely “the Xs differ from 
the Ys.” Thus, for instance, even when we 
cannot generate numerical point predictions; 
(the ideal case found in the exact sciences), it 
may be that we can at least predict the order 
of numerical values or the rank order of the 
first-order numerical differences, and the like. 

Sometimes in the other sciences it has been 
possible to concoct a middling weak theory 
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that, while incapable of generating numerical ; 
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‘point values, entails a certain function form, 
such as & graph should be an ogive or that it 
should have three peaks and that these peaks 
“should be increasingly high, and that the dis- 
Fiance on the abscissa between the first two 
peaks should be less than the distance between 
the second two. In the early history of quan- 
F tum theory, physicists relied on Wien’s law, 
„which related “‘some (unknown) function” of 
‘wavelength to energy multiplied by the fifth 
power of wavelength. In the cavity radiation 
4 experiment, the empirical points were simply 
plotted at varying temperatures, and it was 
~ evident by inspection that they fell on the 
same curve, even though a formal expression 
i for that curve was beyond the theory’s capa- 
bilities (Eisberg, 1961, pp. 50-51). 
Talking of Wien’s law is a good time for me 
jto recommend to psychologists who disagree 
with my position to have a look at any text- 
book of theoretical chemistry or physics, where 
one searches in vain for a statistical signifi- 
cance test (and finds few confidence inter- 
vals). The power of the physicist does not 
T come from exact assessment of probabilities 
that a difference exists (which physicists 
would view as a ludicrous thing to show), nor 
by the verbal precision of so-called “opera- 
tional definitions” in the embedding text. The 
physicist’s scientific power comes from two 
other sources, namely, the immense deductive 
fertility of the formalism and the accuracy of 
the measuring instruments. The scientific trick 
| lies in conjoining rich mathematics and experi- 
mental precision, a sort of “invisible hand 
wielding fine calipers.” The embedding text is 
Sometimes surprisingly loose, free-wheeling, 
even metaphorical—as viewers of television’s 
Nova are aware, seeing Nobel laureates dis- 
course whimsically about the charm, strange- 
ness, and gluons of nuclear particles (see, €g- 
Nambu, 1976). One gets the impression 
when you have a good science going, with po- 
tent mathematics and accurate instruments, 
you can be relaxed and easygoing about the 
words. Nothing is as stuffy and pretentious as 
the verbal “pseudorigor” of the soft branches 
of social science. In my modern physics text, I 
am unable to find one single test of statistical 
significance. What happens instead is that the 
physicist has a sufficiently powerful invisible 
hand theory that enables him to generate an 
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expected curve for his experimental results. He 
plots the observed points, looks at the agree- 
ment, and comments that “the results are in 
reasonably good accord with theory.” Moral: 
It is always more valuable to show approxi- 
mate agreement of observations with a the- 
oretically predicted numerical point value, 
rank order, or function form, than it is to 
compute a “precise probability” that some- 
thing merely differs from something else. Of 
course, we do not have precise probabilities 
when we do significance testing because of the 
falsity of the assumptions generating the 
table’s values and varying robustness of our 
tests under departures from these assumptions. 
The only possible “solution” to the theory- 
refutation problem that I have time to dis- 
cuss in any detail is what I call consistency 
tests (Meehl, Note 3). Unfortunately, this ap- 
proach is not easily available for most the- 
oretical problems in soft psychology, although 
I am not prepared to say that it i 
to the domain in which I have 
ing it, namely, taxometrics, that is, the ap- 
plication of psychometric procedures to de- 
tection of a taxonic situation and classification 
of individuals into the taxon or outside of it. 
From our conjectures about the latent causal 
situation, we derive formulas for estimating 
the theoretical quantities of interest, such as 
the proportion of schizotypes in a given clini- 
cal population, the mean values of the schizo- 
typal and nonschizotypal classes, the optimal 
cut (“hitmax”) on each phenotypic indica- 
tor variable for classifying individuals, and 
the proportion of valid and false positives 
achieved by that cut. But we realize that our 
conjectures about the latent situation may be 
false or that the indicators relied on may have 
too low validity, ot that they may be more 
correlated within the taxa than desired, and 
so forth. Second, even if the basic formal 
structure postulated is approximated by the 
state of nature (e.g. there is a schizoid taxon, 
the indicators have sizable validity, the intra- 
taxon distributions are quasi-normal or at 
least unimodal, the correlation of the indica- 
tors within the groups is small, and the de- 
partures from these various hypotheses are 
within the tolerance allowed by the method’s 
robustness), it may still be that we have suf- 
fered some kind of systematic bias on one 0 
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the indicators due to a nuisance variable such 
as social class, or that we have had bad luck in 
the sample, so the method’s numerical deliver- 
ances on this occasion are untrustworthy. 

Whether the abstract causal structure postu- 
lated is unsound or the numerical values found 
in this sample are seriously in error, we need 
some method of checking the data internally 
to find out whether these unfortunate possibil- 
ities have materialized. We do this by deriving 
theorems within the formalism specifying how 
various numerical values (observed or cal- 
culated from the observed) should be related 
to each other, so that when they are not re- 
lated as the consistency theorem demands, we 
are alerted to the danger that something is 
rotten in the state of Denmark (see Meehl, 
1973d). Unfortunately, most of the work, 
both mathematical and empirical, is as yet 
only available in mimeographed reports from 
our laboratory (Golden, 1976; Golden & 
Meehl, Note 1, Note 2; Meehl, Note 3, Note 
4), What survives scrutiny will be found in a 
book in preparation with my former student 
and research colleague Robert Golden (Golden 
& Meehl, in press). 

One taxometric 
christened maxcov. 
lies on the followi 
indicator variabl 
within a diagnostic taxon and within the extra 
taxon population 
pair of these is 


Procedure, which I have 
-hitmax (Meehl, 1973d) re- 
ing theorem: If three fallible 


taxon, the frequency distributions 
of the fallible indicators, the location of all 
three hitmax cuts, and the inverse probability 
of taxon membership (via Bayes? theorem) for 
a patient who combination of 


namely, biological sex diagnosed by three 
MPI femininity keys, have been most en- 
couraging and suggest that the method is 
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powerful and quite robust under departures 
from the simplifying hypotheses, But apply- 
ing it to a situation in which we do not know 
the true answer (such as “What is the propor- 
tion of unrecognized schizotypes in a mixed 
psychiatric population?”), how much faith 
should we have in our numerical results? The 
best way I know to go about this, since mere 
replication of the inferred parameter estimates , 
does not answer the question, is by the use 
of consistency tests. For example, one of the / 
consistency tests in this kind of two-category 
taxonic situation is this: If we form the prod- 
uct of the differences between the inferred la- 
tent means on y and z (schizotypes minus non- 
schizotypes) and then multiply this product 
Ay Az by the product of the inferred schizo- 
typal base-rate P and its complement Q, then 
it can be shown that this theoretically cal- 
culated quantity should equal the grand co- 
variance of y and z computed directly from 
the observations. We call this the “total co- 
variance consistency test,” 

Of course, such a relation is not required to 
be literally true, because it is known in ad- 
vance that (a) the impoverished theory has 
imperfect verisimilitude and (b) all statisti- 
cal estimates are subject to both systematic 
and random error. (We are not going to do a4 
Significance test!) What we have is a problem 
of robustness and detection of excessive de- 
partures from the postulated latent conditions. 
Golden and I arbitrarily said that we would 
consider a particular sample as delivering suffi- 
ciently accurate information if the estimates 
of base rate and hit rate were within .10 of 
the true values, and estimated latent means 
and standard deviations within one class in- 
terval of the truth. (Actually we did much 
better than that on the average. For example, 
with sample sizes greater than 400, equal 
Variances, two sigma differences of latent 
means, and zero intrataxon correlations, the 
average error for P was only .01 and for la- 
tent means and sigmas, less than one fourth 
standard deviation which is one-half the small- 
est integral class interval.) But if these toler- 
ances strike you as excessively large, I remind 
you how much more powerful such numerical 
claims are in soft psychology than the usual 
flabby “the boys are taller than the girls or 
“the schizophrenics are shyer than the manic 
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í Table 1 
Description of Sample Sets | 
Set 3 Variable N Pp EM ESD SD gee oe r Fe 
4 N 1,000.5 STRT S02 2 7 1 0 0 
n 800 -5 BA AIDIA 2 2 1 0 0 
i 6002 SP ees Oommen 2 2 1 o 0 
‘ 400-5 ES PAPA 2 2 1 oP 0 
mi P 1000 6 8 12 2 2 2 1 oè 3 
a 1,000.7 es Ogio 2 2 1 oP 2 
23 1000 8&8 8 12 2 p 2 1 oP 8 
i 1,000 9 PVE 2 2 1 0 0 
3.1 D' 1000 5 9 12 
: ‘ 2 2 Sey oP 0 
a 1000 5 «10122 2 1 1 oP 15 
A 1000 fe Soa E 2 2 rn 0 0 
i 1,000 -aS a ARIE 2 0 1 0 0 
A SD1/SDe 1,000 5 8 12 19a) B12 1.1 oP 0 
i 000 -5 AEAN O E A 13 oP 0 
a o00 S (iS a ee 17 oè 0 
$ 1,000 5 SU N 3 2 3 0 0 
a r 1,000.5 BAINZ 2 2 1 we 0 
ee 1,000.5 Bisa) lanes 2 2 1 3b 0 
eS 1,000 5 BIZA 2 2 1 5b 8 
E 1,000 5 ee aD) 2 2 1 8 0 
Telti 
6.1 ee 
i ren = 4 1,000 -8 Sue DOEA 2 2 1 5/125, 0 
a N 800 8 PN) 3 2 2 1 5/1 2B nN, 
SH 600 8 BE AZEZ 2 2 1 5/125 0 
4008 PEERY 2 2 1 IROA 
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Note, N = sample size; P 


15 samples. 
94% correct. 


a at = base rate of the taxon; Me 
ioe o t he taxon on each indicator; SDe = tandard deviation of 
+ SD) i ie standard deviation of the taxon on each indicator; 

.)/2; z = latent correlation between indicator pa! 


= mean of the extra taxon class on each indicator; 
the extra taxon class on each 
D' = (M: — Me)/S, where S = SD, 
irs; F = number of failures of consistency tests in 


> Param 
ara: i 
eter estimates are always or nearly always accurate. 


es We then imposed tolerances on 
ea te four most promising consistency 
Gils aS within the formalism. For ex- 
us 7 i the total covariance consistency 
a pu — PQ (Js — Ja) (Zs — Ža) 
K OB eren greater than .64 + .145°, 
aR m cut i chosen by a combination of 
CLR erivation with preliminary Monte 
ca s then this particular sample is 
A numerically inconsistent” with 
foe a Test Ti. Now if any one of the 
R istency tests is, so to speak, rejected 

given sample, this is a red flag warning 


us that we ought not to have much faith in 
the parametric estimates of interest. 

The important question then is, how sensi- 
tive are the consistency tests to sample de- 
partures from the parametric truth in excess 
of the tolerance allowed? How often will we 
draw a sample in which the inferred param- 
eters are in error by more than the tolerance 
limit imposed but all four consistency tests 
are satisfied within their tolerance limits, 
leading us mistakenly to trust our results? 
Second, how often is at least one of the four 


consistency tests numerically inconsistent (i.e, 
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Table 2 
Consistency Test Result 
Sample 
Actual Trust- Sus- 
situation worthy picious Total 
Accurate 336 36 372 
Inaccurate 0 228 228 
Total 336 264 600 


outside its tolerance limit) leading us to mis- 
trust the sample when in fact all of the sample 
estimates of the parameters are within their 
tolerances? The first of these we might call a 
“false negative” failure on the part of the 
consistency tests to function jointly; the sec- 
ond is then a false positive. 

I restrict my data presentation to Monte 
Carlo runs in which the samples are generated 
from a multivariate normal model, although 
I want to emphasize that our methods are not 
generally confined to the normal case. Nor- 
mality was imposed because of Monte Carlo 
generating problems. In Table 1, the numbers 
“Set 1.1, 1.2,...” in the first column merely 
name conditions of fixed population properties 
and sample sizes, and 25 Monte Carlo samples 
were drawn per set. The column heads indi- 
cate the various population properties, such 
as taxon base-rate P, the two latent taxon 
means and standard deviations, the mean dif- 
ference in standard deviation units, the ratio 
of latent standard deviations, and the within- 
group correlations, The important result (F) 
indicates how many of the 25 samples under 
the given set conditions were failures of the 
consistency tests. Thus, the four consistency 
tests were applied to each sample, which was 
classified as probably trustworthy (or prob- 
ably not) in accordance with the tolerance 
rules for consistency tests, Then the sample 
was classified as to whether it was in fact 
trustworthy, that is, whether the main latent 
parameters were all estimated within their 
allowed tolerance. 

Despite the high average accurac of 
taxometric method HBS) eehed X E 
percent errors in estimating each of the latent 
N (base rate, hit rates, means, stan- 
st et ag, ae, sting taron 

y on the method, hoping to 


PAUL E. MEEHL 


be accurate on all seven parameters on any ` 
sample drawn, he would be misled distress- 
ingly often were he to lack consistency tests, 
Among our 600 Monte Carlo samples, all 
seven latent parameters of the artificial popu- 
lation were estimated to an accuracy within 
the tolerance levels in 372 samples; that is, 
on 228 samples at least one parameter was 
inaccurate. This shows that a trustworthy de- | 
vice for detecting such bad samples is much 
to be desired. It will not do a taxonomic scien- ⁄ 
tist much good to be “usually quite accurate” 
if the procedure relied on is nevertheless often 
(38% of the time) somewhat inaccurate and 
the investigator is without a method that 
warns him when the untoward event has, on a 
given occasion, occurred. 

In Table 2 the 600 Monte Carlo samples are © 
tallied with respect to each sample’s parameter 
estimation accuracy and whether it passed all 
four consistency tests. It is encouraging that 
overall the consistency tests were 94% ac- 
curate, Furthermore, the 6% of the samples 
in which the consistency tests erred were all 
samples in which they erred conservatively; 
that is, one or more of the consistency tests 
was suspiciously outside its tolerance limits, 
yet none of the latent parameters estimated 
by the methods was outside its tolerance* 
limits. We have not as yet drawn a single 
Monte Carlo sample (among 600) in which 
the four consistency tests were conjunctively 
reassuring but the sample was in fact mislead- 
ing. This finding suggests that we were unduly 
stringent, so that if some small amount of lee- 
way were permitted for errors of the other 
kind, the consistency tests could be somewhat 
relaxed and, perhaps concurrently, the toler- 
ance limits on the parameter estimates could 
be somewhat tightened. 

There is some interchangeability between 
original estimators and consistency tests, and 
the maxcoy-hitmax method itself was orig- 
inally derived by me as a consistency test be- 
fore I realized that it could better be used as 
an original search device (see Meehl, Note 3, 
pp. 28-29; Note 4, pp. 2-6). 

Not in reliance on these results, which I 
present merely as exemplars of a general meth- 
odological thesis, I want now to state as 
strongly as I can a prescription that we should 
adopt in soft psychology to help get away 
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from the feeble practice of significance test- 
ing: Wherever possible, two or more nonre- 
dundant estimates of the same theoretical 
quantity should be made, because multiple ap- 
proximations to a theoretical number are al- 
ways more valuable, provided that methods of 
setting permissible tolerances exist, than a so- 
called exact test of significance, or even an 
exact setting of confidence intervals. This is a 
special case of what my philosopher colleague 
Herbert Feigl refers to as “triangulation in 
logical space.” It is, as you know, standard 
procedure in the developed sciences. We have, 
for instance, something like a dozen inde- 
pendent ways of estimating Avogadro’s num- 
ber, and since they all come out “reasonably 
close” (again, I have never seen a physicist do 
a t test on such a thing!), we are confident 
that we know how many molecules there are 
in a mole of chlorine. 

This last point may lead you to ask, “If 
consistency tests are as important as Meehl 
makes them out to be, why we don’t hear 
about them in chemistry and physics?” I 
have a perfect answer to that query. It goes 
like this: Consistency tests are so much a 
part of standard scientific method in the de- 
veloped disciplines, taken so much for granted 
by everybody who researches in chemistry or 
physics or astronomy or molecular biology or 
genetics, that these scientists do not even 
bother having a special name for them! It 
shows the sad state of soft psychology when a 
fellow like me has to cook up 4 special meta- 
theory expression to call attention to some- 
thing that in respectable science is taken as a 
matter of course. 

Having presented what seems to me some 
encouraging data, I must nevertheless close 
with a melancholy reflection. The possibility 
of deriving consistency tests in the taxonic 
situation rests on the substantive problems 
Presented by fields like medicine and behavior 
genetics, and it is not obvious how we would 
go about doing this in soft areas that are non- 
taxonic. It may be that the nature of the sub- 
ject matter in most of personology and social 
Psychology is inherently incapable of permit- 
ting theories with sufficient conceptual power 
(especially mathematical development) to 
7 ield the kinds of strong refuters expected by 

Opperians, Bayesians, and unphilosophical 
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scientists in developed fields like chemistry. 
This might mean that we could most profitably 
confine ourselves to low-order inductions, a 
(to me, depressing) conjecture that is some- 
what corroborated by the fact that the two 
most powerful forms of clinical psychology 
are atheoretical psychometrics of prediction 
on the one hand and behavior modification on 
the other. Neither of these approaches has the 
kind of conceptual richness that attracts the 
theory-oriented mind, but I think we ought 
to acknowledge the possibility that there is 
never going to be a really impressive theory in 
personality or social psychology. I dislike to 
think that, but it might just be true. 


Addendum 


My colleague, Thomas J. Bouchard, Jr., on 
reading a draft of this article faulted me for 
what he saw as a major inconsistency between 
my neo-Popperian emphasis on falsifiability 
and my positive assessment of Freud. There 
is no denying that for such a quantitatively 
oriented product of the “dust-bowl empiricist” 
tradition as myself, I do have a soft spot in 
my heart (Minnesota colleagues would prob- 
ably say in my head) for psychoanalysis. So, 
the most honest and straightforward way to 
deal with Bouchard’s complaint might be 
simply to admit that the evidence on Freud is 
inadequate and that Bouchard and I are 
simply betting on different horses. But I can- 
not resist the impulse to say just a bit more 
on this vexatious question, because while I am 
acutely aware of a pronounced (and possibly 
irrational) difference in the “educated prior” 
I put on Freud as contrasted with rubber band 
theory or labeling theory or whatever, I am 
not persuaded that my position is as grossly 
incoherent as it admittedly appears. Passing 
the question whether attempts to study psy- 
choanalytic theory by the methods of experi- 
mental or differential psychology have on the 
whole tended to support rather than refute it 
(see, e.g, Fisher & Greenberg, 1977; Rapa- 
port, 1959; Sears, 1943; Silverman, 1976), 
my own view is that the best place to study 
psychoanalysis is the psychoanalytic session 
itself, as I have elsewhere argued in a far too 
condensed way (Meehl, 1970/1973e). 

I believe that some aspects of psychoana- 
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lytic theory are not presently researchable be- 
cause the intermediate technology required— 
which really means instruments-cum-theory— 
does not exist. I mean auxiliaries and methods 
such as a souped-up, highly developed science 
of psycholinguistics, and the kind of mathe- 
matics that is needed to conduct a rigorous 
but clinically sensitive and psychoanalytically 
realistic job of theme tracing in the analytic 
protocol. This may strike some as a kind of 
cop-out, but I remind you that Lakatos, Kuhn, 
Feyerabend, and others have convincingly 
made the point that there are theories in the 
physical and biological sciences that are un- 
testable when first propounded because the 
theoretical and technological development nec- 
essary for making certain kinds of observa- 
tions bearing on them had not taken place. It 
is vulgar positivism (still held by many psy- 
chologists) to insist that any respectable em- 
pirical theory must be testable, if testable 
means definitively testable right now. 

But I do think that there is another class of 
consequences of psychoanalytic theory, close 
to the original “clinical connections” alleged 
by Freud, Ferenczi, Jones, Abraham, and 
others that does not involve much of what 
Freud called the witch metapsychology, where 
no complicated statistics are needed, let alone 
the invention of any new formal modes of 
protocol analysis. Here the problem is mainly 
that none of us has bothered to carry out some 
relatively simple-minded kinds of analyses on 
a random sample of psychoanalytic protocols 
collected from essentially naive patients to 
whom no interpretations have as yet been 
offered. This second category is, in my view, 
a category of research studies that we could 
have done, but have not done. Example: We 
can easily ascertain whether manifest dream 
content of a certain kind is statistically as- 
sociated (in the simple straightforward sense 
of a patterned fourfold table) with such and 
such kinds of thematic material in the pa- 
tient’s subsequent associations to the dream. 
I would not even object to doing significance 
tests on a batch of such tables, but to explain 
why would unduly enlarge what is already an 
addendum. Š 

I cheerfully admit, in this matter, to the 
presence of a large distance between my sub- 
jective personalistic probability (based on my 
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experiences as analysand and practitioner of 
psychoanalytic therapy) and the present state 
of the “intersubjective public evidence.” That 
is what I mean by saying that Bouchard and 
I are betting on different horses. But one 
must distinguish, as I know from subsequent 
conversations that he does, between a criti- 
cism (a) that what is proper evidence does 
presently exist and is adverse to a conjecture 
and (b) an anti-Popperian claim that falsifi- 
ability in principle does not matter. If I 
thought (as does Popper) that Freudian the- 
ory was in principle not falsifiable, then I 
would have to confess to a major inconsist- 
ency. But I do think it is falsifiable, although 
I agree that some parts of it cannot at present 
be tested because of the primitive development 
of the auxiliary theories and the measure- 
ment technologies that would be jointly nec- 
essary. 

A final point on this subject is one that I 
hesitate to include because it is very difficult 
to explain in the present state of philosophy 
of science, and I could be doing my main 
thesis damage by presenting a cursory and 
somewhat dogmatic statement of it. Neverthe- 
less, having made the above statements about 
psychoanalytic theory and having contrasted 
it favorably with some of the (to me, trivial 
and flabby) theories in soft psychology, I fear 
I have an obligation to say it, however in- 
eptly. Once one sees that it is inappropriate to 
conflate the concepts rational and statistical, 
then it is a fuzzy open question, in the present 
state of the metatheoretician’s art, just when 
a mass of nonquantitative converging evidence 
can be said to have made a stronger case for 
a conjecture than the weak kinds of noncon- 
verging quantitative evidence usually repre- 
sented by the significance testing tradition. I 
say “when” rather than “whether,” because 
it is blindingly obvious that sometimes quali- 
tative evidence of certain sorts is superior in 
its empirical weight to what a typical social, 
personality, or clinical psychologist gets in 
support of a substantive theory by the mere 
refutation of the null hypothesis. Take, for 
instance, the evidence in a well-constructed 
criminal case, such as the evidence that Bruno 
Hauptmann was the kidnapper of the Lind- 
bergh baby. I do not see how anybody who 
reads the trial transcript of the Hauptmann 


case could have a reasonable doubt that he 
was guilty as charged. Yet I cannot recall any 
of the mass of data that convicted him as 
being of a quantitative sort (one cannot fairly 
except the serial numbers on the gold notes, 
they being not “measures” but “football num- 
bers”). 

All of us believe a lot of things that we 
would not have the vaguest idea how to ex- 
press as a probability value (pace strong 
Bayesians! ) or how to compute as an indirect 
test of statistical significance. I believe, for in- 
stance, that Adolf Hitler was a schizotype; I 
do not believe that Kaspar Hauser was the 
son of a prince; I believe that the domestic cat 
probably was evolved from Felis lybica by the 
ancient Egyptians; I hold that my sainted 
namesake wrote the letter to the Corinthians 
but did not write the letter to the Hebrews; I 
am confident that my wife is faithful to me; 
and so forth. The point is really a simple one 
—that there are many areas of both practical 
and theoretical inference in which nobody 
knows how to calculate a numerical probabil- 
ity value, and nobody knows how to state the 
manner or degree in which various lines of 
evidence converge on a certain conjecture as 
having high verisimilitude. There are proposi- 
tions in history (such as, “Julius Caesar 
crossed the Rubicon”) that we all agree are 
well corroborated by the available documents 
but without any ¢ tests or the possibility of 
calculating any, whereas Fisbee’s theory of 
social behavior is only weakly corroborated 
by the fact that he got a significant ¢ test 
when he compared the boys and the girls 
or the older kids and the younger kids on 
the Hockheimer-Sedlitz Communication Scale. 
Now I consider my betting on the horse of 
psychoanalysis to be in the same kind of ball 
park as my beliefs about Julius Caesar or the 
evolution of the cat. But, I repeat, this may 
be a terribly irrational leap of faith on my 
part, For the purposes of the present article 
and Bouchard’s criticism of it, I hope it is suf- 
ficient to say that one could arguably hold 
that significance testing in soft psychology is 
a pretentious endeavor that falls under a tol- 
erant neo-Popperian criticism, and could 
nevertheless enter his personalistic prediction 
that when adequate tests become available to 
us, a sizable portion of psychoanalytic theory 
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will escape refutation. So I do not think I am 
actually contradicting myself, but I am per- 
sonalistically betting on the outcome of a fu- 
ture horse race. 
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w A Readers, Writers, and Reviewers Guide to 
Assessing Research Reports in Clinical Psychology 


Brendan A. Maher 
Harvard University 


The Editors of the Journal of Consulting and Clinical Psychology who served 
between 1974 and 1978 have seen some 3,500 manuscripts in the area of con- 
sulting and clinical psychology. Working with this number of manuscripts has 
made it possible to formulate a set of general guidelines that may be helpful in 
the assessment of research reports. Originally developed by and for journal re- 
viewers, the guidelines are necessarily skeletal and summary and omit many 
methodological concerns. They do, however, address the methodological concerns 
a that have proved to be significant in a substantial number of cases. In response 
to a number of requests, the guidelines are being made available here. 


Topic Content 


1. Is the article appropriate to this journal? Does it fall within the boundaries 
t mandated in the masthead description? 


— Style 


1. Does the manuscript conform to APA style in its major aspects? 


Introduction 


1. Is the introduction as brief as possible given the topic of the article? 


2. Are all of the citations correct and necessary, or is there padding? Are important 
citations missing? Has the author been careful to cite prior reports contrary to 


the current hypothesis? 
ait 3. Is there an explicit hypothesis? 
4. Has the origin of the hypothesis been made explicit? 


5. Was the hypothesis correctly derived from the theory that has been cited? Are 
7 other, contrary hypotheses compatible with the same theory? 


6. Is there an explicit rationale for the selection of measures, and was it derived 
logically from the hypothesis? 
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Method 


. Is the method so described that replication is possible without further information? 
. Subjects: Were they sampled randomly from the population to which the re- 


sults will be generalized? 


3. Under what circumstances was informed consent obtained? 


. Are there probable biases in sampling (e.g., volunteers, high refusal rates, institu- 


tion population atypical for the country at large, etc.) ? 


. What was the “set” given to subjects? Was there deception? Was there control 


for experimenter influence and expectancy effects? 


. How were subjects debriefed? 


7. Were subjects (patients) led to believe that they were receiving “treatment”? 


17; 


18. 
19, 


20. 


, When the stimulus is a human (e.g., in clinical 


. When only one stimulus or a few human stimuli we 


. Were there special variables affecting the subjects, such as medication, fatigue, 


and threat that were not part of the experimental manipulation? In clinical 
samples, was “organicity” measured and/or eliminated? 


. Controls: Were there appropriate control groups? What was being con- 


trolled for? 


. When more than one measure was used, was the order counterbalanced? If So, 


were order effects actually analyzed statistically? 


- Was there a control task(s) to confirm specificity of results? 
. Measures: For both dependent and independent variable measures—was 


validity and reliability established and reported? When a measure is tailor-made 
for a study, this is very important. When validities and reliabilities are already 
available in the literature, it is less important, 


- Is there adequate description of tasks, materials, apparatus, and so forth? 
. Is there discriminant validity of the measures? 


. Are distributions of scores on measures typical of scores that have been reported 


for similar samples in previous literature? 


. Are measures free from biases such as 


a. Social desirability? 

b. Yeasaying and naysaying? 

c. Correlations with general responsivity? 
d. Verbal ability, intelligence? 


If measures are scored by observers using categories or codes, what is the inter- 
rater reliability? 


Was administration and scoring of the measures done blind? 


If short versions, foreign-language translations, and so forth, of common measures 
are used, has the validity and reliability of these been established? 
In correlational desi 


c gns, do the two measures have theoretical and/or methodologi- 
cal independence? j 


Representative Design 


judgments of clients of differing 


race, sex, etc.), is there a Sample of stimuli (e.g., more than one client of each 


race or each sex) ? 


; Te use - 
Planation of the failure to sample given? ere egite ex 
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Statistics 


_ Were the statistics used with appropriate assumptions fulfilled by the data (eg. 


normalcy of distributions for parametric techniques)? Where necessary, have 
scores been transformed appropriately? 


. Were tests of significance properly used and reported? For example, did the 


author use the b value of a correlation to justify conclusions when the actual size 
of the correlation suggests little common variance between two measures? 


. Have statistical significance levels been accompanied by an analysis of practical 
. significance levels? 


. Has the author considered the effects of a limited range of scores, and so forth, in 


using correlations? 


. Is the basic statistical strategy that of a “fishing expedition”; that is, if many 


comparisons are made, were the obtained significance levels predicted in advance? 
Consider the number of significance levels as a function of the total number of 
comparisons made. 


Factor Analytic Statistics 


. Have the correlation and factor matrices been made available to the reviewers and 


to the readers through the National Auxiliary Publications Service or other 
methods? 


_ Is it stated what was used for communalities and is the choice appropriate? Ones 


in the diagonals are especially undesirable when items are correlated as the 
variables. 


_ Is the method of termination of factor extraction stated, and is it appropriate in 


this case? 


. Is the method of factor rotation stated, and is it appropriate in this case? 
5. If items are used as variables, what are the proportions of yes and no responses 


for each variable? 


. Is the sample size given, and is it adequate? 
7. Are there evidences of distortion in the final solution, such as singlet factors, ex- 


cessively high communalities, obliqueness when an orthogonal solution is used, 
linearly dependent variables, or too many complex variables? 


. Are artificial factors evident because of inclusion of variables in the analysis that 


are alternate forms of each other? 


Figures and Tables 


. Are the figures and tables (a) necessary and (b) self-explanatory? Large tables 


of nonsignificant differences, for example, should be eliminated if the few obtained 
significances can be reported in a sentence or two in the text. Could several 


tables be combined into a smaller number? 


. Are the axes of figures identified clearly? 
. Do graphs correspond logically to the textual argument of the article? (Eg. if 


the text states that a certain technique leads to an increment of mental health and 
the accompanying graph shows a decline in symptoms, the point is not as clear 
to the reader as it would be if the text or the graph were amended to achieve 


visual and verbal congruence.) 
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Discussion and Conclusion 


1. Is the discussion properly confined to the findings or is it digressive, including new 
post hoc speculations? 


2. Has the author explicitly considered and discussed viable alternative explana- 
tions of the findings? 


3. Have nonsignificant trends in the data been promoted to “findings”? 


4. Are the limits of the generalizations possible from the data made clear? Has the 
author identified his/her own methodological difficulties in the study? 


5. Has the author “accepted” the null hypothesis? 


6. Has the author considered the possible methodological bases for discrepancies be- 
tween the results reported and other findings in the literature? 


Many detailed responses to a first draft were reviewed. Particular acknowledgment is due 
to Thomas Achenbach, George Chartier, Andrew Comrey, Jesse Harris, Mary B, Harris, Alan 
Kazdin, Richard Lanyon, Eric Mash, Martha Mednick, Peter Nathan, K. Daniel O'Leary, 
N. D. Reppucci, Robert Rosenthal, Richard Suinn, and Norman Watt. 

Requests for reprints should be sent to Brendan A. Maher, Department of Psychology and 
Social Relations, Harvard University, Cambridge, Massachusetts 02138. 

This material may be reproduced in whole or in part without permission, provided that 
acknowledgment is made to Brendan A. Maher and the American Psychological Association. 
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Personality Traits and Environmental Variables as 
Independent Predictors of Posthospitalization Outcome 


age George A. Clum 
Virginia Polytechnic Tnstitute and State University 


A set of predictor variables, jdentified as intrapsychic, and a second set, iden- 


tified as environmental, were examined in a multiple regression 


analysis as to 


their independent contribution in predicting posthospitalization adjustment. The 


analyses indicated that level of adjustment at 


baseline hospitalization was the 


most salient prognostic variable. Significant others’ expectations for the patients’ 
self-help performance contributed independently to a follow-up criterion of total 
symptomatology as rated by the significant other. The results provide moderate 
support for the hypothesis that environmental as well as intrapsychic variables 


are important prognostic indicators. 


A recent review of prognostic factors of hos- 
pitalized psychiatric patients (Clum, 1975b) 
has shown that two types of variables, intra- 
psychic and environmental, can be hypothe- 
sized to be independent predictors of post- 
hospital outcome. This conclusion was largely 
inferential, however, and no substantive data 
exist that demonstrate the independent con- 
tribution of each set of variables. To accom- 
plish this, three conditions must be met: (a) 
A set of predictor variables, identified as in- 
trapsychic, must be found to predict post- 
hospital outcome; (b) a set of predictor vari- 
ables, identified as environmental, must be 
found to predict posthospital outcome; and 
(c) both sets must contribute independent 
variance to the criterion in a multiple regres- 
sion format. In line with the previous defini- 
tion of these variables, intrapsychic variables 
include any “measurable cognitive or person- 
ality characteristics the patient exhibits. En- 
vironmental variables are defined as socio- 


This study was supported by Grant UVAC-12-70 
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University of Virginia. 

Requests for reprints should be sent to George A. 
Clum, Department of Psychology, Virginia Poly- 
technic Institute and State University, Blacksburg, 
Virginia 24060. 


psychological phenomena within the patient’s 
life space, but external to the patient himself” 
(Clum, 1975b, p. 416). 

One example that may clarify the problem 
involves the relationship between marital sta- 
tus and prognosis. The personality trait view 
suggests that individuals who are single and 
socially withdrawn will not be selected as 
marriage partners and will tend to have a bad 
prognosis. Being single has been found to be 
related to an individual’s continued stay in a 
mental institution and also to poor social ad- 
justment on release from the hospital (Clum, 
1975b). In contrast, the environmental view 
suggests that being married acts as a buffer 
that leads to a greater likelihood of the patient 
having a good prognosis. People who are mar- 
tied have a definite role in the family, whereas 
people who are single are neither the home- 
maker nor the breadwinner and hence have 
roles secondary to the functioning of the fam- 
ily. The press to perform will accordingly be 
less for the single individual and might be 
anticipated to lead to poorer performance for 
such people according to an expectation hy- 
pothesis. 

To determine whether the relationship be- 
tween single status and poor prognosis is due 
to a selection process or to a buffer hypothesis, 
the initial level of patient dysfunction must 
be controlled. If it can be shown that single 
individuals do not exhibit more disturbance 
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than married individuals on admission to a 
hospital but that they do exhibit a poorer 
prognosis, the buffer hypothesis would be 
supported, 

Another example that can be addressed to 
the question of independent contribution to 
prognosis would be that of expectations and 
outcome. The personality trait view would 
predict that the more disturbance exhibited by 
the individual, the lower would be the expec- 
tations, resulting in a poor prognosis; that is, 
level of disturbance would determine level of 
expectations. The environmentalist would pre- 
dict outcome from expectations independent 
of level of disturbance. 

The relative importance of these two sets 
of variables has been studied previously in 
relation to in-hospital criteria of length of 
hospitalization and rated improvement at dis- 
charge (Clum, 1975a). Although Clum’s 
(1975a) study supported the hypothesis that 
environmental variables are predictive of length 
of hospitalization, it must be considered an in- 
conclusive test. Length of hospitalization is a 
relatively weak criterion because of its fluctua- 
tion on the basis of administrative considera- 
tions and because of the fact that in a short- 
stay hospital such as the one in his study, 
the criterion variance is severely truncated. 
Accordingly, the present study expands this 
analysis using a criterion of rated adjustment 
1 year after initial hospitalization. 


Method 
Subjects 


Subjects in the present study (V=79) included 
all patients admitted to the University of Virginia 
psychiatric service who were also available for fol- 
low-up evaluation 1 year after admission. This sam- 
ple represented a severely truncated subsample (50%) 
of the subjects who had completed all baseline data, 
a reduction of subjects which, however, is compara- 
ble to other studies using mailings and phone calls 
to obtain follow-up data. A comparison of the two 
groups, however, revealed no differences on variables 
of marital status, age, race, social class, and initial 
level of symptomatology on five symptom factors. 
The subjects were between the ages of 16 and 65 
and included only those who did not have a diag- 
nosis of organic brain damage. 
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Predictors 


At the time of hospitalization, each patient was 
administered a biographical inventory, the Life 
Change Inventory (described previously by Clum, 
1976), and a measure of expectations for improve- 
ment, Information culled from the biographical in- 
ventory included total time previously spent in men- 
tal hospitals, education, income, marital status, age, 
and number of friends. The patients’ expectations 
were derived from the Katz Adjustment Scales (Katz 
& Lyerley, 1963). 

A significant other also rated his/her expectations 
for the patient’s performance once the patient re- 
turned home from the hospital. Two measures of ex- 
pectations were obtained—one regarding the patient's 
performance on self-help tasks, the other regarding 
the patient’s social adjustment. 


Criterion Measures 


Two criteria were used—total number of symp- 
toms as noted by the patient and total number of 
symptoms as noted by the significant other. The Katz 
Adjustment Scales were used to evaluate symptom- 
atology. 


Procedure 


At the time of hospitalization, patients were asked 
to complete all research forms. A significant other 
(spouse, parent, relative, or friend) who was fa- 
milar with the patient's adjustment in the month 
prior to hospitalization was also asked to complete 
the research instruments. One year later the patient 
and the same significant other were again contacted, 
and the same forms were readministered. The sam- 
ple was then divided randomly into validation and 
hold-out samples to determine the stability of the 
results. 


Results 
Patient-Related Symptomatology 


Both environmental and personality trait 
measures were entered into a stepwise mul- 
tiple regression equation in order to assess 
the independent contribution of each variable 
to a criterion of total symptomatology as rated 
1 year after hospitalization. As Table 1 indi- 
cates, only total symptomatology at baseline, 
time in mental hospital, and education con- 
tributed independently to patient-rated total 
symptomatology at the 1-year follow-up. Fur- 
ther, only total symptomatology at baseline 
was found to contribute independently to the 
criterion in the hold-out cross-validation group. 


‘ 


Table 1 
Regression Analysis of Predictor 


PREDICTING OUTCOME 
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Variables at Hospitalization to Patient-Rated Total 


Symptomatology at 1-Year Follow-Up 


All others = 1 (complex) 


a Intrapsychic. 


© Environmental. 

4 Married = 1; all others = 2 (complex). 
* p = 05. 

* p= 01. 


Stress was not found to be related to the 
criterion. However, significant others’ expec- 
tations for the patient’s performance on self- 
help tasks correlated in the expected direction 
with outcome but did not contribute inde- 
pendently to its predictions. 


Significant-Other-Rated Symptomatology 


The regression analysis of significant-other- 
rated total symptomatology at the 1-year 
follow-up (see Table 2) demonstrated that 
baseline total symptomatology, significant 
others’ expectations for self-help, number of 
friends, marital status, and patients’ baseline 
expectations were the only variables of those 
examined that contributed independently to 
the criterion, Significant others’ expectations 
of self-help were significant at the .01 level, 
supporting the importance of expectations as 
a prognostic variable, independent of personal- 
ity traits, Marital status was also reliably re- 


| Sample 
| Validation Hold out 
F Predictor variable SimpleR Beta weight Simple R Beta weight 
Baseline patient-rated total symptoms“ .50b** 46 .479* .28 
Time in mental hospital (complex) .26° 15 Al* 
Education (complex) — .39b* —.02 —,.04 
Income (complex) —.30 —.03 
Significant others’ expectations—self-help® .36* .32 
Marital statusi —.04 04 
he Age (complex) —.21 —.12 
} Patients’ expectation—baseline* —A5 —.03 
| Level of performance as rated by 
significant other—social adjustment* .03 12 
| No. friends (complex) —.16 —.18 
Total stress® 26 
Significant others’ expectations 
—social adjustment® AS 
Martial status 
Single = 2 02 414 
-69 AT 


b Significant independent contribution in multiple regression. 


lated to significant others’ ratings of total 
symptomatology, again, in a negative direc- 
tion. This was contrary to predictions. Of these 
five predictors only total baseline symptom- 
atology and significant others’ expectations 
cross-validated as independent contributors to 
the criterion. 

The vast majority of previous research has 
found single people to have poorer prognoses. 
Since most of these studies were conducted on 
predominantly schizophrenic populations, it 
was decided to reanalyze the relationship be- 
tween marital status and outcome separately 
for schizophrenic and schizoid personality pa- 
tients and all other diagnostic groups. Pearson 
correlation coefficients were computed between 
marital status and total symptomatology at 
follow-up for patients diagnosed as schizo- 
phrenic or schizoid personality. The results 
indicated that marital status and total symp- 
tomatology at follow-up were related (7 = -41 
between marital status and significant others’ 
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Table 2 
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Regression Analysis of Predictor Variables at Hospitalization to Significant-Other-Rated Total 


Symptomatology at 1-Year Follow-Up 


at 


Sample | 
Validation Hold out 
Predictor variable Simple R Beta weight Simple R Beta weight js 
| 
Baseline significant-other-rated total | 
symptoms* Eh aig 27 40b* 36 
Significant others’ expectations—self-help* 54d 33 24 25 
Martial status (complex)? —41>* —A5 —.33 
No. friends (complex) 21> 27 
Patients’ expectations—baseline* 03> —.15 —.11 
Significant others’ ratings of level of 
performance-social adjustment* k bed 
Age (complex) —.08 —,.08 
Education (complex) —.31 —.23 
Significant others’ expectations— 
social adjustment® 17 27 
Time in mental hospital (complex) .22 33 
Total stress° 14 —.13 
Income (complex) 17 —.14 
Marital status (complex) 
Single = 2 —.23** —.28 = 13 
All others = 1 86 59 


a Intrapsychic. 


» Significant independent contribution in multiple regression, 


e Environmental. 

4 Married = 1; all others = 2. 
* p = 05. 

“p= 01. 


ratings; r = .43 between marital status and 
patients’ ratings) in a generally positive di- 
rection, although not reaching significance due 
to the small sample size (8). Thus, for this 
subgroup of patients, a single status led to a 
poorer prognosis, as was previously predicted. 


Discussion 


The present study sought to determine 
whether environmental factors are of inde- 
pendent prognostic significance when consid- 
ered conjointly with personality trait varia- 
bles. A multiple regression format in which 
both types of predictors were considered was 
used to make this determination. 

The results provided partial support for the 
hypothesis that predictor variables representa- 
tive of both classes of variables would con- 
tribute independent predictor variance. Spe- 
cifically, initial level of symptomatology was 
consistently related to the criterion. In addi- 


tion, significant others’ expectations for self- 
help were related to total symptomatology as 
rated by the significant other but not as rated 
by the patient. This provides moderate sup- 
port for the import of an environmental factor 
of others’ expectations as an independent 
prognostic variable. Heretofore, it has been 
unclear whether the relationship of expecta- 
tions to outcome was attributable to the fact 
that others’ expectations were lower for those 
individuals whose adjustment was poor. It is 
arguable whether poor prognosis in this case 
was a function of the expectations or the al- 
ready poor adjustment. The fact that expec- 
tations for self-help were not independently 
related to patients’ ratings of symptomatology 
suggests that for this criterion the impact of 
expectations is determined largely by its re- 
lationship to initial level of symptoms. 

No other prognostic variable, either per- 
sonality or environmental, cross-validated. 


PREDICTING OUTCOME 


This is especially confusing with regard to 
marital status, since this variable has been 
most consistently related to prognosis (Clum, 
1975b). The vast majority of these studies, 
however, included Veterans Administration or 
state hospital patients, most of whom were 
schizophrenic and chronic. In the present 
study a marital status of married was posi- 
tively correlated with significant-other-rated 
symptomatology, as compared with a negative 
relationship in most other studies. Since most 
of the patients in the present study were 
neither schizophrenic nor chronic, it was de- 
cided to analyze the relationship between 
marital status and outcome separately for 
those patients diagnosed as schizophrenic or 
schizoid and those with all other diagnoses. 
The results confirmed the finding of previous 
studies: Married status was related to a bet- 
ter outcome for the schizophrenic and schizoid 
subgroup. It is possible, therefore, that diag- 
nosis is a moderator of the marital status—out- 
come relationship, rendering marital status an 
even more powerful prognostic variable. 
Whether it contributes independently to out- 
come for schizophrenic patients still remains 
an unanswered question. 

The environmental variable of life stress 
was not found to be of any reliable conse- 
quence. However, in another study (Clum, 
1976), it was demonstrated that whereas life 
changes prior to hospitalization were unre- 
lated to symptomatology at follow-up, they 
were related to the level of symptoms at the 
time of hospitalization. Similarly, life changes 
subsequent to hospitalization were predictive 
of level of symptomatology 1 year after hos- 
pitalization, This supports Rahe’s (1972) no- 
tion that stress only in the preceding year is 
predictive of symptomatology. 


Implications 


The implications of this study are twofold. 
First, the question of prognosis that confronts 
the clinician is related to decisions regarding 
disposition. For example, if a patient can be 
predicted to have a poor prognosis, the in- 
sistence on further follow-up treatment could 
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be increased. Decisions to continue hospital- 
ization for a longer period of time or until such 
time as those factors predictive of outcome 
have been modified could be affected. 

The second implication concerns those areas 
targeted for change. Since previous studies 
have focused on intrapsychic variables and 
have related these to outcome, the focus of the 
clinician has remained at effecting changes in 
the patient’s personality. With the addition, 
in the present study, of the importance of ex- 
pectations and possible role models as inde- 
pendent predictors of outcome, a new direc- 
tion for targets for change should be forthcom- 
ing. Specifically, attention should be paid to 
changing the patient’s families’ expectations 
for performance, such that higher demands on 
the patients performance of self-help tasks 
should be emphasized. The importance of as- 
suming a significant role in the family as a 
breadwinner or homemaker appears specific 
to schizophrenic patients. For this group, ac- 
cordingly, the importance of developing sig- 
nificant social roles would seem to be a target 
for therapeutic intervention. In contrast, being 
married was found to be a negative prognostic 
factor for nonschizophrenic patients, Accord- 
ingly, affecting a more positive relationship 
within the marital context should receive 
greater attention in this group. 
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This study sought to provide a more complete exploration of the relationship 
between Rorschach developmental level scores and intelligence than has ap- 
peared to date in the literature. Factor scores for both Wechsler Adult Intelli- 
gence Scale and Rorschach measures were used in comparing test protocols 
obtained from 86 psychiatric inpatients. Results reaffirmed a relationship be- 
tween intellectual level and developmental level. Further, the relationship was 
found to be of a general nature. It did not rely specifically on verbal, percep- 
tual, or memory factors. Results also indicated that the relationship held for 


females as well as males. 


On the basis of Werner’s (1948) develop- 
mental theory, Friedman (1953) derived a 
Rorschach measure of developmental level 
(DL). Goldfried, Stricker, and Weiner (1971) 
have reviewed a large number of the studies 
making use of this index and concluded that 
Friedman’s scoring of the Rorschach results 
is a good measure of the DL of functioning. 
Findings indicate impressively stable and theo- 
retically consistent age effects, theoretically 
consistent differences among pathological 
groups, correlations with the degree of regres- 
sion among schizophrenics, and successful pre- 
diction of superior interpersonal functioning 
among high-scoring groups. 

The relationship between DL and intelli- 
gence has been one issue of discussion in the 
literature. Werner (1948) speculated that 
mental development proceeds in the direction 
of increasing differentiation and hierarchical 
integration. This view offers a developmental 
framework for Ainsworth and Klopfer’s 
(1954) hypothesis that “vague, global per- 
ception reflects a relatively low level capacity 
and... the more refined and differentiated 
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the perception the higher the level of intelli- 
gence” (p. 353). Thus, viewing the DL scor- 
ing of the Rorschach as an indicator of gen- 
eral cognitive functioning suggests a relation- 
ship with measures of intellectual ability. 

In general, the literature has strongly sup- 
ported the idea of such a relationship. For 
example, Goldfried (1962) found significant 
correlations between DL and IQ scores in a 
sample of male neuropsychiatric patients. 
Similarly, Blatt and Allison (1963) found IQ 
scores to be significantly related to develop- 
mentally high Whole Response scores in sam- 
ples of graduate students and neuropsychiatric 
patients (Allison & Blatt, 1964; Blatt & Alli- 
son, 1963). Friedman and Orgel (1964) found 
a DL—IQ association in paranoid schizophre- 
nics. They did not find the relationship in 
groups of catatonics, hebephrenics, neurotic 
brain-damaged patients, or normals. However, 
as Kissel (1965) has noted and demonstrated, 
the failure to find a relationship probably re- 
sulted from Friedman and Orgel’s use of per- 
centage scores rather than from an inherent 
lack of relationship between DL and intelli- 
gence. 

Kissel (1965) did find the predicted rela- 
tionship in a sample drawn from a child 
guidance clinic, and there has been a report 
of an association between DL and mental age 
in children (O'Neil, O'Neill, & Quinlan, 
1976). Gerstein, Brodzinsky, and Reiskind 
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(1976) also found the general relationship in 
a group of white children. Interestingly, how- 
ever, in looking at children of below-average 
intelligence, they discovered that black sub- 
jects produced higher DL scores than did their 
white counterparts. They raised the possibility 
that a developmental-structural analysis of 
the Rorschach may provide a more realistic 
assessment of the cognitive capacity of some 
black children than do standard IQ tests. 

Thus, there is now consistent evidence of 
a link between a Rorschach measure of cog- 
nitive development and tests of intellectual 
ability. However, since measures of intelli- 
gence are usually a composite of different 
kinds of abilities, Goldfried et al. (1971) sug- 
gested that it would be of value to specify 
which aspects of intellectual functioning are 
related to DL. The goal of the present article 
is to state more exactly the nature of the rela- 
tionship between DL and intelligence. This 
goal is pursued by examining the relationship 
between two independent factors underlying 
DL and three independent factors underlying 
performance on the Wechsler Adult Intelli- 
gence Scale (WAIS). These factors are de- 
rived from previous factor analytic studies of 
the WAIS and a project of our own that ex- 
amined the factor structure among Rorschach 
cards scored for DL. 

Specifically, studies of the WAIS have ex- 
tracted three factors in addition to a dominant 
general factor. These three dimensions have 
been labeled Verbal Comprehension, Percep- 
tual Organization, and Memory (or Freedom 
from Distractibility) (Berger, Bernstein, 
Klein, Cohen, & Lucas, 1964; Cohen, 1957a, 
1957b). Scores for the factors can be obtained 
by adding the scaled scores of the WAIS sub- 
tests making up each factor- 

A construct validity study of the DL mea- 
sure revealed two independent factors among 
Rorschach cards scored for DL (Cardwell & 
Greenberg, Note 1). One factor was the com- 
posite DL score obtained by adding up the 
subject’s DL scores on each of the 10 Ror- 
schach cards. The second factor was obtained 
by taking the difference between a subject’s 
DL scores on two contrasting sets of four 
cards (i.e. DL score on Cards 3, 4, 5, and 6 
minus the DL score on Cards 7, 8, 9, and 10). 
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In sum, the present study concerns the re- 
lationship between three distinct sources of 
variation within IQ data and two distinct 
sources of variation within the DL measure. 
This permits an examination of the total pat- 
tern of correlations among the five measures 
and a look at the number and form of di- 
mensions that may contribute to covariation. 
If one predicts a nonspecific general factor 
underlying both DL and 1Q, a relationship 
across all IQ factors and DL scores would be 
expected. This is the implication of Ains- 
worth and Klopfer’s (1954) view. Alterna- 
tively, a more specific relationship would be 
suggested if DL dimensions were related to 
only verbal, perceptual, or memory factors on 
the WAIS. 

In addition to providing a more thorough 
analysis of the relationship between 1Q and 
DL, the present study also permits an exami- 
nation of sex differences in the relationships. 
Previous studies of the DL construct have 
largely not included female subjects (Gold- 
fried et al., 1971). 


Method 


Data used in the study were obtained from the 
file of a medical center psychology service. All Ror- 
schach tests in the file had been administered and 
scored for DL by faculty members with extensive 
Rorschach experience or by psychology interns un- 
der the direct supervision of these faculty members. 
The interscorer reliability of the DL system has been 
found to be quite high, ranging from 89.7 to 95.5 
(Friedman, 1953; Goldfried et al., 1971). 

In selecting protocols for study, all referrals from 
the hospital’s psychiatric inpatient service over a 7- 
year period were reviewed, and 86 records (57 women 
and 29 men) met the following criteria for study: 
(a) an available summary sheet listing developmental 
scoring for each card, (b) the production of at least 
one response to each card, and (c) an available 
summary sheet of WAIS subtest scores. The ‘mean 


age of the sample was 29.1, with a range of 16 to 


psychoneuroses). Patients with a diagnosis of organic 
brain syndrome were excluded from the sample. 
Friedman applied Werner's 
ciple” in the development of his scoring system. 
Werner (1957) holds that with increasing maturity, 
mental functioning “proceeds from a state of relative 
globality and Jack of differentiation to 4 state of in- 
creasing differentiation, articulation, and hierarchic 
integration” (p. 126). Friedman’s system, which at- 
tempts to measure such changes in perceptual func- 
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tioning, uses only structural and organizational as- 
pects of the percept. Rorschach responses are first 
scored for location (whole or detail) and are then 
placed into one of six categories. The categories 
represent increasing levels of differentiation and inte- 
gration. They are labeled (in order of increasing 
maturity) amorphous (a), minus (—), vague (v), 
mediocre (m), plus (+), and plus-plus (++) re- 
sponses. A detailed presentation of the scoring cri- 
teria is presented by Friedman (1952, 1953) and 
Goldfried et al. (1971). 

For the present study, each response given to the 
Rorschach was assigned a value between 1 and 12 
according to the corresponding developmental level 
of maturity. All responses to a single card were then 
summed, and this total was divided by the number 
of responses given to that card. The weightings used 
for each response are as follows: Wa=1; Da=2; 
W—=3; D—=4; Wvu=5; Dv=6; Dm=7; Wm 
=8; D+=9; W+=10; D++=11; W++=12. 

The order of the major categories amorphous 
through plus-plus follows directly from Friedman’s 
interpretation of Werner. The relative rankings of 
whole and detail categories is consistent with Wer- 
ner’s orthogenetic principle and parallels the develop- 
mental trends reported by Friedman, Hemmendinger, 
and Siegel (cited in Phillips & Smith, 1953), We 
might also add that our 12-point ranking system rep- 
resents a more differentiated and congruent quan- 
titive translation of Friedman’s scoring system than 
has been used before, For example, Becker (1956) 
used only a 6-point system by sometimes ignoring 
the distinction between whole and detail responses 
and by arbitrarily grouping minus, vague, and amor- 
phous responses in some categories. 

Consistent with the factor analytic studies of the 
WAIS (Berger et al, 1964; Cohen, 1957a, 1957b), 
three factor scores were derived by summing the 
total scaled scores of the appropriate WAIS subtests: 
Verbal Comprehension (Information, Comprehension, 
Similarities, and Vocabulary), Perceptual Organiza- 
tion (Picture Completion, Block Design, and Object 
Assembly), and Memory (Arithmetic, Digit Span, 
and Digit Symbol). 


Results and Discussion 


As an initial step in the analysis, covariance 
matrices were estimated for the five variables 
(three WAIS factors and two DL factors) 
separately for each sex. Box’s (1949) tech- 
nique was used in comparing these matrices, 
and this test produced an F ratio less than 
one, indicating little difference between the 
pattern of correlations for male and female 
subjects. This finding is in line with previous 
research, indicating that the DL construct can 
be meaningfully and consistently applied to 
both sexes (Cardwell & Greenberg, Note 1). 
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Table 1 
Correlations of DL Factors with WAIS 
Factors and Full Scale IQ 


DL factor 
WAIS factor 1 2 
Verbal 40** 
Memory .28** 
Perceptual By bed 
Full Scale IQ 40** 


Note. WAIS = Wechsler Adult Intelligence Scale; 
DL = developmental level. 

*p < 05. 

* p< 01. 


The data were then combined across sexes, 
and the five-variable Pearson product-moment 
correlation matrix was estimated. The corre- 
lations between the three WAIS factors and 
the two DL factors from this matrix are pre- 
sented in Table 1 along with the correlations 
between Full Scale 1Q and the two DL factors. 

As can be seen the correlations are all posi- 
tive, significant, and of similar magnitude. 
Bartlett’s (1941, 1947) test of complete inde- 
pendence between two sets of data was ap- 
plied to correlations between WAIS and DL 
factors, and the result was significant, y°(6) = 
44.74, p< .001. The correlations between 
Full Scale IQ and the DL factors were also 
significant (p < .01). These results reaffirm 
previous research findings of a relationship 
between DL and intellectual measures. 

To help answer questions concerning the 
specificity of this relationship, canonical cor- 
relations were estimated (Morrison, 1967). 
This technique attempts to identify sources 
of common variance between two sets of mea- 
sures by estimating weighted combinations of 
the variables that are most strongly corre- 
lated. After one such dimension is specified, 
its effects are removed from the matrix of 
correlations. Additional canonical variables, 
which are independent of the first dimension, 
can then be identified among the residual cor- 
relations. Each time a dimension is extracted 
from the matrix, Bartlett’s test of complete 
independence is reapplied to the residual cor- 
relation matrix until a nonsignificant result 
indicates that all common variance has been 
accounted for. 
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Table 2 

Loadings for Canonical Roots, Canonical 
Correlations, and Significance Tests Between 
WAIS and DL Factors 


Loadings for 
canonical roots 


eg et 


Factor and test First Second 
WAIS 

Verbal 761 052 

Memory ATS —.725 

Perception 443 .686 
DL 

1 (total score) 869 A158 

2 684 170 
r 611 .265 
Bartlett’s test (x°) 44.74" 6.59" 


Note. WAIS = Wechsler Adult Intelligence Scale; 
DL = developmental level. 

* p < 025 (df = 2). 

** p < 001 (df = 6). 


Applied to the correlations between WAIS 
and DL factors, this technique produced the 
results summarized in Table 2. 

A maximum of two canonical correlations 
is possible in these data, and both were sig- 
nificant. The first and most important dimen- 
sion related, with essentially equal loading, 
the sum of the three WAIS factors to the sum 
of the two DL factors and produced a corre- 
lation of .61. This result strongly supports the 
notion of a nonspecific general factor as the 
most important source of covariance between 
WAIS and DL variables. Among WAIS vari- 
ables, such a dimension appears very similar 
to Spearman’s (1927) construct of “general 
intelligence (g).” Stern (1956) has concep- 
tualized Spearman’s g as 4 property of the 
mind that determines its capacity for “col- 
lective coupling” of separate intellectual fac- 
tors. In essence, his discussion, much like dis- 
cussions of the DL construct, focuses on an 
individual’s ability to effectively integrate dif- 
ferentiated (mental) systems. Thus, differen- 
tiation and integration seem to be key concepts 
for both a theory of g anda theory of DL. 
As already noted, these concepts are also con- 
sonant with Ainsworth and Klopfer’s (1954) 
view of a relationship between perceptual dif- 
ferentiation and level of intelligence. 
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In addition, it is not surprising to find that 
a measure based on Rorschach responses is 
associated with verbal abilities as well as per- 
ceptual skills. Indeed, numerous factor ana- 
lytic studies of the Rorschach have shown that 
verbal abilities play a significant role in the 
style and quality of Rorschach responses given 
(Murstein, 1965). 

The second canonical variable found in the 
present study accounted for substantially less 
common variance and produced a correlation 
of only .265, However, Bartlett’s test revealed 
that this correlation is significant. The load- 
ings indicate that within the patient sample 
studied, there was a tendency ‘for high DL 
factor scores to be associated with high WAIS 
perceptual factor scores and low memory fac- 
tor scores. 

Overall, then, the present findings reaffirm 
a relationship between intellectual and psy- 
chological developmental factors in a psychi- 
atric population. Other statements about the 
relationship can now also be made—namely, 
that it is largely of a general nature, it does 
not rely specifically on verbal, perceptual, or 
memory factors, and it applies to females as 
well as males. 


Reference Note 


1. Cardwell, G. F. & Greenberg, R. p. A multi- 
variate analysis of the Rorschach developmental 
level score. Manuscript submitted for publication, 
1977. 
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' Life Event Scales: Psychophysical Training and 
Rating Dimension Effects on Event-Weighting Coefficients 
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(N = 54) responded to ques- 


Scale. Some subjects were given 


and the data suggest that this effect 


= Subjects chosen randomly from the community 
tionnaires and rated either “social readjustment” or “stressfulness” for the 
events in the Social Readjustment Rating 
psychophysical training before rating the events. The training did not have a 
significant effect on the ratings. However, stressfulness ratings were consistently 
higher than those of social readjustment, 
interacted with the events rated. The interaction may mean that choice of event 
dimension influences the ability to predict a dependent measure 


life events. 


The systematic scaling of life events, in- 
troduced by Holmes and Rahe (1967), was a 
major breakthrough in quantifying the rela- 
tionship between life events and both physi- 
cal and psychiatric illness. These investi- 
gators used Stevens’ (1974) technique of 
magnitude estimation to determine the 
amount of “social readjustment” that certain 
events would cause for an “average” person, 
regardless of the desirability of the event. 
Each subject was given a three-paragraph 
explanation of the rating task and then esti- 
mated the magnitude of social readjustment 
that each event would require. The original 
event coefficients, so-called since they were 
used to multiplicatively weight the occur- 
rence of events, were produced by a “sample 
of convenience” of 394. 

Results of investigations using events 
weighted by Holmes and Rahe’s (1967) co- 
efficients are plentiful. Weighted event scores 
have been used to predict sudden cardiac ar- 
rest (Rahe & Lind, 1970), time of myocardial 
infarction (Rahe & Paasikivi, 1971; Theorell 
& Rahe, 1971), occurrence of bone fractures 
(Tollefson, 1972), scholastic achievement 
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(Harris, 1973), disease and illness rates 
(Gunderson & Rahe, 1974), slight colds and 
fevers (Holmes & Holmes, 1970), and depres- 
sion (Paykel, 1973). Though only partial, 
this list illustrates the diverse usage that this 
method has recently enjoyed. The majority 
of these studies have demonstrated modest, 
positive correlations between life event scores 
and criterion measures. 

In this article we address several issues con- 
cerning Holmes and Rahe’s (1967) method. 
First, will a sample of subjects randomly 
selected from the community produce the 
same coefficients as the original sample? Sec- 
ond, given the brief instructions used by 
Holmes and Rahe, does practice with match- 
ing numbers to lines and lines to numbers as 
opposed to no training affect the coefficients? 
Third, does asking a subject to rate ‘stress- 
fulness,” a term that incorporates the concept 
of desirability rather than social readjustment, 
affect the ratings? If any of these variables 
are found to significantly affect the data, then 
we must address the question of what the 
implications are for investigators who must 
choose among the available sets of event 


weightings. 


Method 
Subjects 
One hundred households were randomly selected 
from the county telephone directory. Each was then 


randomly assigned to one of four experimental con- 
ditions such that there were 25 households in each. 
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Table 1 


ARTHUR A. STONE AND JOHN M. NEALE 


Summary of Analysis of Variance on Logs of Raw Scores 


eee 


P 
Source df* F F Conservative F 

Between subjects 53 

Rating dimension (A) 1 12.150 .002 .002 

Training (B) 1 1.781 .189 189 

AXB 1 -204 >.300 >.300 
Within subjects 2,173 

Events (C) 41 24.185 <.001 <.001 

AXC 41 1.582 .012 .215 

BXC 41 1.102 >.300 >.300 

AXBXC 41 1.096 >.300 >.300 

C X Subjects 2,050 

Total 2,267 


* Not corrected for variance-covariance violations. 


Households were sent a letter of introduction, a 
payment of $2, and, depending on experimental 
assignment, one of four questionnaires. The en- 
closed directions stated that any adult member of 
the family was eligible to complete the form. 
Several follow-up reminders were sent if the ques- 
tionnaire was not returned within 2 weeks. The 
final sample consisted of 54 people; all respondents 
save 1 were white, 76% were male, and most were 
considered middle class based on income and edu- 
cation, 

All subjects completed magnitude estimation of 
the 43 items found on the Holmes and Rahe Social 
Readjustment Rating Scale (SRRS; Holmes & Rahe, 
1967). The ratings were done in one of four con- 
ditions and formed a 2X2 factorial design. The 
first factor was whether the questionnaire incor- 
porated a magnitude-estimation training procedure 
prior to filing out the SRRS. The training condition 
consisted of matching numbers to line lengths rang- 
ing from .3 cm to 30.0 cm and drawing lines to 
represent numbers ranging from 3 to 300. Each of 
these tasks was composed of two parts: a page 
with detailed instructions explaining how to use 
numbers and line lengths to represent relative mag- 
nitudes, followed by 13 number trials and 13 line 
trials, In the no-training condition, subjects com- 
pleted the SRRS with no practice. Although the 
training procedure required that participants com- 
plete a longer form, return rates were not affected 
by this variable: 27 subjects in each group returned 
correctly completed questionnaires, 

The second factor was the dimension that subjects 
used in rating the SRRS events. In one condition, 
social readjustment, the instructions were identical 
to those used by Holmes and Rahe (1967), that is, 
a three-paragraph description of the meaning of 
social readjustment, how to do magnitude estima- 
tions, and three examples of proportionally match- 
ing numbers to events, However, in the second con- 


Y 


dition, stressfulness, the words social readjustment 
were replaced with stressfulness, and the word change 
was replaced with stress. One of the examples in the 
original SRRS instruction set was omitted from the 
instructions for the stressfulness condition, as there 
was no satisfactory means for changing the example 
so that it concerned stress and not social readjust- 
ment. One sheet of each questionnaire requested 
demographic information such as age, sex, education, 
income, and occupation. 


Results 


The hypotheses that training and rating 
dimension affect event coefficients were tested 
by a 2 (training vs. no training) x 2 (rating 
dimension: readjustment vs. stress) x 42 
(the SRRS events) unweighted means, re- 
peated measures analysis of variance 
(ANOVA). Data based on magnitude estima- 
tion procedures have repeatedly been demon- 
strated to conform te a log normal distribu- 
tion (Stevens, 1974); thus, to meet the 
normality assumptions of the ANOVA, 
base-10 logarithms of raw scores were sub- 
mitted to the analysis. Since departures from 
homogeneity of the variance—covariance 
matrices in repeated-measures ANOVAs pro- 
duce positively biased F ratios (Box, 1954), 
the obtained F ratios were tested assuming 
both the worst case of the variance-covari- 
ance violations and assuming the case with 
no violations at all. If F ratios are significant 
with degrees of freedom adjusted for the 
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Geometric Means on Social Readjustment Rating Scale Events for Stressfulness and Social 


Geometric M rating 


Readjustment Conditions Collapsed over Training Factor 


Geometric M rating 


SS See 


Stress- Social Stress- Social 
Item fulness readjustment Item fulness readjustment 
Trouble with boss 222 184 Change in church 
Jail term 989 540 activities 160 95 
Death of spouse 1,905 916 Marital 
Change in sleeping reconciliation 716 375 
habits 150 164 Fired from work 755 286 
Death of close Divorce 1,151 438 
family member 1,194 438 Change in line of 
Change in eating aE 420 292 
habits 135 150 Change 
Foreclosure on angeun auher 
origana 649 310 of arguments 569 399 
Change in personal Change in work 
habits 164 156 responsibilities 406 215 
Death of close nue EPA mork ye Ek 
friend 804 252 Change in working 
Minor violations oot acae i 308 ie 
of faze 111 66 Pitts in recreation 247 173 
Personal achievement 261 227 EEE ONT 
Pregnaner 367 304 z 10,000 p3? 234 
Change in health Mortage less than 
of family member 687 342 $10,000 a0 ee 
Sexual difficulties 611 338 Personal illness 65 eae 
In-law oDe 265 200 Business readjustment 581 247 
Change in family Change in social 
get-topettees 161 112 activities 226 154 
Change in financial Change in living 
state 617 262 conditions 361 237 
Gaining AAEM Retirement 347 332 
family member 433 259 Vacation 189 117 
Change in residence 296 158 Christmas 192 90 
Child leaving home 403 226 Change to new school 215 141 
1,219 443 Begin schooling 237 161 


Marital separation 


worst violation, then we have confidence in 
the effect. If an F ratio is not significant with 
this conservative test, yet is significant when 
tested under the no-violation condition, we 
are less confident in the effect and need more 
information to make any conclusive state- 
ments (Greenhouse & Geisser, 1959). 

Table 1 presents the summary of the 
ANOVA with both uncorrected and conserva- 
tive probabilities for each F ratio. The hy- 
Pothesis that psychophysical training affects 
ratings of events was not confirmed, F (1, 50) 
=1.78, ns. The significant main effect of 
rating dimension, F(1, e VA ES p= 
002, confirms the hypothesis that ratings of 


stressfulness were different from ratings of 
social readjustment, For all but three of the 
events rated, coefficients produced in the 
stressfulness condition were equal to or 
higher than the ratings of social readjustment. 
A significant main effect for events with a 
very high F ratio, F(41, 2050) = 24.18, 
p < .001, was expected. This simply indicates 
that different events were given different 
ratings. A significant Rating Dimension X 
Events interaction was also observed, F(41, 
2050) = 1.58, p = .012; but because the in- 
teraction is not significant under the con- 
servative test, F(1, 41) = 1.58, p=.215, 
we are less confident in the effect. Geometric 
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Figure 1. Relationship between arithmetic and geometric means of Social Readjustment Rating 
Scale (SRRS) weightings and geometric means of stressfulness and social readjustment conditions. 


means for each of the rating dimensions, 
averaged across the training factor, are pre- 
sented in Table 2. 

Event coefficients obtained in the social 
readjustment condition, collapsed over levels 
of the training factor, are similar in magni- 
tude to the coefficients that Holmes and 
Rahe (1967) found for the SRRS. Figure 1 
presents the coefficients of the social read- 
justment condition found by us and Holmes 
and Rahe, as well as those obtained in the 
stressfulness condition of the present study. 
The observed differences between Holmes 
and Rahe and the present study’s coefficients 
obtained in the social readjustment condi- 
tion are largely a function of the measure of 
central tendency used to represent the data 
(see Figure 1). Thus, the social readjustment 
condition of the present study replicates the 
data that Holmes and Rahe obtained, 


Discussion 


To our surprise, trained subjects did not 
Produce coefficients Significantly different 
from those produced by the nontrained sub- 
jects. Thus, it is plausible that subjects can 
do magnitude estimations with little train- 
ing, though Stevens (1974) believes that 
Some practice may be useful. At least one 


other interpretation of this finding must, how- 
ever, be noted—The magnitude-estimation 
training instructions may not have been ef- _ 
fective. Some support for this hypothesis lies 
in our observation that several subjects ap- 
parently used a straightedge while complet- 
ing the training. Since the instructions did 
not explicitly prohibit this, an oversight on 
our part, it is possible that this occurred 
frequently. If this was the case, we must 
conclude that the training hypothesis was 
not adequately tested. 

Coefficients from the stressfulness condi- 
tion were shown to deviate significantly from 
the coefficients in the social readjustment 
condition in both elevation and configuration 
over events. But what do these differences 
mean to the life events researcher? Scale dif- 
ferences must be evaluated in light of the 
original purpose of weighting the events with 
these ratings; namely, is illness contraction 
related to the experience of life events? In 
predicting various illnesses from measures of 
life events, a single life events score is usually 
computed as a weighted linear combination 
of the events experienced. The weights are — 
determined by subjects’ ratings of events 
along some semantic dimension. Therefore, 
different rating dimensions would be ex- 
pected to alter the weights used in forming 
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the linear combination. Changing these 
weights is likely, in turn, to alter the correla- 
tion between the life events score and the 
criterion, particularly if the relationship be- 
tween the weights from the two rating dimen- 
sions is not monotonic and linear. Given the 
obtained interaction between rating dimen- 
sion and events, we expect that correlations 
between events and an illness criterion will 
vary depending on the dimension used. 

Our data demonstrate that the set of 
weights that an investigator uses may be 
linked to the prediction of criterion scores. 
Unfortunately, the direction of this effect is 
unknown, The researcher is thus left in the 
unenviable position of choosing among scales. 
An obvious and somewhat tempting resolution 
of this dilemma is for the investigator to use 
all sets of coefficients and empirically, for 
example, using a multiple regression analysis, 
determine the best combinations of scales for 
the set of data, One must wonder, though, 
what is to be learned about the perception of 
life events and their relationship to the cri- 
terion from such an exercise. Stated differ- 
ently, the meaning of the weights yielded by 
a multiple regression is not tied to any sub- 
stantive dimension of the events, such as 
social readjustment or stressfulness. Because 
the meaning of the prediction is of basic im- 
portance, identification of the factors that 
subjects use to rate life events is necessary 
for the advancement of an understanding of 
the interaction of life events with illness and 
psychological functioning. 

We suggest systematic investigation of the 
properties or dimensions that subjects incor- 
porate in their ratings. Multidimensional 
scaling methods are available for both test- 
ing the validity of rationally constructed di- 
mensions and for deriving dimensions with 
no preconceptions. Nelson (1967) has simi- 
larly suggested that multidimensional analy- 
ses, in particular discriminant and factor 
analysis, be used for evaluating life events. 
Explorations along these lines have the po- 
tential to greatly clarify the complex inter- 
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actions that exist between person and en- 
vironment. 
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Mood, Pleasant Events, and Unpleasant Events: 


Two Pilot Studies 


Lynn P. Rehm 
University of Pittsburgh 


The relationship between mood and both pleasant and unpleasant events is as- 
sessed in two studies. Undergraduates made self-ratings of mood and kept daily 


logs of pleasant and unpleasant events for approximately 2 weeks, Intrasubject 
correlations in both studies suggested that mood was related to pleasant and 
unpleasant events independently. Intersubject correlations were consistent but 
nonsignificant. Cross-lagged correlations were significantly less than same-day 
correlations. Weighted event scores produced marginally higher correlations . 
with mood than unweighted scores. Minor sex differences are noted. The impli- 7 


cations of these results for theory and practice are discussed. i 


A common assumption in the behavioral 
literature on depression is that mood is a 
function of reinforcement (e.g., Ferster, 
1973; Lewinsohn, 1974). Although some 
basic questions relating to the precise defini- 
tion of reinforcement in daily life remain un- 
answered, empirical relationships have been 
obtained between mood and pleasurable 
events or activities that have a likely corre- 
spondence to reinforcement. Using the Pleas- 
ant Events Schedule (MacPhillamy & Lewin- 
sohn, Note 1), a 320-item list empirically 
developed to assess potentially reinforcing 
events in daily experience, intraindividual 
correlations have been demonstrated between 
mood and number of pleasant events (Lewin- 
sohn & Graf, 1973; Lewinsohn & Libet, 
1972). Depressed persons have been found to 
report fewer pleasant events than nonde- 
pressed psychiatric and normal groups (Mac- 
Phillamy & Lewinsohn, 1974). Wener and 
Rehm (1975) demonstrated a causal rela- 
tionship between manipulated rate of positive 
feedback in a laboratory task and subsequent 
mood, 
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Self-monitoring of mood and pleasant 
events has been a part of a number of treat- 
ment programs for depression with a variety 
of rationales and functions. Lewinsohn’s 
(1974) clinical research program employed 
the Pleasant Events Schedule as a means of 
empirically selecting targets for behavioral 
intervention. Events that correlate with mood 
for a specific individual are increased in 
order to influence mood (Lewinsohn, 1976). 
Fuchs and Rehm (1977) developed the Posi- 
tive Activities Schedule, consisting of 20 
categories of instrumental behavior likely to 
be associated with reinforcement. Depressed 
subjects kept a log of their daily activities 
using the schedule as a guide. Logs were used 
as a basis for self-selection of target be- 
haviors and were also assumed to be an inter- - 
vention in and of themselves, that is, an in- 
tervention modifying depressive, pessimistic 
self-monitoring, 

Anton, Dunbar, and Friedman (1976) used 
activity logs as part of a therapy program 
that included scheduling of individual rein- 
forcing activities. Enjoyability ratings of each 
activity logged were described as a potential 
dependent variable. Ad hoc activity sched- 
ules were used in three case studies described 
by Rush, Khatami, and Beck (1975). Activ- 
ity data were used to confront clients’ cog- 
nitively distorted interpretation of their be- 
havior. Another ad hoc use of activity logs 
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Table 1 
Average Mood and Events Data Across Subjects: Study 1 
Males* Females? Total 
Variable M SE M SE M SE 

Mean daily mood rating 4.91 -16 5.99 .16 5.63 91 
Mean daily pleasant events 3.66 1.05 4.12 1.66 3.97 1.48 
Mean daily unpleasant events 2.85 1.36 2.51 1.28 2.62 1,29 

an = 10. 

bn = 20. 


was described by McLean (1976), who used 
them primarily to assess improvement in be- 
havioral productivity as one alternative com- 
ponent of therapy for depression. 

Given the widespread use of pleasant event 
or activity monitoring in depression therapy 
programs and the empirical support for the 
relationship between pleasant events and 
mood, it is somewhat surprising that only 
activities associated with reward have been 
studied. All of the research cited has dealt 
with “pleasant events” or “positive activi- 
ties,” yet aversive events have been related 
to depression in a number of ways. Aversive, 
stressful life events have been found to pre- 
cede clinical depression (cf. Dohrenwend & 
Dohrenwend, 1974). Seligman (1975) has 
demonstrated empirically that noncontingent 
aversive events lead to a state of learned 
helplessness, which he associates with depres- 
sion. Lewinsohn, Lobitz, and Wilson (1973) 
found that depressed persons were particu- 
larly sensitive to aversive events, It would 
seem logical that mood would be related to 
aversive events in the daily life of normal 
persons, 

Two pilot studies were conducted to ex- 
plore the feasibility of assessing the relation- 
ship between unpleasant events and mood in 
a manner that could be adapted to clinical 
usage. In both studies events were listed by 
individuals in positive or negative columns on 
a daily log with a general definition as a 
guide. Pleasant events were defined as any 
event that is pleasant, enjoyable, or reward- 
ing, and unpleasant events were defined as 
any event that is unpleasant, aversive, or 
Punishing. Mood was assessed on a daily basis 
on a 0-10 rating scale, anchored at 0 = worst 


mood ever to 10= best mood ever. This 
scale was adopted for simplicity and ease of 
administration. Aitken (1969; Aitken & 
Zealley, 1970) described the use of a similar 
simple scale that correlated well with psy- 
chiatric ratings of depression, In both studies 
subjects were instructed to make their mood 
ratings and then fill out the events logs daily 
at the end of the day. 


Study 1 
Method 


Log sheets and standard instructions were given 
as a class assignment to 33 undergraduates enrolled 
in a psychology course. Usable data were obtained 
from 30 subjects (10 males and 20 females) who 
kept logs for approximately 2 weeks (1 for 13 days, 
13 for 14 days, 15 for 15 days, and 1 for 16 days). 


Results 


The actual events recorded varied some- 
what in specificity and type. Some sample 
pleasant events that were recorded by stu- 
dents included “read Sunday paper,” “good 
lunch,” “sunny day,” “letter from Adrienne,” 
“bought albums,” “good grade on bio test,” 
“Joni Mitchell concert,” “party,” “saw good 
movie,” “got high,” and “complimented by 
Mrs. F.” Some examples of unpleasant events 
were “bio test,” “dentist appointment,” “got 
parking ticket,” “missed bus,” “got blister 
playing squash,” “Pitt lost game,” “argument 
with roommate,” “did the laundry,” and 
“dull Geography class.” 

Means for data collected are shown in 
Table 1. Males’ ratings of their mood were 
significantly lower than those for females, 
(28) = 3.66, p < 01. Differences in number 
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. 
Table 2 
Average Intraindividual Correlations: Study 1 
M 
With With 
previous subsequent 
Correlated variables Males* Females? Totale events® events® 
Mood and pleasant events 65 55 58 ces AS 
Mood and unpleasant events —.36 — 48 —.45 —.11 00 
Pleasant and unpleasant events —.15 —.28 —.22 — — 
Mood and events (R) 75 68 70 — _ 
an = 10, 
bn = 20, 
en = 30. 


of events were not significant though in direc- 
tions consistent with the mood differences. 

Mean correlations were calculated using 
Fisher’s z transformation and were then con- 
verted back to Pearson product-moment cor- 
relations. No significant differences in corre- 
lations were found between sexes. Looking at 
the total group, the average correlations be- 
tween pleasant and unpleasant events with 
mood were all statistically significant. The 
relatively low, negative, average correlation 
between the two classes of events suggests 
that pleasant and unpleasant events make 
relatively independent contributions to mood. 
The average multiple correlations for pleas- 
ant and unpleasant events correlated with 
mood (see Table 2) were larger than either 
single correlation and this further suggests 
mood is a function of both pleasant and un- 
pleasant events. 

Lewinsohn and Libet (1972) posed the 
question of whether mood would be affected 
by the previous day’s activity or whether 
mood would affect the subsequent day’s activ- 
ity. This question can be answered by look- 
ing at correlations between mood and events 
of the previous day and between mood and 
events of the subsequent day, Table 2 shows 
that these mean cross-lagged correlations are 
quite low and comparable to those obtained 
by Lewinsohn and Libet. The same-day cor- 
relations were significantly larger than those 
for mood and events of the previous day: 
For mood and pleasant events, (29) = 9.30, 
< .01; for mood and unpleasant events, 
(29) = 5.40, p < .01, or for those for mood 


and events of the subsequent day, for mood 
and pleasant events, ¢(29) = 6.55, p < .01; 
for mood and unpleasant events, ¢(29) = 
5.95, p < .01. Thus there was no indication 
of causality from events of one day to mood 
on the next or from the mood of one day to 
the events of the next. Lewinsohn and Libet’s 
finding for pleasant events and mood was 
replicated and extended to unpleasant events 
as well. 

Although the methodology is directed 
toward intrasubject questions, intersubject 
data can also be derived. For the 30 subjects, 
average mood correlated .31 with average 
pleasant events and —.28 with average un- 
pleasant events. The multiple correlation be- 
tween mood and both event scores was .57. 
These data are consistent with the intra- 
subject data, though only the latter correla- 
tion is statistically significant (df = 28, p < 
01). A correlation of .46 (df = 28, p < .05) 
between average pleasant and average un- 
pleasant events suggests individual differences 
in list length for this self-monitoring format. 


Study 2 


The second study had three purposes: 
(a) to replicate the first study with regard to 
monitoring pleasant events; (b) to determine 
whether a value weighting for each event 
would yield an obtained reinforcement mea- 
sure that would enhance correlations with 
mood; and (c) to determine whether per- 
ceived contingency contributes to mood cor- 
relations, 
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Average Mood and Events Data Across 14 Subjects: Study 2 


Average On o eee EE EEE eee 


Contingent 
Raw score Weighted score score 
Variable M SE M SE M SE 
Mean daily mood ratings 5.83 .80 — — = — 
Mean daily pleasant events 5.53 1.96 32.52 12.51 19.33 7.66 
Mean daily unpleasant events 3.68 1.71 21.00 10.60 11.22 8.27 


Method 


Thirty-four undergraduates enrolled in a psy- 
chology course were assigned a choice of projects, 
one of which was to participate in this study. Four- 
teen students elected to participate and kept moni- 
toring data for 14 days. Instructions were the same 
as in Study 1, except for the addition of weighting 
and perceived contingency instructions. After re- 
cording each event, subjects were instructed to give 
it a value on a 0-10 scale. For pleasant events, 10 
was to signify “an extremely enjoyable or pleasant 
event,” and for unpleasant events, 10 was to signify 
“an extremely aversive or unpleasant event.” Zero 
signified neutrality on either scale. 

To assess perceptions of contingency, participants 
were asked to indicate how many of the value points 
assigned to each event were directly attributable to 
themselves or their own behavior (i.e. their effort or 
lack of effort or their skills or lack of skill). This 
method allowed for a more continuous assessment of 
degree of perceived response contingency than would 
a dichotomous yes or no. Attribution research (cf. 
Weiner et al, 1971) suggests that such judgments 
are continuous. 


Results 


The average daily mood rating for these 
subjects was 5.83, which corresponds closely 
with Study 1. Subjects in this study (see 
Table 3) recorded more events, both pleasant, 
t(42) = 2.652, p < .05, and unpleasant t(42) 


Table 4 


= 2.050, p < .05, than subjects in Study 1. 
These differences may be due to a different 
set acquired from the class discussion of the 
project or to self-selection differences. In 
either case they do not effect the hypotheses 
in question. 

Mean correlations between mood and 
events (see Table 4) were slightly smaller 
but comparable to those obtained in Study 1, 
Again, pleasant and unpleasant events were 
uncorrelated with each other on the average, 
but each contributed to mood variance, 

The use of weightings enhanced the aver- 
age intraindividual correlations to a minor 
degree. A comparison of pairs of correlations 
with mood between unweighted and weighted 
pleasant events scores indicated that the 
weighted correlations were larger to a mar- 
ginally significant degree, t(13) = 1,809, 
p < .05, one-tailed. The increase in the mean 
unpleasant events correlation with weighting 
was not significant, t(13) = 1.240. Thus the 
evidence suggests that weighting event scorés 
may have only a slight value in identifying 
relationships with mood more accurately. 

The average correlations of mood and 
self-attributed points were of the same mag- 
nitude as the correlations between mood and 


bject Correlations: Study 2 


ss Average Intrasubject and Intersu 


Intrasubject M 


Intersubject M 


Raw Weighted Contingent 


Raw ‘Weighted Contingent 


Correlated variables scores scores scores scores scores scores 
Mood and pleasant events „51 59 52 12 18 P 
ood and unpleasant events asana —.27 = 13. = ae 
Pleasant and unpleasant events 08 —.01 03 cok 62 ay 
Mood and events (R) 70 15 70 ‘33 34 ; 
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number of events. No particular advantage 
appears to accrue from specifying reinforce- 
ment value that is perceived as contingent on 
the person’s own behavior. 

Again it is notable that the average corre- 
lation between pleasant and unpleasant events 
was close to zero. Each class of events ap- 
pears to contribute to mood independently. 
Average multiple correlations between mood 
and both sets of event scores (Table 4) bear 
out this assertion. 

As can be seen in Table 4, interindividual 
correlations between mood and events were 
in the predicted directions but were quite 
small and not statistically significant. Corre- 
lations between pleasant and unpleasant 
events scores were significant for raw scores 
(p< .01) and weighted scores (p < .05). 
These suggest significant individual differ- 
ences in reporting rates for pleasant and un- 
pleasant events even though these classes are 
relatively independent within subjects. Mul- 
tiple correlations for mood with the events 
scores combined were again consistent with 
intrasubject findings but were not statisti- 
cally significant. 

It is particularly interesting that perceived 
contingent event scores do not enhance cor- 
relations with mood. As an additional way of 
examining whether perceived contingency is 
related to mood, a ratio of self-attributed 
points to total weighted scores was calcu- 
lated for each subject for both pleasant and 
unpleasant events, Averaged across all indi- 
viduals, 59.4% of the value of pleasant events 
was self-attributed, whereas 49.5% of the 
value of unpleasant events was self-attributed. 
Correlations between these ratios and average 
mood would indicate whether subjects who 
tend to attribute a greater proportion of their 
pleasant or unpleasant events value to them- 
selves would be more or less depressed than 
subjects who see these events as less con- 
tingent. The correlation between the ratio of 
perceived contingent pleasant events and 
mood was —.14. The correlation between 
the ratio of perceived contingent unpleasant 
events and mood was .004. Needless to say, 
neither was significant. Thus there was no 
evidence that tendency to perceive con- 


tingency between events and behavior influ- 
enced mood in this study. 
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Discussion 


These two studies suggest that even with 
this simple methodology, it is feasible to 
assess the relationships between mood and 
events, both pleasant and unpleasant, in 
meaningful ways. Despite individual differ- 
ences in recording rates, pleasant and un- 
pleasant events were recorded with relative 
independence, and each correlated with mood, 
The magnitude of correlations between mood 
and pleasant events using these less struc- 
tured methods was comparable to that ob- 
tained by Lewinsohn and Libet (1972) and 
Lewinsohn and Graf (1973). Although the 
correlations with unpleasant events were 
regularly of a lesser magnitude, they never- 
theless make a significant additional contribu- 
tion to mood. 

Evidence from the second study suggests 
that a simple scaling of event magnitude con- 
tributes only marginally to greater precision 
in identifying relationships between mood 
and events. Differentiating the value by the 
degree to which events are self-attributed 
did not seem to contribute to correlations 
with mood for either pleasant or unpleasant 
events. Events are related to mood whether 
contingent on one’s own behavior or entirely 
external in origin. 

Thus there appears to be a potential 
clinical value to self-monitoring unpleasant 
as well as pleasant events in conjunction with 
behavior therapy for depression. Unpleasant 
events that correlate with mood may also be 
appropriate target activities for modification 
(Lewinsohn, 1976). The data generated by 
monitoring unpleasant events may also con- 
tribute to helping clients make more accurate 
discriminations about functional relationships 
between their behavior and their mood. As a 
result, they may be able to make more realis- 
tic evaluations and set more realistic goals 
via techniques analogous to those used by 
Fuchs and Rehm (1977) or Rush et al. 
(1975). 

Certain qualifications must accompany 
these conclusions. First, these studies should 
indeed be considered as only promising pilot 
studies. Problems concerning the precise 
definition and validity of unpleasant events 
remain, as they remain for pleasant event re- 


MOOD AND EVENTS 


cording. Further basic research is certainly 
necessary. Second, these studies suggest po- 
tential for monitoring unpleasant events only 
in terms of intrasubject investigations. Inter- 
subject correlations were largely nonsignifi- 
cant; individual differences in recording rates 
were evident; and sex differences were ob- 
tained. The development of structured meth- 
ods for assessing unpleasant events, parallel 
to those available for assessing pleasant 
events, may allow for valid nomothetic 
studies. 


Reference Note 


1. MacPhillamy, D., & Lewinsohn, P. M. The Pleas- 
ant Events Schedule, Unpublished manuscript, 
University of Oregon, 1971. 


References 


Aitken, R. C. B. Measures of feeling using analogue 
scales. Proceedings of the Royal Society of Medi- 
cine, 1969, 62, 989-993. 

Aitken, R. C. B., & Zealley, A. K. Measurement of 
mood. British Journal of Hospital Medicine, 
1970, 4, 214-224. 

Anton, J. L, Dunbar, J, & Friedman, L. Anticipa- 
tion training in the treatment of depression. In 
J. D. Krumboltz & C. E. Thoresen (Eds.), Coun- 
seling methods. New York: Holt, Rinehart & 
Winston, 1976. 

Dohrenwend, B. P., & Dohrenwend, B. S. (Eds.). 
Stressful life events: Their nature and effects. 
New York: Wiley, 1974. 

Ferster, C., B. A functional analysis of depression. 
American Psychologist, 1973, 28, 857-870. A 
Fuchs, C, Z.ą, & Rehm, L. P. A self-control behavior 
therapy program for depression. Journal of Con- 
sulting and Clinical Psychology, 1977, 45, 206-215. 


859 


Lewinsohn, P. M. A behavioral approach to depres- 
sion. In R. M. Friedman & M. M. Katz (Eds.), 
The psychology of depression: Contemporary 
theory and research, New York: Wiley, 1974. 

Lewinsohn, P. M. Activity schedules in treatment of 
depression. In J. D. Krumboltz & C. E. Thoresen 
(Eds.), Counseling methods. New York: Holt, 
Rinehart & Winston, 1976. 

Lewinsohn, P. M., & Graf, M. Pleasant activities 
and depression. Journal of Consulting and Clini- 
cal Psychology, 1973, 41, 261-268. 

Lewinsohn, P. M., & Libet, J. Pleasant events, activ- 
ity schedules and depression. Journal of Abnormal 
Psychology, 1972, 79, 291-295. 

Lewinsohn, P. M., Lobitz, W. C., & Wilson, S. Sensi- 
tivity of depressed individuals to aversive stimuli, 
Journal of Abnormal Psychology, 1973, 81, 259- 
263. 

MacPhillamy, D. J., & Lewinsohn, P. M. Depression 
as a function of levels of desired and obtained 
pleasure. Journal oj Abnormal Psychology, 1974, 
83, 651-657. 

McLean, P. Therapeutic decision-making in the be- 
havioral treatment of depression. In P. O. David- 
son (Ed.), The behavioral management of anxiety, 
depression and pain. New York: Brunner/Mazel, 
1976. 

Rush, A. J., Khatami, M., & Beck, A. T. Cognitive 
and behavior therapy in chronic depression. Be- 
havior Therapy, 1975, 6, 398-404. 

Seligman, M. E. P. Helplessness: On depression, de- 
velopment and death, San Francisco: Freeman, 
1975. 

Weiner, B. et al. Perceiving the causes of success and 
failure. Morristown, N.J.: General Learning Press, 
1971. 

Wener, A. E., & Rehm, L. P. Depressive affect: A 
test of behavioral hypotheses. Journal of Abnormal 


Psychology, 1975, 84, 221-227. 


Received March 2, 1977 m 


Journal of Consulting and Clinical Psychol 
1978, Vol. 46, No. 5, 860-868 pct 


Cognitive and Personality Factors in Suicidal Behavior 
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The role of aggression in suicidal behavior is studied. The personality function- 
ing of 20 suicide attempters, 20 nonsuicidal psychiatric controls, and 20 suicide 
completers was assessed using the Rorschach. There were 11 women and 9 men 
in each group, and their ages ranged from 21 to 63. Feffer’s role-taking task 
provided a test of the cognitive functioning of the first two groups. All three 
groups experienced the breakthrough of more aggressive than libidinal drive 
derivatives, but no significant differences between the groups were found. Only 
the suicide attempters’ aggressive responses were more primitive than their 
libidinal responses. On the role-taking task, the suicidal group’s cognitive func- 
tioning in the neutral situation was superior to their functioning in the aggres- 
sive one. The control group yielded no such difference. The suicidal group's 
performance in the aggressive situation was also significantly inferior to the 
control group. These results are interpreted as underscoring the role of cogni- 


tion in symptom choice. 


Conflicts over aggression play a central role 
in the psychoanalytic theory of suicide. Freud 
(1917/1957) struggled with the problem of 
how the ego, with its self-love and narcissistic 
libido, could allow its self-destruction. The 
answer, he theorized, lies in the ego’s treating 
itself as an object, directing against that ob- 
ject the hostility and sadism that were its 
original reaction to objects in the outside 
world. Freud (1920/1950) later added to his 
theory the concept of a drive toward death, 
to be found in the controlling, coercing, and 
punishing components of the superego. In de- 
pression, the superego obtains a hold on con- 
sciousness and, as a “pure culture of the death 
instinct,” drives the ego into death. 

Other psychoanalytic thinkers have modi- 
fied this theory. Zilboorg (1937) wrote that 
suicide is likely only when the individual has 
identified with a dead person, and that the 
process of identification has to take place 


This article is based on a doctoral dissertation sub- 
mitted by Andrew M. Geller to Yeshiva University 
under the supervision of Alvin Atkins. 
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during childhood or adolescence, at a time 
when the incorporated person is already dead. 
Menninger (1938) agreed with Freud that 
suicide is the wish to kill another turned in- 
ward, but he also saw it as the ego’s punish- 
ing itself for that crime and wishing to be 
killed, 

Although the early psychoanalytic literature 
on suicide was based primarily on case stud- 
ies, later writers have attempted broader, 
more experimental analyses of the role of ag- 
gression in suicidal behavior. Their results are 
inconclusive. Several researchers found that 
suicidal individuals are more hostile than non- 
suicidals (Vinoda, 1966), that their dreams 
contain more themes of violence than control 
subjects (Raphling, 1970), and that they tend 
to resent those on whom they depend (Lester, 
1969). Others found no significant differences 
in aggressive impulses or ideation between 
suicidal and nonsuicidal subjects (Eisenthal, 
1967; Fisher, 1971). 

Little research has focused on the role of 
cognitive functioning in suicidal behavior, al- 
though there is evidence to suggest that cog- 
nitive dysfunction can lead to symptomatic 
behavior. Feffer (1959) extended Piaget’s 
(1950) concept of decentering activity in the 
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physical, inanimate world to an analysis of 
the individual’s cognitive structuring of so- 
cial content. In his theory, roles and role 
reciprocals are the social polarities in a rela- 
tionship of subject and interpersonal object 
that parallel the subject-object relationship 
of impersonal cognition. In order for subjects 
to construct the reality of the other accurately, 
to be able to modulate their behavior in an- 
ticipation of the behavior of the other, they 
must be able to coordinate simultaneously 
their role and the role of the other in the 
same situation. They must be aware of both 
perspectives. Prior to attainment of this abil- 
ity to simultaneously decenter in social situa- 
tions, the subjects are able only to react to 
the behavior of the other. At yet a more 
primitive level, subjects have great difficulty 
even being aware of perspectives different 
from their own, and their behavior is almost 
exclusively egocentric. 

Lowenherz and Feffer (1969) found that 
adult subjects had greater difficulty coordinat- 
ing role perspectives when the role was one 
that was defensively isolated, that is, when 
the subject had defined one of the role recip- 
rocals as “least like me” or “personally unac- 
ceptable to me.” This elaboration of Feffer’s 
model of interpersonal cognitive development 
has been used to explain symptomatic behav- 
ior. Such behavior may be thought of as the 
result of isolation between dynamically rele- 
vant schemata. The subjects’ relationship to 
reality is distorted, as they focus on only one 
dimension in the situation, unaware of the 
corrective influence of the opposite perspec- 
tive. For example, Ward (1976) tested the 
hypothesis that the “other” in psychotic de- 
lusions, particularly if that other is a malev- 
olent, threatening figure, is a representation 
of defensively isolated aspects of the self. She 
found that the subjects indeed functioned at 
a more primitive cognitive level when taking 
the roles with delusional themes than they 
did when working with neutral themes. 

The present study examined both cognitive 
and defensive functioning in seriously suicidal 
individuals, Given the central rule that aggres- 
sion plays in the psychoanalytic theories of 
suicidal behavior, it was predicted that within 
the suicidal group there would be more break- 
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through of primary process elements in re- 
sponse to aggressive stimuli or impulses than 
in response to libidinal stimuli and impulses, 
and the defenses against these breakthroughs 
would be less effective for aggressive than for 
libidinal material. It was expected that as a 
result of these weak defenses, the thinking of 
the suicidal subjects would contain a greater 
proportion of primary process elements than 
would the thinking of the control group in 
dealing with aggressive stimuli and impulses. 
Finally, it was predicted that subjects whose 
symptomatic behavior included making serious 
suicide attempts would have difficulty co- 
ordinating the roles represented by that symp- 
tom, namely victim and aggressor. This dif- 
ficulty has been interpreted as a function of 
the subject’s more primitive cognitive func- 
tioning when assuming the role of the aggres- 
sor, which is the reciprocal of the sympto- 
matically exaggerated role orientation of the 
victim. 


Method 


Subjects 


There were two groups of subjects, one experi- 
mental and one control. Each group was comprised 
of 20 subjects with a wide range of age, education, 
and socioeconomic status. All subjects were inpatients 
in a private psychiatric hospital in New York City. 

The first experimental group was made up of 20 
randomly selected patients who had made serious 
suicidal attempts. Only subjects whose suicide at- 
tempts were potentially lethal were included in this 
group. Patients who met that criterion but at the 
time scheduled for testing were overtly psychotic 
(actively hallucinating and delusional) were excluded 
from the study. Also excluded were patients who were 
diagnosed as suffering from more than minimal brain 
damage, more than minimal retardation, or a life- 
threatening physical illness. Although each subject’s 
diagnosis, at both admission and discharge, medica- 
tion regimen, family history, sex, marital status, edu- 
cation, and socioeconomic level were noted, these 
factors played no part in determining eligibility for 
the study. 

The control group, also comprised of 20 subjects, 
was made up of randomly selected inpatients at the 
same hospital. Each subject had no history of sui- 
cidal ideation, suicidal impulses, or gestures. Both 
the history of the subject’s present illness and 
past history as recorded in the chart, plus consulta- 
tion with the subject’s therapist, were used in mak- 
ing this determination. As with the experimental 
group, patients who were actively psychotic or suf- 
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fering from serious organic brain disease, mental re- 
tardation, or a life-threatening physical illness were 
excluded. 

A third “group” was composed of data from pa- 
tients who had actually committed suicide. Twenty 
Rorschach protocols were randomly collected from 
records of subjects who were tested while inpatients 
at the hospital and who within 1 year of that test- 
ing killed themselves. The use in this study of the 
available data from this group is an attempt to par- 
tially eliminate a conceptual difficulty found in most 
research on suicide. In general, such studies have 
examined subjects who attempted suicide and main- 
tained that the findings were also applicable to those 
individuals who succeed in killing themselves. In 
this study, the data from those who committed sui- 
cide were used as a comparison with the information 
obtained from the suicide attempter group, providing 
some notion of the validity of that information. 

The demographic characteristics of all three groups 
were similar. The ages ranged from below 20 to 
over 60, with a majority of subjects between 20 and 
50 years of age. There were 11 women and 9 men 
in each group, and all three groups were virtually 
all Caucasian with the exception of 2 black sub- 
jects. Over half of the subjects in each group were 
single. In the nonsuicidal group, over half were 
Catholic, but religion was fairly equally represented 
in the other groups. All subjects in each group were 
high school or college graduates. 


Procedure 


All testing was done in individual sessions held in 
an office on the subject’s inpatient unit, Inclusion in 
the study was voluntary, and each subject was as- 
sured of confidentiality and anonymity. 

There were two parts to the stimulus situation. 
The Rorschach was used to test the defensive func- 
tioning of the subjects in the presence of aggressive 
and nonaggressive stimulation. Feffer’s (1959) role- 
taking task (RTT) was used to measure their cog- 
nitive functioning in aggressive and neutral situations, 
The two tests were presented to the subjects in 
counterbalanced order. 


Rorschach 


The Rorschach inkblots were administered to each 
subject in the usual fashion. Subjects were first 
shown the 10 cards in Sequence and were asked to 
tell what they looked like or reminded them of. On 
the first two cards, if only one response was given, 
another was requested. Thereafter, no further en- 
couragement was given by the examiner. After the 
subjects had responded to all 10 cards, they were 
asked to go through each one again with the ex- 
aminer to explain what it was about the blot that 
determined the response given. 

Subjects’ Tesponses to the Rorschach were scored 
according to Holt’s (Note 1) system for scoring 
primary process on the Rorschach. For Holt, the 
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distinction between primary and secondary process 
is a function of the degree to which drive dominates 
logic and reality in a response and the nature of 
the drive itself. Holt’s system measures manifesta- 
tions of primary process on the Rorschach and the 
means by which a person tries to control it and de- 
fend against the anxiety its emergence presumably 
entails. Scorable Rorschach responses are divided 
into two groups, those motivated by libidinal drives 
and those resulting from aggressive drives. These two 
categories are further divided into 11 subgroups, such 
as oral, anal, aggression-subject, and aggression-ob- 
ject. Each response is scored 1 or 2 depending on 
its level of what Holt called “primariness.” The 
arbitrary boundaries between the two levels are de- 
fined by two points. One is a continuum from raw, 
shocking, blatant, or primitive forms of the drive 
in question (Level 1) to civilized, socially accepta- 
ble forms that are more appropriate to social com- 
munication between strangers in a professional situa- 
tion (Level 2). The second criterion is the degree 
to which the response focuses on the drive-relevant 
aspect of an image, a division useful primarily for 
libidinal responses. 

Because this binary division seemed too crude, Holt 
introduced a 5-point scale of Defense Demand (DD), 
which serves primarily to further differentiate re- 
sponses along a continuum of primariness, with the 
DD score rising from 1 to 5 as the response be- 
comes less socially acceptable and more bizarre. The 
DD score is a function only of the underlying idea 
of the response. However, the way in which that 
idea is expressed is obviously also important in an 
assessment of primary process. Holt dealt with this 
by including 47 categories in which to score meth- 
ods of control or defense manifested in the response. 
The effectiveness of these defenses is recorded by 
yet another score, Defense Effectiveness (DE). This 
rating is a combined measure of a response’s form 
level and the affect accompanying it, with a posi- 
tive DE score reflecting a well-defended response 
and a negative score indicating one that was poorly 
defended. 


Role-Taking Task 


The assigned content of the RTT revolved around 
two themes, “hurt” and “please.” Hurt was chosen 
to stimulate stories containing both aggressor and 
victim, with the role reciprocals viewed as central to 
the choice of suicide as a symptom behavior. The 
control stimulus was please, a theme chosen for its 
relatively neutral quality with regard to social de- 
sirability and dynamic impact. 

Each subject was shown a white 5 X 8 inch (12.5 X 
20.5 cm) index card on which two identical stick 
figures, separated by the word HURT, were drawn. 
The subject was directed to use the card as a take- 
off point for making up a story about two people, 
one of whom hurts the other. When the initial story 
was completed, the subject was shown another card 
identical to the first, except that an arrow was 
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drawn under the figure on the left. This time the 
directions were to “make believe that you are one 
of the characters in the story, the figure on the left. 
qj Tell the same story that you told before, but retell 
it from the perspective of that person, as he or 
© he would tell it.” Following completion of this 
story, the subject was shown a third card identical 
to the others, except that the arrow was now under 
ta the figure on the right. Instructions for this card 
were the same as for the second card, except this 
time the subject was asked to take the role of the 
other character in the story. 
i The subject was then shown another three-card 
series, with the word PLEASE replacing the word HURT 
on otherwise identical cards, and was directed to go 
through the RYT procedure again, this time telling 
a story about two people, one of whom does some- 
thing to please the other. Half of the subjects were 
"4 presented with the please theme before the hurt theme. 
The two sets of stories produced by each sub- 
ject were then scored to determine the subject’s 
decentering level for each of the two themes. The 
scores obtained for the please stories were taken as 
a baseline of the subject's ability to coordinate roles 
and their reciprocals in a relatively neutral situa- 
tion. Scores for stories containing the roles of vic- 
Stim and aggressor were used to indicate the change 
in the subject’s cognitive functioning when dealing 
with defensively isolated aspects of personality. 

All stories were scored twice, in accordance with 
the RTT coordination method and the RTT differ- 
entiation method. Although both methods assess the 
subject’s ability to take different role perspectives, 
they provide different measures of this ability. The 
RTT coordination method focuses on the degree and 
quality of coordination between the different per- 
spectives. Increasingly higher scores are given to re- 

Bey fect superior ability to assume different roles and 
shift from one to the other while maintaining inner- 
role consistency and balance in the situation de- 
scribed. At one extreme are the categories for ob- 

=, vious inconsistency between the characters’ view- 
Points, At the other extreme, the subject manifests 
an ability to synthesize the two perspectives, with 

| each character mindful of the external role orientation 
of the other. 

The RRT differentiation method provides a mea- 
Sure of the subject’s ability to differentiate and co- 
ordinate attributes of the “self” and “other” within 
a single retelling of the initial story. Each retelling 
_ Provides a measure of decentering ability from the 
perspective of one given role. The subject’s ability 
to decenter was thus measured in four different situa- 
tions, as victim, as aggressor, as “pleaser,” and as 
“pleased,” or in other words, as the subject of dy- 
namically isolated role attributes and as the subject 
of attributes not dynamically isolated. Low scores 
indicate little or no shifting of perspective from one 
character to the other, whereas higher scores re- 
flect an increasingly sophisticated ability of one char- 
acter to be taken as the object of an internalized 

_ State of the other. 


ae 
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Table 1 
Comparisons Within Groups of Responses 
With Libidinal or Aggressive Content 


—— nl 


Group n Libidinal Aggressive T” 


Mean percentage 


Attempters 19 11.25 23.25 29.5** 
Completers 18 8.10 18.05 DTS 
Nonsuicidals 17 13.80 17.60 30.5* 
Mean defense demand 
Attempters 19 1.48 2.00 AAP stat 
Completers 20 1.80 2.22 73.0 
Nonsuicidals 18 1.83 2.34 63.0 
Mean defense 
effectiveness Ur 
Attempters AS 58 122 
n 14 19 
Completers 1.02 95 98 
n 14 20 
Nonsuicidals 68 .70 118 
n 15 18 


a Wilcoxon matched-pairs signed-ranks test. 
b Mann-Whitney U. 

*p < 05. 

+*+ p < .005. 


Reliability of Scoring 


Protocols from the Rorschach and the RTT were 
scored by both the examiner and another scorer, a 
graduate student in clinical psychology. Neither 
scorer knew to which subject group the protocol 
being scored belonged. Interjudge reliability coeffi- 
cients were obtained for scores of the Rorschach and 
the RTT. The overall interjudge correlation on the 
Rorschach scores was .83. The coefficients were higher 
for the RTT. For the coordination scores the relia- 
bility coefficient was .88, and it was .96 for the dif- 
ferentiation scores. For all_ numerical scores on which 
the judges differed, the final score was the average 
of the original two. 


Results 


The results of this study fall into two 
groups, those that indicate the subjects’ per- 
sonality functioning as reflected in their Ror- 
schach responses and those based on the RTT 
scores that describe their cognitive function- 
ing in interpersonal situations. The analyses 
of the results of this study were done with 
nonparametric tests, as the results were not 
normally distributed and did not have equal 


variances. 
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Table 2 
Comparison of All Rorschach Scores: Suicide A ttempters and Completers 
a kw 
Group M% n Us MDD n Us M DE n Ua 
Aggressive 
Attempters 23.25 20 2.00 20 58 19 
160 182 156 
Completers 18.05 20 2.22 20 95 20 
Libidinal 
Attempters 11,25 20 1,48 20 45 14 
163 183 73 
Completers 8.10 20 1.80 20 1.02 14 


Note. DD = defense demand; DE = defense effectiveness. 


a Mann-Whitney U; all values are ns, 


Personality Functioning: Rorschach Results 


Table 1 summarizes the within-group com- 
parisons for all three subject groups. As pre- 
dicted, the suicidal subjects, both attempters 
and completers, experienced breakthroughs of 
significantly more aggressive than libidinal 
drive material in responding to the Rorschach, 
The nonsuicidal control group also had a 
greater percentage of aggressive than libidinal 
drive-related responses. However, the level at 
which that difference is significant was lower 
for the control group than for either of the 
suicidal groups. 

The aggressive responses of the suicide at- 
tempters were significantly more primitive 
than their libidinal responses as indicated by 
the differences in DD scores. Comparison of 
those scores for the nonsuicidal control group 
did not yield significant differences between 
their aggressive and libidinal responses. The 
degree to which their responses were rated as 
primitive or bizarre was not a function of the 
particular drive content of the responses, al- 
though this was the case for the suicide at- 
tempters. 

The suicide completer group also showed 
no significant differences between the DD 
Scores of their aggressive and libidinal re- 
sponses. This was the one instance in which 
their results did not parallel those of the at- 
tempter group. 

Contrary to the Prediction, no significant 
differences were found between the DE scores 
of the libidinal and aggressive responses for 
both suicidal groups. It therefore appears that 
even the attempter group, which experienced 


the breakthrough of a great deal of primitive 
aggressive drive material, defended against 
their aggressive responses as well as they did 
against their libidinal responses. The com- 
pleter group also was equally effective in deal- 
ing with the aggressive and libidinal material 
in the records. 

Comparisons between the three subject 
groups on each of the Rorschach measures 
studied were all nonsignificant. The suicidal 
groups did not seem to be dealing with more 
aggressive drive material than was the non- 
suicidal group. In addition, the suicidal 
group’s defenses against those impulses were 
not more primitive or less effective than were 
the defenses of the control group. Table 2 
summarizes the between-group comparisons 
for the two suicidal groups, and Table 3 con- 
tains the results of comparisons between the ; 
suicidal and control groups. 


Cognitive Functioning: RTT Results 


The results of the RTT coordination scores 
fully support the hypothesis that suicidal 
subjects function at a more primitive cognitive 
level than nonsuicidal subjects when presented * 
with aggressive stimuli. Table 4 summarizes 
the within-group comparisons of coordination 
scores. Suicidal subjects were clearly deficient 
in their ability to maintain a consistent theme 
while taking the roles of both aggressor and vic- 
tim, when their performances were compared 
to their ability to coordinate perspectives in 
a neutral situation. Nonsuicidal subjects mani- 
fested no such difficulty when their perform- 


Table 3 
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Comparison of All Rorschach Scores: Suicidal and Control Groups 


Group M% n U» MDD n Us M DE n Us 
Aggressive 
Suicide attempters 23.25 20 2.00 20 58 19 
180 187 
Nonsuicidals 17.60 20 2.34 20 .70 18 ie 
Boi 195 197 154 
Suicide completers 18.05 20 2,22 20 95 20 
Libidinal 
Suicide attempters 11.25 20 1,48 20 45 14 
k 193 182 A 93 
Nonsuicidals 13.80 20 1.83 20 68 18 
Cas 154 194 107 
Suicide completers 8.10 20 1,80 20 1.02 14 


^a Mann-Whitney U; all values are ms. 


ances with the aggressive and neutral themes 
ere compared. 
Although the coordination scores of the sui- 
cidal subjects for the hurt situation were sig- 
nificantly lower than their scores in the please 
ituation, it is important to note that both of 
he mean coordination scores of the nonsui- 
idal control group were lower than either of 
the suicidal groups. Possibly the control 
gtoup’s lower coordination scores were a mani- 
estation of the effect that their different psy- 
chopathologies had on their cognitive perform- 
ance in that interpersonal situation. Perhaps 
the suicidal subject, in dealing at an uncon- 
scious level with aggressive impulses, has 
overemphasized the role of benevolent or nur- 
turing person. In any case, this difference be- 
tween the groups requires that the coordina- 
tion scores of each subject group in the ag- 
Stessive situation be compared to their scores 
the neutral situation in order to measure 


icidal group differed significantly from that 
of the control group, with the suicidal sub- 
jects generally performing at a higher level in 
the heutral situation and the nonsucidial sub- 
jects doing better in the aggressive situation. 
he mean difference between hurt and please 
r the suicidal and nonsuicidal groups was 

a and 1.45 respectively (U = 132, $ 
05), 


Note. DD = defense demand; DE = defense effectiveness, 


The results of the analyses of the differen- 
tiation scores of the two subject groups also 
support the hypotheses. The cognitive func- 
tioning of the suicidal subjects when taking 
the isolated role of aggressor was significantly 
more primitive than their performance in the 
ego-syntonic role of victim. There were no 
significant differences in their abilities to as- 
sume the role reciprocals with a neutral theme 
or in the nonsuicidal subjects’ performances 
in any of the four roles. Interestingly, the sui- 
cidal subjects achieved their lowest differen- 
tiation scores among all roles when taking the 
role of aggressor and the highest of all their 
differentiation scores for the role of victim. 
These results are displayed in Table 5. 

Comparisons between the two groups for 
the differentiation scores, found in Table 6, 
also support the prediction. The magnitude of 
the difference between the suicidal group’s 
scores in the roles of aggressor and victim was 
significantly greater than the magnitude of the 
difference between the control group’s scores 
for those roles. There were no such findings 
in the between-group comparison for the neu- 
tral situations. 


Discussion 


This study examined two major areas of 
functioning, cognitive and defensive. The re- 
sults fully supported the hypotheses about — 
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Table 4 

Comparisons Within Groups of Role-Taking 
Task Coordination Scores for Dynamically 
Isolated and Relatively Neutral Role 
Reciprocals 


_—_— 


M coordination 


Group ` Hurt Please T* 
Suicidal 18.15 20.90 37* 
Nonsuicidal 17.95 16.50 70 
Note, n = 18. 


s Wilcoxon matched-pairs signed-ranks test. 
*p < 025. 


cognitive functioning. The suicidal subjects 
differed from the control group in a very de- 
fined manner. When asked to take the re- 
ciprocal roles of aggressor and victim in a 
story about one person hurting another, a 
situation that highlighted what is presumably 
the central conflict in suicidal behavior, these 
subjects had difficulty in coordinating the 
same thematic material from two opposite per- 
spectives. In addition, they were more adept 
at assuming the ego-syntonic role of victim 
than the ego-alien role of aggressor, with sev- 
eral of the suicidal subjects actually comment- 
ing that they “just couldn’t” tell the story 
from the aggressor’s point of view. These same 
subjects performed significantly better on all 
measures when taking roles in the dynamically 
neutral situation of one person pleasing an- 
other. They had less difficulty coordinating 
the two roles and took the roles of both 


Table 5 

Comparisons Within Groups of Role-Taking 
Task Differentiation Scores for Dynamically 
Isolated and Relatively Neutral Roles 


ee aaaaaaaaaaaaaasaaaasasasssssssssħÃōŐ— 


Group M differentiation Ts 
Aggressor Victim 
Suicidal (17) 2.67 3.33 21.0* 
Nonsuicidal (14) 2.75 2,44 33.5 


a E ee el eg 
Pleaser Pleased 


Suicidal (17) 3.27 3.14 80.5 
Nonsuicidal (15) 2.68 2.85 52.0 
SS SSO 2 Se ea 
Note. Numbers in parentheses are ns, 

5 piny matched-pairs signed-ranks test. 
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pleaser and pleased with equal facility, For 
the nonsuicidal subjects, there were no sig- 
nificant differences in their performance in 
either the aggressive or neutral situation, and 
their scores in the aggressive situation were 
significantly higher than those of the suicidal 
group. 

The results of this study that pertain to de- 
fensive functioning partially supported the 
hypotheses. As predicted, suicidal subjects ex: 
perienced breakthroughs into consciousness o 
more aggressive drive-related than libidinal 
drive-related material. Subjects who later com- 
mitted suicide also seemed to have been deal- - 
ing with more aggressive impulses during test- i 
ing. In addition, the aggressive responses of 
subjects who had attempted suicide were more 
primitive and more bizarre than their libidinal 
responses. 

Some of the results did not turn out as pre- 
dicted. Although, as noted above, the aggres- 
sive responses of the suicide attempters we 
more primitive than their libidinal responses, | 
there were no significant differences in the 
suicidal subjects’ ability to defend against 
either type of response. There would seem to 
be at least two explanations for this. It ma 
be that Holt’s DE score, used in this study 
as a measure of the sophistication of a sub- 
ject’s defensive functioning, was not the most 
appropriate instrument for the task. That isy 
individuals may defend against bizarre of 
primitive responses in ways that are more oF 
less immature or pathological. However, if the 
defense mechanisms, regardless of their sophis- 
tication, succeed in protecting the subjects 
from conscious anxiety or confusion, then they 
have been effective and would be scored as 


Table 6 

Comparisons of Mean Differences Between 
Differentiation Scores for Dynamically Tsolated 
and Relatively Neutral Role Reciprocals 


— 


M difference 
between Suicidal Nonsuicidal U" 
fe EST ak sic 2 lee E 
Aggressor—victim —.67 +.36 92° 
Pleaser-pleased +.06 —.31 194 


A LS eee ee 


Note. n = 20 for all four conditions. 
* Mann-Whitney U. 
*p<.01. 
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such. So, in interpreting these findings, it depression, whether suicidal ideation is pres- 
could be said of suicidal individuals that ent or not. Silverman (1966), using the tech- 
| though in aggressive situations they may in- nique of subliminal stimulation, showed that 
deed use defenses that are developmentally unconscious aggressive impulses are directly 
immature, they do not necessarily decompen- related to ego pathology in schizophrenia, 
pate in such cases to any greater degree than One therefore might well view various forms 
in other circumstances. Indeed, nowhere in of psychopathology not as discrete entities but 
the literature is it suggested that a clear sign as forming a continuum of symptomatic €x- 
of high suicide potential is a person’s overt pressions of universal conflicts. Both suicidal 
disorganization in response to aggression. and nonsuicidal psychiatric subjects must deal 
Rather, the response is said to be an uncon- with a large amount of aggressive drive-re- 
scious regression of the ego to more primitive lated material, because aggressive conflicts are 
forms of defense. a major factor in pathological ego functioning. 
There is another explanation for the lack Perhaps the overall severity of the pathology, 
of significant differences between the libidinal whatever its form, is a function of the extent 
and aggressive DE scores of the suicidal sub- to which drive-related material breaks into 
jects. It is possible that there actually was no consciousness, 
difference in the maturity of these subjects’ This is not to imply that any such break- 
defensive responses to the breakthrough of through is pathological. It is clear that ag- 
various types of drive-related material. Their gressive impulses are an important component 
ego functioning was not clearly defined under of the healthiest personality. The presence of 
a variety of different circumstances, instead it these impulses can be seen as evidence of ego 
remained fairly constant throughout, Such an dysfunction only when it leads to severe anx- 
interpretation can be expanded to deal with jety or when aggression dominates the per- 


the other two findings in this study that were sonality to the exclusion of other drives. 
not predicted by the hypotheses. Therefore, if so-called normal control subjects 
It has been expected that significant dif- were tested and compared with suicidal or 
ferences between suicidal and nonsuicidal other hospitalized subjects, differences might 
subjects would appear in their personality be found in the percentages of aggressive re- 
functioning in aggressive situations. This was sponses in the total record or in the predomi- 
not the case. There were no significant differ- nance of aggressive over libidinal material. 
ences between the three groups on any of the But one could not expect to find such differ- 
Rorschach scores. Furthermore, just as with ences when dealing exclusively with a hos- 
the two suicidal groups, the nonsuicidal psy- pitalized population, 

chiatric controls also experienced the break- For the same reason, no significant differ- 
through into consciousness of more aggressive ences were found either within or between 
than libidinal material on the Rorschach, al- groups for DE scores. As discussed above, 


though the degree of difference was not nearly these scores can be understood as measures of 
as high for the control group. These results the individual’s overall ego functioning, and 
may be used to bring into question the psy- none of the three groups functioned signifi- 
choanalytic theory of the significance of ag- cantly more poorly than any other. 
gression in suicide. If aggression plays a major role in both 
However, the similarity between the sui- schizophrenia and depression, whether suicidal 
cidal and nonsuicidal psychiatric groups can ideation is present or not, what dictates the 
also be understood as & reflection of the im- form in which the pathology will manifest it- 
portance of aggressive conflicts not only in self? What determines the various sympto- 
self-destructive behavior but also in a num- matic expressions of general ego dysfunction? 
ber of other diagnostic entities. Jacobson The results of this study suggest that the 
(1971), in attempting to combine ego psy- individual’s cognitive functioning, the area in 
chology with psychoanalytic drive theory, im- which the clearest differences between the sui- 
plicated aggression as 4 cidal and nonsuicidal groups wer 
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define the environment and thus determine the 
nature of symptom expression. Suicidal sub- 
jects do not differ significantly from their 
nonsuicidal psychiatric counterparts in over- 
all level of ego pathology or overall level of 
cognitive functioning. However, they do show 
more markedly primitive functioning in inter- 
personal situations with aggressive content. 
In this area there is evidence of clear isola- 
tion among suicidal subjects between the role 
reciprocals of aggressor and victim. They are 
at their best cognitively when taking the role 
of victim, but they have so dissociated them- 
selves from the ego-alien role of aggressor that 
perspective in the aggressive situation remains 
relatively unavailable to them. 

One could say that they have thus defined 
their environment as one that precludes the 
direct outward expression of aggression, a 
relatively stable condition that in itself might 
not give rise to pathological behavior. How- 
ever, when external events lead to the con- 
scious experience of aggressive drive deriva- 
tives, the suicidal subjects can deal with the 
resulting breakdown of ego defenses only by 
treating themselves as victims. At deeper 
levels of the unconscious, such actions may 
indeed succeed in killing introjected others, 
but consciously the suicidal subjects can per- 
ceive the situation only as one in which they 
alone have been hurt. Because nonsuicidal 
subjects are able to assume the role of ag- 
gressor in an aggressive situation, self-destruc- 
tive behavior is a less likely, and certainly a 
less necessary, outcome of the breakthrough 
of aggressive drive derivatives, though it may 
lead them to ego pathology as serious as that 
of the suicidal group. 


Reference Note 
1, Holt, R. R. Manual for Scoring primary process 


material in the Rorschach, Unpublished paper, 
New York University, 1970, (Mimeo) 
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Relationships Between WISC-R Factors, Wide-Range 
Achievement Test Scores, and Visual-Motor Maturation in 
Children Referred for Psychological Evaluation 


James M. Stedman 
Department of Psychiatry 
University of Texas Health Science 
Center at San Antonio and 
Community Guidance Center, San Antonio 


Robert H. Cortner 
University of Texas Health Science 
Center at San Antonio and 
Community Guidance Center, San Antonio 


The present study 


G. Frank Lawlis 
North Texas State University 


Gloria Achterberg 
Community Guidance Center, San Antonio 


investigated relationships between the Kaufman Wechsler 


Intelligence Scale for Children—Revised factors (Verbal Comprehension, Per- 


ceptual Organization, 


and Freedom from Distractability), Wide-Range Achieve- 


ment Test (WRAT) scores, and visual-motor maturation in a sample of 106 
children. These children ranged in age from 6 to 13, and they had been referred 
for clinical evaluation because of a variety of school-related problems, includ- 


ing learning and classroom behavior problems. 


Results indicated significant 


correlations between the Verbal Comprehension factor and the Reading, Spell- 
ing, and Arithmetic measures on the WRAT and all measures of visual—motor 


maturation, The Perceptual Organization 


factor correlated significantly with 


Reading, Spelling, and all visual-motor maturation measures. The Freedom 
from Distractability factor correlated significantly with Arithmetic. 


Since Luty (1967) and Sattler (1974) de- 
vised methods for the clinical use of Cohen’s 
(1959) factor analysis of the Wechsler In- 
telligence Scale for Children (WISC), clini- 
cians have increasingly used a factor score 
approach of WISC interpretation. No doubt 
this tradition will continue with Kaufman’s 
(1975) factor analysis of the Wechsler Intelli- 
gence Scale for Children—Revised (WISC-R) ; 
and, in fact, Sattler has suggested WISC-R 
interpretations. Given this increasing trend 
toward use of the factor approach in clinical 
interpretation, it would seem important to 
establish correlational networks between 
WISC_R factor scores and other criteria, par- 


Gloria Achterberg is currently a student at North 
Texas State University. 

Requests for reprints should be sent to James M. 
Stedman, Department of Psychiatry, University of 
Texas Health Science Center at San Antonio, 7703 
Floyd Curl Drive, San Antonio, Texas. 


ticularly those tests most commonly used by 
clinicians, 

The present study sought to investigate 
relationships between the Kaufman factors 
Verbal Comprehension (VC), Perceptual Or- 
ganization (PO), and Freedom from Distract- 
ability (FD); achievement, as reflected on 
the Wide-Range Achievement Test (WRAT; 
Jastak & Jastak, 1965); and perceptual ma- 
turation, as reflected in the Koppitz (1964) 
scoring of the Bender-Gestalt in a sample of 
school children referred for psychological 
evaluation, The following relationships were 
hypothesized: (a) Since the VC and PO fac- 
tors, as identified by Kaufman, are equivalent 
to the Verbal and Performance IQ scores on 
the WISC-R, a positive relationship with 
Reading and Spelling on the WRAT was 
anticipated. (b) Since Verbal IQ has been 
shown to be positively related to WRAT 
Math scores (Jastak & Jastak, 1965) and 
since math is thought to be sensitive to dis- 


Copyright 1978 by the American Psychological Association, Inc. 0022-006X/78/4605-0869$00.75 


869 


870 


Table 1 


STEDMAN, LAWLIS, CORTNER, AND ACHTERBERG 


Intercorrelation Matrix of Wechsler Intelligence Scale for Children—Revised Subtests 


Test 1 2 3 
1. Information 
2. Similarities 58 
3. Arithmetic 57 52 
4. Vocabulary 68 60 56 
5. Comprehension 68 59 51 
6. Picture Completion 21 31 31 
7. Picture Arrangement 38 A3 34 
8. Block Design 31 32 -26 
9. Object Assembly 38 36 35 
10. Coding 23 14 36 
11. Digit Span 136. 433° 7.36 


tractability, it was anticipated that the FD 
and the VC factors would be correlated with 
Arithmetic achievement on the WRAT. (c) It 
was anticipated that the PO factor would be 
positively correlated with Koppitz scores on 
the Bender-Gestalt. 


Method 
Subjects 


The subjects for this study were students (76 
males and 30 females). These students ranged in age 
from 6 to 13 with a mean age of 9.5 and were 
distributed across grade levels as follows: first, 
13.2%; second, 16.9%; third, 17.9% ; fourth, 11.3%; 
fifth, 104%; sixth, 14.1%; seventh, 7.5%; and 
eighth, 8.4%, Their IQs ranged from 60 to 118 with 
a mean of 88.1 and a standard deviation of 13.1. All 
were referred for evaluation because of learning 
and/or classroom behavior problems. Of these 106 
subjects, 90% had a Spanish surname, and it is 
likely that a high percentage of these subjects are 
bilingual. The remainder were Anglo or black, with 
blacks constituting only 2% of the total sample, 


Procedure 


As part of a service contract with the parochial 
school system, Archdiocese of San Antonio, WISC-R, 
WRAT, and Bender protocols were collected during 
the 1975-1976 school year. Because of a desire to 
study factor structure per se, the WISC-R protocols 
were intercorrelated with respect to the 11 subtests, 
resulting in an 11 X 11 matrix. (The Maze subtest 
was not administered.) A principal-components 
analysis (with squared multiple correlations in the 
diagonal) was performed, terminating with an eigen- 
value less than 1. Three factors, accounting for 64.6% 
of the variance, were retained and rotated according 
to the varimax criteria. Factor scores were com- 
puted for each Subject, and these scores were trans- 
formed into normal distribution, 


These factor scores, which were essentially equiv: 
lent to the Kaufman factors, were then correlati 
with age-appropriate WRAT Level 1 or Level 2 
standard scores for Reading, Spelling, and Arith- 
metic and Koppitz error scores for the Bender- 
Gestalt protocols, Because Koppitz raw scores arë 
more normally distributed in the 6- to 10-year-old 
age range, correlations were calculated only with 
subjects up through the fifth grade (n= 77). Th 
error scores included total number of scorable erro 
total significant, and total highly significant erro 
Significant and highly significant errors are th 
that Koppitz (1964) found to occur in protocols ol 
children diagnosed as brain injured. 


Results 


Table 1 presents the intercorrelation 
matrix for the 11 subtests, and Table 2 pre- 
sents the factor structure obtained by the 
methods described above. 

Table 3 presents Pearson correlations be- 
tween the three factor scores and the WRAT 
Standard scores for Reading, Spelling, and 
Arithmetic. As anticipated, the VC factor 
contributed significantly to Reading scores 
(Ż < .005), but it should be noted that the 
PO factor also was significantly related 
(Ż < .05), although the magnitude of the re- 
lationship was considerably less. The VC 
factor also was significantly related (p< 
005) to Spelling scores, and again the PO 
factor made a small but significant contribu- 
tion ( < .05). The FD factor, as expected, 
related significantly (p < .005) to the Arith- 
metic score and, as anticipated, the VC factor 
also contributed (p< .05). The PO factor 
apparently contributes little to Arithmetic 
as measured by the WRAT. 


WISC-R, WIDE-RANGE ACHIEVEMENT TEST, AND MATURATION 


Table 4 gives Pearson correlations between 
the three factor scores and the total Koppitz 
Bender errors, the total significant errors, 
and the total highly significant errors. As 
predicted, there was a significant relationship, 
expressed negatively, between the PO factor 
and absolute and highly significant Koppitz 
Bender scores (p < .05 to p < .005), Of note 
are the significant (p< .01 to p< .005) 
associations of the VC factor with all Kop- 
pitz scores and the significant (p < .01 to 
p < .005) associations of the FD factor with 
both the absolute and significant Koppitz 
scores. 


Discussion 


The present study confirms that the Kauf- 
man factors are related in clinically expected 
ways to WRAT achievement scores and 
visual-motor maturation scores in a sample 
of children referred for psychological evalu- 
ation. This finding is perhaps reassuring to 


Table 2 
Factor Structure of the Wechsler Intelligence 
Scale for Children—Revised 


um 


Varimax rotation 
Unrotated ——————_ 


Test first factor VC PO F D 
Verbal 
Information 78 84 10 16 
Similarities 75 77.2304 
Arithmetic 72 66 13 40 
Vocabulary 82 82 30 03 
Comprehension 78 82. 48 .09 
Digit Span 55 138. yeas” | 42) 
Performance 

Picture 
Completion 55 AG: ASS (01 

icture 
Arrangement 61 48 3i 21 
Block Design 56 23° «7504 
Object Assembly 64 23 68 36 
Coding 38 07 05 94 


Note. VC = Verbal Comprehension; PO = Per- 
ceptual Organization; FD = Freedom from Dis- 
tractibility. 
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Table 3 
Correlations Between WISC-R Factors and 
WRAT Standard Scores 


WRAT 
WISC-R 
factor Reading Spelling Arithmetic 
vc Aa .38** .20* 
PO .19* sS .10 
FD 04 4 Bey ft! 


Note. WISC-R = Weschler Intelligence Scale for 
Children—Revised; WRAT = Wide-Range Achieve- 
ment Test; VC = Verbal Comprehension; PO 
= Perceptual Organization; FD = Freedom from 
Distractibility. 


clinicians who, all too often, see Bender pro- 
tocols that do not seem to fit at all with find- 
ings on the PO factor and distracted, anxious 
children with good Arithmetic and Digit 
Span scores. 

Other relationships are also of interest. 
The PO factor’s small but significant contri- 
bution to Reading and Spelling indicates that 
nonverbal intellectual factors make their 
contribution to those skills, as well as to 
Arithmetic. Also, both the VC factor and the 
FD factor were significant, suggesting that 
verbal mediators and attention factors are 
involved in visual-motor maturation processes. 


Table 4 
Correlations Between WISC-R Factors and 


Koppits Bender-Gestalt Errors 


Total errors 


Highly 
Factor No Significant significant 
VC pet 43*** tos 27** F5 Caida 
PO —.30*** —.1S —.22* 
FD —.28** qart .08 


Note. WISC-R = Wechsler Intelligence Scale for 
Children-Revised; VC = Verbal Comprehension; 
PO = Perceptual Organization ; FD = Freedom 
from Distractibility. 

*p< 05. 

**p< 01. 
*** p < .005. 
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Perhaps, all of these additional relationships 
are not surprising, but they do suggest areas 
for further research regarding relative 
WISC-R factor contributions to related 
achievement and visual-motor maturation 
measures. 

Finally, it should be noted that these data 
suggest that perhaps the factor structure for 
bilingual Mexican Americans is approximately 
equivalent to that of English-speaking sub- 
jects. Although this conclusion cannot be 
stated definitely due to the mixed nature of 
the sample, the evidence seems to point in 
that direction. Further research is also needed 
to ascertain whether bilingual factor struc- 
tures are indeed equivalent. 
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Freedom of Choice and Behavioral Change 


Frederick H. Kanfer and Laurence G. Grimm 
University of Illinois 


A 3X2 (Treatment X Population) factorial design with repeated measures 
(pretest/posttest) was used to evaluate the effects of perceived freedom of 
choice on behavior change in a therapy analogue study. Ninety subjects were 
assigned to three groups that varied in the amount of perceived choice given 
to subjects in determining the type of training procedure used for speed-reading 
enhancement. Experimental conditions were crossed with two populations of 
subjects to examine two levels of perceived freedom. Half of the subjects were 
drawn from a pool of psychology students required to participate in psychology 
experiments, and the remaining half of the subjects were volunteers, The main 
dependent variable was the amount of change in reading rate. A marginally 


significant (p = .06) increase in reading speed was obtained by volunteer sub- 
jects in comparison to subject pool participants. Subjects who perceived that 


they were given a choice in training 


procedures improved significantly more 


(p <.02) in reading speed than subjects who lost the freedom of choice. No 


changes in reading comprehension were noted. These findings are discussed in 
terms of the relationship between freedom of choice and performance in a 


behavior change program. 


j Increased sophistication about psychological 
' treatment, a social trend toward self-help 
therapies, and the declining doctrine of the 
infallibility of mental health professionals 
have resulted in greater consumer selectivity 
among therapy agents and methods. With 
greater client understanding of behavior 
change methods, the client’s acceptance of a 
treatment program has become even more 
critical for therapeutic improvement than 
before. It has long been established that 
motivation to change is essential for treat- 
ment success, Clinical folklore and current 
Psychological theories suggest ample reasons 
for enhancing a client’s belief that the treat- 
ment is voluntary and that he or she has 
some voice in deciding among treatment ap- 
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proaches in order to further cooperation. In 
fact, in many cases, especially when referred 
by agencies or other persons, the develop- 
ment of a motivation to change becomes a 
first treatment objective (Kanfer, 1975). 
Laboratory findings closely match clinical 
experience. Liem (1975) found that students. 
who were given a choice in selecting among 
different types of classroom sections did sig- 
nificantly better on exams and reported 
greater satisfaction with their sections and 
class leaders in comparison to no-choice sub- 
jects. Similarly, residents in a home for the 
aged who were given greater personal respon- 
sibility and choice in their daily routines 
showed significant improvement in alertness, 
active participation, and self-rated well-being 
than control subjects (Langer & Rodin, 
1976). The belief that one has a choice also 
appears related to perception of increased 
control over an outcome. For example, choice 
of the order of taking a number of tests can 
reduce anxiety (Stotland & Blumenthal, 
1964). The presence of an escape response 
from an aversive noise decreased the aversive- 
ness of the threatening stimulus (Corah & 
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Boffa, 1970). Thus, a belief in personal con- 
trol strongly influences the tolerance of aver- 
sive events (Averill, 1973; Kanfer & Seidner, 
1973; Langer, Janis, & Wolfer, 1975). Langer 
(1975) found that choice in a chance situa- 
tion increased confidence and risk taking. The 
freedom to choose among alternatives is thus 
related to perception of personal control, 

The clinical implications of attributing a 
behavior change to one’s own actions lie in 
the findings of better maintenance of that 
behavior (Davison & Valins, 1969; Kopel & 
Arkowitz, 1975). The finding of the facilitat- 
ing effects of freedom of choice is supple- 
mented by the finding that elimination of 
freedom or a threat to eliminate freedom has 
the opposite effects in activating the person to 
oppose such restrictions (Brehm, 1966). For 
example, a study by Brehm and Brehm (see 
Brehm, 1966) showed that subjects resisted 
persuasion when an authority figure acted in 
a dictatorial manner (i.e., “You must agree.”) 
When the communication was phrased so as 
to not pose a threat to freedom, it was highly 
persuasive. Experiments on the effect of re- 
actance suggest that a threat to eliminate 
freedom or an intentional deprivation of free- 
dom yields detrimental behavioral changes 
with regard to an assigned task. The clinical 
parallel would be the referral of a client for 
therapy under conditions of coercion. Reac- 
tance and countercontrol (Davison, 1973) 
would be expected to interfere with the treat- 
ment process. 

The forced referral to the therapist is some- 
what paralleled in the experimental situation 
in which students in psychology courses are 
required to participate in research. Both the 
ethical and technical problems of using such 
subjects have been repeatedly discussed (Kel- 
man, 1972; Schultz, 1969). Cox and Sip- 
prelle (1971) demonstrated that “true volun- 
teers” markedly differed from nonvolunteers 
in mean heart rate changes over trials during 
an operant verbal conditioning procedure. In 
a recent study, Gordon (1976) examined 
the behavior of volunteer and nonvolunteer 
subjects who could either choose or were as- 
signed to different relaxation treatments. Gor- 
don found that volunteers valued the treat- 
ment more and reported it to be significantly 
more effective than subjects who had no 
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choice. These findings are also consistent with 
the work presented by Rosenthal and Rosnoy _ 
(1975) that emphasized that the use of vol- 
unteer versus nonvolunteer subjects intro- 
duces important sources of bias in the experi- 
mental results. 

The present study used a laboratory ana- 
logue to examine two critical problems in 
clinical practice: the client’s freedom of choice 
in deciding to submit to therapy and the cli- 
ent’s perceived choice in selecting among 
therapeutic procedures. A free-choice condi- 
tion in selecting training procedures was used 
as an analogue to treatments that emphasize 
negotiation in establishing change programs. 
A lost-choice condition represented a situa- 
tion in which a client enters therapy with 
expectations for negotiation but is then given 
the treatment that the therapist recommends. 
The analogue for this group is somewhat 
weaker, since in this study the experimenter 
also manipulated the prior expectations. A 
no-choice condition corresponded to the client 
who has no strong expectations about par- 
ticipating in deciding on treatment plans and | 
does not choose a treatment. To increase simi- 
larity to clinical procedures, skill in reading 
was chosen as an area in which college stu- 
dents frequently experience problems. Specifi- 
cally, this study examined the effect of (a) 
volunteer versus nonvolunteer participation 
and (b) choice versus imposition in selection 
of training procedures on improvement in 
brief speed-reading practice. 


Method 
Subjects 


The subjects were 90 students at the University of 
Illinois at Urbana-Champaign. Forty-five subjects 
(23 males and 22 females) were enrolled in intro- 
ductory psychology courses in which research par- 
ticipation credit was given toward their course 
grades. Subjects were required to submit their 
names for participation and were then assigned by 
computer to five experiments. The subjects had no i 
choice in this assignment, but they could refuse to 
participate in a given experiment. The remaining 
45 subjects (27 males and 18 females) were volun- 
teers who answered campuswide advertisements that 
read “Wanted: Participants for a study concerning 
increased reading speed and comprehension.” The 
nonvolunteers were assigned to a room and were 
scheduled for experimental participation by routine 
Psychology - department procedures. Volunteers con- 
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tacted the experimenter by telephone to arrange 
their appointments. Neither the advertising nor initial 
contact included an explicit or implicit statement 
about the subjects’ opportunity to obtain remedial 
training. All subjects were randomly assigned to one 
of three groups in a 3 X 2 factorial design. Test and 
questionnaire data indicated that all groups were 
equivalent in pretest reading speed and comprehen- 
sion, expectations for change, prior training experi- 
ence, and motivation to learn speed reading. 


Procedure 


Subjects were seen individually and were told 
that the aim of the study was to examine the im- 
mediate effects of several different training tech- 
niques on reading rate and on comprehension. The 
long-term purpose of the experiment was said to 
assist in the development of reading skill programs. 
All subjects filled out a questionnaire to assess their 
interest in improving reading skills while the ex- 
perimenter was seated behind them. After completion 
of the questionnaire, the subjects were given instruc- 
tions for the pretest. The pretest consisted of a nar- 
rative passage at the low college level used by the 
University Counseling Center for establishing reading 
ability. The selected passages were from the book 
Toward Better Reading Skills, The subjects were 
told to read the passage quickly but to try also to 
understand the material, since they would be asked 
questions afterwards. When individual subjects indi- 
cated their readiness, the experimenter started a 
stopwatch and signaled them to begin reading. The 
amount of time taken to complete the reading of 
the standard passage was used as a reading speed 
measure, 

After subjects had finished the reading task, they 
were given a comprehension test, comprised of 
multiple-choice questions. Following the administra- 
tion of the comprehension test, subjects were seated 
in front of a small screen, which was used for read- 
ing skill training. Other training materials included 
a Carousel projector, set up for rear projection, and 
50 slides of phrases ranging from 4 to 5 words each. 
The phrases were taken from a film strip produced 
by the Society for Visual Education, Inc., Chicago, 
Illinois, designed to improve reading skills. Each 
Phrase was set up tachistoscopically so that it ap- 
peared on the screen for approximately .14 sec. 
Timers provided an interval of 5 sec between slide 
Presentations. Immediately prior to the presentation 
of training materials, the subjects were told: 


This study is examining training techniques and 
their effect on reading rate and comprehension. 
With the techniques that I will describe to you in 
a moment, it has been shown that even in one 
training session some improvement is possible. In 
the long run the techniques are all equally effec- 
tive, but, due to individual differences, the method 
the person feels most comfortable with will prob- 
ably be the most effective because it best fits his/ 
her style of learning. 
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The remainder of the instructions were varied ac- 
cording to the group to which the subject was as- 
signed. Subjects in the free-choice conditions were 
told: “We have found that almost everyone prefers 
to choose the techniques they work with. How 
about you? Would you also prefer to have a choice 
of which reading technique to use?” Subjects in 
the lost-choice group were given the identical in- 
structions. For no-choice subjects the preceding in- 
structions were omitted. All subjects, with the ex- 
ception of one male and one female, indicated that 
they would like to have a choice. Since they did 
not wish to make a choice, these two subjects were 
excluded from further participation in the study. 

Following these instructions, three training tech- 
niques were described to subjects in the following 
manner: 


The first technique that can be used in training is 
a derivation of the Flourens-Derjeu ocular scan 
method. Although it has been shown to induce 
some eye strain in the learner, it is learned 
quickly and offers rapid improvement. The second 
technique is called the Mueller peripheral scope 
method. This method has been shown to produce 
a small amount of eye strain, takes longer to 
learn, and results in slower improvement. The 
third technique is the Morgan lateral field method. 
It involves a moderate amount of eye strain and 
requires a medium length of time to learn but 
yields a moderate and immediate improvement in 
reading. 


Following the characterization of the training tech- 
niques, the free-choice group was asked, “Now that 
you have heard the description of the character- 
istics of each technique, which one would you prefer 
to work with?” After these subjects made their 
choice, the experimenter said, “OK, that’s fine, we'll 
use that one.” The lost-choice group was asked, 
“Now that you have heard a description of the 
characteristics of each technique, which one would 
you prefer to work with?” After these subjects had 
Stated a preference, the experimenter went to the 
pack of the laboratory, turned on the equipment, 
returned, and said, “I am sorry but I cannot give you 
that one, I will have to give you the _____ method.” 
No-choice subjects were told that although these 
were the three methods generally used in research 
of this type, this experiment only used the —— 
method. In spite of the fact that specific methods 
were identified, all subjects were exposed to an 
identical procedure. Prior to the presentation of the 
slides, all subjects were told: 


You will be shown a slide for a fraction of a 
second, then there will be a brief rest, followed 
by another slide. The sequence will continue until 
all slides have been shown. Now, each slide will 
have a group of words on it. Try to take in the 
entire line in one look instead of reading across 
the line from left to right. I will Jet you know 
when all the slides have been shown. 
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The experimenter then went behind a partition and 
began to show the slides. After all 50 slides had been 
presented, the experimenter returned and asked the 
subject to move to the testing table. A second read- 
ing and comprehension test, matched for reading 
difficulty to the pretraining material, was then 
administered. This administration followed the same 
procedures outlined for the pretraining test. It should 
be noted, however, that in the presentation of the 
pretests and posttests, the passage contents were 
alternated to control for an order effect. Following 
the reading of posttest materials, a comprehension 
test was again given. All subjects were then told 
that it would require considerable practice over a 
period of weeks to actually improve their reading 
skills. However, the experimenter offered to give 
them the results of their reading tests as well as 
those of the study and to discuss with them any 
desire to take a remedial reading course. The sub- 
jects were then to fill out an “experimenter evalua- 
tion form.” The evaluation form contained seven 
questions, rated from strongly agree (1) to strongly 
disagree (5). The items contained statements about 
the experimenter’s manner, his competency, and his 
presentation of the material. After completion of the 
experimenter evaluation form, subjects were asked to 
deposit the form in the second author’s mailbox on 
a different floor in the same building, immediately 
after the experiment. Those subjects who requested 
information about improving their reading skills 
were given prepared materials referring them to 
the University Counseling Center where such classes 
are offered, 


Overall Design 


The experiment consisted of a factorial design with 
two subject classifications (volunteers and nonvolun- 
teers) and three treatment classifications (free 
choice, lost choice, and no choice), For all subjects, 
pretraining and posttraining scores were obtained 
for reading speed and comprehension, In addition, 
the pretraining motivation questionnaire and the 
posttraining experimenter evaluation questionnaire 
provided data to ascertain initial level of motivation 
for improving reading skills and postexperimental 
reactions to the experiment and the experimenter (as 
a test of potential reactance), 


Results and Discussion 


The main measure of the effects of the 
volunteer and choice variables was the change 
in reading rates following the 50-item training 
presentation. Table 1 presents the reading 
rates of all groups on the prereading and post- 
reading tests and mean change scores. To as- 
sess the initial reading rate for all groups, a 
one-way analysis of variance on pretraining 
reading scores was performed, The nonsignifi- 
cant F(5,84) of 1.67 indicates that the 


groups did not differ prior to the training ses- 
sions, 
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Table 1 
Mean Pretest, Posttest, and Change Scores in 
Reading Rate i 


OGN 
a 


Choice Nonvolunteers Volunteers 
Free 

Pre 298 346 

Post 303 376 

Change 5 30n,b,¢ 
Lost 

Pre 322 287 

Post 303 285 

Change —19, -2 
No 

Pre 294 336 

Post 289 341 

Change —5, 5 


Note. Significant comparisons at p< .05 were ob- 
tained for pairs of groups identified by the same 
subscript. Reading rate was measured in words per 
minute. 


To evaluate the relative change in reading 
rates among groups, a two-way analysis of 
variance (Groups X Populations) was per- 
formed on raw change scores in reading rates 
from the pretest to posttest (Overall & Wood- 
ward, 1975, 1976). The main effect for the 
choice conditions was significant, F(2, 84) = 
3.13, p < .05. A marginally significant main 
effect was obtained for the volunteer versus 
nonvolunteer comparison, F(1, 84) = 3.38, p 
= .06. Volunteer subjects showed a greater 
increase in reading rate in comparison to 
subjects drawn from the introductory psy- 
chology subject pool. No significant interac- 
tion effect was obtained. 

The direction of the significant effect of 
choice conditions is indicated by the array of 
mean change scores, Subjects in the free- 
choice condition showed the greatest change, 
followed by the no-choice and lost-choice sub- 
jects. A Scheffé test was performed to locate 
the sources of the differences demonstrated 
by the significant main effect. The free-choice 
groups showed a significantly greater increase 
in reading rate relative to the lost-choice 
groups ($ < .02). These results indicate that 
the opportunity to choose among alternative 
training techniques significantly affected per- 
formance as a result of the training session. 
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A two-way analysis of variance (Groups X 
Population) on raw gain scores was used to 
assess changes in reading comprehension 
scores. The results indicated that neither vol- 
unteering nor choice conditions significantly 
affected reading comprehension changes. 
Reading rate and comprehension change scores 
were then transformed to residual change 
scores and were intercorrelated (Tucker, 
Damarin, & Messick, 1966). The correlation 
coefficient was nonsignificant (rsy = -06). 
These findings indicate that changes in read- 
ing rates occurred independently of compre- 
hension, and thus faster reading did not lower 
comprehension. 

Since both male and female subjects and 
two experimenters participated, a two-way 
analysis of variance was used to test for the 
effects due to the experimenters as well as for 
sex, Neither analysis yielded a significant 
result. 

The overall results are consistent with 
findings in other studies that subjects who 
perceived that they had a choice among al- 
ternatives performed more effectively than 
subjects who lost such a choice. The implica- 
tions of these findings for clinical procedures 
are clear, Treatments that emphasize negoti- 
ation about therapy objectives and use par- 
ticipation of the client in the treatment pro- 
cess (Kanfer, 1975) would be expected to 
work more efficiently toward the therapeutic 
goals than clients on whom these goals are 
imposed or for whom the parameters of a 
procedure are not discussed. Champlin and 
Karoly (1975) have reported a preliminary 
investigation in which clients who participated 
in negotiating contract objectives ‘showed sig- 
nificantly greater activity in the change pro- 
gram than those on whom contract conditions 
Were imposed. 

_ The present data do not permit a clear 
inference as to whether this difference was 
due to reactance (i.e, a decline in reading 
tate) in the lost-choice subjects or an en- 
hancement (a positive motivational effect) in 
the free-choice group. Failure to obtain sta- 
tistical differences between the no-choice 
group and the lost-choice group by the 
Scheffé test permits no firm conclusion in this 
Matter. An analysis of within-group changes 
Over trials also fails to clarify this issue, since 
Only the nonvolunteer lost-choice group 
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showed a significant change, that is, a de- 
crease in reading rate (p < .03). However, 
inspection of the data suggests that both 
effects may have taken place, possibly inter- 
acting with the characteristic of the subject 
population, In contrast to the average in- 
creases in reading rates among subjects in the 
free-choice groups, both volunteer and non- 
volunteer subjects in the lost-choice groups 
showed an overall decrease in reading rate on 
the posttraining test. In the no-choice group, 
the rate of volunteers tended to increase and 
the rate of nonvolunteers tended to decrease 
after the training session. Variation among 
subjects and the size of the sample may have 
contributed to the failure to clarify this ques- 
tion in the present experiment. 

The difference in change between volunteer 
and nonvolunteer subjects just missed signifi- 
cance at the traditional .05 level. However, 


‘the strong trend (p= .06) supports sug- 


gestions from earlier studies and extends the 
findings by Gordon (1976) that persons who 
seek out a particular change experience appear 
to benefit more from it than those who are 
assigned to it by others. Thus, the implica- 
tions for clinical practice lie in the need for 
improvement of clinical techniques that would 
help a person become involved in a change 
program, even when initially referred by some- 
one else. 

The marginally significant difference be- 
tween volunteers and nonvolunteers could be 
viewed as a reflection of the differences in 
experience with or enthusiasm about a read- 
ing skills program. The questionnaire data 
indicated that 31% of the nonvolunteer and 
20% of the volunteers had previously expe- 
rienced some speed-reading courses. Further, 
67% of the subjects in each group indicated 
that they believed speed-reading courses can 
be effective. Thus, differential expectations of 
training effectiveness or past experience could 
not have contributed to the differences in 
reading rate changes that were obtained for 
the two populations. 

An initial questionnaire assessed the sub- 
ject’s motivation to learn speed-reading tech- 
niques by asking how much time and money 
they would be willing to invest in a course. 
Volunteer and nonvolunteer subjects reported 
on the average that they would spend $24 
and $29, respectively, and that they would 
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practice reading techniques 4.6 and 3.6 hours 
per week, respectively. Therefore, the two 
populations did not differ appreciably in their 
motivation to enhance their speed-reading 
skills. 

In rating the experimenters’ courtesy, com- 
petency, manners, and experimental conduct, 
all subjects tended to give favorable judg- 
ments, All group means were practically 
identical. On a 5-point scale (with 1 as the 
most favorable rating), the overall mean was 
1.87. The present analogue to treatment is not 
complete, In clinical practice deprivation of 
treatment choice is rarely explicit. On the 
other hand, the analogue is more conserva- 
tive, since agency-referred clients are often 
sent for treatment of a problem that they do 
not accept as requiring any personal change. 
In this study, both groups reported interest in 
working toward better reading skills. Thus the 
additional imposition of “the problem” in 
many clinical cases should yield stronger ef- 
fects of the volunteer variable than was ob- 
tained here. 

The overall findings of this experiment are 
consistent with the hypothesis that increased 
freedom of choice enhances performance in a 
change program. Within the limitations of the 
present study, this effect was more strongly 
demonstrated for the freedom to decide among 
training methods than for the difference in 
the source of referral to the experiment. 
Further tests of these hypotheses are needed 
to indicate whether generalization is war- 
ranted from the laboratory findings to long- 
term programs with clients who present a 
disturbing problem situation, 
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To investigate and compare the effects of progressive relaxation training and 
meditation training on autonomic arousal in alcoholics, 30 subjects were se- 
lected from a population of alcoholics in a Veterans Administration hospital 
substance-abuse program. The subjects were randomly assigned to one of the 


following three experimental conditions: (a) progressive relaxation training 
group, (b) meditation training group, or (c) quiet rest control group. All groups 
met for 3 weeks during which state anxiety, blood pressure, heart rate, and 
spontaneous galvanic skin responses were measured. The measures were de- 
signed to assess the treatment effects following the first training session and at 
the end of the total training period. The results indicate that both progressive 
relaxation training and meditation training are useful for reducing blood pres- 


sure in alcoholics. In addition, significant 


differences between the groups in the 


effectiveness of the relaxation procedures were found. Meditation training in- 
duced blood pressure decreases at an earlier point in the 3-week training period 
and affected decreases in systolic blood pressure that progressive relaxation 
training did not. These results support the idea of considerable specificity of 


response to relaxation techniques. 


Recent authors (Benson, Beary, & Carol, 
1974: Bernstein & Borkovec, 1973) have dis- 
cussed the usefulness of progressive relaxation 
training and meditation training as treatments 
for stress-related disorders. In general, these 
authors have portrayed the techniques as po- 
tentially effective treatments for a number of 
problems including hypertension, insomnia, 
tension headaches, and generalized anxiety. 

Alcoholism and other forms of substance 
abuse are also disorders for which relaxation 
techniques may have utility. Preliminary sup- 
port for this possibility has been provided by 
studies assessing the effects of meditation on 
self-reports of substance usage (Benson, 1974; 


This article is based on a doctoral dissertation sub- 
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Benson & Wallace, 1972; Shaffi, Lavely, & 
Jaffe, 1974, 1975; Winquist, 1973). Although 
the mechanism by which relaxation techniques 
might affect drinking behavior is unknown, 
two possibilities appear tenable. First, relaxa- 
tion techniques might serve as a direct sub- 
stitute for the reduced arousal that follows 
alcohol consumption. If this were the case, 
such techniques might be useful for alcoholics 
as coping techniques in stressful situations. 
Second, relaxation techniques might lead to a 
generalized state of lowered arousal. Such a 
state of lowered arousal would, in turn, tend 
to deprive alcohol of its hypothesized tension- 
reducing capacity, since a relatively lowered 
state of arousal would already exist. As a 
consequence, less direct reinforcement of 
drinking behavior would theoretically occur as 
a result of the use of alcohol. 

Further credence in the possible role of 
relaxation techniques in the treatment of al- 
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cohol abuse is provided by studies investigat- 
ing the relationship between anxiety, stress, 
and alcoholism. Steffen, Nathan, and Taylor 
(1974) found significant negative correlations 
between blood alcohol levels and electromyo- 
graphic response in alcoholics, suggesting a 
tension-reducing effect of alcohol. Miller, Her- 
sen, Eisler, and Hilsman (1974) compared 
alcoholics and social drinkers in a simulated 
interpersonal situation requiring assertive be- 
havior. The alcoholics significantly increased 
their operant responses to obtain alcohol, but 
the social drinkers did not, suggesting that 
alcoholics may be more inclined to drink in 
response to social stress. Higgins and Marlatt 
(1975) studied the fear of interpersonal eval- 
uation as a determinant of alcohol consump- 
tion in male social drinkers and found that 
subjects who expected to be evaluated in a 
second study drank significantly more alcohol 
than subjects who did not expect to be eval- 
uated. The results of this study are suggestive 
of the possible role of stress, particularly when 
interpersonal in nature, in the development 
and maintenance of problematic drinking. 

Although the tension-reduction hypothesis 
of alcoholism has not been firmly established 
(Cappell & Herman, 1972), the above studies 
are suggestive of possible dynamics involving 
anxiety, stress, and/or fear of interpersonal 
evaluation. Accordingly, relaxation techniques 
with reported success in reducing arousal 
(Benson et al., 1974; Bernstein & Borkovec, 
1973) are worthy of consideration as a treat- 
ment strategy for alcoholics. 

The purpose of the present study is to in- 
vestigate the usefulness of relaxation tech- 
niques for reducing self-reported anxiety and 
autonomic arousal in alcoholics. Two diver- 
gent techniques, progressive relaxation train- 
ing and meditation training, were included in 
the design for comparative purposes, since 
they may vary considerably in their effective- 
ness and in the dependent measures that they 
affect. 


Method 
Subjects 


Subjects (N = 30) for this study were drawn from 
a population of male alcoholics on a substance abuse 


J. PARKER, G. GILBERT, AND R. THORESON 


unit of a Veterans Administration hospital. The 
mean age for the subjects was 45.1 years, and the 
mean educational level was 11.3 years. The subjects 
were generally of middle socioeconomic status, and 
the mean number of years of heavy drinking, as 
reported by the subjects, was 12.0. Subjects with 
evidence of psychosis, severe cerebral dysfunction, 
serious physical impairment, or illiteracy were 1 
omitted. In addition, subjects receiving psychotropic ` 
medications or having had previous relaxation or 
meditation training were not included. After the re- © 
maining patients were judged by the unit physician © 
to be detoxified and after they had agreed to re- 
main in the hospital for the duration of a 1-month 
treatment program, the State-Trait Anxiety Inven- 
tory (STAI), developed by Spielberger, Gorsuch, and 
Lushene (1970), was administered. Those patients 
with trait anxiety raw scores above 30 were selected 
for the study. The cutoff score was used because 
subjects with initially low anxiety would not permit 
an accurate assessment of the usefulness of the re- 
laxation techniques. Two subjects were dropped on 
the basis of the cutoff score. 


Groups 


The selected patients were assigned by means of 
a table of random numbers to one of the following 
three treatment conditions: 

1. Progressive relaxation training (PRT) group. 
The PRT treatment condition was operationalized 
by using the seven muscle group methodology of 
Bernstein and Borkovec (1973), which involves the 
systematic tensing and relaxing of various muscles. 
Subjects (n = 10) in this condition received 5 min- 
utes of instruction in the rationale and exercises of 
PRT and then were exposed to 3 weeks of PRT 
practice, with the group meeting three times per 
week for 4 hour each session. The first and last 
training sessions were individual sessions with the 
experimenter for the purpose of measurement. The 
middle sessions were conducted in a group setting 
with the 15-minute PRT instructions presented on a 
tape recorder and played in a quiet room with the 
lights turned low. 

2. Meditation training (MT) group. The MT 
treatment condition was operationalized by using the 
methodology of Beary and Benson (1974). This 
methodology involves a comfortable position, a quiet 
environment, a passive attitude, and the silent repe- 
tition of the word one on every exhalation. Subjects 
(n = 10) in this condition received 5 minutes of in- 
structions in the rationale and procedures of MT 
and then were exposed to 3 weeks of MT practice, 
with the group meeting three times per week for 
4 hour each session. The first and last training ses- 
sions were individual sessions with the experimenter 
for the purpose of measurement. The middle ses- 
sions were conducted in a group setting, with MT 
instructions presented on a tape recorder and played 
in a quiet room with the lights turned low. The 
first 2 minutes of the tape reviewed the MT tech- 
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nique and were followed by 13 minutes of silence, 
during which time the meditation was practiced. 

3. Quiet rest control (QR) group. Subjects (n= 
10) in this condition received 5 minutes of instruc- 
tion in the desirability and benefits of relaxation, 
although no specific relaxation techniques were 
taught. Subjects were told simply to sit quietly with 
their eyes closed and to allow themselves to relax. 
They were then exposed to 3 weeks of QR sessions, 
with the group meeting three times per week for 
} hour each session. The first and last QR sessions 
were individual sessions with the experimenter for 
the purpose of measurement. The middle sessions 
were conducted in a group setting, with the QR 
instructions presented on a tape recorder and played 
in a quiet room with the lights turned low. The 
first 2 minutes of the tape reviewed the QR instruc- 
tions and were followed by 13 minutes of silence, 
during which time the quiet rest occurred. 


Procedure 


Subjects received a brief orientation interview by 
the experimenter, and permission for inclusion in a 
research study was obtained. The first session, an 
individual session with the experimenter, was con- 
ducted in a quiet room with the subject seated com- 
fortably in a recliner facing away from the physio- 
logical recording equipment. The first part of the 
initial training session (5 minutes) was devoted to 
specific instruction in the particular treatment that 
the subject was to receive, and a 10-minute pretreat- 
ment measurement period immediately followed dur- 
ing which the physiological measures and the state 
anxiety data were collected. After this measurement 
period, the first 15-minute training tape was pre- 
sented, and a 10-minute posttreatment measurement 
period followed in which the physiological measures 
and the state anxiety data were again collected. A 
brief period was provided after the second measure- 
ment period for questions from the subjects or fur- 
ther instructions, if needed. 

The remaining training sessions, except for the final 
one, were conducted in a group setting over a 3-week 
treatment period. Each group meeting was 4 hour 
in duration, with all groups meeting collectively for 
the first 15 minutes to control for motivational and 
expectancy variables. During this collective meeting, 
the rationale for relaxation therapy was briefly re- 
viewed, and subjects were instructed to practice twice 
daily between group sessions. 

The 15-minute tape-recorded training session fol- 
lowed, with each experimental group meeting sep- 
arately in a quiet, dimly lighted room. The final 
training session, an individual session with the ex- 
Perimenter, again involved the collection of pretreat- 
ment and posttreatment measures. 

Subjects who did not attend at least seven training 
Sessions were dropped from the study. One subject 
from the QR group who had medical complications 
during his hospitalization was dropped for this rea- 
Son. Another subject in the PRT group who dis- 
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charged himself from the hospital against medical 
advice was also dropped. 


Measures 


The following measures were designed to assess 
whether or not the treatments had immediate ef- 
fects following the initial training session. To control 
for fluctuations due to the subjects’ hospital activ- 
ities, these measures were all collected in the morn- 
ings between the hours of 8:00 a.m. and 10:00 a.m. 

1. The state anxiety portion of the STAI was ad- 
ministered during the pretreatment measurement pe- 
riod and the posttreatment measurement period of 
the initial training session. 

2. Arterial blood pressure readings were taken on 
a standard medical manometer during the pretreat- 
ment measurement period and the posttreatment 
measurement period of the initial training session, 

3. Heart rate measures were taken using the radial 
pulse method during the pretreatment measurement 
period and the posttreatment measurement period of 
the initial training session, 

4. Skin conductance measures (spontaneous gal- 
vanic skin response; GSR) were recorded on a 
Lafayette Datagraph Model 76016 for a 2-minute 
interval during the pretreatment measurement period 
and the posttreatment measurement period of the 
initial training session. Spontaneous skin conductance 
fluctuations greater than .6 k® were scored by an 
independent observer unaware of the subjects’ treat- 
ment condition. Beckman silver-silver chloride elec- 
trodes were attached to the first and third fingers 
of the left hand using Beckman electrode paste, and 
a 5-minute rest period was arranged during the pre- 
treatment measurement period before the skin con- 
ductance measures were taken. 

To assess the total training effects of the 3-week 
treatment program, STAI, blood pressure (BP), heart 
rate, and spontaneous GSR measures were also col- 
lected at the end of the final training session. 


Results 


The results of this study were analyzed in 
a two-way analysis of variance (Groups X 
Trials) design with repeated measures on one 
factor. The pretreatment group means for 
state anxiety, BP, heart rate, and spontaneous 
GSR were not significantly different. 


Session 1 Effects 


State anxiety. The results from the 3 X 2 
analysis of variance (ANOVA) on the STAT 
measures over Session 1 found the main effect 
for trials to be significant, F(1, 27 ) = 43,49, 
p < .001, indicating that the combined groups 


882 


Table 1 

Mean State Anxiety Raw Scores Over the 
Initial Training Session and the 3-Week 
Training Period 


Eee 


Measurement period 


Session 1 Post 
final 

Group Pre Post session 
Relaxation 45.5 35.2 28.8 
Meditation 44,7 35.8 30.1 
Control 46.5 41.6 36.6 


reported less anxiety following Session 1, but 
no differences between the groups were found 
(see Table 1). 

Blood pressure. The results from the 3 X 2 
ANOVA on the systolic blood pressure mea- 
sures over Session 1 found the interaction 
(Groups X Trials) to be significant, F (2, 27) 
= 4,68, p < .05 (see Table 2). 

For the systolic BP interaction, the group 
mean systolic BPs were significantly different 
from one another following their respective 
treatments, F(2, 30) = 3.71, p < .05. A New- 
man-Keuls probe found the MT group to be 
significantly lower than the PRT group (7 = 
3, df = 30, p < .05; see Winer, 1962, p. 80). 
The performance of the QR group, however, 
was not significantly different from that of 


Table 2 

Mean Blood Pressures Over the Initial 
Training Session and the 3-Week Training 
Period 


Measurement period 


Session 1 Post 
final 
Group Pre Post session 

Systolic 

Relaxation 118.0 117.5 115.0 

Meditation 109.5 104.5 102.5 

Control 109.5 110.5 119.0 
Diastolic 

Relaxation 84.0 81.5 78.0 

Meditation 77.0 71.5 69.5 

Control 71.5 76.0 80.5 
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the PRT group or the MT group. The MT 
mean systolic BPs were significantly lower at 
the end of Session 1, F(1, 27) = 11.52, p< 
.01. The PRT and QR means were not sig- 
nificantly lower at the end of Session 1, 

The results from the 3 X 2 ANOVA on the 
diastolic BP measures over Session 1 found 
the interaction (Groups X Trials) to be sig- 
nificant, F(2, 27) = 8.81, p < .01 (see Table 
2). 

For the diastolic BP interaction, the MT 
diastolic BP mean was significantly lower at 
the end of Session 1, F(1, 27) = 10.11, p< 
01, In contrast, the QR diastolic BP mean 
was significantly higher at the end of Session 
1, F(1, 27) = 6.77, p < 05. 

Heart rate. The results from the 3 X 2 
ANOVA on the heart rate measures over Ses- 
sion 1 found the main effect for trials to be 
significant, F(1, 27) = 26.70, p < .001, indi- 
cating that the combined groups showed de- 
creased heart rate following Session 1, but 
no differences between the groups were found 
(see Table 3). 

GSR. The results from the 3 x 2 ANOVA 
on the spontaneous GSR measures over Ses- 
sion 1 found the main effect for trials to be 
significant, F(1, 27) = 15.67, p < .001, indi- 
cating that the combined groups showed de- 
creased spontaneous GSRs following Session 
1, but no differences between the groups were 
found (see Table 4). 


Total Training Effects 


State anxiety. The results from the 3 X 2 
ANOVA on the STAI measures collected at 
the beginning of the first session and the end 


Table 3 

Mean Heart Rates/ Min Over the Initial 
Training Session and the 3-Week Training 
Period 


aaua 


Measurement period 
S meas ewan Tere oS 


Session 1 Post 
final 

Group Pre Post session 
Relaxation 88.6 76.1 80.4 
Meditation 86.8 80.6 76.9 
Control 88.4 83.9 87.0 


~ 


— 


= 
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of the final session found the main effect for 
trials to be significant, F(1, 27) = 50.80, 
p < .001, indicating that the combined groups 
reported less anxiety at the end of the final 
training session than at the beginning, but no 
differences between the groups were found 
(see Table 1). 

Blood pressure. The results from the 3 X 2 
ANOVA on the systolic BP measures col- 
lected at the beginning of the first session and 
the end of the final session found both the 
main effect for groups, F(2, 27) = 3.74, p < 
.05, and the interaction (Groups X Trials), 
F(2, 27) = 11.16, p < .001, to be significant 
(see Table 2). 

For the systolic BP analysis, the group sys- 
tolic BP means were significantly different 
from one another at the end of the final train- 
ing session, F(2, 38) = 7.54, p<.0l. A 
Newman-Keuls probe found that the MT 
group was significantly lower than both the 
PRT group (r = 2, df = 38, p < .01) and the 
QR group (r = 3, df = 38, p < .01). The MT 
mean systolic BP was significantly lower at 
the end of the final training session, F(1, 27) 
=7.38, p < .05. The QR mean systolic BP 
was significantly higher at the end of the final 
training session, F(1, 27) = 13.59, p < .001. 

The results from the 3 X 2 ANOVA on the 
diastolic BP measures collected at the begin- 
hing of the first session and the end of the 
final session found the interaction (Groups X 
Trails) to be significant, F(2, 27) = 12.90, 
$ < .001. 

For the diastolic BP interaction, the PRT 
and MT mean diastolic BPs were significantly 


Table 4 

Mean Spontaneous Galvanic Skin Responses 
Over the Initial Training Session and the 
3-Week Training Period 
ii 


Measurement period 
bel Aire Bae re Ma a 


Session 1 Post 
final 

Group Pre Post session 
Relaxation See ete 2.5 
Meditation 7A 27 3.0 
‘ontrol 3.3 1.8 4.8 


Note. Measures are the number of fluctuations 
Sreater than .6 kQ per 2 minutes. 
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lower at the end of the final training session, 
F(1, 27) = 5.58, p < .05, and, F(1, 27) = 
8.72, p< .01, respectively. The QR mean 
diastolic BP was significantly higher at the 
end of the final training session, F(1, 27) = 
12.55, p < .01. 

Heart rate. The results from the 3 X 2 
ANOVA on the heart rate measures collected 
at the beginning of the first session and the 
end of the final session found the main effect 
for trials to be significant, F(1, 27) = 11.31, 
p < .01, indicating that the combined groups 
had lower heart rates at the end of the final 
training session, but no differences between 
the groups were found (see Table 3). 

GSR. The results from the 3 X 2 ANOVA 
on the spontaneous GSRs collected at the be- 
ginning of the first session and the end of the 
final session found no significant effects (see 
Table 4). 


Discussion 
Conclusions from Session 1 Analyses 


With regard to whether PRT and MT have 
effects over a single treatment session, the 
results indicate that on several dependent 
variables, they were not significantly superior 
to control procedures. Although PRT and MT 
led to reports of less anxiety, slower heart 
rates, and fewer spontaneous GSRs, control 
procedures appeared to produce similar 
changes. These findings suggest that simply 
instructing subjects to “rest quietly” on their 
own is as effective as PRT or MT for induc- 
ing decreased self-reported anxiety, lowered 
heart rates, and decreased spontaneous GSRs. 

The control procedures and PRT, however, 
were not as effective when systolic BP and 
diastolic BP were the dependent variables. 
The study of systolic BP over Session 1 in- 
dicates that the performance of the MT group 
decreased significantly during the first ses- 
sion, whereas, that of the PRT and QR 
groups remained approximately the same. 
Moreover, the analysis of diastolic BP changes 
over Session 1 indicates that the MT group 
decreased significantly during the first ses- 
sion, whereas the PRT group remained ap- 
proximately the same, and the QR group in- 
creased significantly. In summary, PRT and 
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MT are not more effective than control pro- 
cedures when self-reported anxiety, heart 
rates, and spontaneous GSRs are measured. 
However, MT is significantly more effective 
than PRT and control procedures when sys- 
tolic BP and diastolic BP are measured. 


Conclusions from Total Training 
Period Analyses 


With regard to whether PRT and MT have 
effects over a 3-week training period, the re- 
sults are generally analogous to the Session 1 
analyses. All three groups showed decreases 
in self-reported anxiety and heart rates, but 
statistically significant differences between the 
groups were not found. The data suggest that 
subjects who are told to rest quietly relax as 
well as PRT and MT subjects when state 
anxiety scores and heart rates are the depen- 
dent measures. No significant training effect 
was demonstrated for the spontaneous GSR 
measure, indicating that the Session 1 effects 
did not last over the entire training period. 

Group differences were again found on the 
systolic BP and the diastolic BP measures. 
The systolic BP analysis showed that the MT 
group decreased significantly, the PRT group 
remained approximately the same, and the 
QR group increased significantly. The dia- 
stolic BP analysis showed that both the PRT 
and the MT groups decreased significantly 
while the QR group increased significantly. In 
summary, PRT and MT do not affect self- 
reported anxiety or heart rates over a 3-week 
training period, but they do affect systolic and 
diastolic BP in the direction of decreased 
arousal. 


Implications 


The results of this study suggest that per- 
sons have considerable “control” over the re- 
laxation process, even when formal relaxation 
techniques are not being used. In terms of 
heart rates and spontaneous GSRs, the QR 
condition affected significant decreases on sev- 
eral measures. This finding is in contrast to 
the findings of several other studies (Beary 
& Benson, 1974; Wallace, 1970; Wallace & 
Benson, 1972) and may stem from the fact 
that expectancy, motivation, and attention ef- 
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fects were carefully controlled in the present 
research. Since the groups in this study met 
collectively when the rationale for relaxation 
was explained, the QR subjects had the same 
“expectancy” for benefits as did the experi- 
mental groups. Accordingly, the QR subjects 
were provided with a set that elicited a strong 
“desire” to relax. 

A perusal of the typical research in the area 
of relaxation shows that such careful control 
of expectancy and attention is not usually the 
case. For example, in the typical ABA designs, 
trained meditators are instructed to “sit with 
eyes closed,” then to “meditate,” and then to 
“sit with eyes closed” again. The independent 
variable of meditation is manipulated in such 
designs, but expectancy is too. Subjects are | 
quite probably given subtle cues indicating 
that they are supposed to “try” in the medita- 
tion condition, at least more so than when 
“sitting with eyes closed.” In the Wallace and 
Benson (1972) research, the subjects were 
fully aware of the purpose of the study and 
had an investment in proving that meditation 
was effective. The present study suggests / 
that expectancy and motivation are important 
variables that must be controlled in relaxa- 
tion research, since subjects can affect con- 
siderable decreases in arousal on certain de- 
pendent variables with only positive expec- 
tancies and motivations operating. 

A second implication of this study is that a 
single dependent variable is not likely to be 
sufficient for satisfactorily evaluating the ef- 
fects of a relaxation procedure. Both systolic 
and diastolic BP differentiated between the 
groups in this study, but heart rates, sponta- 
neous GSRs, and self-reports did not. Several 
studies in the literature of both relaxation 
training and meditation have used self-reports 
exclusively, and such reliance on a single de- 
pendent measure can sometimes be misleading. 
In the present study, systolic and diastolic BP 
measures were the most sensitive to differences 
between the groups. 

A third implication is that PRT and MT 
differ substantially in their efficiency for in- 
ducing decreased autonomic arousal as mea- 
sured by BP, In the systolic and diastolic BP 
analyses over Session 1, the MT group showed 
significant decreases, whereas, the PRT group 
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did not. In the systolic and diastolic BP anal- 
yses across the total training period, the MT 
group continued to show significant decreases, 
whereas the PRT group showed a significant 
decrease in diastolic BP only. On the basis 
of these findings, MT is a more efficient re- 
laxation technique than is PRT when BP is 
a relevant variable. 

A fourth, most intriguing, implication of 
this research is that efforts at relaxation can 
sometimes be stressful. In the analysis of 
diastolic BP over Session 1, the QR group 
showed a significant increase in arousal. Al- 
though heart rate, spontaneous GSR, and state 
anxiety scores decreased in association with 
the QR procedure, diastolic BP increased. 
This finding leads to the hypothesis that “cog- 
nitive activity” may increase in a stimulus- 
deprived situation as was found in the QR 
condition. Whereas the experimental groups 
had an activity to perform (either mental or 
physical exercises), the control group did not. 
All three groups appear to have received the 
benefits of a sedentary physical state, but 
only the experimental groups received the 
benefits associated with restricted attention. 
The restricted attention apparently reduced 
the likelihood of experimental subjects think- 
ing about current problems and/or anxiety- 
arousing situations as seemed to occur with 
the QR control subjects. The experimental 
groups may have escaped such cognitive ac- 
tivity by the demands of the relaxation task. 
The findings of this study suggest that the 
MT procedure was the most effective experi- 
mental condition in a number of the analyses, 
and the concept of restricted attention may 
help explain these results. 

Finally, with regard to the question of 
whether relaxation techniques are potentially 
Useful in the treatment of alcoholism, the data 
are encouraging. The alcoholic subjects in this 
study clearly experienced decreases in auto- 
nomic arousal and self-reported anxiety, with 
MT tending to be the most efficient treatment 
Condition. Using only a 5-minute instruction 
Period, significant decreases in arousal were 
obtained during the first training session in 
both the PRT and MT groups. In general, 
these decreases were maintained over a 3-week 
training period with only group support. Such 
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benefits are at a small cost in terms of staff 
time, which makes the relaxation techniques 
practical for substance abuse programs. Also, 
the alcoholics in this study accepted the re- 
laxation procedures with interest and coopera- 
tion. None of the subjects refused to partici- 
pate or expressed dissatisfaction with the 
relaxation procedures. 

As a cautionary note, demonstrating the 
efficiency of relaxation techniques for reducing 
autonomic arousal in alcoholics is not equiva- 
lent to demonstrating their usefulness as direct 
treatments for alcoholism. Long-term studies 
are required to assess whether alcoholics can 
be trained to use relaxation techniques on a 
regular basis over an extended period of time. 
Moreover, will such extended usage result in 
a generalized reduction in autonomic arousal 
and/or decreased rates of alcohol consumption? 
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Sex and Worker Acceptance of a Former Mental Patient 


Amerigo Farina, Pauline J. Murray, and Thomas Groh 
University of Connecticut 


This is the fifth in a series of studies measuring the acceptance accorded to 
former mental patients. The procedure was to send a confederate in the guise 
of a job applicant to be evaluated by a worker already on the job. The workers 
were told either that the applicant was an ex-mental patient or that he was an 
ordinary applicant, and for each condition the confederate was calm for half 
of the subjects and nervous for the rest. The studies revealed that women are 
more accepting of former patients than men and that men are more accepting 
of female than male ex-patients. Nervous applicants were rejected by workers 


of both sexes. 


Persons who experience difficulty in ad- 
justment are viewed in a grossly unfavorable 
way by virtually all members of our society 
(Nunnally, 1961). Opinions are especially 
negative toward those who are hospitalized, 
apparently because hospitalization indicates 
very serious mental problems (Lawner, 1966). 

_ This leads to the gloomy conclusion that 

h those who are least able to deal with thorny 
interpersonal situations and are hospitalized 
for this reason will find their problems magni- 
fed when they return to the community. 
Research into various facets of interpersonal 
interaction shows that this is precisely what 
happens (Farina, Gliha, Boudreau, Allen, & 
Sherman, 1971; Farina, Holland, & Ring, 
1966; Farina, Thaw, Lovern, & Mangone, 
1974), 

A particularly important area for com- 
munity adjustment is employment, since self- 
esteem and social status are Closely tied to 
the job one has. Moreover, having enough 
Money for even basic necessities such as food 
Will typically entail working, and if an ex- 
patient cannot find a job, the patient may be 
forced to return to the hospital. The research 
that has been done on job finding has been 
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limited to male subjects. However, that re- 
search makes it amply clear that males are 
denied jobs if they have been in a mental 
hospital. Employers will openly say that they 
do not like hiring ex-mental patients (Ol- 
shansky, Grob, & Malamud, 1958). With ac- 
tual behaviors, it was found that employment 
interviewers estimated a male applicant’s 
probability of getting a job as significantly 
lower, and they were reliably less friendly 
toward him when he revealed a history of 
mental illness in comparison to when he did 
not (Farina & Felner, 1973). 

In view of these data, some wholly unex- 
pected findings were obtained in a series of 
four studies, all using the same procedure 
(Farina, Felner, & Boudreau, 1973; Farina 
& Hagelauer, 1975). Female department store 
clerks met a female confederate in the guise 
of a job applicant and were asked to evaluate 
her as a potential co-worker. Half were told 
that the applicant was an ex-mental patient, 
and the rest were told that she was an ordi- 
nary job seeker. Unlike all prior studies of 
males, the women were no less accepting of 
the ex-mental patient than of the normal ap- 
plicant. The study was replicated with fe- 
male workers in a hospital, using a different 
female confederate, and the identical results 
were obtained. There followed another replica- 
tion at the same hospital, but this time the 
subjects were male workers who met a male 
confederate. It was then found that males 
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strongly rejected the applicant with a history 
of mental illness. To determine if it was sex 
of the subject or sex of the mental patient 
that accounted for the differences, a fourth 
replication was carried out. A new group of 
female department store clerks met a different 
male confederate, and no difference was found 
in degree of acceptance accorded to the appli- 
cant whether he was perceived as an ex-pa- 
tient or as normal. 

Attitude studies hardly explain these clear 
sex differences in acceptance of former mental 
patients. Nunnally (1961) found no sex dif- 
ferences in attitudes in any of the numerous 
studies that he has done. In a later attitude 
study (Farina et al., 1973), it was found that 
males in comparison to females expressed a 
greater liking for and a greater willingness to 
work with ex-mental patients. This kind of 
incongruity between attitudes and behaviors 
is not peculiar to the topic of mental illness. 
It has been found in a number of areas (e.g., 
Farina & Holzberg, 1967; Kutner, Wilkins, & 
Yarrow, 1952). 

Thus, prior behavioral studies have shown 
that females are perfectly willing to accept 
either a male or a female as a co-worker even 
if the applicant has been mentally ill. Male 
workers, however, strongly reject a male 
applicant if he has a history of mental illness. 
How males would act toward a female appli- 
cant who has been in a mental hospital has 
not been determined, and this is one purpose 
of the present study. In the prior studies an 
additional variable, being tense and nervous 
or being calm, was also investigated. With 
half of the subjects in the mentally ill condi- 
tion and half in the control condition, the 
“applicant” acted visibly tense and nervous. 
Calm and relaxed behavior was displayed 
with the remaining subjects, In all four stud- 
ies the nervous applicant was unambiguously 
rejected. This manipulation is also carried out 
in this investigation to determine if tense peo- 
ple are disliked generally or whether this 
effect, too, is sex related, 


Method 


j The subjects of the study were 48 males employed 
-in the physical plant division of a state university. 
The nature of the work they did was quite variable, 
ranging from supervision of heating plant operations 
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to janitorial work. They were contacted by a super- | 
visor and were informed that the possibility of using 
workers to evaluate job applicants was being in- 
vestigated. Therefore, they were told that workers 
were needed to interview and give their opinion of a 
potential co-worker, and they were asked to volun- | 


teer their help during regular working hours, Those 
who volunteered were contacted, and a time was set 
for the interview to take place. All but 1 man who — 
volunteered actually became subjects of the study, and © 
12 were randomly assigned to each of the four cells. | 
The single exception was a man who expressed the | 
conviction that we were trying to fire people like 
him, and he was excused from the study. P 

The workers were seen in offices belonging ti 
the physical plant division. It was explained to, 
them that workers already on the job knew that 
job best and, therefore, they would meet a job 
applicant and evaluate how well the new person 
would do if placed in the same department in 
which they worked. They were further told thata, 
the university wanted to see how disadvantaged | 
pesple (former mental patients) would do on cer- 
tain jobs. To find out, they were told that some 
workers would meet former mental patients who 
were interested in working for the university and 
that other workers would meet ordinary applicants to 
provide comparison data by means of which the 
suitability of ex-patients could be judged. To avert 
suspicion in the event that the employees compared 
notes, they were told that one applicant might be j 
seen by several workers. 

Each subject was asked to talk to the applicant | 
and to form a basic impression about her. This was 
stressed as most important, since later, in private, 
he was to indicate how he judged the applicant 
would do if hired. He was also asked to describe: 
the essentials of his work to help her decide about — 
accepting the job, if it were offered. As part of & 
brief background statement about the applicant, 
half of the workers were then, told that the person i 
they were about to meet was ‘@ former mental pa- i 
tient, whereas the rest were informed that she was + 
an ordinary job applicant. Actually, everybody met 
the same confederate,t a female undergraduate stu- 
dent in her early 20s. She was introduced with dif- ; 
ferent names, and she changed clothing several F 
times in a given day to avert suspicion. The worker — 
generally seemed pleased to participate in the i 
project and appeared to believe the experimenter. | 

The confederate related the same™personal history 
in all conditions. She reported having graduated from 
high school and working as a clerk ‘ina hardware 
store. She also said that she had worked in # 
grocery store and had been employed’ as a secretary: Í 
However, with half of the workers in the mental L 
patient condition and half in the control condition, 
she behaved in a calm, relaxed manner, whereas 
with the rest she was tense and anxious. The be 


1We would like to thank Maureen Kohler f0f 
doing this work. ` 
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À havior selected to indicate anxiety was the same as 
was used in the prior four studies. In the anxious 
condition the confederate seldom looked the em- 
ployee in the eye, only occasionally stealing a glance. 
She kept her head down while frequently wringing 
her hands, and periodically she swallowed as if her 

throat were dry. To avoid possible- changes in be- 
havior as a function of believing the worker con- 
sidered her normal or blemished by a history of 
mental illness, she was kept ignorant of what the 

subjects had been told. 

: After the confederate had described her work ex- 
perience, the employee was instructed to describe his 

‘job. The applicant was then thanked and dismissed, 

‘and a postexperimental questionnaire was read to 

the worker, who responded to the items verbally. 

None of the workers seemed suspicious about any 

aspect of the study. We believe that participation in 

the study during regular working hours and at the 
request of the supervisor helped to avert skepticism. 

We further believe that the effort being made by the 

University during that period to hire from popula- 

J _ tions underrepresented by the employees also helped. 
The true nature of the’ study was never revealed 
both to insure subject naiveté and because disclosure 
Seemed more harmful than nondisclosure. 


Results and Discussion 


A central purpose of the present study was 
to determine the reception that would be given 
to a female former mental patient by male 
Workers. To do this, the 15 items in the post- 
‘perimental questionnaire were analyzed 
5 ing two-factor analyses of variance, with 

mee factor being history of the applicant 
a illness—no mental illness) and the 
Other component heing the confederate be- 

havior (tense-calm). For one of these items, 
a significant main effect (p < .05) for mental 

- illness was found, When asked to describe the 
4pplicant’s assets, the workers perceived her 

_ 4S having fewer assets if she had been in a 

= Mental hospital than if she had not. Although 

this finding was close to chance level (1 sig- 
Cant item in 15), it was anticipated that 

Not all items would be equally revealing 

about unfavorable dispositions toward ex- 

Patients, In our society people are expected 

to be helpful toward unfortunate others, like 

Mental patients, rather than doing them 

harm, Consequently, as in the prior studies, 

~ the Workers were not expected to express dis- 
favor Openly, such as by saying they would 
Not get along with the applicant in the men- 
fal patient condition. However, the expression 
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of negative feelings was expected on items on 
which it was not obvious that the respondent 
was registering personal disfavor, One item 
viewed as especially subtle was the one in- 
quiring about the assets possessed by the 
applicant. This reasoning is described in 
detail, and empirical support for it is pre- 
sented in another publication (Farina, Chap- 
nick, Chapnick, & Misiti, 1972). Therefore, 
even though only one item differed between 
conditions, the fact that it was a subtle item 
suggests that the history of hospitalization 
did lead the workers to reject the female 
applicant. However, this rejection is much 
milder than that shown toward a male appli- 
cant in a previous study (Farina et al., 1973). 
There are also some significant interactions 
which indicate that these subjects have an 
unfavorable view of a female former patient. 
In the calm condition the confederate was 
described as significantly more tense when 
she was presented as a patient than when 
she was described as an ordinary job appli- 
cant, On the other hand, when nervous and 
tense, it was the normal applicant who was 
rated as more tense (p < .05). It seems that 
the normal applicant is expected to be calm, 
and when she is tense her nervousness is very 
salient, In contrast, the ex-mental patient is 
expected to be nervous, and she is seen as 
tense even when objectively the tension is 
not there. A similar pattern was found for 
the item asking how well the applicant would 
do the job if hired. A reliable interaction 
(p < .05) indicates that the normal person 
is expected to do better when calm, whereas 
a former patient is expected to do better 
when nervous. Possibly, better performance 
is anticipated and people are liked more 
when they behave in accordance with stereo- 
typic beliefs. One of the studies reported by 
Farina et al. (1973, p. 366) strengthens this 
possibility, since the same results were found, 
The findings concerning the calm-tense 
variable were consistent and quite clear. 
When tense, the confederate is expected to 
get along less well with other workers (p< 
05), she is perceived as having fewer assets 
(p < 001) and more liabilities ($ < 001), 
she is judged to be less reliable (6 < 05), 
and she is thought to be less well adjusted 
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(p < .001) and valuable (p< .05) than 
when she is calm. 

Before considering the meaning of these 
results, we need to take note of the meth- 
odology used in this and the similar prior 
studies, since this bears on the conclusion we 
can draw. In each study, the subjects were a 
reasonably random sample of workers in that 
setting, and so we believe that other workers 
drawn from comparable populations would 
behave like our subjects if they were to face 
the same conditions. However, only one con- 
federate was used in each study. The major 
problem with such a procedure was clearly 
recognized by Brunswik (1947) and force- 
fully described by Hammond .(1948). In 
brief, if we want to determine how workers 
respond to an ex-mental patient, we need 
both a representative sample of workers and 
a representative sample of ex-mental patients, 
Of course, we can conclude that workers 
react unfavorably to at least one person— 
the confederate we used. But another indi- 
vidual, whose physique and personality are 
different, might not elicit the same reaction. 
Data pertinent to this issue were reported by 
Farina, Thaw, Felner, and Hust (1976). 
Those researchers examined how the social 
impact of stigmatizing conditions (including 
mental illness) was influenced by individual 
differences among people. Four confederates 
played the role of a normal or stigmatized 
person with subjects drawn from the same 
population. A given subject saw only one 
confederate in only one of the three roles. It 
was found that the effect of the stigmatizing 
conditions was reliably influenced by the in- 
dividual characteristics of each confederate. 

On the other hand, Farina et al. (1976) 
also found response patterns that were the 
same for all confederates; for example, when 
perceived as mentally retarded, each con- 
federate was given reliably less painful shocks 
than when perceived as normal. Thus, indi- 
vidual differences among people do not neces- 
sarily invalidate all findings obtained with 
just one person. More important, the present 
study is the fifth in a series, all using identical 
procedures. The five experiments contained 
five different confederates, two males and 
three females, and were done in three differ- 
ent places of employment. Yet, the findings 
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are extremely consistent. For the calm-tense 
variable, we find that the confederate, when 
nervous, is always evaluated much less favor- 
ably. The mentally ill—normal results are 
also consistent, with males evaluating the 
confederate in the ex-mental patient condition 
less favorably than in the control condition 
and women responding the same in the two 
conditions. 

What conclusions can we reasonably draw 
from this and the earlier studies? There ap- 
pear to be two rather clear sex differences in 
the way that mental patients are treated. 
First, men appear to be unfavorably disposed 
toward members of either sex if they have 
been mentally ill, whereas women fully ac- 
cept such a person. And second, focusing on 
the victim of the mental illness, a male is 
treated more poorly than a female even when 
they have identical psychiatric histories. The 
results from this series of five studies are 
surprising in view of the past studies that 
uniformly report that there are no sex dif- 
ferences of the sort that we have found. 
Leaving inconsistencies between the present 
and past studies aside for the moment, how 
are we to understand our findings? There 
are reports in the literature that suggest 
explanations, 

Parsons and Bales (1955) have asserted 
that women are more concerned with ongoing 
interpersonal relationships than are men, 
whereas men focus more on goals in the 
future. The confederate acted in the same 
way both as a normal and as an ex-mental 
patient; women, perhaps, respond similarly 
in the two conditions because they are more 
influenced by the immediate behavior than 
by the cues about future threats from the 
former patient. On the other hand, men arè 
more concerned about the future and may 
have rejected the applicant with the psy- 
chiatric history because of fears such as of å 
disruption in their work careers. As for the 
preference men show for female in comparison 
to male ex-mental patients, a possible eX 
planation is that males are thought more 
likely to be aggressive and disruptive once 
hired. It has been reported that males aré 
viewed as more disposed than females to 
react to stress with aggressive behavior 
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(Coie, Pennington, & Buckley, 1974). How- 
ever, the evidence in support of these ex- 
planations is of uncertain value, since it is 
based on surveys of opinions. It appears 
likely that it is because attitudinal measures 
were used in the prior studies of social re- 
actions to mental patients that the sex dif- 
ferences we found were so unexpected. By 
now it should be very clear that the atti- 
tudes that people express and what they 
actually do are not necessarily consistent. 
Hence, if we are interested in behavior, we 
should measure behavior as directly as 
possible. 

The finding that nervous people are so 
unequivocally and strongly rejected has both 
theoretical and practical implications. On 
the theoretical level, it seems important to 
know why tense people are responded to so 
unfavorably, Perhaps such a response is lim- 
ited to circumstances in which someone is to 
be judged as a worker and means only that 
anxious people are expected to do a poor 
job. But it seems likely that this reaction is 
much more general, and it may mean that 
nervous people evoke memories of negative 
events or promise to bring trouble in the 
future. Whatever is responsible, we can ex- 
pect that tense people will be less readily 
accepted than calm individuals, and we can 
improve their social relationships if we can 
teduce the visibility of their tension. If men- 
tal patients are especially tense, as is widely 
believed, it might be particularly important 
to their readjustment at home if they can be 
helped to look more calm. Conceivably, the 
nervousness displayed by ex-patients is in 
part responsible for the difficulties that they 
encounter in finding a job. 
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Prospects for Faking Believable Deficits 
on Neuropsychological Testing 


Robert K. Heaton, Harold H. Smith, Jr., Ralph A. W. Lehman, 
and Arthur T. Vogt 
University of Colorado School of Medicine 


This study compared the results of 16 volunteer malingerers with those of 16 
cooperative, nonlitigating head-trauma patients on the Wechsler Adult Intelli- 
gence Scale, the Halstead-Reitan battery, and the Minnesota Multiphasic Per- 
sonality Inventory (MMPI). The overall level of ability impairment shown by 
the malingerers equaled that of the head-injury group, but different patterns 
of strengths and deficits were produced by the two groups on testing. The 
malingerers also showed more severe personality disturbance on the MMPI. 
The test protocols were sent to 10 neuropsychologists, who made “blind” judg- 
ments as to whether each was probably produced by a malingerer or by a real 
head-injury patient. Neuropsychologists’ diagnostic accuracies ranged from 
chance-level prediction to about 20% better than chance. Discriminant func- 
tions based on the neuropsychological test results and the MMPI, respectively, 
correctly classified 100% and 94% of subjects in both groups. In another large 
sample of head-injury patients, those who were involved in. court actions and/or 
gave clinical evidence of faking were more likely to be classified as malingerers 


by the discriminant functions. 


Neuropsychological tests are widely used in 
clinical settings to help diagnose brain lesions. 
Their role in this context is usually ancillary. 
That is, they help predict the presence and 
nature of neurologic conditions, which must 
then be confirmed by procedures involving 
more cost, risk, and/or discomfort. On the 
other hand, because they directly measure 
abilities affected by brain damage, neuropsy- 
chological tests have a more definitive role 
when the questions being asked deal specifi- 
cally with the behavioral consequences of neu- 
rologic conditions. Such consequences can be 
important in themselves, not merely as symp- 
toms contributing to a medical diagnosis. For 
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the many brain-damaged patients whose con- 
ditions are neither grossly incapacitating nor 
immediately life threatening, it is important 
to know which abilities have been affected, 
the severity of the various deficits, and their 
probable impacts on the patients’ everyday 
functioning (e.g., on social relationships and 
potential for working and living indepen- 
dently). In some civil and criminal court pro- 
ceedings, there is also a need to define as 
precisely as possible the effects of brain lesions 
on adaptive abilities. In such cases a dollar 
value is to be placed on the disability resulting 
from a head injury, or legal competency is t0 
be decided. 

To be valid for any of the purposes men- 
tioned above, neuropsychological testing Te 
quires adequate effort on the part of the 
patient. Most people are inclined to do theif 
best on tests, and most patients have mote 
to gain by appearing capable than by em- 
phasizing deficits. This is not always true 
however, particularly when the test results 
are to be used to justify compensation or other 
claims. Thus, in testifying before the courts 
as an expert witness, the clinical neuropsy~ 
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FAKING ON NEUROPSYCHOLOGICAL TESTING 


chologist is frequently asked a difficult ques- 
tion: “Is it possible to exaggerate deficits on 
these tests or even to fake deficits that do 
not exist?” In some cases one can argue con- 
vincingly that the particular pattern of test 
results makes good sense neurologically, and 
that it is therefore unlikely that the patient 
was malingering. However, few (if any) clini- 
cians have had experience with known ma- 
lingerers, and no research has shown whether 
malingerers’ neuropsychological test scores can 
be distinguished from those of nonmalingering 
brain-damaged patients. 

The Minnesota Multiphasic Personality In- 
ventory (MMPI) is often used in conjunction 
with neuropsychological tests to help deter- 
mine the types and degrees of emotional dis- 
turbances that may be associated with neuro- 
logic impairment (Boll, Heaton, & Reitan, 
1974; Dikmen & Reitan, 1977). MMPI scales 
and formulas have been developed for detect- 
ing conscious malingering and other test-tak- 
ing attitudes that may affect MMPI profile 
validity or serve as moderator variables in 
profile interpretation (Anthony, 1971; Cofer, 
Chance, & Judson, 1949; Gough, 1947, 1950, 
1954; Osborne, 1970). It is not known 
whether patients tend to display the same 
test-taking attitudes on the MMPI and on 
neuropsychological tests; if they do, MMPI 
signs of malingering might alert clinicians to 
the possible invalidity of neuropsychological 
protocols produced by the same patients. Also, 
there is some evidence that the MMPI may 
be helpful in identifying patients with func- 
tionally based neurological complaints (Shaw 
& Matthews, 1965). 

The present study compares the results of 
some volunteer malingerers with those of non- 
litigating head-injured patients on the MMPI 
and a detailed battery of neuropsychological 
tests. In addition, these test results were sent 
to 10 neuropsychologists to determine whether 
they could judge which protocols were pro- 
duced by malingerers and which were pro- 
duced by the patients with real head injuries. 


. Method 
Subjects 


The head-injury group was composed of 13 males 
and 3 females, with a mean age of 26.7 (SD = 6.5) 
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years and a mean of 11.9 (SD = 1.6) years of edu- 
cation. All of these patients had been referred for 
clinical neuropsychological evaluations, and all were 
considered to have put forth adequate effort on the 
testing. All had documented histories of traumatic 
head injuries, followed by at least 12 hours of un- 
consciousness, and had no history of any other 
neurological illness, At the times of testing, all of 
the patients had residual neurological deficits, but 
none had peripheral injuries to their upper extrem- 
ities that would interfere with test performance, 
Also, none of these individuals were involved in 
civil or criminal court actions, and none were ap- 
plying for disability income support. 

The malingering subjects were recruited by the 
second author from his neighborhood, college classes, 
and church. These subjects were told that they 
would be given $25 for taking the neuropsychologi- 
cal test battery and a $5 bonus if they were suc- 
cessful fakers, Actually, all subjects were paid the 
$30 regardless of how much they faked. Twenty 
subjects agreed to participate in the study as ma- 
lingerers, but only 16 gave any evidence of actually 
malingering on the tests, The other 4 individuals 
earned normal scores on the entire neuropsycho- 
logical test battery and were eliminated from the 
study. There are two reasons for omitting them, 
First, the normal test scores would not have justi- 
fied compensation in a real court case. Second, the 
purpose of this study was to learn whether patterns 
of neuropsychological deficits due to head injury 
can be distinguished from those faked by ma- 
lingerers, The distinction between real deficits and 
no deficits is of little interest in this context. 

The malingerers who were retained in the study 
included 11 males and 5 females, Their mean age 
of 24.4 years (SD = 7.5) was not significantly dif- 
ferent from that of the real head-injury group, 
(30) = .94, p> .05. Also, their mean of 12.9 years 
(SD =2.4) of education was comparable to that 
of the head-injury group, ¢(30) =1.39, p >.05. To 
provide an estimate of their actual intelligence, prior 
to their testing as malingerers, these subjects were 
administered the Shipley-Hartford Intelligence Scale 
(Shipley, 1940) under standard conditions; that is, 
they were asked to do their best. Using the system 
proposed by Paulson and Lin (1970) for predicting 
Wechsler Adult Intelligence Scale (WAIS) Full Scale 
IQ values from Shipley scores, we obtained a group 
mean of 113.2 (SD = 7.5). 


Tests 


Neuropsychological evaluations were administered 
by experienced technicians who were trained and 


1 One of these four individuals had started to fake, 
but he was so obvious about it that the technician 
severely reprimanded him and threatened to call the 
attorney who had supposedly referred him for test- 
ing. From that point on, the subject earned above 
average scores on all tests given. 
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supervised by the first author. The evaluations con- 
sisted of all 11 subtests of the WAIS (Wechsler, 
1955) and all of the tests normally included in the 
Halstead-Reitan Neuropsychological Test Battery 
for adults. These latter tests, which are described 
in detail in Reitan and Davison (1974), include 
the Category Test; the Tactual Performance Test; 
the Speech-Sounds Perception Test; the Seashore 
Rhythm Test; the Finger Oscillation Test; Parts A 
and B of the Trial Making Test; the Reitan-Klgve 
Sensory-Perceptual Examination, with the Tactile 
Form Recognition Test replacing Tactile Coin Rec- 
ognition as the measure of dystereognosis; the 
Aphasia Screening Test, including the testing of 
visuoconstructive abilities; measures of grip strength, 
obtained with a hand dynamometer; the Grooved 
Pegboard and Static Steadiness Tests from the 
Klgve-Matthews Motor Steadiness Battery; and the 
Reitan-Kléve Lateral Dominance Examination. The 
MMPI was also administered to provide objective 
personality measures that might be differentially 
associated with real brain damage versus deliberate 
attempts to exaggerate pathology. 

Numerous published studies have shown that the 
neuropsychological test battery used in this study 
is sensitive to focal and diffuse cerebral lesions 
caused by diverse neurologic conditions. (See re- 
views of this literature in Kløve, 1974, and Russell, 
Neuringer, & Goldstein, 1970.) In addition, this 
battery is well suited to the task of cataloging resid- 
ual strengths and deficits after a traumatic head 
injury, because it samples broadly the adaptive 
abilities that can be affected by cerebral lesions, 
such as, sensory, motor, cognitive, language, visuo- 
spatial, as well as other mental abilities. An addi- 
tional advantage of the test battery approach is 
that it permits analysis of both the level and the 
pattern of test performance in making diagnostic 
inferences (Reitan, 1966). Given the test scores from 
the Halstead-Reitan battery, experienced clinicians 
have been able to use these complementary diag- 
nostic methods to infer not only the presence and 
location of cerebral lesions but also their etiologies 
(Filskov & Goldstein, 1974; Reitan, 1964), Even 
though it may be very difficult for a clinician to 
discern whether one or a few poor test scores have 
been faked, a consideration of the pattern of 
strengths and deficits shown on more comprehen- 
sive testing may make such discrimination possible; 
that is, to fool the clinician on the test battery, 
the malingerer must show a pattern of results that 
is similar to patterns of scores earned by head- 
injury patients with real deficits. 


Judges ` 


Ten neuropsychologists provided independent 
“blind” ratings as to whether they thought each test 
protocol was produced by a malingerer or by a 
nonmalingering head-injury patient.? The second 
and fourth authors participated as judges. How- 
ever, at that time they had no knowledge of the 
numbers of subjects in each group and no previous 
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exposure to the test results of any of the subjects, 
The 10 judges differed greatly with respect to pre- 
vious experience in interpreting the Halstead-Reitan 
battery: The range was 8 weeks to 18 years full- 
time equivalent, with 5 of the judges having 4 or 
more years of experience. Judges with varying ex- 
perience were sought in order to assess the possi- 
ble role of experience in increasing diagnostic ac- 
curacy. 


Procedure 


The technicians who tested the malingering sub- 
jects were not told about this study until after the 
data collection phase was completed. To replicate 
the usual testing situation as closely as possible, it 
was considered necessary for the technicians to test 
these subjects as real clinical patients. Therefore, 
these subjects were scheduled as litigating patients, 
referred by one of two Denver attorneys who had 
previously sent real patients for testing.? 

In preparation for their neuropsychological eval- 
uations, the volunteer malingerers were asked to 
pretend that they had suffered head injuries in ac- 
cidents caused by other persons. Subjects were to 
consider themselves involved in litigation to deter- 
mine how much financial compensation they would 
obtain from the persons responsible for the acci- 
dents or from the insurance companies involved. 
They were told to imagine that their everyday func- 
tioning (e.g., in school and vocational activities) 
had been much worse since their accidents, that 
their potential earning powers had been substan- 
tially reduced, and that they deserved all the money 
that the courts would allow them. It was explained 
that their psychological test results would help de- 
termine how large their settlements would be. They 
were encouraged to fake the most severe disabil- 
ities that they could, without making it obvious to 
the examiner that they were faking. They knew 
that it was necessary for the technicians to think 
that they were real clinical patients. The subjects 
were told nothing about the test battery beyond 
what is usually told to real patients—that they 
would be tested for a full working day; that the 
tests cover a variety of sensory, motor, and cog- 
nitive functions; and that the tests measure dis- 
abilities that result from brain injuries. They were 
also given some of the other background informa- 
tion that real trauma patients have: (a) a story 
about the accident, duration of coma and hospitali- 
zation, whether there was a skull fracture, whethet 
Seizures developed and, if so, what the seizures were 
like and how they were being treated (this infor- 


? We are grateful for the assistance of Elgan Baker, | 


Gordon Chelune, Charles Cleeland, Igor Grant, 
Robert Ivnik, Charles Matthews, Homer Reed, and 
James Reed, who served as clinical judges. 3 
3We appreciate the permission granted by Nell 
Hillyard and Gerald McDermott to have malingering 
subjects scheduled for testing under their names. 
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mation was taken from the files of real head-trauma 
patients) and (b) a very general description of the 
neurological lab tests and clinical exams that they 
would have received if they had been real trauma 
patients. 

The examiner technicians questioned whether 7 of 
the 16 malingerers had put forth optimal effort on 
one or more tests. The other 9 malingerers were 
successful in convincing the examiners that they had 
given their best performances during all of the 
testing. 

Each test protocol sent to the neuropsychologist 
judges contained the following: (a) the subject’s 
age, sex, years of education, present/most recent 
occupation, and time elapsed since the subject’s real 
or pretended head injury; (b) the 3 IQ values and 
11 subtest scaled scores on the WAIS; (c) the 
Halstead Impairment Index and scores on the Cate- 
gory Test (total error score, plus scores on the 
seven individual groups), Trail Making Test (time 
and error scores), Tactual Performance Test (times 
and numbers of blocks placed during all three trials 
plus Memory and Location scores), Speech-Sounds 
Perception and Seashore Rhythm Tests, Lateral 
Dominance Examination, and for each hand on the 
Finger Oscillation, hand dynamometer, Grooved 
Pegboard, and Static Steadiness tests; (d) xeroxed 
data sheets with individual responses recorded on 
the Aphasia Screening Test and Sensory-Perceptual 
Examination; and (e) T scores for the 3 validity 
scales and 10 standard clinical scales of the MMPI. 

Judges were told of the general design of the 
study and were provided details of the subject se- 
lection procedures and instructions given to the 
paid malingerers. They knew that some of the sub- 
jects were malingerers and that some were non- 
malingering head-injury patients, but they were not 
told how many of each were in the total group. 
Judges were asked to consider each protocol, one 
at a time, and to make their judgments about the 
protocol without reference to the others. Further- 
more, each judge was asked to consider the 32 pro- 
tocols in a different random order. After reviewing 
each protocol, two decisions were required. The 
first was whether the protocol was probably pro- 
duced by a malingerer or by a genuine head-injured 
Patient. Second, the degree of confidence with which 


. this first judgment was made was to be rated ac- 


cording to a 4-point scale: very sure, sure, fairly 
Sure, unsure. 


Results * 


„Table 1 presents the means, standard de- 
Viations, and ż-test results for the malinger- 
ing versus head-injury group comparisons on 
all test measures. The malingerers did as 
badly as the real head-injury patients in 
Overall level of ability test performance; that 
is, there were no significant differences be- 
tween the groups on the three WAIS IQ 
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values or on the two neuropsychological sum- 
mary measures (Average Impairment Rat- 
ing and Halstead Impairment Index). Also, 
the mean Full Scale IQ obtained by the ma- 
lingerers when asked to fake was 17 points 
lower than their actual mean IQ as estimated 
with the Shipley-Hartford scale, #(30) = 
6.39, p < .001. These results suggest that the 
malingering group did in fact fake deficits 
during neuropsychological testing. 

Although there were minimal group differ- 
ences in overall level of performance, the ma- 
lingerers did differ from the real head-injury 
group in the pattern of strengths and deficits 
shown on testing. The real head-injury group 
did significantly worse than the malingerers 
on the Category Test; Part B of the Trail 
Making Test (error component); and on the 
Total Time, Memory, and Location compo- 
nents of the Tactual Performance Test. The 
malingerers did worse on the Speech-Sounds 
Perception Test, the Finger Oscillation Test, 
finger agnosia, sensory suppressions, hand 
dynamometer, and WAIS Digit Span. In ad- 
dition, the malingerers showed (faked) more 
emotional disturbance on the F scale and on 
six clinical scales of the MMPI. 

The neuropsychologist judges correctly 
classified from 50.0% to 68.8% of the subjects 
in this study. Sensitivity, or true positive 
rate for real head injuries, ranged from 
43.8% to 81.3%. Specificity, or true negative 
rate for malingerers, ranged from 25.0% to 
81.3%. Three somewhat different measures 
of diagnostic accuracy are more directly rele- 
vant to the clinical situation: the probability 
that the subject has a real head injury if the 
judge says he/she does; the probability that 
the subject is faking if the judge says he/she 
is; and the judge’s overall efficiency, or total 
correct classification rate for the combined 
population of head-injury patients and ma- 
lingerers. These three measures vary not only 
with the sensitivity and specificity of the 
judge but also with the prevalence of ma- 
lingerers in the population being studied. In 
this study the prevalence rate of malingering 
was set arbitrarily at 50%, but in general 


4Gary Zerbe’s assistance with some of the data 


analyses is gratefully acknowledged. 
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Table 1 


Neuropsychological and Personality Test Means, Standard Deviations, and t-Test Results 


for Head Injury — Malingerer Group Comparisons 
(2 }21 e aaau 


Malingerers Head injuries 
Variable M SD M SD t 
Neuropsychological summary measure 
Average impairment rating* 1.91 6 1.97 7 28 
Halstead Impairment Index 59 2 68 3 1.04 
WAIS 
IQ 
Full Scale 96.2 9.7 97.2 11.5 27 
Verbal 98.1 10.3 100.4 10.7 64 
Performance 94.1 11.6 93.1 14.6 23 
Scaled score 
Information 10.1 2.5 10.2 2.3 15 
Comprehension 9.6 3.1 10.1 2.9 .53 
Arithmetic 9.2 2.5 9.8 2.6 -56 
Similarities 10.6 1.5 10.6 2.7 -08 | 
Digit Span 7.0 2.9 9.5 2.5 2.64" 1 
Vocabulary 10.3 1.1 10.2 1.9 11 
Digit Symbol 6.7 2.5 7.2 3.0 51 
Picture Completion 9.0 21 10.0 2.0 1.37 
Block Design 10.4 2.6 9.0 2.9 1,48 
Picture Arrangement 9.2 2.1 8.1 3.2 1.23 
Object Assembly 9.4 2.6 9.7 3.2 .24 | 
Halstead-Reitan battery 
Category Test (errors) 46.1 20.1 67.4 27.2 2.52" 
Trail Making Test 
Part A (sec) 55.4 25.2 49.8 32.3 „54 
Errors .12 3 31 =) 1.28 
PE B (sec) 109.3 54.3 140.8 72.1 1.40 
Trors 81 1.0 2.00 9 2.19* 
Tactual Performance Test ; 
Total time/block (min) 5 4 1.2 1.3 2.23* T 
Memory 8.1 11 6.2 17 3.738" 
Location 5.2 2.1 2.8 1.8 3.48** 
Speech Sounds Perception (errors) 23.8 12.6 10.6 7.3 3.64*** 
Seashore Rhythm Test (correct) 21.4 4.5 23.8 41 1.60 
Finger Oscillation Test : i 
No./20 sec 63.1 49* 
ee Form Recognition» y e 02 a fs 
„Time (sec) ITO n 12,8 32.5 83 
Finger agnosia (errors) 7.2 5.6 3.5 4 2.23* 
Finger Tip Writing (errors) 6.5 4.9 5.9 73 -26 
Suppressions (number) 10.6 7.4 41 58 2.80* 
Crosses (rating)* 31 10 27 ‘9 1.09 
Aphasia (errors)* 11.2 11.9 07 61 45 
Added motor tests 
Hand dynamometer (kg)? 45.8 ro 
‘ . 20.8 76.4 30.5 3.32 
Grooved Pegboard—time, b y : 
Hole-type steadiness’ a o E 6.3 7.3 1.30 
Sec 
Hits Ao 1 11.6 26.5 39.3 1.60 
5.3 60.3 99.1 84.0 .92 
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Table 1— (continued) 
ER a 
Malingerers Head-injuries i 
Variable M SD M SD t ' 
MMPI (T scores) ; 
Validity scale i 
Lie, ; 49.4 8.2 52.4 10.7 .91 
Validity 79.9 189 O41 12.2 2.81% 
Defensiveness 483 7.4 53.6 9.7 {73 
Clinical scale $ : ; i 
Hypochondriasis 718.3 15.4 60.9 138 3378 
Depression 85.2 15.1 73.2 18.1 2.04 
Hysteria 73.6 9.0 61.7 10.6 3.43"* 
Psychopathic Deviate 69.1 13.7 67,9 10,7 27 
Masculinity-Femininity 55.9 15.9 61.3 9.6 1.16 
Paranoia 71.4 13.9 61.1 11.9 2,26* 
Psychasthenia 84.2 13.0 66.8 18.0 3.15% 
Schizophrenia 93,9 21.2 73.6 22.4 2.64" 
Mania 63.5 10.0 65.1 10.0 44 
Social Introversion 73,3 11,2 55.4 10.1 4,740" 


Note. n = 32 (16 malingerers and 16 head-injury patients). WAIS = Wechsler Adult Intelligence Scale; 
MMPI = Minnesota Multiphasic Personality Inventory. 
* Ratings are defined in Russell, Neuringer, and Goldstein (1970), 
è Scores were summed for both sides of the body. 
*p <.05. 
p < 01. 
“> < 001. 


ranged from chance-level prediction to pre- 
diction rates about 20% better than chance. 
For most of the judges, sensitivity exceeded 
specificity; these judges would do best under 
the lowest malingerer prevalence conditions. 


clinical or litigating populations the preva- 
lence may be much different. Table 2 pre- 
sents the ranges of these three diagnostic 
accuracy figures estimated for our 10 judges 
at three prevalence rates of malingering: 


30%, 50%, and 70%. The computations in- 
volved are described in Galen and Gambino 
(1975). In general, the success of the judges 


Conversely, in this study two judges cor- 
rectly classified more malingerers than head- 
injury patients, and they would do better 


Table 2 
Ranges of Diagnostic Accuracies Expected From 10 Neuropsychologists Adjusted 


for Prevalence of Malingering at Three Levels 


Efficiency change 
Rate P(I/+) P(M/-) Efficiency from P(M) = .30 
P(M) = 30 
Judges’ range 70-85 30-50 .55-.70 — 
Chance .70 30 .50 — 
P(M) = .50 
hie range .50-.70 50-.69 50-69 —.10 to +.08 
Chance .50 -50 50 00 
P(M) =.70 
ea range ,27-.50 .62-.85 .40-.70 —.20 to +.15 
Chance 30 -10 50 -00 


Note. P(M) i i lation being studied; P(I/+) = probability that the 
io = prevalence of malingerers in the population being studied; bole rat th 
Patient has a at head injury, if the judge says he/she does; P(M/—) = probability that the patient is ee 
aking if the judge says he/she is; efficiency = overall proportion of correct classifications to be expecte e 


Tom the judge, given a specified P(M). 
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whenever the prevalence of malingering is 
higher. The far-right column of Table 2 shows 
how changing prevalence rates can affect 
judges’ overall efficiencies. 

The relationship between the judges’ total 
correct classification rates in this study and 
their amounts of previous experience in neu- 
ropsychology was not statistically significant, 
r(8) =.27. The median percentage of agree- 
ment for all possible pairs of judges was only 
56 (range = 31%-75%). The relationship 
between confidence ratings (unsure to very 
sure) and prediction.accuracy was computed 
for each judge using a point-biserial correla- 
tion; that is, confidence ratings for correct 
versus incorrect judgments were compared. 
These correlations ranged from —.13 to .46 
(df = 30). There was a nonsignificant ten- 
dency for more experienced judges to more 
accurately assess their likelihood of being 
correct in their subject classifications; a 
product-moment correlation of .24 (df = 8) 
was obtained between judges’ years of ex- 
perience in neuropsychology and their point- 
biserial correlations for confidence ratings 
versus classification accuracy. The order in 
which cases were reviewed did not affect the 
probability of the cases being correctly iden- 
tified by the judges, r(30) = —.20. Also, 
there was a nonsignificant correlation be- 
tween subjects’ severity of “impairment” on 
the neuropsychological test battery (Average 
Impairment Rating) and subjects’ likelihood 
of being correctly identified by the judges, 
r(30) = 17. 

Because the head-injury and malingering 
subjects did show different patterns of per- 
formance on testing, it seemed possible that 
these differences might permit classification 
of subjects with a greater degree of accuracy 
than that achieved by the judges. Two step- 
wise discriminant function analyses (Nie, 
Hull, Jenkins, Steinbrenner, & Bent, 1975) 
were performed, one using the neuropsycho- 
logical variables and the other using the 
MMPI variables listed in Table 1. The op- 
timal neuropsychological discriminant func- 
tion cutoff correctly classified all subjects in 
both groups. The optimal MMPI function cut- 
off missed only one subject in each group.’ 

A It was not possible to recruit more ma- 
lingerers for direct cross-validation of the 
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discriminant functions. However, a number 
of head-injury patients not included in the 
function development had received clinical 
evaluations in our neuropsychology labora- 
tory, and their results were available for in- 
direct cross-validation of the functions. Of 
these patients, 12 had complete neuropsycho- 
logical data only, 14 had complete MMPI 
results only, and 62 had both types of data 
complete. Forty-one of these 88 patients were 
known to be involved in civil or criminal 
court actions at the times of testing, and 5 
litigators and 1 nonlitigator were considered 
by the examiners to have been obviously fak- 
ing on the tests. Thus, 42 of the head-injury 
patients had reason to exaggerate pathology 
and/or gave strong clinical evidence of do- 
ing so. Another 42 were known not to be 
involved in court cases and were rated by 
the examiners as having put forth adequate 
effort on testing, Excluded from considera- 
tion were 2 patients for whom it was uncer- 
tain whether they were litigating and 2 who 
had had cerebrovascular accidents in addi- 
tion to head injuries. 

The malingering discriminant function for- 
mulas were applied to the remaining 84 pa — 
tients. Of the 42 patients who were involved 
in court cases or who had given strong clini- 
cal evidence of exaggerating deficits, 27 
(64.3%) were classified as being malingerers 
by one or both formulas. Of the 42 who had 
no obvious evidence of or reason to exagget- 
ate pathology, only 11 (26.2%) were classi- 
fied as malingerers by one or both formulas. 
The resulting chi-square of 12.95 (df=1, 
p <.001) indicates that obvious fakers and 
patients involved in civil or criminal court 
proceedings were significantly more likely to 
be called malingerers by the formulas. 


Discussion 


It is difficult to guess what proportion of 
clinical patients exaggerate deficits on neuro- 
psychological testing. However, the question 
of exaggeration probably should be consid: 
ered more often than it is, particularly whe? 
there are obvious reasons for the patient not 


5 Due to the length of these discriminant function 
formulas, they are not given here. They can be ob: | 
tained from the first author. 
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to do his or her best. It is likely that the 
prevalence of malingering is especially high 
among patients involved in compensation 
litigation or competency disputes in criminal 
trials. This group comprises half of all head- 
injury referrals to our laboratory. In addi- 
tion, there are probably other social and per- 
sonality factors that cause some patients to 
fake or exaggerate disability. For example, 
for some individuals the expectation of spe- 
cial care or attention in the family, greater 
concern from the treating physician, or re- 
lease from performance expectations might 
serve as strong inducement for emphasizing 
real or imagined deficits. 

The overall correct classification rates ob- 
tained by the neuropsychologist judges in 
this study are rather modest, but it is un- 
likely that any of them have had previous 
experience with known malingerers. It would 
seem that the judges’ prior experience was 
only indirectly relevant to the questions 
asked in this study, and this may explain the 
ow correlations between amount of experi- 
ence and accuracy of judgments. Further- 
More, judges were deprived of such poten- 
tially useful information as details of the 
patients’ injuries, findings of neurological 
clinical and laboratory tests, how the recov- 
ery periods had gone, how well the patients 
had done in life prior to their injuries, be- 
avioral observations made by the neuropsy- 
hological technicians, and even some test 
ata (verbatim responses on the WATIS). All 
Í this information is usually available in the 
linical situation and might contribute to the 
lagnostic accuracy of the clinician. How- 
ver, the primary concern of the present 
udy was the value of the test scores in 
making clinical judgments, and our judges 
Were provided with the same information 
Used in previous studies of clinical interpre- 
tation with the Halstead-Reitan battery 
(Filskov & Goldstein, 1974; Reitan, 1964). 

Our group comparisons on the neuropsy- 
chological test battery reveal that malingerers 
can show significant abnormalities on test- 
mg, but that the patterns of their strengths 
and deficits differ from those produced by 
Sehuine head-injury patients. These ma- 
"gerers did especially poorly on motor and 
Sensory tests, but they did relatively well on 


2 
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several of the cognitive tests that are most 
sensitive to brain damage. The malingerers 
also displayed a greater range and degree of 
apparent personality disturbance on the 
MMPI, and they tended to obtain high 
scores on the Validity (F) scale. This sug- 
gests that the MMPI, in combination with 
the neuropsychological test battery, has some 
use in identifying patients who have general 
tendencies to feign or exaggerate symptoms. 
The clinician should be doubly wary when a 
litigation patient shows a “suspicious” pat- 
tern of neuropsychological test scores (one 
that is unusual from a neurological point of 
view or that resembles the pattern shown by 
known malingerers) and also gives an MMPI 
profile of questionable validity. 

The results of our discriminant function 
analyses suggest that group differences in 
patterns of neuropsychological and MMPI 
scores are sufficiently reliable to be used in 
predicting group membership. However, in 
considering the discriminant functions gen- 
erated in this study, it is emphasized that 
any value they may have is restricted to the 
head injury — malingerer distinction. Two pa- 
tients who had suffered strokes as well as 
head injuries serve as good examples of this. 
They were both classified as malingerers by 
the neuropsychology function, due to severe 
sensory-motor deficits, which are much more 
typical of cerebrovascular accidents than of 
traumatic head injuries. 

The need to cross-validate or improve our 
discriminant functions is also emphasized, 
particularly in view of the small subject 
groups on which they were developed. In 
such research, the design used in the present 
study could be extended by including coop- 
erative, nonlitigating head-injured patients 
who are asked to make themselves look as 
impaired as possible on the testing. In the 
clinical situation, many malingerers will have 
some real deficits to “build on.” This may 
make them more difficult to identify than 
were our normal malingerers, and the sug- 
gested extension of the present study would 
therefore be of value. Nevertheless, the re- 
sults of our indirect cross-validation of the 
discriminant functions suggest that these 
functions probably do have some reliability 


900 


in detecting those real head-injury patients 
who exaggerate their pathology. 

On virtually all ability tests, the subject is 
told what is required in order to do well. At 
the same time, it usually becomes obvious 
what a bad performance entails, for example, 
be slow, make errors, fail to solve the prob- 
lem. Therefore, neuropsychological tests 
would seem intrinsically vulnerable to faking. 
Faking certain “objective” aspects of the 
physical neurological exam (muscle wasting, 
asymmetrical or pathological reflexes, nystag- 
mus) would be more difficult, but sensory 
and mental status testing by the neurologist 
should be just as easily faked as are neuro- 
psychological tests. Clear discrepancies be- 
tween objective and subjective parts of the 
neurological exam may give the malingerer 
away, or at least arouse suspicion. However, 
there may be no objective findings, and the 
rather “spotty” (mild and inconsistently lo- 
calized) symptoms presented by many pa- 
tients with histories of head injuries make 
the real versus faking distinction much more 
difficult. In cases in which this judgment has 
to be made primarily on the basis of fakable 
symptoms and complaints of mental distur- 
bance, any patterns of such symptoms that 
are characteristic of malingerers probably 
would be more reliably identified by stan- 
dardized and comprehensive neuropsycho- 
logical testing than by nonstandardized and 
briefer clinical examinations. Nevertheless, 
until more is known about fakers’ patterns 
of performance, caution is warranted in inter- 
preting test results of patients who may have 
reason to exaggerate pathology. It is also pos- 
sible that malingerers who have been expertly 
coached will be able to simulate more suc- 
cessfully the deficits of genuinely brain-dam- 
aged patients. Under these circumstances dis- 
criminant functions of the type developed 
here will be less reliable in assigning patients 
to the nonmalingering category, 
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Validity of Self-Reports in Three 
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This study examined whether population type (voluntary outpatient, voluntary 
inpatient, coerced outpatient) and question type (alcohol, nonalcohol, demo- 
graphic) differentially affected the validity of alcoholics’ self-reports. Three dis- 
tinctly different populations of alcoholics independently completed life history 
questionnaires. The veridicality of subjects’ answers was assessed using official 
records and documents. Generally, alcoholics in this study gave highly valid 
self-reports, a result that parallels the findings of earlier studies. Question type 
differentially affected the validity of subjects’ interview answers, as significantly 
fewer invalid answers were given to demographic questions than to alcohol and 
nonalcohol questions. Population type, however, did not significantly affect the 
validity of self-reported life history information. Invalid interview answers were 
more often overreported than underreported when compared with official records. 


_ Despite limited empirical evidence, skep- 
ticism about the validity of alcoholics’ self- 
Teports has abounded (reviewed in Sobell, 
1976). Yet, in the alcoholism field, self- 
Teports are a major source of data. From a 
Practical standpoint, self-reports are con- 
venient and economical, they obviate check- 
ing official record sources, and they often 


‘Provide information when further verification 


Is impossible. Given their widespread use, it 


“seems highly unlikely that self-reports will 


be abandoned as a primary source of infor- 
mation, 

Although the alcoholism field has long re- 
lied on self-reported data, however, only re- 
cently has the validity of alcoholics’ self- 
Teports been examined (Armor, Polich, & 
Stambul, 1976; Sobell, 1976; Sobell & Sobell, 
1975; Sobell, Sobell, & Samuels, 1974; Sobell, 
Sobell, & VanderSpek, Note 1). These investi- 
8ations have found that most verifiable self- 
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reported life history data by alcoholics are gen- 
erally quite valid. However, beyond considera- 
tion of the overall validity of alcoholics’ 
self-reports, little is known about the validity 
of subjects’ answers as a function of popula- 
tion type (ie., outpatients vs. inpatients, 
females vs. males, chronics vs, problem drink- 
ers, voluntary vs. coerced, etc.), Will all 
alcoholics be equally self-disclosing, coopera- 
tive, and, moreover, truthful in answering 
different types of questions? For example, 
chronic alcoholics might have more difficulty 
in accurately remembering certain events 
simply because they may have more events 
to recall than other populations of alcoholics. 
Similarly, alcoholics coerced into treatment 
might be relatively less self-disclosing about 
their drinking behavior if they anticipate that 
a more severe drinking history is suggestive 
of a need for more treatment. 

The present study examined whether the 
validity of self-reports differs between and 
within three different populations of alco- 
holics. The investigation was limited to 
examining self-reports that could be verified 


by official records. 


Association, Inc. 0022-006X/78/4605-0901$00.75 
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Method 
Subjects 


Three distinctly different groups of male alcoholics 
served as subjects in this study: (a) voluntary out- 
patient alcoholics (V/OP) who were clients at the 
Orange County, California, Department of Mental 
Health Alcoholism Services (OCAS; n=14); (b) 
coerced outpatient alcoholics (C/OP) who were court 
referred for treatment at the OCAS (m= 12); and 
(c) voluntary inpatient alcoholics (V/IP) who were 
hospitalized at the San Diego County, California, 
Detoxification Center (n = 13). Subjects were further 
selected in accordance with the following criteria: 
(a) no evidence of alcohol withdrawal symptoms or 
alcohol intoxication at the time of the interview; 
(b) no evidence of organic brain syndromes or a 
primary diagnosis other than alcoholism; and (c) 
voluntary participation in the study. No subject 
cleared for the study refused to participate. 

Since the three groups of subjects were specifically 
chosen to differ from one another, statistical analy- 
ses comparing descriptive characteristics between 
groups of subjects were not performed. However, 
demographic variables clearly reflected differences 
between the three groups. Subjects in the V/IP 
group were older than subjects in the other two 
groups (M age: V/IP=44.5 years, V/OP = 38.4 
years, C/OP = 34.2 years), reported longer drink- 
ing problem histories (M years: V/IP = 16.8, n = 
12; V/OP=8.6; C/OP=3.6), and had more al- 
cohol-related arrests (M arrests: V/IP = 19.5, 
V/OP =4.5, C/OP = 4.5). Although few subjects 
reported alcohol-related hospitalizations, subjects in 
the V/OP group reported a mean of .79 such hos- 
pitalizations compared to a mean of .38 hospitali- 
zations for Group V/IP. No subject in Group C/OP 
reported any alcohol-related hospitalizations, Sub- 
jects in Group V/IP also had about three times as 
many nonalcohol arrests as subjects in the other two 
groups (M nonalcohol arrests: V/IP = 4.0, C/OP = 
1.3, V/OP = 1.2). The largest observed difference 
among groups was in terms of ethnicity. Even 
though two of the groups had similar percentages of 
Caucasian subjects (V/IP = 92.3%, V/OP = 92.9%), 
only 50.0% of C/OP subjects were Caucasian. 
Finally, all groups of subjects had a mean education 
of about 12 years. 


Procedure 


Group interviews were conducted separately with 
subjects in each of the three experimental groups. 
In the group setting, each subject was asked to 
complete a questionnaire and return it to the in- 
Vestigator (L. Sobell) when finished. The ques- 
tionnaire contained 35 verifiable questions about 
drinking and life history information. 

At the time of the interview, subjects were not 
aware that their answers would be compared with 
official records, However, after the questionnaire was 
completed, all subjects were debriefed as to the 
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nature of the study. Subjects were also assu 
their interview answers would be confidential, 
not be recorded in their clinic chart, and wou 
influence the kind of treatment received. 
subjects were further assured that partici 
the study would not affect their legal stai 
would the court or the treatment agency have a 
to their answers. 


Oficial Record Sources 


Interview answers were compared with the fi 
ing official records: (a) driver records from the 
fornia Department of Motor Vehicles; (b) 
arrest records, [Commonly known as “rap shi 
these records were those of the California B 
Criminal Identification and Investigation (CII 
the Federal Bureau of Investigation (FBI). In 
fornia, all legal agencies must furnish the Bu 
of CII with daily reports of virtually all 
demeanor and felony arrests within their juri 
tion and the eventual disposition of such chi 
FBI records contain essentially the same info 
tion but on a national basis. The information 
vided by these records constituted a known 
mum number of arrests for any given indi 
(c) inpatient hospitalizations at Orange C 
California, Medical Center, the sole public hos] 
serving the county (for Groups V/OP and C; 
only); (d) inpatient hospitalizations at San Di 
County, California, Detoxification Center—the 
public detoxification facility in San Diego Count 
(for Group V/IP only); (e) inpatient hospital 
tions at Metropolitan State Hospital (No 
California), the state hospital serving both Or 
and San Diego Counties; and (f) inpatient h 
pitalizations at all other California state hospi 
prior to July 1969—these hospitalizations were listed 
on the CII record until July 1969. 


Dependent Measures 


Each of the 35 verifiable interview ques 
was grouped into one of three mutually exclusi 
categories: (a) alcohol questions (n = 7)—questi 
about alcohol-related behaviors; (b) nona 
questions (n=19)—questions about behaviors 
directly related to drinking; and (c) demogra) 
questions (n = 9)—questions about personal identi 
ing information (i.e. name, date of birth, age; 
color, social security number, etc.). To be scored 
valid, interview answers had to be identical 
the record data. Invalid answers to all alcohol 
tions and 16 of the 19 nonalcohol questions 
further evaluated as either (a) overreports (inform 
tion that was reported in the interview but that 


The questionnaire also included two questio 
evaluate whether subjects who responded aff 
tively to fictitious questions would also tend to 
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report other aspects of their behavior. The first 
question asked subjects if they had ever used a drug 
called “bindro.” This fictitious drug (previously 
used in a study by Petzel, Johnson, & McKillip, 
1973) was disguised among a list of 11 other “real” 
drugs. The second fictitious question asked subjects 
how many times they had been hospitalized for 
alcohol-related problems at “Garden View Rehabili- 
tation Hospital.” This question, too, was disguised 
among a series of questions about admissions to 
actual hospitals for alcohol-related problems. 


Results 
Nonresponse Bias 


Nonresponse bias, the failure of subjects to 
respond to questions, was not an important 
factor in this study. All subjects in Groups 
V/OP and V/IP answered all interview 
questions, whereas subjects in Group C/OP 
answered 98.7% of all interview questions. 
(Two subjects each failed to answer one 
question, and one subject failed to answer 
four questions.) 


Validity of Interview Answers 


The two groups of alcoholics who were in 
treatment voluntarily, Groups V/OP and 
V/IP, validly answered 87.8% and 80.0%, 
tespectively, of all verifiable questions. Alco- 
holics coerced to participate in treatment 
(Group C/OP) validly answered 83.6% of 
all verifiable questions. The overall percent- 
ages of invalid answers for each type of ques- 
tion for each of the three groups of subjects 
Were (a) for Group V/OP, alcohol = 18.4%, 
nonalcohol = 11.3%, and demographic = 
9.5%; (b) for Group C/OP, alcohol = 9.5%, 
nonalcohol = 20.3%, and demographic = 
13.9%; and (c) for Group V/IP, alcohol = 
25.3%, nonalcohol = 22.7%, and demographic 
= 10.3%. Only two subjects responded posi- 
tively to the fictitious questions. Both sub- 
jects were in Group V/IP, and each reported 
having been hospitalized one time at Garden 
View Rehabilitation Hospital. 

The hypothesis that population type and 
Question type would differentially affect the 
Validity of subjects’ interview questions was 
tested using a 3 x 3 repeated measures analy- 
Sis of variance, Population type was treated 
as an independent factor with three levels 
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(V/OP, C/OP, V/IP), and question type was 
treated as a dependent factor with three 
levels (alcohol, nonalcohol, and demographic), 
Since there were unequal numbers of the 
three types of questions, subjects’ raw scores 
for number of invalid answers were con- 
verted into proportions of invalid answers for 
each of the three question types. Questions 
that were not answered (missing data) or 
reported by subjects as “unknown” were 
scored as invalid. A few subjects answered 
some questions by giving a range (e.g., 4-7 
public drunk arrests). These answers were 
converted to a single value by obtaining the 
arithmetic mean of the range, Consequently, 
not all interview answers were integers. Inter- 
view answers that were given as a range were 
not scored as invalid if they were within +.5 
of the official record information. 

The analysis of variance indicated a sig- 
nificant main effect for question type, F(2, 
52) = 3.61, p < .05, with no significant main 
effect for population type and no significant 
interaction. A simple effects analysis was 
performed to explore the significant main 
effect for question type. A significant simple 
effect, F(2, 52) = 21.67, p < .01, indicated 
that fewer invalid answers were given to 
demographic questions (M = 4.3%) than to 
alcohol (M = 7.0%) and nonalcohol (M = 
7.0%) questions. This finding was not sur- 
prising, as most demographic questions re- 
quired answers that were personally descrip- 
tive and fixed (e.g, name, date of birth, 
social security number, eye color, etc.). The 
two demographic questions answered least 
validly across all subjects were hair color and 
height, The answers to these questions were 
compared with data taken from the subjects’ 
driver records. Since these two variables were 
almost susceptible to real change over time, 
higher invalidity on these variables could 
reflect actual change. 


Direction of Discrepancy 


Demographic questions were excluded from 
analysis of direction of discrepancy, because 
invalid answers for eight of the nine ques- 
tions could not be categorized as either over- 
reports or underreports (e.g. date of birth, 
ethnicity, eye color, etc.). Three nonalcohol 
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questions were also excluded because they 
only required a yes or no answer. Further, 
alcohol and nonalcohol questions that were 
not answered (missing data) or subjects who 
reported unknown answers were not included 
in these analyses because direction of dis- 
crepancy could not be determined. 

Table 1 presents the percentage of valid, 
overreported, and underreported interview 
answers given by each of the three groups of 
subjects for alcohol questions and nonalcohol 
questions, The percentages of valid answers 
for combined alcohol and nonalcohol ques- 
tions (n = 26) were very similar for both 
outpatient groups (C/OP = 84.0%, V/OP = 
86.8%). Of the invalid answers given by sub- 
jects in each of these two groups, approxi- 
mately two thirds were overreported. Al- 
though the data for the inpatient group of 
subjects (V/IP) followed the same propor- 
tional distribution of overreports and under- 
reports as the outpatient groups, the 
absolute percentage of overreports and under- 
reports was about twice that of the out- 
patient subjects. Subjects in the inpatient 
group gave the lowest percentage of valid 
answers (68.0%) to questions for which di- 
rection of discrepancy could be determined. 

One-sample ¢ tests (two-tailed) were per- 
formed for each group of subjects to deter- 
mine if the mean proportion of overreported 
answers differed significantly from chance 
(p =.5) as compared to underreported an- 
swers. The mean proportions of overreports 
and underreports for each group were as fol- 
lows: C/OP—overreports = -65, underreports 
= 35; V/OP—overreports = -74, underre- 
ports = .26; and V/1P—overreports = 56, 
underreports = .44, For the two outpatient 
groups, C/OP and V/OP, the mean propor- 
tion of overreported answers was significantly 
greater than chance, ¢(11) = 1.93, p < .05, 
and ¢(13) = 4.08, p< .05, respectively. On 
the other hand, the mean proportion of over- 
Teported answers for the inpatient group, 
V/IP, did not differ significantly from chance 
t(12) = .64, p > 05. 3 


Discussion 


The overall validity rates for combined 
alcohol and nonalcohol questions for both 
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groups of outpatient alcoholics is comparable | 
to that obtained in other studies (Ball, 1967; 
Guze, Tuason, Stewart, & Picken, 1963; 
Sobell & Sobell, 1975; Sobell et al., 1974), 
In fact, the percentages of valid, overreported, 
and underreported interview answers for both 
groups of outpatient alcoholics in this study 
are nearly identical to results reported by 
Sobell and Sobell (1975) for a similar popu- 
lation, More importantly, the self-reports of | 
alcoholics who were coerced (court referred) 
into treatment were found to be as valid as 
those given by alcoholics who voluntarily 
entered treatment. The finding that all three 
groups of subjects gave more overreported 
than underreported interview answers is also 
consistent with earlier research (Ball, 1967; 
Guze et al., 1963; Knupfer, in press; Sobell, 
1976; Sobell & Sobell, 1975; Sobell et al, 
1974). 

In the last few years there has been a 
proliferation of studies using fictitious drug 
questions to assess overreporting of drug use. 
In all of these studies, only negligible per- 
centages of fictitious drug use have been re- 
ported [Haberman, Josephson, Zanes, & Elli- 
son, 1972 (1%); Whitehead & Smart, 1972 
reviewed two studies reporting less than 1% 
and one study reporting 7.5%; Petzel et al, 
1973 (3.8%); Single, Kandel, & Johnson, 
1975 (< 1%)]. Even though no subject in 
the present experiment reported using the 
fictitious drug bindro, two V/IP subjects re- 
ported being hospitalized at the bogus Garden 
View Rehabilitation Hospital. Given this 
paucity of results, the utility of fictitious 
questions to identify subjects who gave in- 
valid self-reports cannot be evaluated. 

Finally, although different populations of 
alcoholics in this study generally gave quite 
valid self-reports, this finding cannot neces 
sarily be extrapolated to conclude that alco- 
holics are basically honest in their daily inter- 
actions. For instance, alcoholics who are not 
interviewed within the context of an alcohol ! 
treatment program and who are not guaral- 
teed the confidentiality of their answers might 
engage in denial, misrepresentation, and/0 
lying, especially if they do not view them 
selves as having alcohol-related problems. It 
is notable, however, that individuals who wel® 
Court referred and coerced into an outpatient 

a 
| 


: 
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alcohol treatment program did not give more 
invalid self-reports than voluntary outpatients 
or inpatient alcoholics. 


Reference Note 
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The aims of this study were (a) to develop a reliable measure of preferences 
among types of controlled drugs and (b) to examine the correspondence be- 
tween the most preferred drug and the drug most frequently used. One hundred 
thirty active multiple drug abusers rated their preferences among 11 combina- 
tions of controlled drugs and common methods of administration using the 
method of paired comparisons. Edwards’ coefficient of consistency indicated 
that preferences were highly consistent (.92) and therefore internally reliable. 
Nearly half of the respondents most preferred drugs other than the type that 
they most frequently used, and their preferences were related to the method of 
administration. The results suggest that preference is one among several deter- 


minants of drug use. 


Attempts to identify personality character- 
istics that predispose drug abusers to con- 
tinued use of controlled drugs have produced 
generally inconclusive or conflicting results. 
Drug-related variables such as physiological 
addictiveness, cost, and availability as well as 
social variables such as setting, peer pressure, 
and acceptability seem to have stronger effects 
than personality in determining how fre- 
quently various drugs are used. 

Several recent articles (Crain, Ertel, & Gor- 
man, 1975; Penk & Robinowitz, 1976; Un- 
gerer, Harford, Brown, & Kleber, 1976) have 
suggested that personality characteristics are 
related primarily to drug preference rather 
than actual drug use. Preference among alter- 
native drugs can be conceptualized as an in- 
tervening variable that results from interac- 
tions among the users’ psychological charac- 
teristics (including personality, attitudes, and 
expectations), situational variables, and dif- 
ferences in the psychopharmacological effects 
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of specific types of drugs. Preference may be 
one of several factors that determine fre 
quency of use. If preference mediates the re- 
lationship between personality and drug-tak- 
ing behaviors, personality constructs are ex 
pected to be more strongly related to drug] 
preference than to frequency of use. This for 
mulation could account for previous incon 
sistent findings and could contribute to the 
development of our understanding of the pet 
sonality dynamics underlying compulsive drogi 
use. 

Although several personality correlates o 
drug preference have been reported, the spe) 
cific relationships between preference and us 
have not been examined. Some investigators 
(Baer & Corroda, 1973; Henriques, Arsenian 
Cutter, & Samaraweera, 1972) have assum 
that the most frequently used drug is the most 
preferred. Since there is no evidence to the 
contrary, the possibility that preference ant 
use are synonymous cannot be discounted 
but the hypothesized differential relationships 
between (a) personality and preference al 
(b) personality and drug use imply that pref: 
erence and use are only moderately relate? 
The explanatory value of the drug preferencii 
construct is negligible if the measures of pre 
erence and use are highly correlated. a 

One limitation of drug preference reseai®i 
is that reliabilities of the preference meas 
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generally are unknown. If preferences among 

drugs are not well measured, results that seem 
to confirm the predicted moderate degree of 
correspondence between preference and use 
may be the product of error rather than an 
indication of true differences between con- 
structs. The single-item rating scales typically 
used to measure preference seem to be par- 
ticularly susceptible to unreliable responding 
that could artifactually attenuate the relation- 
ship between preference and use. 

The present research examined the drug 
preferences of a group of active multiple drug 
abusers who were referred for treatment. 
Drug preference was measured by a paired- 
comparisons methodology that included six 
drug types (i.e. amphetamine, barbiturate, 
cocaine, hallucinogen, cannabis, opiate) and 
encompassed the three most common methods 
of drug administration (i.e., oral, intranasal, 
intravenous). The paired-comparisons method 
allows a direct test of consistency (Edwards, 
1957) that can be interpreted as an index 
of internal reliability of the preference scores. 
The relationship between the drug preference 
construct and drug use was assessed by com- 
paring the most preferred drug with the drug 
used most frequently during the preceding 
60-day period. 


Method 
Subjects 


A drug preference inventory was administered to 
130 persons seeking treatment for drug dependence 
at the Connecticut Mental Health Center in New 
Haven. Seventy-two percent of the sample were 
males, and 63% were white. Their ages ranged from 
17 to 29, with a mean of 22.5 years. 


Current Drug Use 


The type of drug used most frequently during the 
Preceding 60 days was determined by self-report and, 
a Most cases, was corroborated by medical examina- 
tion and/or thin-layer chromatography urinalysis re- 
sults. For the majority of respondents, the most fre- 
Quently used drug was heroin (60.0%), followed by 
Marijuana (17.6%), barbiturates (8.4%), ampheta- 
mines (7.6%), hallucinogens and psychedelics (5.0%), 
and other opiates (1.4%). 
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Drug Preference 


The drug preference inventory contains all possi- 
ble [n(n —1)/2] pairwise comparisons of 11 com- 
binations of types of controlled drugs and commonly 
used methods of administration: smoking cannabis, 
snorting opiates, ingesting opiates, shooting opiates, 
snorting cocaine, shooting cocaine, ingesting amphet- 
amines, shooting amphetamines, ingesting barbitu- 
rates, shooting barbiturates, and ingesting hallucino- 
gens. Each of the 55 preference pairs are rated on a 
6-point scale, which ranges from —3 to +3, There 
is no indifference point. The inventory also includes 
a self-report measure of whether the respondent has 
ever used each of the 11 drug categories, 


Procedure 


The testing procedure was presented as voluntary 
research directed toward helping the clinics improve 
the quality of treatment. Applicants were guaranteed 
confidentiality and were assured that their perform- 
ance would not influence their treatment status, Three 
applicants declined to participate. 


Results 
Previous Drug Use 


Responses to the ever used items indicated 
that at some time during their lives, 94% of 
the respondents had smoked cannabis, 61% 
had snorted cocaine, 61% had shot cocaine, 
70% had ingested opiates, 56% had snorted 
opiates, 63% had shot opiates, 48% had in- 
gested amphetamines, 77% had shot ampheta- 
mines, 66% had ingested hallucinogens, 43% 
had ingested barbiturates, and 72% had shot 
barbiturates. Seventy-six percent had used 
more than 4 of the 11 drug categories, and 
44% had used at least 9. The self-reports in- 
dicate that the majority of respondents ac- 
tually had used a variety of controlled drugs, 
and their preferences were based at least in 
part on personal experience with their effects. 


Consistency of Preferences 


The individual preference matrices were 
tested for transitivity using the coefficient of 
consistency (zeta). For the 55-item matrix, a 
¿é of .42 indicates greater than chance con- 
sistency at the .05 level of confidence, and a 
é of .64 is significant at the .001 level. The 
mean ¢ (.92) indicated that preferences were 
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highly consistent and transitive. Seven re- 
spondents were excluded from subsequent 
analyses because their preference data yielded 
consistency coefficients that were less than .53 
(p> .01). Two of the excluded respondents 
had consistency coefficients greater than .42. 
The higher cutoff score was selected to mini- 
mize the possibility that unreliability of pref- 
erence contributed to any lack of correspon- 
dence between preference and use. 


Preferred Drugs and Methods 
of Administration 


Preference scores ranging from O to 10 were 
calculated for each of the 11 drug categories 
by counting the number of times the cate- 
gory was preferred to all others. When the 
data were grouped within drugs, the most 
preferred drug was opiates (43%) followed 
by cocaine (19%), cannabis (19%), amphet- 
amines (8%), hallucinogens (6%), and bar- 
biturates (4%). When the methods of ad- 
ministering the drugs were compared, 56% of 
the respondents preferred intravenous cocaine 
to intranasal cocaine, 61% preferred intra- 
venous opiates to intranasal opiates, 59% pre- 
ferred intravenous opiates to oral opiates, 
78% preferred intravenous barbiturates to 
oral barbiturates, and 53% preferred intra- 
venous amphetamines to oral administration 
of amphetamines. These results show that for 
the four injectable drugs, the majority of re- 
spondents preferred intravenous to other 
methods of administration. 


Preferences and Use 


When the most preferred drug was com- 
pared with the currently used drug, 46% of 
this sample most frequently used a drug other 
than the one that they most preferred. Forty- 
five of the 75 opiate users preferred opiates 
to all other drugs. One of the 11 barbiturate 
users preferred barbiturates, (Three preferred 
cocaine, and 4 preferred opiates.) Six of the 
10 amphetamine users preferred ampheta- 
mines. Twelve of the 21 cannabis users pre- 
ferred cannabis, and 3 of the 6 hallucinogen 
users preferred hallucinogens. Thus, 60% of 
the opiate users preferred opiates. (Thirteen 
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percent preferred cannabis, and 25% pre- 2 
ferred cocaine to opiates.) Forty-six percent 

of those who most frequently used drugs other | 
than opiates preferred their current drug, 
(Seventeen percent preferred opiates, 9% pre- 

ferred cocaine, and 9% preferred hallucino- 
gens.) Cocaine was the drug least often iden- 
tified as the most frequently used drug, but 

it was the most frequently preferred alterna.) 
tive to the currently used drug (17.5%), fol- ? 
lowed by cannabis (8%) and opiates (79%). 


Discussion 


The method of paired comparisons yielded 
internally consistent ratings of drug prefer 
ence. Since zeta measures transitivity, the re- } 
sults demonstrate that drug preferences are | 
transitive. Transitivity implies that, for ex- f 
ample, if intravenous cocaine is preferred to 
intravenous opiates, and intravenous opiates 
are preferred to smoking cannabis, then intra- 
venous cocaine is preferred to smoking can- 
nabis. The high coefficient of consistency 1 
dicates that the method of paired comparisons 
is a reliable procedure for measuring drug 
preference. Since reliability of self-reports 18! 
particularly problematic among drug abusers i 
in treatment, this finding suggests that the” 
drug preference inventory would produce Te 
liable preference scores in other populations: 
as well. a 

A scant majority of the drug abusers Mi 
this study most frequently used the type of 
drug that they most preferred. The discrep: 
ancy between preference and use was greate! 
among the nonopiate users than among the 
opiate users. However, nearly 40% of the 
opiate users preferred either cocaine or math 
juana to opiates, whereas only 17% of the 
nonopiate users preferred opiates to all othe 
drugs. 

The results indicate that preference and 
use are more independent than has been be 
lieved previously. The assumption that th j 
most frequently* used drug is the de facl? 
drug of choice (Henriques et al., 1972) W# 
not confirmed. Many drug abusers compl’ | 
sively use a drug other than the one that they 
most prefer. The reasons for the less thi? 
perfect correspondence between preferent? 
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and use are not fully evident, but the high 
coefficients of consistency obtained for the 
drug preference inventory demonstrate that 
error in measuring preference was not respon- 
sible for attenuating the relationship. 

Some of the reasons for using drugs other 
than the most preferred type may be physio- 
logical or economic. For example, when the 
supply of a nonaddictive preferred drug is 
limited temporarily, less preferred but physio- 
logically more addictive drugs may be sub- 
stituted. As a combined result of tolerance 
and withdrawal effects that induce increased 
use of addictive drugs, use of the less pre- 
ferred drug may remain higher than that of 
the preferred drug after the supply of the 
preferred drug expands. This process might 
account for the current addictions of some of 
the opiate users who continued to prefer non- 
addictive cocaine and marijuana. 

In the dosages used by many street ad- 
dicts, heroin functions primarily to avoid 
withdrawal rather than to achieve euphoria. 
Cocaine is rarely available in quantities suf- 

ficient to produce tolerance effects, and toler- 

ance to cannabis does not develop. Conse- 
quently, preferences for cocaine and mari- 

juana by opiate users might derive in part 

from their continued ability to produce eu- 
phoria that can no longer be attained with 
Opiates, 

The results are consistent with the hypothe- 
sis that preference is one among several de- 

terminants of drug use, but they do not ex- 

clude the possibility of a reciprocal causal 
relationship between preference and use. Pro- 
longed frequent use of a most preferred drug 
may lead to its devaluation in the drug pref- 
erence structure. For individuals whose use 
of drugs is motivated by sensation seeking 
(Zuckerman, Bone, Neary, Mangelsdorff, & 
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Brustman, 1972), for example, increased fa- 


miliarity with the effects of the most fre- 
quently used drug may enhance preference 
. for more exotic, less frequently used drugs. 
Experience with the long-term negative effects 
of heroin addiction also could be expected to 
Produce increased preference for nonaddictive 
drugs. Preference may affect use primarily 
during the acquisition phase of compulsive 
drug use, whereas greater effects of drug use 
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on preference could be expected during the 
maintenance phase. Longitudinal investiga- 
tions of preference and use in nonaddict pop- 
ulations would be helpful in clarifying the com- 
plex interrelationships among these variables, 

The results show that within each of the 
four types of injectable drugs the majority of 
respondents preferred intravenous to other 
methods of administration. This finding indi- 
cates that preference depends in part on the 
method of administration, but the reasons 
for the popularity of the intravenous method 
remain a matter for speculation, Gossop and 
Connell’s (1975) explanation for the higher 
evaluation of drugs by oral abusers than by 
intravenous abusers may apply here. Although 
intravenous administration affords more im- 
mediate, potent, and economic effects as com- 
pared with oral and intranasal administration, 
it carries a higher probability of long-term 
adverse consequences such as abscesses, 
thrombosed veins, hepatitis, and overdose. 
Since most of these relatively youthful mul- 
tiple drug abusers had not yet experienced 
the physical liabilities that accompany exten- 
sive intravenous drug use, their method pref- 
erences may have been influenced primarily 
by the short-term advantages of injecting 
drugs. 

Internal evidence from the individual pref- 
erence matrices suggested that method of 
administration may interact with the type of 
drug to determine preference. One common 
response pattern showed, for example, that 
intravenous opiates were preferred to intra- 
venous cocaine, but intranasal cocaine was 
preferred to intranasal opiates. Since method 
of administration seems to have interactive 
effects as well as main effects on preference, 
research concerning drug use and preference 
should include distinctions among the differ- 
ent methods of administration using more 
comprehensive arrays of drug types and meth- 
ods than have been investigated heretofore. 

The independence of drug preference and 
frequency of use suggests several implications 
for the treatment of drug abusers. In cases in 
which the most preferred and most used drugs 
differ, knowledge of the patient’s drug pref- 
erences could be useful in formulating an op- 
timal course of rehabilitation. Although most 


912 


heroin users prefer heroin to all other drugs, 
for example, the addiction to heroin by a 
minority may obscure equally severe problems 
involving more preferred nonopiate drugs. Re- 
habilitative therapies that concentrate on the 
most frequently used drug may be less than 
successful if the existence of the preferred 
drug is not discovered and given a commen- 
surate degree of clinical attention. Accurate 
measures of the applicant’s drug preferences, 
administered on a routine basis, would be use- 
ful in diagnosing these otherwise latent drugs 
of abuse. 

The special case of opiate users who prefer 
nonopiate drugs such as cocaine and ampheta- 
mine is particularly problematic. These in- 
dividuals may have become addicted to heroin 
in conjunction with the use of their preferred 
drug. Some of them are misidentified as con- 
firmed heroin addicts and become enrolled in 
programs in which they are maintained for 
indefinite periods on methadone, a highly ad- 
dictive synthetic opiate, when detoxification 
from heroin and treatment primarily for de- 
pendence on the preferred drug might be a 
more effective therapy. The drug preference 
inventory could be used to identify prospec- 
tive methadone patients whose preferred drugs 
are nonopiates. 

Many rehabilitative programs rely on tech- 
niques intended to modify personality and 
other intrapsychic characteristics that are be- 
lieved to underlie compulsive drug use. The 
rationale for these practices may be invalid 
to the extent that it depends on the assump- 
tion of a direct relationship between personal- 
ity characteristics and drug use. These psy- 
chotherapeutic techniques also risk the danger 
of emphasizing the drug most frequently used 
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to the point of missing the significance of sec. / 
ondary drugs of abuse that may be more 
highly preferred. If a causal chain exists from 
personality characteristics through drug pref- 
erence to drug use, then therapy might more 
effectively curtail compulsive drug use by at- 
tempting to change drug preferences rather 
than characteristics that are more remotely 
implicated in drug abuse. 
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Neuropsychological Stability in Multiple Sclerosis 


Robert J. Ivnik 
Mayo Clinic and Mayo Foundation 
Rochester, Minnesota 


The neuropsychological performances of 14 patients who had multiple sclerosis 
(MS) and who received repeated testings spaced over time by at least 1 year 
were compared with identical evaluations of 14 patients who had neurological 
involvement but not MS. Subjects in each group were individually matched on 
chronological age at first testing, length of test-retest interval, sex, and years 
of formal education. Performance decrements attributable to the demyelination 
process of MS were primarily manifested on tasks requiring motor proficiency 
or complex sensory discriminations. Tests of higher order cognitive functions 
(e.g., abstractions, speech perception) were less adversely affected, except for 
measures having significant motor components. Preliminary Minnesota Multi- 
phasic Personality Inventory data are also presented. The results indicate rela- 
tive preservation or only mild deterioration for most intellectual abilities despite 


worsened motor-sensory functioning. 


In recent years, the psychological test per- 
formances of patients with multiple sclerosis 
(MS) have been experimentally differentiated 
from the performances of either patients with 
neurological involvement but not MS or nor- 
mal control patients. Matthews, Cleeland, and 
Hopper (1970) compared the test scores of 
30 patients who had MS with those of a 
neurological group that excluded MS subjects 
on an extensive battery of neuropsychological 
tests and found significantly poorer function- 
ing for the patients with MS on tests that 
demanded motor skill, speed, and coordina- 
tion, Reitan, Reed, and Dyken (1971) re- 


, ported similar findings when patients who had 


MS were compared with a neurologically 
normal control group. Reitan et al. also noted 
less striking but statistically significant defi- 
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cits on tests of verbal and auditory percep- 
tion, measures of incidental recall for geo- 
metric figures and spatial relations, and tests 
of sensory-—perceptual alertness. General in- 
formation, verbal communication, and com- 
prehension measures were least affected by 
MS. In each study, relatively good preserva- 
tion of reasoning and logical-analytic skills 
was apparent. The validity of the studies by 
Matthews et al. and Reitan et al. is supported 
by Goldstein and Shelly’s (1974) successful 
partial replication of each, Finally, Beatty 
and Gange (1977) provided correlative evi- 
dence suggesting that memory functions also 
may suffer the effects of demyelination, but 
this hypothesis at present is tentative and 
requires additional experimental verification. 
MS is a disease manifested by exacerba- 
tions and remissions of clinical symptoms as- 
sociated with a slowly progressive deteriora- 
tion of general functioning. Even though 
neuropsychological studies have defined dis- 
tinguishing behavioral-performance features 
of the disease, the relationship between the 
extent of neuropsychological impairment and 
the clinical progression of the disease has not 
been described. To approach this question, 
Ivnik (1978) compared the neuropsychologi- 
cal performances of three groups of patients: 


Inc. 0022-006X//78/4605-0913$00.75 


913 


914 


Table 1 
Neurological Diagnoses of 14 Non-MS 
Neurological Control Patients 


Patient Diagnosis 
1 Left cerebral atrophy; history of alco- 
hol abuse 
2 Major motor seizures secondary to 


subarachnoid hemorrhage 

3 Posttraumatic syndrome 

4 S/P encephalitis 

5 Somatosensory-evoked seizures of un- 
known etiology 


6 S/P neurosurgical repair of depressed 
skull fracture 

7 S/P neurosurgical repair of epidural 
hematoma 

8 S/P skull fracture with cerebral con- 
tusion 

9 Major motor seizures of unknown 
etiology 

10 Major motor seizures of unknown 
etiology 

11 Parkinson’s disease 

12 Mixed seizure disorder of unknown 
etiology 

13 Partial-complex seizure of unknown 
etiology 

14 Mild mental retardation; radiculop- 


athy; S/P myasthenic syndrome 


Note. MS = multiple sclerosis; S/P = status/post. 


those whose duration of MS was 1-5 years, 
6-10 years, or greater than 10 years. Neuro- 
psychological tests generated surprisingly few 
significant findings., Ivnik observed that the 
individual rate at which MS progresses is so 
varied that large-group comparative statistics 
may be an inappropriate experimental design 
for examining the disease process. To more 
rigorously examine the stability of neuropsy- 
chological performances over time for patients 
with MS, the research reported herein con- 
tained repeated examinations of the subject 
population using each patient as his or her 
own control. Patients with MS who were 
seen for extensive neuropsychological exami- 
nations on two occasions were compared with 
a population of patients with non-MS neuro- 
logical involvement who also received repeated 
evaluations. 


Method 
Subjects 


The testing protocol of every patient with MS 
who had received repeated examinations in the 
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Neuropsychology Laboratory, University of Wiscon. 
sin, was reviewed. Only patients with definite diag. 
noses of MS were considered; patients described as 
“possible MS,” “probable MS,” “MS suspect,” or in | 
any other equivocal manner were excluded, The 

referring neurologist’s judgment that the diagnosis 

was certain determined group inclusion. If a patient's 

diagnosis was equivocal at the time of initial testing 

but definite when reevaluated, he or she was in 

cluded for further consideration. A minimal test- 

retest interval of 1 year was required. Persons not 

receiving complete neuropsychological evaluations 

were excluded unless the missing data reflected a 

patient’s physical disability due to the neurological 

illness. For example, if a patient’s manipulative 

skills were so deteriorated that he or she was unable 

to perform the test of fine motor coordination, the 

patient was included in the study and the worst 

possible score was assigned for the missing test. If, 

however, data were missing for reasons unrelated to 

the ability being assessed (e.g., a patient did not} 
receive the Category Test of nonverbal abstraction 
abilities because of poor visual acuity), no theoretic 
data were supplied. There were few instances in 
which data were supplied, because one criterion for 
selection in the study was having completed most of | 
the neuropsychological measures. 

Determination of the non-MS neurological control 
group required review of all neuropsychological 
patient protocols on file. Persons with incomplet 
test protocols, only one neuropsychological examina- 
tion, or primary diagnoses of nonneurological con 
ditions were excluded from further consideration 
Each patient with MS was directly matched to 4] 
non-MS neurological control patient with regard to 
sex, chronological age when first tested, education, 
and number of months between testings. F 

The decision to use only non-MS neurological 
patients as controls was made to distinguish the 
stability of MS from that of other neurological di { 
orders. From a clinical viewpoint, a neurologic 
control group comprised of diseases that are most 
frequently included in the differential diagnoses with | 
MS (eg., spinal cord lesions) would have been 
optimal, but practical concerns made this impossible 
The final MS and neurological control groups Wi 
cluded 14 patients each. Neurological diagnoses 0 
control patients are shown in Table 1. 


Neuropsychological Tests 


The following measures served as dependent 
variables in this study. Ver- 

Wechsler Adult Intelligence Scale (WAIS). E ; 
bal, Performance, and Full Scale IQs (Wechsler 
1955) were compared for each experimental cae 
Individual WAIS subtest performances also We 
analyzed. 

Wide Range Achievement Test (WRAT). 
recognition, spelling, and arithmetic abilities d 
tested and scored for grade level equivalen 
(Jastak & Jastak, 1965). | 


word 


wert 
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Category Test. A test of abstraction — concept 
formation ability using 208 visual stimuli was pre- 
sented on a screen (Halstead, 1947). The number of 
errors was recorded. 

Speech Perception Test. This is an auditory dis- 
crimination task in which the subject underlines on 
an answer sheet one of four nonsense syllables that 
most closely corresponds to the stimulus presented 
via a tape recorder (Halstead, 1947). The test was 
scored for number of errors. 

Tactual Performance Test (TPT). A 10-block 
form board was given under a blindfolded condition. 
Average time (in minutes per block) for three con- 
secutive trials (preferred hand, nonpreferred hand, 
both hands) was recorded, as was the total time for 
the three trials. The number of blocks that the sub- 
ject could recall for shape (memory component) and 
correct location (location component) on a posttest 
drawing of the board and blocks was also noted 
(Halstead, 1947). 

Trail Making Test. A paper-and-pencil test was 
given in which on Part A a subject connects as 
rapidly as possible 25 numbered circles distributed 
on a sheet of paper (Armitage, 1946). On Part B, 
half of the circles are numbered and half are let- 
tered, and the subject connects the circles by alter- 
nating between these two sequences. The number of 
seconds to complete each part was the dependent 
measure. 

Seashore Rhythm Test. A test of the subject’s 
ability to make same-different discriminations be- 
tween 30 pairs of rhythmic patterns was presented 
with a tape recorder (Seashore, Lewis, & Saetveit, 
1960). The number of correct responses was re- 
corded, 

Imperception Test. Standardized examination was 
given for eliciting tactile, auditory, and visual im- 
ag or suppression (or both) errors (Reitan, 
_ Finger Agnosia Test. The patient was required to 
identity by touch alone fingers of his or her hand 
alter they were lightly touched with a pencil point; 
20 trials were given on each hand, and the number 
of errors was recorded (Reitan, 1959b). 

Fingertip Number Writing Test. The patient was 
blindfolded and was required to identify numbers 
Written on his or her fingertips; 20 trials were given 
on each hand, with the number of errors being 
hoted (Reitan, 1959b). 
ae Form Discrimination Test. The subject 
Foes by touch alone one of four plastic shapes 
ea in his or her hand, indicating his or her 
pa by pointing to a display panel. Time and 
isa scores were recorded for each hand (Reitan, 

9b), 

Bester Roughness Discrimination Test. The 
Soci was blindfolded and was presented with four 
differs blocks, each of which was covered with a 
ines roughness of sandpaper. The subject was 
Mika ed to arrange (as quickly as po ble) the 

Ree blocks in order of increasing roughness. 

o and error scores were computed for each hand. 

tooved Pegboard Test. A manipulative dexter- 
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Table 2 

Means and Standard Deviations of Matching f 
Variables for MS Group and Non-MS 
Neurological Control Group 


Variable MS Control 
Chronological age 
at first testing 38.0 + 9.28 37.0 + 9.32 
Education (years) 12.5 + 2.65 12.1 + 2.80 
No. months 
between testings 37.0 + 28.6 34.4 + 29,7 


Note. MS = multiple sclerosis. Sex was directly 
matched. There were no significant differences. 


ity task using a pegboard containing 25 holes with 
randomly positioned slots (Lafayette Instrument 
Co., Model 4202) was given. Before they could be 
inserted, the pegs, with an edge along one side, 
needed to be rotated to match the hole. Time scores 
were recorded for each hand. 

Maze Coordination Test. A modified 2706 A 
Maze (Lafayette Instrument Co.) was placed on a 
stand in a vertical position at the subject's midline. 
The subject was required to go through the maze 
with an electric stylus, trying not to touch the 
sides, The stylus was attached to a time clock and 
counter. Cumulative time of contact with the maze 
and cumulative error scores were recorded for each 
hand. 

Static Steadiness Test. The subject inserted an 
electric stylus into the holes of a conventional hole- 
type steadiness test (Lafayette Instrument Co., 
Model 4605 C). The subject was asked to keep the 
stylus in each hole for 15 sec. Cumulative time and 
counter scores were recorded for each hand. 

Finger Tapping Speed Test. The subject was re- 
quired to tap (as fast as possible) his or her index 
finger on a counter apparatus. The mean of five 10- 
sec trials was recorded for each hand (Halstead, 
1947). 

Woke Impairment Index. A composite score 
was computed for each subject, ranging from 0 to 
1.0, based on the frequency with which the subject 
exceeded specified cutoff points on 10 tests routinely 
used in the neuropsychological examination, Six of 
10 Impairment Index measures were represented by 
tests from Halstead’s battery. The other four mea- 
sures were (a) time score on Part A plus Part B 
of the Trail Making Test exceeding the cutoff point 
suggested by Reitan (1958); (b) two or more defi- 
nite dysphasic symptoms on a modified and extended 
version of the Halstead-Wepman Aphasia Screening 
Test (Halstead & ‘Wepman, 1949); (c) distorted 
reproductions of square, triangle, or Greek cross fig- 
ures on the Aphasia Screening Test; and (d) occur- 
wo or more errors on one body side in 
ntification and fingertip num- 


or the presence of two Or 
n test- 


rence of t 
tests of tactile finger ide! 


ber writing perception, e : 
more lateralized errors on sensory imperceptio 


ing in tactile, auditory, or visual modalities. 


mn , 
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Table 3 
Raw Score Descriptive Statistics on All Dependent Measures 


MS testing Control 
First Second First Second 
Measure M SD M SD M SD M SD 
WAIS 
Full 103.4 10.8 99.7 13.3 98.4 15.7 100.4 150 
Verbal 107.7 10.8 104.1 13.4 99.4 18.6 100.9 174 
Performance 97.7 12.8 94.0 14.6 96.6 15.3 9.7 13.4 | 
WAIS subtests | 
Information 11.5 2.5 11.0 2.4 99 26 10.5 26 
Comprehension 12.0 2.8 11.5 3.1 10.5 34 10.1 28m 
Arithmetic 114°) 932 10.8 3.7 97 4.0 10,2 43 
Similarities 11.1 2.1 10.6 2.7 10.5 2.8 10.6 29 
Digit Span 10.5 1.5 9.2 2.5 94 3.2 8.6 36 
Vocabulary 11.0 2.6 10.3 2.6 9.7 3.1 10.1 3.2 
Digit Symbol 75 24 5.9 2.6 6.8 2.2 73° ag 
Picture Completion 10.0 1.8 9.9 2.5 94 25 10.0 23 
Block Design 9.7 1.6 8.3 3.4 98 3.1 9.6 30 
Picture Arrangement 8.8 2.7 8.1 2.3 9,2 2.4 8.0, 24 
Object Assembly 7.6 3.1 7.7 3.3 8.7 2.9 10.0 20 
WRAT (grade level) 
Reading EAREN 1Y- 11.3 3.7 95 45 96 43 
Spelling | 87 3.0 8602.8 TA 29 72 3h 
Arithmetic 78 2.9 7.6 3.4 60 2.9 5.3 18 
Impairment Index 55 .27 68 23 64 23 53 
Category Test (errors) 52.3 241 51.1 21.8 58.4 20.9 49.4 218 
Seashore Rhythm Test 
(number correct) 24.1 3.1 23.4 4.2 22.2 4.2 22.5 4d 
Speech Perception (errors) 5.8 3.4 6.1 2.3 10.1 7.0 7.9 id 
Trails A (seconds) 40.1 11.8 55.6 29.5 56.0 44.7 51.9 221 
Trails B (seconds) 949 32.0 178.2 126.7 135.4 96.9 137.1 9 
TPT 
Dominant (min/block) DIETA FO 93:9) 1.5 13 14 
Nondominant (min/block) 19° 24 BS i332 w 25 1.5 aig 
Both (min/block) 21A =3.0 16 25 18 26 ‘goo tl 
Total Time (minutes) 39.1 24.0 47.6 28.6 45.9 35.5 314 290 
Memory 56 23 BO) tiig 57 29 63 14 
Location ZEDA BiA Di 24 33 25 
daonn Testing (errors) 

‘actile 36 84 64 36 1 88 
Auditory ROEE CER T 19 24A 
pen d4 53 86 161 07 27 29 ól 

p! 64 115, 236 3.32 1.00 1.30 193 38! 


Finger Agnosia (errors) 


Be % PANS Tiree 16 29 12 M 
Total Ee 14 22 12 16 T 
2.79 289 242 3.34 2.79 3.89 229 3 
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Table 3 (continued) 
MS testing Control 
First Second First Second 
Measure M SD M SD M SD M SD 
Fingertip Number Writing 
(errors) 
Right 2.9 4.1 5.6 6.2 2.9 28 22) 28 
Left 26 40 41 5.6 20 3.9 tan) Qo 
Total ssi Bi 9.7 10.5 49 65 34 4.6 
Sandpaper (sec) 
Right 33,7 8.2 52.7 28.0 47.9 25.0 38.6 17.0 
Left 43,3 20.1 55.4 31.8 39.4 20.0 35:5.) 13.7 
Total 78.6 30.1 109.1 70.0 90.6 51.1 TL 30.7 
Sandpaper (errors) 
Right 57 0.94 1.00 1,30 1.14 1.87 oor AA 
Left A 1.68 1.14 1.92 1.00 1.88 pe Sey) 
Total 1,28 2.30 2.14 3.58 2.14 3.63 Prk ie E y 
Tactile Forms (sec) 
Right 28.8 23.6 38.1 25.7 239 Fs} 20.2 61 
Left 31.1 23.1 40.6 28.8 21.6 6.4 17.4 5.2 
Total 60.1 44.6 78.7 53.2 47.5 13.0 40.6 13.8 
Tactile F 
Right a SA AMS 19 1.42 2982 07) 1287 
Left 86 2.07 1.86 2.63 07 a2 00 00 
Total 1.43 2.47 2.64 3.30 .36 1.08 64 213 
Finger Tapping (count) 
Dominant 43.4 9.0 37.2 7.6 39.7 a Eh oe 
Nondominant 36.2 17 29.8 10,9 35.0 9. ; 4 
Dynamometer (kg) 
Dominant 44.7 9,0 39.4 9.8 ae TE f as ne 
Nondominant 38.1 14.1 33.8 15.1 35.9 i 4 
Pegboard (sec/peg) 
Dominant R 4.3 1.8 5.6 2.5 A 2 ss 3 
Nondominant 59 3.9 25.6 40.3 % 3 i 5 
Mazes (sec) 
57 2.39 3.41 
Domi 4.31 4,24 6.21 6.52 4.39 6. 
Nondominait 14.61 25.65 22.06 34.01 6.75 5.70 4.65 6,80 
Mazes (count) 
3 240 15,9 18.0 
omi 27.1 27.0 37.6 36.8 23. 
Nondominali 106.1 253.9 182.6 347.7 42.6 38.4 25.4 30.2 
Static Steadiness (sec) 23.62 14.79 
i 22.62 18.00 34.93 25.33 22.82 9.36 A È 
Nondomsiaaat 34.62 27.35 42.39 30.65 32.82 20.10 31.91 19.57 
Static Steadiness (count) 
9 90.5 126.3 68.7 
Omi 109.1 75.9 139.3 67.7 155. 
onda 112.1 53.3 259.7 320.8 155.9 73.4 143.7 64.7 


Note, WAIS = Wechsler Adult Intelligence Scale; WRAT = Wide Range Achi 


actual Performance Test. 


ievement Test; TPT = 
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Minnesota Multiphasic Personality Inventory 
(MMPI). Ten patients in each group also com- 
pleted the MMPI at each testing. Their profiles 
were compared for each scale as a related brief study. 


Data Analysis 


Normative data for many of the above-described 
tests have been provided by Kiernan and Matthews 
(1976), The availability of these norms permits the 
transformation of raw scores on individual tests into 
T scores (M=50, SD=10), thereby providing a 
mechanism for comparing the experimental sub- 
jects’ performances against “normal expectation.” 
T-score conversions also allow for direct comparison 
of performance levels in various ability domains be- 
cause raw score measurement units (eg., kilograms 
for grip strength, IQ for intelligence) are transformed 
to a common scale, Further, these conversions pro- 
vide the opportunity for a graphic display of the 
test results. A disadvantage of Kiernan and Mat- 
thews’ results is that normative data on sensory 
examinations were not included. The dependent 
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variables to which the T-score conversions apply 
include Verbal, Performance, and Full Scale IQs 
from the WAIS, Wisconsin Impairment Index, Cate- 
gory Test, Seashore Rhythm Test, Trail Making 
Test, Speech Perception Test, Tactual Performance 
Test, and all tests of motor-steadiness proficiency. 

Analyses of variance (ANOVA) (2X2) were 
computed with repeated measures on the second 
variable (i.e., testings). On measures for which the 
ANOVA identified a trend (.05 < p < .10) or a sig- 
nificant difference (p< .05), post facto £ tests for 
correlated means were computed for each group 
across testings. The comparisons were of specific 
interest when the ANOVA identified a significant 
interaction, because the t-test analyses provided 
information as to which patient groupings (ie, 
either one or both) primarily influenced the inter- 
action, 


Results 


M.S. AND CONTROL GROUP PERFORMANCES AT FIRST AND SECOND TESTING 
ON DEPENDENT MEASURES WITH T-SCORE CONVERSIONS 


Both patient groups were comparable | 


T-SCORE 


=- MS, - FIRST TESTING 
©——— M.S. - SECOND TESTING 

a ———-® CONTROL - FIRST TESTING 
AA CONTROL- SECOND TESTING 


BuHOoS-1L 


chronological age at first testing, length of 
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Figure 1. Multiple sclerosis (MS.) and control 
dependent measures, with T-score co: 
on each variable are shown at bott 
are listed. Post facto correlated £ tes 
in which ANOVAs identified a sign: 


ip meon performance of the MS., con 
sade ado of the MS., control, or both groups 


= group performances at first and second testings 0n 
nversions. [Results of a 2 X 2 analysis of variance (ANOVA) 
‘om, and only those analyses that showed p<.10 significance 
sts comparisons across testings also are given on those measures 
ificant test or interaction effect.] 
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Figure 2. Multiple sclerosis and control group performances on sensory-perceptual examinations. 


test-retest interval, sex, and years of formal 
education (Table 2). 

Evaluation of descriptive raw data of each 
subject group at each testing before any unit 
of measurement (i.e., T-score) conversions or 
Statistical analyses revealed extremely im- 
paired mean values and large standard devi- 
ations on several of the measures, particu- 
larly motor performances (Table 3). These 
extreme values most frequently represented 
spurious exaggeration of group mean values 
partly because one or more patients could not 
complete the test owing to neurological dys- 
function, and the patient therefore was as- 
signed the worst score possible for that mea- 
Sure. T-score conversions served to attenuate 
the undue influence of such artificially ex- 
treme values. 

Analyses of conversion of raw data to T- 
scores revealed significant differences between 
the groups (see Figure 1). On measures in 
Which significant differences were found, the 
Ms patient group’s performances were either 
unchanged or worsened over time, but they 
never significantly improved. In ‘contrast, the 
Performance of the non-MS neurological con- 
trols was either consistently unchanged or 
improved, but it never significantly wors- 
ened, 


The number of imperception errors in each 
sensory modality (i.e, tactile, auditory, 
visual, and total number of errors across 
modalities) failed to distinguish MS patients 
from controls (see Figure 2), The Sandpaper 
Roughness Discrimination and Finger Ag- 
nosia examination results also failed to show 
significant group differences when the number 
of errors was used as the dependent mea- 
sure; however, when the time taken to com- 
plete the Sandpaper Roughness Discrimina- 
tion Test was examined, a significant inter- 
action effect for the right hand (p < .01) 
was evident, and left-handed performances 
approached significance (.05 < p < 10). The 
performance of the MS group on this mea- 
sure noticeably deteriorated bilaterally, 
whereas the control group clearly improved. 
There was a significant group effect (p < .05) 
in the sensitivity of number of errors on the 
Tactile Forms Discrimination Test to MS, 
but no testing or interaction effects were 
apparent (see Figure 2). The time com- 
ponent of the Tactile Forms Discrimination 
Test identified a significant right-handed 
interaction effect (p < .05) and a significant 
left-handed group effect (p< 01). Right- 
handed performances analyzed across groups 
and left-handed performances on interaction 
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Figure 3. Stability of Minnesota Multiphasic Personality Inventory (MMPI) performances for | 
multiple sclerosis (M.S.) patients versus neurological controls. 


analyses approached significance (.05 < p < 
.10) on this measure. These results suggest 
fairly good sensitivity to MS deterioration 
for the error and time components of the 
Tactile Forms Discrimination Test. Finally, 
the Fingertip Number Writing Test gener- 
ated consistent interaction effects for number 
of errors on both the right and left hands, 
along with the total number of errors for 
this test. This interaction was chiefly caused 
by the worsened performance of the MS group 
and by the mild improvement demonstrated 
by the controls, 

Because WAIS Verbal, Performance, and 
Full Scale IQs yielded significant interaction 
effects, analyses were also computed on the 
individual WAIS subtests (Table 3). Among 
the verbal subtests, significant interactions 
without significant group or testing effects 
were obtained on the Information (p < .01) 
and Vocabulary (p < .05) subtests; Digit 
Span showed a testing effect (p < 01), but 
no group or interaction effects were apparent. 
Among the performance subtests, Digit Sym- 
bol also yielded a significant interaction ef- 
fect (p < 01) without group or testing dif- 
ferences; Picture Arrangement attained a sig- 


nificant (p < .05) testing effect only; and 
Block Design approached significance (.05 < 
p<.10) on the analyses of tests across 
groups, r 

The WRAT failed to show significant dif- 
ferences for any academic skill. 

The test-retest MMPI profiles for each 
experimental group revealed some statistically 
significant scale differences (see Figure 3). 


Discussion 


This study introduces a number of issu 
regarding the changes over time for cognitive 
and motor-sensory function in MS. MS de) 
terioration continues to be most apparent oi 
measures that demand coordinated motori 
skills and on cognitive ability tests having 4 | 
significant motor component. The Groovel 
Pegboard Test (fine motor coordination aM / 
manual dexterity) was the most poorly pé™ 
formed test in the MS group on initial testing 
and showed the greatest decrement over time 
Kinetic steadiness (Maze Coordination Test) 
was also definitely impaired, whereas i 
tapping speed and static steadiness generally 
were borderline or only mildly impaired (i 


} 
i 
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less than 2 SDs below normal expectation). 
Grip strength decreased slightly among MS 
patients but remained average. Among tests 
of “higher order” abilities, only the Trail 
Making Test, which is motorically dependent, 
showed significantly worsened scores for MS 
patients. The TPT was poorly executed by 
both the MS and control subjects; however, 
control subjects showed improvement, where- 
as the performances of the MS patients were 
essentially unchanged. The worsening on 
motor proficiency measures for patients with 
MS was consistent with previous research 
(Goldstein & Shelly, 1974; Matthews et al., 
1970; Reitan et al., 1971). 

An unexpected finding was that the motor 
deterioration for MS patients appeared to be 
more evident on dominant than nondominant 
hand measures. This suggestion evolves from 
the post facto ¢ test analyses in which domi- 
nant hand test-retest comparisons on the 
pegboard, finger tapping, and static steadiness 
measures either approached or achieved sig- 
nificance, but nondominant hand performances 
on these same tests did not. On several of 
these motor tests (e.g., finger tapping and 
steadiness measures), the control subjects 
showed bilateral improvement. The clinical 
or neuropathological (or both) significance of 
this observation is uncertain, but it raises the 
possibility that deterioration associated with 
MS is more apparent, and subsequent refer- 
tal for neuropsychological reevaluation is 
therefore pursued, as the efficiency of domi- 
nant hand performance is affected—suggest- 
ing a possible sampling bias among the MS 
patients, 

Several of the tests of sensory discrimina- 
tion abilities also appear to be sensitive to 
MS. The number of errors on the Fingertip 
Number Writing Test increased dramatically 
in the follow-up MS data, as did the time 
scores on the Sandpaper Rougness Discrimi- 
nation Test. The stereognostic abilities of the 


. MS patients were significantly worse than 


those of control subjects, both in initial and 
in follow-up testing, but no statistically sig- 
nificant deterioration was identified. In con- 
Sidering these data, it must be appreciated 
that an unknown degree of the motor—sensory 
deterioration seen in MS may reflect plaque 
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formation below the level of the cerebral 
hemispheres. 

In addition to the worsened performances 
over time on motor-dependent measures and 
several sensory—perceptual tests, the MS 
group consistently showed either unchanged 
or worsened scores on all other tests of adap- 
tive abilities. This observation was in sharp 
contrast to the opposite tendency for non-MS 
neurological control subjects to improve on 
many of these same tasks. Improvement by 
the control group may indicate improvement 
of adaptive skills, a “practice effect,” or both. 
The fact that the mean test-retest interval 
was almost 3 years suggests that any practice 
effect would most likely be minimal. The 
probability that the test-retest differences 
represent a true improvement in neuropsy- 
chological function for the control group is 
supported by the fact that several of the 
neurological disorders of the control group 
involved neurosurgical procedures from which 
recovery of function over time could reason- 
ably be anticipated. 

Neuropsychological decrements over time 
were also evident on several tests of higher 
order cognitive skills. Verbal, Performance, 
and Full Scale IQs on the WAIS achieved 
statistical significance, as did the Wisconsin 
Impairment Index. The WAIS measures were 
primarily influenced by worsening of MS, 
whereas the Impairment Index reflected not 
only the exacerbation of adaptive ability 
deficits in the MS group but also improved 
performances by the non-MS_ neurological 
controls. The worsened Impairment Index 
ratings for MS patients seem to be largely 
attributable to the changes evident on the 
Trail Making Test and the TPT, each of 
which is heavily motorically dependent. The 
Seashore Rhythm and the Speech Perception 
tests yielded no significant findings, but the 
Category Test results demonstrated a mild 
improvement over time that was most promi- 
nent for the control subjects. 

It is somewhat surprising that WAIS varia- 
bles should prove to be sensitive to the effects 
of demyelination when many other measures 
that are commonly considered to be more 
sensitive indices of brain dysfunction (Rei- 
tan, 1959a) yielded negative or equivocal 
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results. These unanticipated psychometric 
findings are further highlighted by the ob- 
servation that the Information, Vocabulary, 
and Digit Symbol WAIS subtests were the 
measures that proved to be most sensitive to 
MS deterioration, The worsening of Digit 
Symbol scores in the MS sample may be re- 
lated to this test’s dependency on a visual- 
motor response, but worse scores on the In- 
formation and Vocabulary subtests cannot be 
explained on a visual-motor basis. Further- 
more, these two subtests are commonly con- 
sidered to be relatively insensitive to neuro- 
logical disorders (in the absence of aphasia). 
Several WAIS subtests that are more often 
sensitive to neurological dysfunction failed to 
yield significant decrements for the MS group. 
The Digit Symbol, Picture Arrangement, and 
Block Design subtests either approached or 
achieved statistical significance on the analy- 
ses of tests across groups, with slightly wors- 
ened functioning evident on these measures 
regardless of neurological diagnosis. Even 
though it is of interest to note that many of 
the individual WAIS subtests, in addition to 
the summary IQ scores, achieved statistical 
significance, the clinical import of these find- 
ings may be minimal, because the absolute 
mean values on these measures were not strik- 
ingly changed in either diagnostic group. 
Although other tests of complex cognitive 
functions yielded statistically significant re- 
sults, the actual level of impairment was not 
strikingly changed when the various neuro- 
logical patient performances were compared 
against expectations based on normative data 
(Figure 1). 

One ability domain that Beatty and Gange 
(1977) suggested as being particularly sensi- 
tive to demyelination is memory. Several 
different tests requiring attention-memory 
skills in this study found no support for 
significant memory impairment, but it must 
be remembered that the tests used here were 
brief clinical assessment procedures and were 
not the more rigorous techniques used in the 
experimental learning—memory literature. 

The MMPI data are interesting for their 
implications for future research, but the cur- 
rent sample size was too small to permit any 
conclusions. Group profiles suggest an in- 


MS. Neurological dysfunction in general ap- 
pears to be associated with worsening depres- 
sion, increased sensitivity to perceived or real 
criticism, and increased denial. Finally, there 
is the suggestion that MS patients may be 
less anxious than non-MS neurological con- 
trols. Future research into the MMPI corre- 


lates of MS will hopefully correlate group or 
individual (or both) MMPI profiles with 
clinical-behavioral status at the time of test- 


ing. 


This study offers some encouragement to 
the MS patient and to professionals active in 
the patient’s therapeutic and social-vocational 
planning. The generality of any conclusion 
drawn from this research is limited to those 
MS patients who remain capable of taking 
neuropsychological examinations. Neverthe- 
less, the knowledge that one can reasonably 
expect to retain much of the premorbid intel- 
lect over an extended time interval may serve 
to keep the realistic anticipation of worsened 
motor skills and sensory-perceptual 
in a broader and less catastrophic personal- 


social—vocational perspective. 
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The Generalized Expectancy for Success Scale— 
A New Measure 


Bobbi Fibel and W. Daniel Hale 


University of Massachusetts 


A new measure of generalized expectancy for success was assessed for its psy- 
chometric properties. Three samples of Caucasian, middle-class college students 
participated in the study. The first sample (n= 100; 59 females, 41 males) 
received a preliminary version of the Generalized Expectancy for Success Scale 
(GESS). Item analysis yielded 30 items that were substantially correlated with 
the total score but were not significantly related to social desirability. The 
second sample (n = 104; 63 females, 41 males) received the 30-item GESS 
twice at a 6-week interval. The third sample (m = 103; 69 females, 34 males) 
received the GESS, the Marlowe-Crowne Social Desirability Scale, the Internal- 
External Locus of Control Scale, the Self-Rating Depression Scale, the Depres- 
sion Inventory, and the Hopelessness Scale. Results indicate that the GESS has 
acceptable test-retest reliability, high internal consistency, and minimal rela- 
tionship with social desirability. Predicted relationships between high generalized 
expectancy for success, depressive symptomatology, and internality were sup- 
ported. Factor analysis indicated that GESS scores are a function of one general 
factor. Further construct validation is reviewed, and implications for future use 


of the GESS are discussed. 


One of the key concepts of Rotter’s learn- 
ing theory (Rotter, 1954; Rotter, Chance, & 
Phares, 1972) that has been the subject of 
considerable study in recent years is that of 
generalized expectancies. The two generalized 
expectancies that have received the most 
attention, and for which there are reliable 
and valid measures, are expectancies regard- 
ing internal—external control of reinforcements 
(Lefcourt, 1976; Phares, 1976; Rotter, 1966, 
1975; Strickland, 1977) and interpersonal 
trust (Rotter, 1967). The purpose of the 
present ongoing investigation is to construct 
and validate a measure of a different gen- 
eralized expectancy—the generalized expec- 
tancy for success. This construct can be de- 
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fined as the expectancy held by an indi 
that in most situations he/she will be ab 
attain desired goals. According to 
learning theory, an individual’s behavio 
tential is a function of reinforcement 
and expectancies that are determined 
person’s reinforcement history for rel 
situations. Therefore, when other facto) 
held constant, the behavior potential fo 
individual with a high expectancy fi 
cess should be greater than that of an 
vidual with a low expectancy for suce 
Further, since situations vary in the ex! 
which a person’s reinforcement histo 
relevant, expectancies for success may 
along a continuum from relatively speci#e 
general, as a function of the degree of $ 
tional novelty or ambiguity, Numerous 
ies (Dickstein & Kephart, 1972; F 
1966; Feather & Saville, 1967; Rosenti 
Jacobson, 1966; Tyler, 1958) have de 
strated that individuals experimentally 
„a high expectancy for success on a 
task or set of tasks are indeed more like 
perform more successfully than those lV 
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low expectancy for success. Unlike these ex- 
perimentally induced task-specific expectan- 
cies, however, situations seldom provide indi- 
viduals with explicit expectancies for success. 
More commonly, individuals face relatively 
unfamiliar or ambiguous circumstances for 
which no highly specific expectancy has been 
provided or formulated. A person’s behavior 
in such situations is still largely influenced by 
his/her expectancy for success, but this ex- 
pectancy increasingly becomes a function of 
generalized expectancy as the degree of nov- 
elty or ambiguity increases. Just as the con- 
struction of a measure of internal-external 
locus of control allows researchers to move 
from situationally induced skill versus chance 
expectancies for control to internal—external 
control as an individual difference variable, 
the Generalized Expectancy for Success Scale 
(GESS) allows researchers to explore indi- 
vidual differences as a function of generalized 
expectancies for success. Ry 

A valid and reliable measure of a general- 
ized expectancy for success can facilitate the 
study of factors in the development of such 
expectancies, situational characteristics that 
influence expectancies, and the impact of a 
generalized expectancy for success on a vari- 
ety of goal-oriented behaviors and other theo- 
retically related cognitive constructs. Thus, 
the GESS can potentially enhance predic- 
tion and clarify issues of theoretical impor- 
tance, The development of such a scale de- 
pends not only on firm grounding in psycho- 
logical theory but also on adherence to sound 
Psychometric principles and extensive con- 
Struct validation. 


Method 
Subjects 


Three samples were obtained, each from large 
undergraduate psychology classes at a large, north- 
eastern university. Students, predominantly middle- 
class Caucasians, were given the option to partici- 
Pate in studies of their own choosing for bonus 
academic points, A preliminary version of the GESS 
was administered to the first sample (n=100; 59 
females, 41 males) during a class period. The second 
sample (n = 104; 63 females, 41 males), solicited in 
the same manner, was group tested in the 4th and 
10th weeks of the semester during the first 20 minutes 
of the class periods. In the third sample (n= 103; 
69 females, 34 males), subjects volunteered for one 
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of several small group-testing sessions based on the 
preliminary description given by the experimenters, 
z male and one female Caucasian graduate stu- 
lent, 


Test Construction 


Initially, an attempt was made to construct items 
that both sampled across situational domains (such 
as public, private, familial, interpersonal, and work 
related) and did not specify criteria for success, One 
hundred fifty items were constructed by the experi- 
menters. The 150 items were then screened for face 
validity by three psychologists. One hundred four 
items were selected and subsequently administered to 
the first sample of 100 subjects. An item analysis 
yielded 30 items that were substantially correlated 
with total score (r > .50) but were not significantly 
related to social desirability (p > .10) as measured by 
the Marlowe-Crowne Social Desirability Scale 
(Crowne & Marlowe, 1960), These 30 items consti- 
tute the current version of the measure (see Ap- 
pendix). All items begin with the same stem phrase: 
“In the future I expect that I will . . „” which is 
printed at the top of each page. Responses to items 
are in Likert format. Subjects are instructed to circle 
a number on a S-point scale from 1 (highly im- 
probable) to 5 (highly probable) for each item, 
Seventeen items are phrased in the positive or suc- 
cess direction and 13 in the negative or failure direc- 
tion. Items are randomly ordered, The scale is scored 
additively and in the direction of success, such that 
a high total scale score indicates a high expectancy 
for success. 


Procedure 


The second sample of subjects (m= 104) was run 
during fall 1975 and received only the GESS on two 
occasions for test-retest reliability purposes. The 
third sample of subjects, run in groups of 10-15 
during spring 1975, received the GESS, the Mar- 
lowe-Crowne Social Desirability Scale, Rotter’s In- 
ternal-External Locus of Control (I-E) Scale (Rot- 
ter, 1966), the Self-Rating Depression Scale (Zung, 
1965), Beck's Depression Inventory (Beck, 1967), 
the Hopelessness Scale (Beck, Weissman, Lester, & 
Trexler, 1974), and a questionnaire assessing sui- 
cidal ideation (Crepeau, Note 1). Responses to the 
30 GESS items were intercorrelated, and the re- 
sulting matrix was factored by the principal com- 
ponents method. Components were rotated to or- 
thogonal simple structure by means of Kaiser's 
(1958) varimax method. Minimum cigenvalue for 
factor rotation was 1.50. 


Results 


The test-retest correlation coefficient of 
the GESS using scores taken at a 6-week in- 
terval from subjects in the second sample 
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Table 1 
Correlations Between the GESS and Selected 
Other Measures for the Present Study 


Ee 
Males Females 
Scale n r n r 

Social Desirability 34 15 69 .26* 
Self-Rating 

Depression 26 —.58** 58 —.48** 
Depression 

Inventory 25 —.61** 57 —.54** 
Hopelessness 26 —.69** 59 —.31** 
Locus of Control 32 —.10 67 —.27* 


Note. GESS = Generalized Expectancy for Success 
Scale. 

*p <05. 

“> < 01. 


who were present for both administrations 
(n = 74; 46 females, 28 males) was .83 over- 
all (.89 for males and .80 for females), Means 
and standard deviations on the GESS were 
not significantly different for this sample 
from those obtained as a function of group 
testing with additional measures, nor were 
differences in responding found as a function 
of sex. Consequently, data from the second 
and third samples were combined for analy- 
ses of psychometric properties (n = 207, 132 
females, 75 males). The possible range of 
total scores is 30-150, with higher scores 
indicating a high expectancy for success. Ac- 
tual total scores ranged from 65 to 143 for 
females and from 81 to 138 for males, The 
mean score for females was 112.32 (mode = 
112, Mdn = 113.14) and for males, 112.15 
(mode = 109, Mdn = 112.88). The respect- 
ive standard deviations were 13.80 and 13.24. 

Two measures of internal consistency were 
computed. The split-half reliability coefficient 
for odd versus even items, using the Spear- 
man-Brown correction formula, was .90 for 
The correlation 
L h and the last 15 
items, again using the Spearman-Brown cor- 
rection formula, was .82 for females and .83 


have a single stem, the high internal con- 
sistency is not surprising. However, it should 
be noted that these reliability coefficients 
also occur across items that reflect a number 
of diverse areas, 
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Correlations with other measures were com- 
puted separately for males and females in the 
third sample.. For both sexes it was found 


` that scores on the GESS were correlated 


negatively and significantly with scores on 
the Zung Self-Rating Depression Scale, the 
Beck Depression Inventory, and the Beck 
Hopelessness Scale (see Table 1). Individuals 
with low expectancies for success were more 
likely to report depressive symptomatology 
and to report themselves as feeling hopeless 
about impending life events. Even though 
items with high social desirability bias were 
eliminated from the original item pool, a low 
but significant correlation between scores on 
the GESS and scores on the Marlowe-Crowne 
Social Desirability Scale was found for fe- 
males (r = .25, p< 02) but not for males 
(r = .15, p > .10) in the third sample. Also, 
high GESS scores were significantly corre- 
lated with internality as measured by Rot- 
ter’s I-E scale for females but not for males. 

A factor analysis was also computed using 
the varimax rotation method, which yielded 
four factors, Variance accounted for by each 
factor was 63.9%, 13.4%, 12.7%, 10.1% for 
Factors 1-4, respectively, 

Items loading on Factor 1 reflect an indi- 
vidual’s sense of general efficacy (Items 4, 8, 
9101213, 15, 16, 21, and 22). For ex- 
ample, the two items with the highest loading 
within Factor 1 were Item 4, “be unable to 
accomplish my goals” (.56), and Item 21, 
“succeed at most things I try” (.55). 

Factor 2 was composed of Items 14, 17, 24, 
25, 26, 29, and 30. The content of these 
items primarily involves long-range career- 
oriented expectancies, Items with the highest 
loading within Factor 2 were Item 26, “at- 
tain the career goals I have set for myself” 
(.56), and Item 30, “achieve recognition in 
my profession” (253) 

Factor 3 contained items related to per- 
sonal problem solving (3, 5, 6, 11, 19, 20, 23, 
28). Item 5, “have a successful marital rela- 
tionship” (.59), and Item 29, “be very suc- 
cessful working out my personal life” (.51), 
loaded highest among the items in the third 
factor, 

Factor 4 consisted of Items 1, 2, 7, 18, and 
27 with Item 1, “find that people don’t seem 
to understand what I am trying to say” NaS); 


\ 


g” 


L * 


EXPECTANCY FOR SUCCESS SCALE 


Table 2 
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Summary of Correlations Between the GESS and Measures of Depressive Affect and Cognition 


from Other Studies 


Males Females Total 
Scale n r n r n r 
Self-Rating Depression 
Strickland (Note 2) 50 —.55*** 50 —.43*** 
Crepeau (Note 1) 67 —.14** 107 —.62*** 
Koerner (1977) 120 — AT" 
With SD partialed out 120 — 44" 
Multiple Affect Adjective Checklist 
Strickland (Note 2) 
Anxiety 50 —.33** 50 —.27* 
Depression 50 —.34** 50 —.20 
Hostility 50 —.03 50 —.04 
Crepeau (Note 1) 
Depression 67 —.66*** 107 —.45*** 
Koerner (1977) 
Anxiety 120 —.43*** 
Depression 120 —.37*** 
Hostility 120 —.A1*** 
Suicidal Ideation 
Crepeau (Note 1) 67 —.33** 107 —.48*** 


Note. GESS = Generalized Expectancy for Success Scale. 


*p < 0S 
** p < 01, 
*** p < 001. 


and Item 18, “find that no matter how hard 
I try, things just domt turn out the way I 
would like” (.54), having the highest load- 
ings on the fourth factor. Although effort 
rather than outcome seems to characterize 
several items of Factor 4, this is not a con- 
sistent theme. In addition, all of the items in 
Factor 4 are phrased negatively, which sug- 
gests a possible overriding response bias. The 
final factor, then, is not easily interpretable. 

Although the factor analysis did yield a 
moderately interpretable factor structure, 
other results militate against assuming the 
simple factor structure noted above. First, 
Factor 1 accounted for a disproportionately 
high percentage of the variance. In addition, 
15 items, half of the scale items, loaded 
Sreater than +.30 on at least two and in some 
cases three factors. The considerable overlap 
in loadings on the four factors suggests that 
the factors are not independent. The small 
sample size and lack of uniformly interpreta- 
ble factors further limit the validity of a 


simple structure. Based on current data, the 
presence of one general factor is tentatively 
reasonable. Subsequent factor analyses using 
larger samples and separate analyses for 
males and females may prove more con- 
clusive. 


Discussion 


Results indicate that the GESS has an 
acceptable test-retest reliability, high in- 
ternal consistency, and a minimal relationship 
with social desirability. Preliminary factor 
analysis did not yield strong evidence of a 
simple subscale structure. It appears that 
GESS scores are largely a function of one 
factor reflecting a sense of general efficacy. 

A number of theoretical approaches to 
depression—learned helplessness (Seligman, 
1975), social learning theory (Phares, 1972), 
and Beck’s (Beck, 1967, 1976) model among 
them—focus on the importance of the de- 
pressive’s negative cognitive set. To estab- 
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Table 3 
Summary of Correlations Between t 
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he GESS and Measures of Internal-External Control of 


Reinforcement from Other Studies 
Total 


Males Females 
Scale n r n r n r 
Strickland (Note 2) (Crandall I-E)* 
LE for positive events 50 03287 50 434+ 
LE for negative events 50 —.17 50 007 
I-E Total 50 .07 50 -26 
Crepeau (Note 1) (Collins’ I-E) i 
Difficult/Easy World 67 —.35** 107 See 
Just/Unjust World 67 = —.20* 107 re 
Predictable/Unpredictable World 67 = —.30** 107. ~—.36' 
Politically Responsive World 67 —.16 107 —.18 
Personal Control 67 —.42*** 107 —.54*** 
I-E Total 67 = —.4g*** 107 —.48*** 
Koerner (Note 3) (Collins’ IE) 
LE Total 120 —.41*** 
Note. GESS = Generalized Expectancy for Success Scale. I-E = internal-external, 
* Crandall’s scale is scored in internal direction, 
* p <.05. 
a IR 
“bp < 001. 
lish the construct validity of the GESS, an Anxiety, frequently a concomitant of de- 


assessment of its scores in relation to mea- 
sures of depressive cognition is crucial. As 
expected, the GESS was significantly related 
to measures of depression, with persons who 
express high expectancies for success being 
less likely to report themselves as depressed. 
This relationship has been further corrobo- 
rated with samples of college students (see 
Table 2; Koerner, 1977; Crepeau, Note ng 
Strickland, Note 2). The significant negative 
correlations with the Hopelessness Scale ob- 
tained in the present investigation provide 
further support for thi 
the GESS, since Beck 
onstrated that 


frequency of suicidal 
e students correlated 


negatively and significantly with GESS scores, 


Thus, low scores on 


pression (Beck, 1967, 1976) and an antici- 
pated correlate of low generalized expectancy 
for success, related negatively and signifi- 
cantly to GESS scores in both Strickland’s 
(Note 2) and Koerner’s (1977) studies. An 
individual with a low generalized expectancy 
for success tends to report greater anxiety as 
measured by the Multiple Affect Adjective 
Checklist (MAACL; Zuckerman, Lubin, & 
Robins, 1965). Additionally, Koerner re- 
ported a significant negative correlation be- 
tween scores on the Hostility subscale of the 
MAACL and GESS scores. As in the female 
sample of the present study, social desirabil- 
ity was positively correlated with GESS scores 
for the combined male and female samples of 
Koerner’s study (r= .29, $ < .01). How- 
ever, after partialing out the effects of social 
desirability, the relationship between GESS 
and depression Scores remained significant 
(r= —.44, p < 001). 
Numerous studies hav 
positive relation: 
belief in internal 
successful copi 
Phares, 


e demonstrated a 
ship between an individual’s 
l control of reinforcement and 
ing behaviors (Lefcourt, 1976; 
1976, Strickland, 1977; Gilmore, 
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Note 3). In the present study, high GESS 
scores were related to internality for female 
subjects but not for males. Other studies by 
Strickland (Note 2), Crepeau (Note 1), and 
Koerner (1977) demonstrated the relation- 
ship between GESS and internality for both 
sexes (see Table 3). Crepeau used the Collins 
(1974) and Levenson and Miller (1976) 
subscales derived from the factor analyzed 
I-E scale to measure locus of control. Crepeau 
found GESS scores negatively and signifi- 
cantly related to each of the five subscales 
(difficult/easy world, just/unjust world, pre- 
dictable/unpredictable world, politically re- 
sponsive world, and personal control) for 
female subjects and all but the fourth sub- 
scale (politically responsive world) for male 
subjects (trend, p < .10). Using an academic 
achievement measure that distinguishes be- 
tween positive and negative events of locus 
of perceived responsibility (Crandall, Note 
4), Strickland reported significant negative 
correlations between GESS scores and total 
I-E scores and internality for positive but 
not negative events for female subjects. For 
males, GESS scores were negatively and sig- 
nificantly correlated with I-E scores on the 
positive, but not the negative events, sub- 
scale. In summary and as expected, GESS 
scores and a belief in internal control of 
reinforcement appear to be related both at a 
general level and across specific dimensions. 
These relationships seem somewhat attenu- 
ated among male subjects. Discriminant 
validity is demonstrated by the low and gen- 
erally insignificant correlations between GESS 
scores and scores on the Social Desirability 
Scale and the MAACL Hostility subscale. 
Support for the construct validity is pro- 
vided by Fibel (1976). She investigated the 
relationship between an individual’s general- 
ized expectancy for success, task-specific 
expectancies for success, and differential re- 
sponses to a learned helplessness paradigm 
with college females. Data analysis showed a 
significant positive correlation between GESS 
Scores and specific expectancies for success 
in novel and ambiguous situations and rela- 
tively lower correlations as specific situational 
information was acquired. Thus, as postu- 
lated, one’s specific or immediate expectancy 
increasingly becomes a function of one’s 
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generalized expectancy as the degree of 
novelty or ambiguity is amplified. 

The choice of a measure of generalized 
versus specific expectancy for success must 
be determined by the level of analysis de- 
sired. Measures of task-specific expectancies 
will be of greater predictive utility when the 
level of analysis is task focused. For example, 
then, the intent of predicting successful per- 
formances on mechanical tasks is better de- 
termined by a measure of expectancy tailored 
to mechanics than by a generalized measure. 
Similarly, measures of specific expectancies 
within a single need area, such as academic 
achievement, are preferable when one’s pre- 
dictive purposes are tied to that need area. 
A generalized measure will be most useful 
when the level of analysis is broadly defined 
as an assessment across need areas and situ- 
ations or in novel or ambiguous circum- 
stances. 

Several cogent issues remain unresolved. | 
The degree of relationship between GESS 
and measures of other personality variables 
such as self-esteem has not been investigated. 
The validity of this instrument for popula- 
tions other than college students must be 
demonstrated. Additional factor analyses are 
needed to support the unidimensionality of 
the measure. The influence of social desira- 
bility on GESS scores while moderate for a 
measure of a culturally highly valued con- 
struct must be taken into account in future 
work with the scale. Nonetheless, at this 
point, the GESS appears to be theoretically 
well-founded, empirically sound, and shows 
promise of predictive utility. Additionally, 
its briefness and ease of administration fur- 
ther enhance its value, Anticipated are 
relationships between the GESS and achieve- 
ment, assertiveness, risk taking, persuasi- 
bility, interpersonal skills, and social com- 
petence. Further, a measure of this possibly 
potent cognitive-mediating variable may have 
implications for predicting psychological 
well-being, particularly vis-a-vis depressive 
symptomatology that is typically charac- 
terized by negative expectations and low 
motivation. 


Reference Notes 


1. Crepeau, J. Effects of stressful life events and 
locus of control on suicidal ideation. Unpublished — 


930 


research, University of Massachusetts, Amherst, 
1977. 

2. Strickland, B. R. Aspiration behavior and depres- 
sion, Unpublished research, University of Massa- 
chusetts, Amherst, 1976. 

3. Gilmore, T. M. Locus of control as a mediator of 
adaptive behavior in children and adolescents. 
Unpublished article, York University, Toronto, 
Canada, 1976, 

4, Crandall, V. An adult measure of intellectual 
achievement responsibility. Unpublished scale, 
1977, 


References 


Beck, A, T. Depression: Causes and treatment, Phila- 
delphia: University of Pennsylvania Press, 1967. 
Beck, A. T. Cognitive therapy and the emotional 
disorders, New York: International Universities 

Press, 1976, 

Beck, A. T., Weissman, A., Lester, D., & Trexler, L. 
The measurement of pessimism; The Hopelessness 
Scale. Journal of Consulting and Clinical Psy- 
chology, 1974, 42, 861-865. 

Collins, B. E, Four separate components of the 
Rotter I-E scale: Belief in a difficult world, a 
just world, a predictable world, and a politically 
responsive world. Journal of Personality and Social 
Psychology, 1974, 29, 381-391. 

Crowne, D. P, & Marlowe, D. A. A new scale of 
social desirability independent of Psychopathology. 
Journal of Consulting Psychology, 1960, 24, 349- 
354, 

Dickstein, L, S, & Kephart, J. L. Effect of explicit 
examiner expectancy upon WAIS performance. 
Psychological Reports, 1972, 30, 207-212. 

Feather, N. J. Effects of prior success and failure on 
expectations of success and subsequent perform- 
ance. Journal of Personality and Social Psychology, 
1966, 3, 287-298. 

Feather, N. J., & Saville, M. R. Effects of amount 
of prior success and failure on expectations of 
Success and subsequent task performance. Journal 
of Personality and Social Psychology, 1967, 5, 
226-232. 

Fibel, B. L. Contingency of reinforcement and level 
of success in a learned helplessness paradigm among 
college females. Unpublished master’s thesis, Uni- 
versity of Massachusetts, Amherst, 1976, 

Kaiser, H, nian Varimax criterion for analytic 
rotation in factor analysis, Ps i 
Coan ly: ‘ychometrika, 1958, 

Koerner, F. E., The effects of depression and sex on 
aggressive affect and behavior toward the self and 
toward others. (Doctoral dissertation, University 


BOBBI FIBEL AND W. DANIEL HALE 


of Massachusetts, 1977). Dissertation Abstracts 
International, 1977, 38, 1887B-1888B. (University 
Microfilms No. 77-22, 025) 

Lefcourt, H, M. Locus of control: Current trends in 
theory and research. Hillsdale, N.J.: Erlbaum, 
1976. 

Levenson, H., & Miller, J. Multidimensional locus 
of control in sociopolitical activists of conserva- 
tive and liberal ideologies. Journal of Personality 
and Social Psychology, 1976, 33, 199-208. 

Minkoff, K., Bergman, E., Beck, A. T., & Beck, R 
Hopelessness, depression, and attempted suicide. 
American Journal of Psychiatry, 1973, 130, 455- 
459. 

Phares, E. J. A social learning theory approach to 
psychopathology. In J. B. Rotter, J. E. Chance, & 
E. J. Phares (Eds.), Applications of a social learn- 
ing theory of personality. New York: Holt, Rine- 
hart & Winston, 1972. 

Phares, E. J. Locus of control in personality. Mor- 
ristown, N.J.: General Learning Press, 1976. 

Resenthal, R., & Jacobson, L. Teachers’ expec- 
tancies: Determinants of pupils’ IQ gains, Psy- 
chological Reports, 1966, 19, 115-118. 

Rotter, J. B. Social learning and clinical psychology. 
Englewood Cliffs, N.J.: Prentice-Hall, 1954, 

Rotter, J. B. Generalized expectancies for internal 
versus external control of reinforcement, Psycho- 
logical Monographs, 1966, 80(1,Whole No. 609). 

Rotter, J. B. A new scale for the measurement of 
interpersonal trust. Journal of Personality, 1967, 
35, 651-665. 

Rotter, J. B. Some problems and misconceptions 
related to the construct of internal versus external 
control of reinforcement. Journal of Consulting 
and Clinical Psychology, 1975, 43, 56-67. 

Rotter, J. B, Chance, J. E., & Phares, E. J. Appli- 
cations of a social learning theory of personality. 
New York: Holt, 1972. 

Seligman, M. E. P, Depression and learned help- 
lessness. In D. Rosenhan & P, London (Eds.), 
Theory and research in abnormal psychology. 
New York: Holt, Rinehart & Winston, 1975. 

Strickland, B. R. Internal versus external control of 
reinforcement. In T. Blass (Ed.), Personality and 
social behaviors. Hillsdale, N.J.: Erlbaum, 1977. 

Tyler, B. B. Expectancy for eventual success as a 
factor in problem solving behavior. Journal of 
Educational Psychology, 1958, 49, 166-172. 

Zuckerman, M., Lubin, B., & Robins, S. Validation 
of the Multiple Affect Adjective Check List in 
clinical situations, Journal of Consulting Psy- 
chology, 1965, 20, 594. 

7Zung, W. K. A self-rating depression scale. Archives 
of General Psychiatry, 1965, 12, 63-70, 


: 
z 


EXPECTANCY FOR SUCCESS SCALE 


Appendix 


931 


The Hale-Fibel Generalized Expectancy for Success Scale 


This is a questionnaire to find out how people 
believe they will do in certain situations. Each 
item consists of a 5-point scale and a belief state- 
ment regarding one’s expectations about events. 
Please indicate the degree to which you believe 
the statement would apply to you personally by 
circling the appropriate number. [1 = highly 
improbable, 5 = highly probable.] Give the an- 
swer that you truly believe best applies to you 
and not what you would like to be true or think 
others would like to hear. Answer the items care- 
jully, but do not spend too much time on any 
one item. Be sure to find an answer for every 
item, even if the statement describes a situation 
you presently do not expect to encounter. Answer 
as if you were going to be in each situation. 
Also try to respond to each item independently 
when making a choice; do not be influenced by 
your previous choices. 


In the future I expect that I will 


1. find that people don’t seem to understand 
what I am trying to say. 

2. be discouraged about my ability to gain the 
respect of others. 

| be a good parent. 

. be unable to accomplish my goals. 

have a successful marital relationship. 

. deal poorly with emergency situations. 

. find my efforts to change situations I don’t 
like are ineffective. 

8. not be very good at learning new skills. 

9. carry through my responsibilities success- 
fully. 


SDAP W 


10. 


11. 
12i 
13. 
14. 


15. 


. find that no 


discover that the good in life outweighs the 
bad. 

handle unexpected problems successfully. 
get the promotions I deserve. 

succeed in the projects I undertake. wv) 
not make any significant contributions to 
society. 

discover that my life is not getting much 
better. 


_ be listened to when I speak. 
_ discover that my plans don’t work out too 


well. 
matter how hard I try, things 
just don’t turn out the way I would like. 


. handle myself well in whatever situation I’m 


in. 


. be able to solve my own problems. 
_ succeed at most things I try. 
. be successful in my endeavors in 


the long 
run. 


. be very successful working out my personal 


life. 


. experience many failures in my life. 
. make a good impression on people I meet 


for the first time. 


_ attain the career goals I have set for myself. 
. have difficulty dealing with my superiors. 

. have problems working with others. 

. be a good judge of what it takes to get ahead. 
| achieve recognition in my profession. 
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Assessing the Impact of Life Changes: 
Development of the Life Experiences Survey 


Irwin G. Sarason, James H. Johnson, and Judith M. Siegel 
University of Washington 


This article describes the development of a new instrument, the Life Experi- 
ences Survey, for the measurement of life changes, It was designed to eliminate 
certain shortcomings of previous life stress measures and allows for separate 
assessment of positive and negative life experiences as well as individualized 
ratings of the impact of events. Several studies bearing on the usefulness of 
the Life Experiences Survey are presented, and the implications of the findings 


are discussed. 


During recent years, numerous studies have 
investigated the relationship between life 
stress and susceptibility to physical and psy- 
chological problems. Most of these studies 
have been based on the assumptions that (a) 
life changes require adaptation on the part 
of the individual and are stressful, and (b) 
persons experiencing marked degrees of life 
change during the recent past are susceptible 
to physical and psychiatric problems. 

There is considerable evidence that a re- 
lationship exists between life stress, opera- 
tionally defined in terms of self-reported life 
changes, and physical illness (Dohrenwend & 
Dohrenwend, 1974b). Rahe and Lind (1971) 
have reported a relationship between life stress 
and sudden cardiac death, Theorell and Rahe 
(1971) and Edwards (1971) have provided 
data suggestive of a link between life stress 
and myocardial infarction. Holmes (1970) 
and Rahe (1968) both found a relationship 
between life stress and major and minor health 
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changes, and Wyler, Masuda, and Holmes 
(1971) have shown that life change is related 
to seriousness of chronic illness. 

There also have been studies of non-health- 
related correlates of life change that have 
yielded positive results. For example, signifi- 
cant negative relationships between life stress 
and academic (Harris, 1973) and teacher 
(Carranza, 1973) performance have been 
found. Several researchers have demonstrated 
a relationship between extent of life changes 
and psychiatric symptomatology (Dekker & 
Webb, 1974; Paykel et al., 1969), Vinokur 
and Selzer (1975) and others (e.g., Constan- 
tini, Braun, Davis, & Iervolino, 1973) have 
also found life stress to be related to the oc- 
currence of depression, anxiety, and tension. 
A comprehensive review of the life stress lit- 
erature and a consideration of methodological 
issues in this area of research has been pre- 
sented by Rabkin and Struening (1976). 

Questions of both a methodological and 
theoretical nature can be raised concerning 
Present methods of assessing life changes. By 
far the most widely used instrument in life 
Stress research is the Schedule of Recent Ex- 
periences (SRE; Holmes & Rahe, 1967). This 
is a self-administered questionnaire containing 
a list of 43 events to which subjects respond 
by checking those events that they have ex- 
perienced during the recent past (previous 6 
months or 1 year), To determine the scoring 
weights for specific events, Holmes and Rahe 
(1967) had a large group of subjects rate 


Copyright 1978 by the American Psychological Association, Inc. 0022-006X/78/4605-0932$00.75 


932 


We 
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each of the 43 items with regard to the 
amount of social readjustment that the various 
events required. The item marriage (assigned 
a value of 500) was used as an arbitrary stan- 
dard or anchor point for making ratings. Mean 
values were obtained for each of the items. 
These mean values (divided by the constant 
of 10) were taken to represent the average 
amount of social readjustment required by the 
events. The values, termed life change units, 
when summed yield a total life stress score. 

Although the development of the SRE rep- 
resents a valuable initial attempt at the quan- 
tification of the impact of life change, its 
adequacy has been questioned on several 
counts (Rabkin & Struening, 1976). The SRE 
was based on the assumption that life changes 
per se are stressful regardless of the desira- 
bility of the events experienced. Therefore, 
both desirable and undesirable events are com- 
bined in determining the life stress score. On 
the other hand, several writers have ques- 
tioned the logic of combining positive and 
negative events (Brown, 1974; Mechanic, 
1975; Sarason, De Monchaux & Hunt, 1975). 
It has been argued that undesirable events 
(e.g., death of a close family member) may 
have a very different, and possibly a more 
detrimental, effect on individuals than positive 
events (e.g, outstanding personal achieve- 
ment). It seems reasonable, therefore, to con- 
sider conceptualizing life stress primarily in 
terms of events that exert negative impacts. 

Vinokur and Selzer (1975) have provided 
information that bears on this issue. These 
investigators used a specially modified version 
of the SRE, which yielded separate values 
for positive and negative life change. Several 
stress-related measures such as self-ratings of 
depression, anxiety, and tension were used, 
as well as measures of aggression, paranoia, 
and suicidal proclivity. The study provided 
support for a relationship between life changes 
and several of these measures but only when 
using a measure of undesirable events. Posi- 
tive change was not found to be systematically 
telated to the personality measures. Vinokur 
and Selzer (1975) concluded that 


it seems reasonable to reject the notion that adjust- 
Ment to change per se is the crucial determinant of 
life stress and its sequelae. Instead, it appears that 
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the contribution of life events to psychological im- 
pairment is mediated by stress that is evoked by 
some undesirable aspect of the events rather than 
by change per se (pp. 333-344). 

Similar evidence that psychological difficulties 
are related to undesirable, but not desirable 
events has been provided by Mueller, Ed- 
wards, and Yarvis (1977). It would seem 
necessary to take this desirability-undesira- 
bility dimension into account in the assess- 
ment of life change. 

Even though it might be advisable to cate- 
gorize events as being desirable or undesirable 
for purposes of assessment, there are some 
diffculties with this approach. For example, 
events may vary in terms of their desirability 
depending on the circumstances and percep- 
tions of the respondent. To illustrate, “preg- 
nancy” may be a highly desirable event for 
a woman who wants a child, but it may be 
viewed as quite undesirable by an unwed 
teenager. Given the fact that individuals per- 
ceive events differently, it is somehow im- 
portant to individualize ratings of the desira- 
bility of the events that they experience. 

A related issue concerns the quantification 
of life changes. Because individuals vary in 
how they are affected by events, the values 
derived from group ratings (such as those 
used with the SRE) may not accurately reflect 
the impact that events have on particular in- 
dividuals. Problems inherent in applying 
group-derived values to individual cases be- 
come obvious when it is noted that certain 
classes of events listed in the SRE can be 
quite ambiguous. For instance, if a subject 
responds to the item major change in financial 
status, it is uncertain if the response refers 
to a major change in a positive or negative 
direction. It is not clear that the life change 
unit associated with major change in financial 
status is as appropriate to the person who 
has recently become bankrupt as to the per- 
son who has recently inherited a large sum 
of money. Thus, even though life change units 
do seem to provide a quantitative measure 
of overall life change, in some cases, they 
may not reflect the actual amount of stress 
resulting from the experiencing of specific 
events. Findings bearing on this issue have 
recently been reported by Yamamoto and 
Kinney (1976). These investigators found life 
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stress scores, based on self-ratings of the 
stressfulness of events, to be better predictors 
than scores derived by using mean adjustment 
ratings similar to those used with the SRE. 

Bearing in mind the methodological issues 
mentioned above, it would appear that a mea- 
sure of life stress should possess three char- 
acteristics. First, it should include a list of 
events experienced with at least some degree 
of frequency in the population being inves- 
tigated. Second, it should allow for ratings, 
by respondents themselves, of the desirability 
or undesirability of the events. Third, it 
should allow for individualized ratings of the 
personal impact of the events experienced. 
The present article describes a new measure 
of life stress, the Life Experiences Survey 
(LES), constructed according to these guide- 
lines and describes the results of several stud- 
ies bearing on its usefulness. 


Development of the LES 


The LES is a 57-item self-report measure 
that allows respondents to indicate events that 
they have experienced during the past year. 
The scale has two portions: Section 1, de- 
signed for all respondents, contains a list of 
47 specific events plus three blank spaces in 
which subjects can indicate other events that 
they may have experienced. The events listed 
in Section 1 refer to life changes that are 
common to individuals in a wide variety of 
situations. The 10 events listed in Section 2 
are designed primarily for use with students, 
but they can be adapted for other populations, 
Section 2 deals specifically with changes ex- 
perienced in the academic environment. Sec- 
tion 1 is appropriate for use with subjects 
drawn from the general population, whereas 
both sections are relevant to a student popu- 
lation, (In the present research, responses to 
items of Sections 1 and 2 were combined in 
deriving life change scores as this research 
was conducted with college students.) 

‘ The LES items were chosen to represent 
life changes frequently experienced by indi- 
viduals in the general population. Many of 
the items are based on existing life stress mea- 
sures, particularly the SRE. Others were in- 
cluded because they were judged to be events 
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that occur frequently and that potentially +æ 


might exert a significant impact on the lives 
of persons experiencing them. Thirty-four of 
the events listed in the LES are similar in 
content to those found in the SRE (Holmes 
& Rahe, 1967). In the construction of the 
present scale, however, certain items were 
made more specific. For example, the SRE 
contains the item pregnancy, which may be 
endorsed by women but perhaps not by a man 
whose wife or girlfriend has become pregnant. 
The present scale allows both men and women 
to endorse the item of pregnancy in the fol- 
lowing manner: Female: Pregnancy; Male: 
Wife’s/girlfriend’s pregnancy. The SRE in- 
cludes the item Wife begins or stops work, an 
item that fails to assess the impact on women 
whose husbands begin or cease working. The 
present scale lists two items: Married male: 
Change in wife’s work outside the home (be- 
ginning work, ceasing work, changing to a 
new job, etc.), and Married female: Change 
in husband’s work (loss of job, beginning a 
new job, etc.) Examples of events not listed 
in the SRE but included here are male and 
female items dealing with abortion and more 
general items such as serious injury or illness 
of close friend, engagement, breaking up with 
boyfriend/girlfriend, and so forth. Nine of 
the 10 school-related items are unique to the 
LES. Finally, some of the events from the 
SRE thought to be of relatively little conse- 
quence (e.g., Christmas, vacation, etc.) were 
not included, and certain other events were 
teworded to simplify responding. 

The format of the LES calls for subjects 
to rate separately the desirability and impact 
of events that they have experienced. Thus, 
they are asked to indicate those events experi- 
enced during the past year (0-6 months or 
7 months—1 year) + as well as (a) whether 
they viewed the event as being positive or 
negative and (b) the perceived impact of the 
particular event on their life at the time of 
occurrence. Ratings are on a 7-point scale 
Tanging from extremely negative (—3) to ex- 


+ Although the LES provides for the assessment 
of life change Occurring during two 6-month inter- 
vals, all analyses to date have involved change scores 
based on the entire preceding 12-month time period. 


tremely positive (+3). Summing the impact 
< ratings of those events designated as positive 
by the subject provides a positive change 
score. A negative change score is derived by 
summing the impact ratings of those events 
experienced as negative by the subject. By 
adding these two values, a total change score 
| can be obtained, representing the total amount 
of rated change (desirable and undesirable) 
experienced by the subject during the past 
year. Although the findings cited earlier 
Mueller et al., 1977; Vinokur & Selzer, 
1975) suggest that this total change score 
ight be less predictive of health-related 
ariables than an index of negative change, 
his measure was used in the present research 
Pw to provide further information concerning the 
X relationships between negative change, change 
” per se, and stress-related dependent variables. 
(The LES is presented in the Appendix.) 
|. For any new instrument it is necessary to 
‘obtain certain kinds of information, Norma- 
tive data should be provided that include in- 
formation about the effects of demographic 
variables (e.g., sex). Evidence should also be 
presented concerning the instrument’s stabil- 
ity over time and correlations with relevant 
dependent measures. Finally, in the case of 
"self-report scales, it should be demonstrated 
hat measures derived from the instrument 
§ not simply reflect the effects of response 
ets such as the tendency to “fake good.” The 
instrument’s scores should not be highly cor- 
related with factors such as social desirability. 


Normative Data and an Examination 
of Sex Differences 


_ The first study undertaken with the LES 
obtained general information concerning the 
responses of college students to the instrument 
and investigated the possibility of differences 
in response due to sex. 

The LES was administered in class to 345 
‘students enrolled in introductory psychology 
Courses at the University of Washington dur- 
ing the fall quarter of 1975. Values were ob- 
tained for positive, negative, and total life 
change scores, Means and standard deviations 
were derived separately for males (» = 174) 
and females (n = 171) on each of these mea- 
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Table 1 
Means and Standard Deviations of Male and 
Female Respondents on the Life Experiences 


Survey (LES) 
I A a ce ie 
Males* Females? 

LES score M SD M SD 

Positive 9.74 8.07 9.57 6.66 
6.87 5.97 6.71 5.51 

Negative 6.22 6.28 7.04 7.90 
4.66 4.36 5.64 6,43 

Total 15.97 11.08 16.61 10.23 

11.53 8.01 12.35 8.82 


Note. In each case figures in top rows are de- 
rived from responses to Parts 1 and 2 combined. 
Figures in the bottom rows are derived from Part 1 
only. 


an = 174. 
bn =171. 
sures, and tests for sex differences were con- 


ducted.2 Data from Section 1 and Sections 1 
and 2 combined are presented separately in 
Table 1. The table shows that there were no 
significant differences between males and fe- 
males on any of the three life change mea- 
sures. It can also be seen that the life change 
scores of this sample of college students are 
generally low. Higher values might well have 
been obtained if subjects from the general 
population had been surveyed. Finally, it can 
be noted that the results of this and a num- © 
ber of other studies with the LES have shown 
that the positive and negative life change 
scores are essentially uncorrelated. 


Reliability of the LES 


Two test-retest reliability studies of the 
LES have been conducted. Both involved sub- 
jects drawn from undergraduate psychology 
courses with a 5- to 6-week time interval be- 
tween test and retest. There were 34 subjects 
in the first study and 58 in the second. Re- 
sponses were scored for positive, negative, 


2 Data concerning the mean ratings of these events; 
the frequency of endorsement of various events; and 
percentile ranks for positive, negative, and total 
change scores can be obtained without charge from 
the authors. 
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and total life changes in each case. Pearson 
product-moment correlations were computed 
to determine the relationships between scores 
obtained at the two testings. Test-retest cor- 
relations for the positive change score were 
19 and .53 (p < .001). The reliability coeff- 
cients for the negative change score were .56 
(p < .001) and .88 (p< .001). The coeffi- 
cients for the total change score were .63 
(p < .001) and .64 (p < .001). 

Although the findings of the two studies 
reported here vary to some extent, perhaps 
due to the relatively small sample sizes, they 
suggest that LES is a moderately reliable in- 
strument especially when the negative and 
total change scores are considered.’ It should 
be noted that test-retest reliability coefficients 
found with instruments of this type are likely 
to underestimate the reliability of the mea- 
sure. That is, with a time interval of 5-6 
weeks, subjects may actually experience a 
variety of events, both positive and negative, 
that may be reflected in responses given at 
the time of retesting. As these changes re- 
flect the actual occurrence of life changes, 
rather than simply inconsistencies in report- 
ing, it would be inappropriate to consider the 
total variability in responding as error. As 
subjects generally seem to report somewhat 
higher levels of positive than negative change 
on the LES, it seems possible that the lower 
reliability estimates found with the positive 
change measure may be due, in part, to the 
greater likelihood of positive changes occur- 
ring within the time interval between test and 
retest. 


Correlates of the LES 


To the extent that the LES measures life 
stress, its scores should correlate with relevant 
personality indices. Further, an analysis of 
the correlational patterns should provide in- 
formation concerning whether life stress is 
more usefully conceptualized in terms of nega- 
tive life change or total life change. 


Anxiety, Academic A chievement, Social 
Desirability, and the LES 


A group of 100 male and female students 
drawn from introduction to personality 
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courses were administered the LES, the State-w < 
Trait Anxiety Inventory (Spielberger, Gor- |» 
such, & Lushene, 1970), and a short form of 
the Marlowe-Crowne Social Desirability Scale: 
(Strahan & Gerbasi, 1972). Academic tran- 
scripts were available for 75 of these students. 
The correlations among life change scores, | | 
anxiety, and grade point average (GPA) are 
presented in Table 2. t 
Inspection of these correlations shows thata 
the total and negative change scores correlai 
significantly and in a positive direction wi 
state and trait anxiety, whereas the positiv 
change score is not significantly related ti 
either measure. Tests for significance of th 
difference between correlations suggested tha 
Positive and negative change scores differ sig, 
nificantly in their correlations with state anx- Ù y 
iety (p<.01). Although negative chang A 
scores were significantly correlated with trait | | 
anxiety and positive change scores were not, ' 
the difference between these correlations was k 
not significant. Significant correlations be- W 
tween negative change and anxiety have also , 
been found in data collected as part of two 
Other investigations. For a sample of naval 
personnel (V = 76), correlations of .46 (pe 
001) and 40 (p< 001) were found wi 
State and trait anxiety, respectively. With 
college students (NV = 82), a correlation of 
24 (p < .05) has also been found betw 
negative change and anxiety as measured Be 
the Multiple Affect Adjective Checklist 
(Zuckerman & Lubin, 1965), | 
With regard to GPA, positive, negative, and 
total change scores were all found to be nega- F 
tively correlated with GPA. Even though the ' 
correlation between positive change and CA i 
was smaller than the correlations betw i 
negative and total change scores and 
measure, the differences between these corre-¥ 
lations were not significant. These results are 


Ya 
It should be noted that in additi ) 
“It s D on to the t: 
reliability studies reported here, reliability data h & 


61 (p< 05), 
82 (p <.001) were obtained for 
and total change Scores, respectively, 
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correlations Between Life Change Scores, Anxiety, and Academic Achievement 


Anxiety 
f Grade point 
LES Life change scores Trait* State* eats 
Positive 04 .03 =21 
Negative 29 4g — 38r 
Total AF. S Tira —40"* 
Balance (negative — positive events) —.21* —.36%* ‘18 


e. LES = Life Experiences Survey. 
= 97, 


‘Consistent with other studies that have found 
ignificant relationships between life stress 
assessed by other measures) and measures 
anxiety (Constantini et al., 1973) and aca- 
| demic achievement (Carranza, 1973). 

Í As it seemed reasonable that the effects of 
positive change might, in part, ameliorate the 
stress produced by negative experiences, a 
palance or subtractive score (negative — posi- 
_ jive) was also computed for each subject and 
as correlated with the dependent measures. 
»s can be seen in Table 2, in no case was 
balance score more predictive than the 
ative change score alone, although differ- 
ices between correlations were not signifi- 
ant. These results are similar to those re- 
ported by Mueller et al. (1977) and Vinokur 
#nd Selzer (1975), who have found such a 
Jalance score to be less predictive of stress- 
elated variables than measures of negative 
fe change. 

The relationships between life change scores 
F e social desirability measure were non- 
pan Correlations between positive, 
4 cond and total change scores and social 
p E ility were —.05, .05, and .01, respec- 
i): This suggests that responses to the 
: S are relatively free from the influence of 
7 Social desirability response bias. 


ĉl . 
"sonal Maladjustment and the LES 


T : 
Boi x determine the relationship between life 
and and personal maladjustment, the LES 
EN e Psychological Screening Inventory 


(PSI) were administered to 75 male and fe- 
male volunteers drawn from introduction to 
personality courses at the University of Wash- 
ington. 
The PSI (Lanyon, 1970, 1973) is a 130- 
item true—false inventory that yields scores on 
five subscales: Alienation (Al), Social Noncon- 
formity (Sn), Discomfort (Di), Expression 
(Ex), and Defensiveness (De). The Al scale 
was designed for “assessing similarity to psy- 
chiatric patients,” and the Sn scale, for ‘“‘as- 
sessing similarity to incarcerated prisoners.” 
The Di scale appears to be a measure of neu- 
roticism, the Ex scale is a measure of the 
introversion-extraversion dimension, and the 
De scale is a measure of test-taking attitude. 
Correlations between positive, negative, and 
total life change scores and the five PSI scales 
are presented in Table 3. The table shows 
that negative life change is significantly re- 
lated to scores on the Sn and Di scales. These 
findings suggest a relationship between nega- 
tive change, as assessed by the LES, and cer- 
tain types of personal maladjustment. Al- 
though two PSI scales were correlated with 
negative change only, the PSI Ex scale was 
found to correlate significantly with the posi- 
tive change score. Thus, it would appear that 
extraverted individuals experience greater de- 
grees of positive change than do introverted 
persons. The results obtained here are similar 
to those obtained by Constantini et al. (1973) 
in their investigation correlating life stress 
scores, derived from the Holmes and Rahe 
(1967) scale, with PSI scores. The fact thi 
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Table 3 
Correlations Between Life Change and PST 
Scores 


a 
PSI 
Life 
change Al Sn Di Ex De 
Positive 14.03 —.07 28" .06 
Negative 107/420" -23* —.02 —.16 
Total 03 .15 —.10 18 —.06 


Note, PSI = Psychological Screening Inventory; 
Al = Alienation; Sn = Social Nonconformity; Di 
= Discomfort; Ex = Expression; De = Defensive- 
ness, 

p< 05. 

Tepe 02, 


in the present study PSI measures of personal 
maladjustment as well as certain of the mea- 
sures considered earlier (e.g., anxiety) corre- 
late with negative but not with positive change 
provide further support for the notion that 
life stress may best be conceptualized in terms 
of negative change. 


Depression, Locus of Control, and the LES 


Scores on the LES, the Beck Depression In- 
ventory (Beck, 1967), and the Internal-Ex- 
ternal (I-E) Locus of Control Scale (Rotter, 
1966) were obtained for a sample of 64 (34 
males 30 females) college students drawn 
from undergraduate Psychology courses. Cor- 
relations between life change scores and these 
two measures are presented in Table 4, The 
table reveals a significant relationship between 
negative change and scores 
pression Inventory, 
evidence presented 
(1975), who found 


lated to self-ratings of depression. An addi- 
tional finding of interest 


who report having experienced high levels of 


A Study of Counseling Center Clients 


Tn addition 


to the findings pr t 
inie 8s presented above, 


ange scores have also been obtained 
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from a group of students receiving treatmen 
at a university counseling center for Psycho. 
logical problems. Based on earlier findings o; 
a relationship between negative life changi 
and measures of personal maladjustment, i 
was predicted that this group would differ 
from a randomly selected group of college stu- 
dents in their negative change scores but no 
in terms of positive change. The counselin 
center sample consisted of 18 students (1 
females and 2 males). For purpose of co 
parison, LES records of 18 (16 females and 


change 
groups are presented in 
Table 5. 


No significant differences were obtained for 
the positive and total change scores. Thé 
counseling center clients did, however, display 
significantly higher negative change scores 
than did the comparison group, ¢(34) = 2:21, 
Ż < 05. In order to rule out the possibility 7 
to the random 


between group means was found for negative, 
#(34) = 2.89, p < .01, but not for positive 
or total life change. These findings provide 
additional support for a relationship between 
negative life change as assessed by the LES 
and problems of a Psychological nature. 


Table 4 


Correlations Between Life Change, Depression, 
and Locus of Control 


Life change 


Beck Locus of 
score depression control y 
Positive = 15 —.05 
Negative .24* 325 
Total -06 17 
* p <05. 


e <.02. 
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Table 5 
Life Change Scores for Normals and Counseling Center Clients 
Change 
Positive Negative Total 
Group M SD M SD M SD 
Normals 10.55 8.26 9.61 9.59 20.16 11.48 
Counseling center 8.33 5.83 16.61 9.37 24.94 10.91 


Note. n = 18 for both groups. 


A Comparison of the LES and SRE 
Approaches as Measures of 
Life Change 


If the LES represents an improvement over 
the SRE, it should be possible to demonstrate 
that measures derived from the LES are more 
highly related to relevant dependent measures 
than are SRE scores. Further analyses of some 
of the data already reported, along with anal- 
yses of additional data, were undertaken to 
provide some basis for comparing these two 
«indices of life stress. The comparisons were 
accomplished by scoring only the 34 items 
of the LES that are common to the Holmes 
and Rahe (1967) measure. These items were 
scored to yield four measures. Three of these 
measures were LES positive, negative, and 
total life change scores derived in the man- 
ner described earlier. A fourth measure was 
derived by applying the life change units used 
with the SRE to each of the 34 items. It was 
thus possible to derive a measure comparable 
to the SRE based on responses to these events. 
Although these measures were based on 34 
rather than the entire 43 items of the Holmes 
and Rahe scale, it was felt that they would 
provide an adequate basis for comparing the 
LES and SRE scoring procedures. Based on 
the findings reported earlier, it was predicted 
that the LES negative change score would be 
more predictive of dependent measures than 
would the Holmes and Rahe measure. No pre- 
dictions were made regarding the LES posi- 
tive and total change scores. 

g In one comparative study, 69 female sub- 
jects from undergraduate human sexuality 
courses were given the LES, the Beck De- 
pression Inventory, and the State-Trait Anx- 


iety Inventory. The four life change measures 
were derived as outlined above. One some- 
what surprising finding was that no signifi- 
cant correlations were found between any of 
the four life change measures and anxiety. 
Given the rather consistent finding of a re- 
lationship between negative change and anx- 
iety reported earlier, these results might best 
be attributed to the rather select nature of 
the sample studied. Significant findings were, 
however, obtained for correlations with the 
Beck Depression Inventory. Correlations be- 
tween positive, negative, and total LES scores 
and depression were .02, .37 (p < 01), and 
24 (p< .05), respectively. The correlation 
between the life change unit score, similar to 
that used with the SRE, and depression was 
.17 (ns). The difference between the corre- 
lations obtained with the LES negative change 
score and the Holmes and Rahe score was 
significant, (66) = 2.31, $ < 053 

A second comparative study of the LES 
and SRE measures concerned the relationship 
between these measures and the scores on the 
PSI. As in the original analysis (which in- 
cluded the entire LES), two PSI adjustment 
measures were found to be significantly cor- 
related with life change when only 34 items 
were scored, Sn and Di (neuroticism). Corre- 
lations between change scores and these mea- 
sures are presented in Table 6. As can be 
seen, the LES negative change scores corre- 
lated significantly with both measures of ad- 
justment (Sn and Di), whereas no significant 
relationships were found between these two 
measures and the life change unit score. Al- 
though the differences between these corre- 
lations did not reach statistical significance, 
the pattern of results does seem to support the 
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Table 6 

Correlations Between LES Change Scores, 
Life Change Unit Scores (34 items), and 
Psychological Screening Inventory (PSI) 
Scale Scores 


ee a ee 
PSI 
LES Life P 
‘Change score Sn Di 
Positive .02 —.04 
Negative .26* 205: 
Total 18 12 
Unit 14 AS 


Note. LES = Life Experiences Survey; Sn = Social 
Nonconformity; Di = Discomfort, 
*pb <.05. 


superiority of the LES measure of negative 
change. 


Discussion 


The results of the studies reported here sug- 
gest that the LES may be a useful research 
and, perhaps also, clinical tool, They indicate 
that negative and total change scores, derived 
from this scale, are reasonably reliable over 
a 5- to 6-week time interval, although the 
positive change score appears to be less stable, 
Support for the usefulness of the scale is pro- 
vided by the findings showing that the nega- 
tive life change score is significantly related to 
a number of stress-related dependent mea- 
sures. In addition, scale responses appear to 
be relatively free from social ` desirability 
biases, and the measure is capable of differ- 
entiating college students who have sought 
help for adjustment problems from those who 
have not, 


sirable change made by the LES. The results 
show that positive and negative life change 
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same dependent measures. This suggests that 
the separate assessment of positive and nega- 
tive change by the LES represents a step 
forward in assessing relationships between life 
changes and diverse dependent measures. It 
seems possible that life stress is most accu- 
rately conceptualized in terms of negative life 
changes rather than in terms of positive or 
total change. Our findings and those reported 
by others suggest that it is the negative change 
measure that should be used if one’s purpose 
is to determine degree of “life stress.” 

Although the results reported here empha- 
size the role of negative change, it should be 
pointed out that the failure to find signifi- 
cant correlations between positive change and 
the dependent measures may be related to the 
lower reliability of the positive change score 
rather than to the unstressful nature of posi- 
tive life change. The findings of Mueller et 
al. (1977) and Vinokur and Selzer (1975), 
which are consistent with the present results, 
however, support conclusions emphasizing the 
importance of negative life changes. 

A major consideration in the assessment of 
life stress concerns the nature of the relation- 
ships obtained between life change scores and 
stress-related dependent variables, One might 
question, for example, whether relationships 
such as those reported in this article and 
found elsewhere in the literature reflect the 
effects of life stress on individuals or simply 
reflect the effects of specific variables on the 
reporting of life change, Regarding life stress 
research in general, one might also question 
whether persons experiencing high levels of 
life stress are actually more susceptible to the 
development of physical and/or psychological 
problems or whether persons who already 
manifest such difficulties are more prone to 
experience life change, Thus, the directional- 
ity of the relationships obtained in life stress 
Studies is often unclear, This makes it diffi- 
cult to draw firm cause-effect conclusions. 
Although authors such as Brown (1972) have 
made a strong case for the causal role of life 
Stress, and even though most research in the 
area seems to be based on the assumption 
that change plays a causal role, definitive 
answers regarding cause-effect telationships 
must ultimately come from longitudinal stud- 
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ies that are more complex than those typically 
found in the life stress literature. 

Although based on available research find- 
ings, it is not possible to resolve this direc- 
tionality issue even though some data are 
available regarding the degree to which life 
stress scores may themselves be influenced by 
the psychological state of the respondent at 
the time of testing. In a recent study by Siegel, 
Johnson, and Sarason (Note 1), the effects of 
an experimentally induced depressive state on 
responding to the LES was investigated. The 
subjects, who had previously completed the 
LES, were randomly assigned to one of three 
experimental conditions: neutral, elation, and 
depression. By using an affect induction pro- 
cedure developed by Velten (1968), it was 
possible to induce transient states of elation 
and depression in these subjects. Subjects 
were then given the LES a second time. Al- 
though a manipulation check indicated that 
the affect induction procedure did result in 
elation and depression in the two experimental 
groups, these mood states had no effect on the 
number of life changes reported or on any of 
_ the LES scores. These results suggest that 

the significant correlations between the LES 

and depression do not result from the effects 
of the depressive mood state on responding 
to the LES. These results might be inter- 
preted as being consistent with the notion that 

a causal relationship exists between negative 

events and depression. However, additional 

data are needed to draw firm conclusions. 
. (Although mood state does not influence re- 

sponding per se, depressed individuals as a 

result of their condition may actually experi- 

ence more negative changes, thus resulting in 

a correlation between change and depression.) 

The results do suggest, however, that re- 

sponses to the LES are not unduly influenced 

by the mood state of the respondent. 

Finally, in considering the assessment of 
life change and its effect on individuals, it 
would seem necessary to take into account 
the role of variables in addition to life stress. 
For example, it may be noted that even 
though significant relationships between 
change scores and dependent measures were 
found in this research, the magnitude of the 
correlations was in most instances low, sug- 
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gesting that life stress accounts for a rela- 
tively small proportion of the variance re- 
flected in the measures. This finding of signifi- 
cant but low correlations is consistent with 
the results of other life stress studies. It thus 
seems appropriate to question whether these 
findings reflect the inadequacy of present life 
stress measures or if it is, in fact, reasonable 
to expect such measures to correlate highly 
with stress-related variables. Dohrenwend and 
Dohrenwend (1974a) have pointed out that 
it is likely that the effects of life stress differ 
from person to person depending on their in- 
dividual characteristics. Some persons may be 
greatly affected by even moderate levels of 
life change, whereas others may be affected 
very little by relatively high levels. If this 
is the case, it may not be unreasonable to ex- 
pect correlations of the low magnitude that 
have typically been obtained. Perhaps we can 
expect to find stronger relationships only as 
variables determining the effects of life change 
are taken into account. 

Unfortunately, relatively little research has 
been directed toward investigating the role 
of moderator variables, although the research 
that has been conducted is provocative. Nuck- 
olls, Cassel, and Kaplan (1972) investigated 
the relationships between life stress and preg- 
nancy and birth complications. No significant 
relationships were found among these vari- 
ables when all subjects were considered. How- 
ever, when mothers were divided into those 
who displayed high and low levels of “psy- 
chosocial assets,” significant results were ob- 
tained. Subjects showing high levels of both 
life change and psychosocial assets (support 
systems in their environment) did not show 
evidence of increased complications. Those 
who displayed high levels of life change and 
low levels of psychosocial assets did have an 
increased frequency of such complications. 

The importance of moderator variables has 
also been suggested by the results of a study 
conducted by Johnson and Sarason (in press) 
in which the relationships among life change 
and measures of anxiety and depression were 
examined as a function of locus of control 
orientation (Rotter, 1966). It was predicted 
that a relationship between negative change 
and depression and anxiety would be found 
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for externally oriented subjects (who pre- 
sumably see themselves as having little con- 
trol over environmental events) but not for 
internally oriented subjects (who tend to per- 
ceive themselves as capable of exerting con- 
trol over environmental events). The results 
were in line with this prediction, thus suggest- 
ing that life stress may affect individuals dif- 
ferently depending on the degree of their per- 
ceived control over events. In one other study, 
Smith, Johnson, and Sarason (1978) found 
the relationship between life stress and a mea- 
sure of psychological adjustment to vary as 
a function of subjects’ scores on a measure 
of sensation seeking (Zuckerman, Kolin, 
Price, & Zoob, 1964). Thus, the effects of 
life stress may also be mediated by self-re- 
ported “optimal level of stimulation.” 

It would appear, then, that one’s perception 
of control over environmental events, sensa- 
tion-seeking status, and degree of psycho- 
social assets may all mediate the effects of life 
stress, It seems likely that there are also other 
individual difference variables that moderate 
the effects of life changes, and research de- 
signed to identify them is needed. The LES, 
which possesses sufficient reliability and cor- 
relates with a variety of relevant dependent 
measures, could be used in studies aimed at 
identifying moderator variables and their ef- 
fects. The format of the LES allows for the 
individualized rating of the impact of events 
plus the availability of separate measures of 
positive and negative change. This makes it 
especially appropriate for use in future re- 
search concerning how people deal with the 
stresses and strains of modern life. 
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Appendix 
The Life Experiences Survey 


Listed below are a number of events which sometimes bring J 
readjustment. Please check those events which you have 
the time period during which you have experienced each 


experience them and which necessitate social 
experienced in the recent past and indicate 


event. Be sure that all check marks are directly across from the items 


Also, for each item checked below, please indicate k 
either a positive or negative impact on your life at the time the event occurred. That is, 


about change in the lives of those who 


they correspond to. 


event as having 


the extent to which you viewed the 
indicate the 


type and extent of impact that the event had. A rating of —3 would indicate an extremely negative 
impact. A rating of 0 suggests no impact either positive or negative. A rating of +3 would indicate 


an extremely positive impact. 


Section 1 
BR yey ih 
> T ed o = 
go #v 2 s aS aay 
amo Be fe Fee oe ge Es 
t to $838 ES .&8 Se BE Ee 
6mo tyr #2 22 32 2b Ba Ha oe 
1. Marriage La ar es ee en ha rias ar! 
2. Detention in jail or comparable 
institution fg” Meg iit) CO Ser +2 +3 
3. Death of spouse 33 (22, —10) (0G) e eis 
4. Major change in sleeping habits 
much more or much less sleep) og. ok a Aaa +3 
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14, 
15. 


16. 
17, 


18. 


19, 


20. 


21. 


22. 
23. 


24, 
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. Death of close family member: 


mother 

. father 
brother 

. sister 
grandmother 
grandfather 

. other (specify) 


Pe ae op 


n 


. Major change in eating habits 


(much more or much less food intake) 


. Foreclosure on mortgage or loan 

. Death of close friend 

. Outstanding personal achievement 

. Minor law violations (traffic tickets, 


disturbing the peace, etc.) 


» Male: Wife/girlfriend’s pregnancy 
. Female: Pregnancy 
. Changed work situation (different 


work responsibility, major change 
in working conditions, working 
hours, etc.) 

New job 

Serious illness or injury of close 
family member: 

a. father 

b. mother 

c. sister 

d. brother 

e. grandfather 

f. grandmother 

g: spouse 

h. other (specify) 

Sexual difficulties 

Trouble with employer (in danger 
of losing job, being suspended, 
demoted, etc.) 

Trouble with in-laws 

Major change in financial status 
(a lot better off or a lot worse off) 
Major change in closeness of family 
members (increased or decreased 
closeness) 

Gaining a new family member 
(through birth, adoption, family 
member moving in, etc.) 

Change of residence 

Marital separation from mate 
(due to conflict) 

Major change in church activities 
(increased or decreased attendance) 


positive 


> E 

ge So So E 
0 7mo E EE sa RAE = 
tow Mone Sree eS 2 an Sa B 
6mo lyr §2 Ee Ge 22 GR EaR 

Sa 

—3 —2 -1 0 

—3 —2 -1 0 

=3 —2 =1 0 

-3 —2 -1 0 

st 0 

ETOR 0 

Su 2 ed)! 0 

Ski) — =n 

ope o 

-3 —2 -i 0 

BEA o 

E. 0 

SAE, o 

= o A) 

=—3' —2 —1 0 

=3 =2 1 0 

SI SIE in 10 

SIE 0 

Oe E.. 0 

Sie ARET 0 

E N 

E A ST 

E O 0 

SAE 0 

E EE 0 

E ee ivan 

On A 

a oem ee 

Tue ee 0 

diene. 14.) 9 

=3 =2 =1 9 

Seek 0 


37. 


38. 
39. 


40. 
41. 


42. 
43, 


44, 


45, 


46. 
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. Marital reconcilation with mate 
. Major change in number of argu- 


ments with spouse (a lot more or a 
lot less arguments) 


. Married male: Change in wife’s 


work outside the home (beginning 
work, ceasing work, changing to a 
new job, etc.) 


. Married female: Change in hus- 


band’s work (loss of job, beginning 
new job, retirement, etc.) 


. Major change in usual type and/or 


amount of recreation 


. Borrowing more than $10,000 (buy- 


ing home, business, etc.) 


. Borrowing less than $10,000 (buying 


car, TV, getting school loan, etc.) 


. Being fired from job 
. Male: Wife/girlfriend having 


abortion 


. Female: Having abortion 
. Major personal illness or injury 
. Major change in social activities, 


e.g., parties, movies, visiting (in- 
creased or decreased participation) 
Major change in living conditions of 
family (building new home, remodel- 
ing, deterioration of home, neigh- 
borhood, ete.) 

Divorce 

Serious injury or illness of close 
friend 

Retirement from work 

Son or daughter leaving home (due 
to marriage, college, etc.) 

Ending of formal schooling 
Separation from spouse (due to 
work, travel, etc.) 

Engagement 

Breaking up with boyfriend/ 
girlfriend 

Leaving home for the first time 
Reconciliation with boyfriend/ 
girlfriend 


Other recent experiences which have had 
a impact on your life. List and rate. 
8. 


49. 
50. 
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Section 2: Student Only 


S1; 


Beginning a new school experience 
at a higher academic level (college, 
graduate school, professional 
school, etc.) 


. Changing to a new school at same 


academic level (undergraduate, 
graduate, etc.) 


. Academic probation 
. Being dismissed from dormitory or 


other residence 


. Failing an important exam 

. Changing a major 

. Failing a course 

. Dropping a course 

. Joining a fraternity/sorority 

. Financial problems concerning 


school (in danger of not having 
sufficient money to continue) 


> > E 
> is) = 3 > 

oY v v + 
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Young Adult Schizophrenics: 
Prediction of Outcome and Antecedent Childhood Factors 


James D. Roff 
Eastern Michigan University 


Raymond Knight 


Brandeis University 


A young adult sample of acute schizophrenics was followed through record 
sources into middle age. Antecedent childhood information was also obtained. 
Three aspects of schizophrenia (psychotic thinking, affectivity, and social com- 
petence) were assessed in terms of long-term stability and prediction of out- 
come criteria. Measures of psychotic thinking were found to lack both stability 
and predictive validity. In contrast, a combined measure of affectivity and 
social competence was stable over time and was significantly related to eventual 
outcome. Childhood factors were also related to adult outcome variables. Im- 
plications for research definitions of schizophrenia are discussed. 


Strauss, Carpenter, and Bartko (1974) 
have emphasized three underlying processes or 
dimensions in schizophrenia: positive symp- 
toms, negative symptoms, and disordered 
social relationships. Positive symptoms primar- 
ily involve psychotic thinking, more specifi- 
cally, the presence of delusions and hallucina- 
tions. Negative symptoms refer to the absence 
of an attribute, Flat or blunted affeot is the 
principal example of a negative symptom, that 
is, the absence of normal or appropriate affect. 
Disordered social relationships are related to 
social competence as measured by the Phillips 
scale (Phillips, 1953) and other comparable 
scales, 

Strauss et al. (1974) have stated that posi- 
tive symptoms appear to be prognostically 
neutral within a schizophrenic sample. Also, 
they consider that negative symptoms may be 
confounded with chronicity and possess re- 
duced prognostic value in the absence of 
chronicity. Disordered social relationships 
have well-established prognostic validity for 
schizophrenics as reflected by studies that 
have used process-reactive or good—poor pre- 
morbid distinctions. 

The present study used a sample of young 
adult schizophrenics, who in terms of length 
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of hospitalization and duration of psychosis 
were acute patients. This allowed the assess- 
ment of variables selected to measure psy- 
chotic thinking, affectivity, and social com- 
petence as prognostic indicators prior to the 
effects of extended hospitalization. Preschizo- 
phrenic information was available from child 
guidance clinic records. Details of the child- 
hood data analysis have been reported by 
Roff, Knight, and Wertheim (1976). This 
prior study found four childhood factors that 
were then related to a global measure of 
long-term adult outcome. The present study, 
in contrast, has focused on factors found dur- 
ing the young adult period and investigated 
their prognostic significance in terms of sub- 
sequent clinical status during the middle adult 
period. In addition to a global measure of 
outcome, two factor scales provided more dif- 
ferentiated outcome criterion. The childhood 
factors, as well as the young adult factors, 
were assessed with a special interest in the 
possible differential prediction with regard to 
the two outcome factor scales. 

Previous prognostic studies have indicated 
a number of significant variables (Nameche, 
Waring, & Ricks, 1964; Robins & Guze, 1970; 
Stephens, Astrup, & Mangrum, 1967; Vail- 
lant, 1964). These studies suggest that mea- 
sures of social competence, both work and 
social adjustment, and measures of affectivity 
would be most appropriate given the data 
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available. It was also considered desirable to 
include measures of psychotic thinking, par- 
ticularly those that reflect severity. 

Related to the prognostic question is the 
problem of defining schizophrenia for research 
purposes. In this study, a hospital diagnosis 
of schizophrenia has been used. This is rec- 
ognized as an overly inclusive definition but 
has the advantage that few probable schizo- 
phrenics would be initially excluded. Outcome 
ratings were made that amounted to a rediag- 
nosis of each case based on follow-up informa- 
tion. In addition, the New Haven Schizo- 
phrenia Index (NHSI), which was developed 
as a checklist definition of schizophrenia for 
research use (Astrachan et al., 1972), was in- 
cluded. The NHSI is similar to other check- 
lists for schizophrenia that are heavily 
weighted with items that reflect aspects of 
psychotic thinking. It is desirable for defini- 
tions of schizophrenia to involve aspects of 
the syndrome that possess long-term stability. 
The concept of process schizophrenia carries 
with it the notion that a stable underlying 
condition exists. This study focuses on the 
related problems of prediction of outcome and 
the search for definitional aspects of schizo- 
phrenia that have long-term stability. 


Method 


Subjects were 45 males who received a hospital 
diagnosis of schizophrenia while in the military ser- 
vice. A few cases were diagnosed as schizoid per- 
sonality in service but were subsequently diagnosed 
as schizophrenic by the Veterans Administration 
(VA), Postservice follow-up was possible through 
the use of centralized VA files. Preservice information 
was independently obtained from child guidance 
clinic records. Three separate records sources pro- 
vided data on the same individuals for childhood 
(M age= 10.9), young adult (M age = 21.7), and 
middle adult (M = 43.7) periods. 

Postservice VA data were used to establish out- 
come groups to be used as predictive criteria for the 
service and childhood information, Each case was 
assigned a position on a 6-point scale, which re- 
flected severity of impairment and certainty of pro- 
cess schizophrenia. Outcome ratings reflected the in- 
dependent judgments of the two principal investiga- 
tors. A consensus was reached for those cases with 
initial disagreement, Outcome 1 involved recovered 
cases An =2); Outcome 2 cases had minimal signs 
of Schizophrenia at follow-up but had neurotic or 
acting-out character deficits (n= 16); Outcome 3 
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cases were paranoid with emotionally unstable per- 
sonalities ( = 6) ; Outcome 4 cases had severe schiz- 
oid personalities but had minimal thought disorder 
(n=10), and Outcomes 5 and 6 were considered 
process schizophrenics, with the former improved 
for substantial periods (n =9) and the latter unim- 
proved or deteriorated (n = 4). Independent clinical 
judgments that were used as the basis for the con- 
sensus outcome ratings had an interrater agreement 
of 76%. (For further information about the deriva- 
tion of outcome ratings, see Roff et al., 1976.) Out- 
come ratings were significantly correlated with other 
indices of outcome—length of hospitalization, em- 
ployment history, marital status, and level of VA 
compensation for psychiatric disability. 

Service information was abstracted from military 
service records, with raters blind to both childhood 
and postservice information. Six variables were se- 
lected as the best available composite measures or 
global judgments of thought disorder, affectivity, and 
social competence. These variables included concep- 
tual disorganization (Hautaluoma, 1971), a factor 
scale with item composition similar to a factor of 
the same name reported by Lorr, Klett, and Mc- 
Nair (1963); the NHSI (Astrachan et al., 1972); 
global clinical judgments of severity of thought dis- 
order and severity of affect deficit; and ratings of 
service social adjustment and service work perform- 
ance. The global clinical judgments were indepen- 
dently made by the two principal investigators on 
5-point scales. The average of the two scores was 
used. Two research assistants independently used the 
factor scale and NHSI. The assistants also rated 
social adjustment and work performance on 3-point 
scales (good, fair, poor). Again, the average of two 
Scores was used. Acceptable levels of interrater agree- 
ment were found for all of the measures, with the 
range of reliability coefficients from .70 to 97 with 
a median reliability of .82. 

These six service variables were intercorrelated 
and then factor analyzed with significant factors 
rotated to an orthogonal, varimax solution. Scales 
Were constructed by a summation of standard scores 
for relevant variables significantly associated with the 
obtained factors. 

The same variables, but from postservice data, 
Were used to construct additional parallel scales. In 
addition, the outcome rating, derived from post- 
service information, served as predictive criteria. 

The childhood data had been previously analyzed 

with four factors reported (Roff et al., 1976). Fac- 
tor scores were computed for factors of unsocialized 
aggressiveness, low IQ-poor school achievement, 
neuroticism, and schizoid syndrome. 
; The resulting set of variables was then analyzed 
i terms of longitudinal prediction. The set con- 
tained four childhood factor scores, two service 
scales, two postservice scales, and the outcome rat- 
ing. Additionally, the stability of the two service 
scales was evaluated by means of their correlation 
with the two postservice scales. 


YOUNG ADULT SCHIZOPHRENICS 


Results 


Service data were factor analyzed to deter- 
mine a limited number of dimensions that 
would provide factor scales for use as vari- 
ables in the prediction of postservice outcome 
measures. Table 1 shows the result of the 
factor analysis for the service data. The first 
factor was labeled Psychotic Thinking, with 
significant loadings for the variables of con- 
ceptual disorganization, NHSI, and global 
clinical judgments of severity of thought dis- 
order. The second factor, labeled A ffect/Social 
Competence, had significant loadings for glo- 
bal clinical judgments of affect deficit, service 
social adjustment, and service work perform- 
ance. The scale scores, derived from the three 
variables loading significantly on each factor, 
retained the independence of the factor solu- 
tion (r = .17). The first two factors accounted 
for 72% of the variance. The loading of the 
affect variable and the social competence vari- 
ables on the same factor was particularly note- 
worthy along with the relative independence 
of the two factors when an oblique rotation 
was selected. 

The factor structure for the postservice 
data was less differentiated. This was reflected 
by the fact that the two postservice scales, 
using similar variables, were significantly cor- 
related (r = .52). The postservice scales were 
highly related to the outcome ratings (psy- 
chotic thinking and outcome = .77, affect/ 
social competence and outcome = .73). At 
follow-up, thought disorder, affectivity, and 
social competence measures were all signifi- 
cantly related to each other. In contrast, dur- 


Table 1 
Orthogonal Factors for Service Variables 


c 


Factor 
Variable 1 2 
Work performance —.05 .62* 
cial adjustment 5 43* 
‘Onceptual disorganization .90* .08 
ew Haven Schizophrenia Index .77* —.29 
linical judgment 
Thought disorder 65* 36 
Affect (32.93 
*P< 01. 
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Table 2 
Relationship Between Service and 
Postservice Scales 
Postservice 

Service 1 2 Outcome 
1. Psychotic thinking 24 11 p12 
2. Affect/social competence .44* .51*  .65* 


*p< 0l. 


ing the young adult period, the thought dis- 
order measures were relatively independent of 
affectivity and social competence measures, 
whereas the later two measures were signifi- 
cantly correlated. 

Table 2 shows the relationship of two ser- 
vice scales to the two postservice scales and 
to. the outcome ratings. Clearly, it was the 
service scale measuring affect deficit and social 


-competence that possessed both long-term 


stability (r =.51) and predictive validity 
(r = .65). In contrast, the Psychotic Thinking 
scale was neither stable nor predictive. Sur- 
prisingly, the Affect/Social Competence scale 
using service information was a better pre- 
dictor of severity of psychotic thinking than 
the service Psychotic Thinking scale, which 
contained the same variables. Table 2 clearly 
demonstrates the unfavorable prognostic sig- 
nificance of the Affect/Social Competence fac- 
tor during the service period, and the Psy- 
chotic Thinking factor failed to significantly 
predict any of the postservice measures. 
Childhood factor scores were related to the 
outcome measures, with special interest di- 
rected to differential relationships with the 
two postservice factor scales. Correlations be- 
tween the childhood factor scores and the 
postservice variables are presented in Table 
3. Schizoid syndrome was significantly related 
to high scores on the Psychotic Thinking and 
Affect/Social Competence scales and to poor 
outcome. Unsocialized aggressiveness was re- 
lated to more favorable outcome and more 
favorable scores on the Affect/Social Com- 
petence scale but was not significantly related 
to level of psychotic thinking. Low IQ — poor 
school achievement was related to unfavora- 
ble scores on the Affect/Social Competence 
scale. The pattern of correlations for the low 
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Table 3 3 
Relationship Between Childhood Factor Scores and Postservice Scales 


ooo oO 


Postservice 
Affect/social 
Childhood Psychotic thinking competence Outcome 
Unsocialized aggressiveness —.12 Sone - a 
Low IQ - poor school achievement .06 .38 27 
Neurotic symptoms —.03 TU 04 
Schizoid syndrome Re 33 Seite 


*p< 05. 


IQ-poor school achievement factor was all 
the more remarkable given the significant cor- 
relation between the two postservice factor 
scales. Level of neurotic symptoms did not 
predict any of the postservice variables. In 
general, the childhood variables predicted 
postservice better than service variables. 

Classifying cases according to their com- 
bined score on the childhood schizoid syn- 
drome and the service Affect/Social Compe- 
tence scale produced the data in Table 4. 
Comparing the number of cases above and 
below the mean for each outcome group, Out- 
comes 1 and 2 combined were significantly 
different from Outcomes 5 and 6 combined. 
Outcomes 3 and 4 were more evenly distrib- 
uted on their scores. Table 4 suggests that the 
poor outcome cases not only had high scores 
on the affect/social competence factor in ser- 
vice, but they also tended to have a poor pre- 
service history as measured by the childhood 
schizoid syndrome factor. This indicates a 
long-standing deficit that existed prior to ex- 
tended hospitalization. 

Although the number of cases was small, 
Outcome 3 cases were compared with Out- 
come 4 cases, using a point-biserial correlation 
with outcome dichotomized. Outcome 4 cases 
were strongly related to low IQ — poor school 
achievement childhood factor scores Opin = 
.68), Outcome 3 cases were related to child- 
hood unsocialized aggressiveness factor scores 
(fois = 37). These relationships were con- 
sistent with the hostile-paranoid clinical pic- 
ture for Outcome 3 and the severe schizoid 
or inadequate personalities characteristic of 
Outcome 4. Outcome 3 cases were not sig- 
nificantly different from Outcomes 1 and 2, 


Outcomes 5 and 6 had higher scores on the 
childhood schizoid syndrome when compared 
with Outcome 4 cases but were not signifi- 
cantly different on the other three childhood 
variables. Outcomes 5 and 6 were consistently 
more disturbed than Outcome 4 cases both in 
service and postservice in terms of psychotic 
thinking and affect/social competence. 

Table 5 summarizes the major findings that 
relate childhood and service factors to out- 
come. The neurotic and schizoid childhood 
factors and the Affect/Social Competence and 
Psychotic Thinking service factors were con- 
sistent in their relationships across all three 
outcome measures. In contrast, the unsocial- 
ized aggressiveness and low IQ — poor school 
achievement childhood factors were not sig- 
nificantly related to Psychotic Thinking scale 
scores but were significantly related, in oppo- 
site directions, to the Affect/Social Compe- 
tence scale; and the former was significantly 
related to more favorable global outcome rat- 
ings while the latter approached significance 
in the unfavorable direction. 


Table 4 

Outcome by Schizoid Syndrome Plus 

A ffect/Social Competence Scale 

Se E a 


Scale score 


Outcome Above M Below M 
1,2 2 14 
3 2 4 
4 4 6 
5, 6 12 1 


Note. x?(3) = 18.87, p < .001. 
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Table 5 
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Childhood and Young Adult Factors in Relation to Outcome 


i  ——$—$— 


Prognostic significance 


Childhood factors 


Young adult factors 


Unfavorable 


Schizoid syndrome 


Affect/social competence 


Low IQ - poor school 


Psychotic thinking 


achievement* 
Neutral Neurotic symptoms 
Favorable Unsocialized aggressiveness” 


a Unfavorably related to affect/social competence factor at outcome but neutral with respect to psychotic 
thinking and of borderline significance in relation to global outcome ratings. 
b Neutral with regard to psychotic thinking factor at follow-up. 


Discussion 


As with most schizophrenic samples, it is 
important to consider sample limitations. 
These include, in addition to sample size, a 
male sample, individuals with childhood prob- 
lems (disturbed preschizophrenics), and the 
exclusion of cases rejected for military ser- 
vice. On the other hand, comparisons have 
been restricted to within-sample differences 
for a group that has had similar experiences 
in terms of child clinic contact, military ser- 
vice hospitalization, and the effects of being 
labeled schizophrenic. Results were consistent 
with the claim of Strauss et al. (1974) that 
positive symptoms are prognostically neutral 
within a schizophrenic sample. Affect deficit, 
primarily flat affect, measured the principal 
negative symptom. In this study, flat or 
blunted affect was not a product of extended 
hospitalization. Its appearance early in a sub- 
ject’s career was an unfavorable prognostic 
sign. Disordered social relationships were re- 
lated to the clinical judgments of affect deficit. 
It is plausible that affect deficit arises as a 
result of a history of impaired social relation- 
ships. Flat affect may be a function of chro- 
nicity measured not in duration from the onset 
of psychotic thinking or positive symptoms 
but from the onset of extreme social malad- 
justment. 

The childhood schizoid factor contained a 
Measure of peer adjustment during childhood 
and a schizoid scale that included symptoms 
of apathy, flat affect, and seclusiveness. In 
other words, the schizoid syndrome included 
the childhood variables most similar to those 
in the Affect/Social Competence scale for the 
adult periods. Unsocialized aggressiveness in 


childhood was most common in Outcomes 1, 
2, and 3. The primary difference between these 
groups was the significantly higher level of 
psychotic thinking at follow-up for the Out- 
come 3 cases. Otherwise, Outcome 3 cases had 
a developmental background very similar to 
the more favorable Outcomes (1 and 2). Out- 
come 4 cases presented a contrast to Outcome 
3 at follow-up with lower levels of psychotic 
thinking but had higher scores on the Affect/ 
Social Competence scale. During childhood, 
the Outcome 4 cases were less aggressive than 
Outcome 3 cases and more inadequate as re- 
flected by lower IQ and poorer school achieve- 
ment. These results suggest that Outcome 3 
cases might be considered poor outcome mem- 
bers of a larger group with generally favora- 
ble outcome, whereas Outcome 4 cases might 
have more favorable outcomes from a group 
with generally poor outcome. 

Attempts to define schizophrenia that are 
heavily weighted with psychotic thinking such 
as the NHSI appear less promising than al- 
ternative measures of affectivity and social 
competence. Social competence has been ade- 
quately assessed by existing scales, A recent 
scale developed by Chapman, Chapman, and 
Raulin (1976) to measure anhedonia may 
provide a more convenient measure of affec- 
tivity. This study suggests that the combina- 
tion of a social competence measure with one 
of affectivity may assess an important defini- 
tional aspect of schizophrenia with both long- 
term stability and prognostic relevance. It 
should be noted that measures of psychotic 
thinking during the postservice period were 
highly related to measures of affectivity, so- 
cial competence, and outcome. It was during 
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the service period that psychotic thinking 
was a poor substitute for the other measures. 
Of course, it is the early stage in which prob- 
lems of definition and prediction are most 
challenging. 

The results of this study do indicate the 
stability of some behaviors of schizophrenic 
subjeots over extended periods of time. It ap- 
pears that adequate assessment of these stable 
aspects of schizophrenia should be incorpo- 
rated into research definitions of schizophrenia, 
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Blood Alcohol Level Discrimination by Alcoholics: 
The Role of Internal and External Cues 


David Lansky, Peter E. Nathan, and David M. Lawson 
Rutgers—The State University 


Two groups of four chronic alcoholic subjects lived in Rutgers’ Alcohol Be- 
havior Research Laboratory for separate 2-week periods. During that time, 
subjects were taught to attend either to internal or to external cues to blood 
alcohol level (BAL). During a single training session, subjects received feed- 
back on actual BAL following each of their BAL estimates. During pretraining 
and posttraining sessions, assessments of BAL estimation accuracy were ob- 
tained in the absence of feedback. Prior to training, both groups of alcoholics 
were equally inaccurate in estimating BAL. During training, when accurate 
BAL feedback was provided, estimation accuracy increased significantly for 
both groups. Once feedback of actual BAL was removed during the posttraining 
test session, however, only externally trained subjects maintained the ability to 
estimate BAL accurately. It was concluded that unlike the nonalcoholic sub- 
jects studied by Huber, Karlin, and Nathan, the alcoholic subjects of this re- 
search did not learn to discriminate BAL on the basis of internal feelings and 
sensations nearly as adequately as they did when they referred to external cues. 
These findings have important implications for the clinical application of BAL 


discrimination training. 


In recent years the widespread conviction 
that abstinence constitutes the only legiti- 
mate treatment goal for alcoholism has come 
under increasing scrutiny. Acceptance of ab- 
stinence-oriented treatment goals for all al- 
coholics has been challenged by findings that 
some alcoholics can acquire and maintain pat- 
terns of moderate social drinking without ac- 
companying “loss of control” over intake 
(Armor, Polich, & Stambul, 1976; Davies, 
1962; Pattison, 1968; Popham & Schmidt, 
1976) and by the apparent success of alco- 
holism treatment programs with controlled 
drinking as an explicit treatment goal (Lovi- 
bond & Caddy, 1970; Pomerleau, Pertschuk, 
& Stinnet, 1976; Sobell & Sobell, 1973, 1976). 


This study was supported by National Institute on 
Alcohol Abuse and Alcoholism Grant AA00259-07 
to Peter E. Nathan. We thank John Miller, Depart- 
ment of Statistics, Rutgers—The State University, 
and the staff of the Alcohol Behavior Research Lab- 
Oratory for their essential help in this project. 

Requests for reprints should be sent to Peter E. 

athan, Alcohol Behavior Research Laboratory, Rut- 
gers—The State University, New Brunswick, New 
Jersey 08903. 


Several studies on the utility of controlled 
drinking-oriented treatment for alcoholism in- 
corporate blood alcohol level (BAL) discrimi- 
nation training as a component. The most 
frequently used BAL discrimination training 
method involves the delivery of accurate feed- 
back on BAL under training conditions de- 
signed to sensitize the individual to the 
affective and physiological (“internal”) con- 
comitants of different BALs. Caddy and Lovi- 
bond (1976), Vogler, Compton, and Weiss- 
bach (1975), and Wilson and Rosen (1975) 
have concluded that this way to train BAL 
discrimination gives the alcoholic the ability 
to monitor level of intoxication and then to 
use this newfound skill to maintain more 
moderate BALs. 

A review of the basic research findings rele- 
vant to this treatment approach, however, 
failed to confirm that alcoholics can in fact 
acquire and maintain the ability to discrimi- 
nate BAL on the basis of internal cue training 
alone. For example, the positive results of 
studies by Lovibond and Caddy (1970), Vog- 
ler et al. (1975), and Paredes, Jones, and 
Gregory (1974) are all difficult to interpret 
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Table 1 
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Means for Demographic Characteristics and Drinking Histories of Subjects in the 


Internal and External Training Groups 


OeeeeeEeEeSeSeSeSeSeSeSeSeSeSssSsssssssssssssesesesese 


Training 
Variable Internal External t 

Age (years) 36.25 35.75 141 
No. years of education 11.75 10.75 .880 
No. alcohol-related hospital- 

izations or treatment 

programs 1.50 1.75 .205 
No. years of problem drinking 15.00 9.50 1.166 


Note. n = 4, All ¢ tests are nondirectional tests for independent samples. df = 6. 


unequivocally, since BAL discrimination ac- 
curacy was not assessed in any of the three 
studies prior to training or after it was ter- 
minated. As a result, it is impossible to know 
whether the alcoholic subjects of these in- 
vestigations actually did (a) improve their 
capacities to estimate BAL accurately and 
(b) maintain this accuracy when external 
cues to intoxication (eg., drink strength, 
veridical BAL feedback, etc.) were removed, 

An investigation designed to study the im- 
pact of some of these factors on BAL estima- 
tion was recently published by Silverstein, 
Nathan, and Taylor (1974), Alcoholics were 
asked to provide BAL estimates on the basis 
of internal feelings and sensations both when 
accurate feedback on BAL was provided (dur- 
ing internal cue training itself) and when it 
was not (during pretraining and posttraining 
sessions). Although discrimination accuracy 
improved from the initial baseline period to 
the subsequent period of training, removal of 
feedback during a final posttraining phase of 
the study resulted in a return to pretraining 
levels of estimation accuracy. In essence, then, 
this initial pre-post comparison of BAL dis- 
crimination by alcoholics did not support the 
view that alcoholics can learn to discriminate 
BAL on the basis of internal cues alone, 

In contrast, the results of BAL discrimi- 
nation studies of nonalcoholic social drinkers 
(e.g., Bois & Vogel-Sprott, 1974; Huber, Kar- 
lin, & Nathan, 1976) suggest that this subject 
population can learn to discriminate BAL on 
the basis of subjective feelings and sensations, 
In fact, Huber et al, (1976) found that social 
drinkers can learn to discriminate BAL 


equally well when trained to attend either to 
internal cues (feelings and sensations) or to 
external ones (BAL—dose relationships). 

To date, however, no study of alcoholic 
subjects has rigorously explored the efficacy 
of BAL discrimination training focused on ex- 
ternal cues nor has external cue training yet 
been compared directly to internally focused 
training. Nonetheless, as noted above, clinical 
studies of BAL discrimination continue to re- 
port attempts to train alcoholics to discrimi- 
nate BAL via internal cues in the absence of 
empirical evidence that they can in fact do 
so. This critical research lacuna emphasizes 
the importance of a direct comparison of these 
two BAL discrimination training methods with 
alcoholics, 

The present study was designed to effect 
just such a comparison. To this end, alcoholic 
subjects were selected for an essential replica- 
tion of the Huber et al. (1976) study, which 
Compared the efficacy of external cue training 
and internal cue training with nonalcoholic 
social drinkers, It was hypothesized that un- 
like the nonalcoholic subjects studied by 
Huber and his colleagues, the alcoholic sub- 
jects in this study would be less well able to 
acquire accurate BAL discrimination on the 


basis of internal than external training pro- 
cedures. 


Method 


Subjects 


t Subjects were recruited via advertisements placed 
ìn regional newspapers. They were offered $60 per 
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week to live in the Alcohol Behavior Research Lab- 
oratory (described in Nathan, Goldman, Lisman, & 
Taylor, 1972) and participate in the study. 

Four men were selected for each of two training 
groups. All subjects met the following criteria: (a) 
more than 2 years of heavy problem drinking; (b) 
evidence of physical dependence on and tolerance to 
ethanol; (c) good physical health, with no signs of 
liver or kidney damage; (d) no evidence of psycho- 
sis or chronic brain syndrome; (e) not dependent 
on prescription or street drugs at the time of the 
research. A complete physical examination, with ap- 
propriate laboratory tests, was given to each sub- 
ject. Demography and drinking history of the two 
subject samples is summarized in Table 1. Nondirec- 
tional ¢ tests for independent samples revealed no 
significant differences between the two groups on 
any of these variables. 


Apparatus 


BALs were measured from breath samples by the 
Gas Chromatograph Intoximeter, Mark IV (Intoxim- 
eters Inc., St. Louis, Missouri). 


Procedure 


_ Each subject participated in a total of three ses- 
sions, From midnight of the day preceding each of 
the three sessions, subjects were deprived of all food 
and beverage. The morning of the session, each sub- 
ject received one cup of decaffeinated coffee; lunch 
was served at the end of the session. 

Session 1 (pretraining). Session 1 was a baseline 
Session, designed to assess subjects’ pretraining BAL 
estimation accuracy. All subjects consumed a total 
of six drinks containing 80-proof vodka and tomato 
Juice at 35-minute intervals. Each drink contained 
a mixture of either 4, 1, or 14 ounces of vodka, plus 
enough tomato juice to make a total of 6 ounces 
of liquid. In this and the other two sessions, these 
three dosage levels were randomly distributed, with 
the 4-ounce dose given once, the 1-ounce dose given 
twice, and the 14-ounce dose given three times in 
each series of drinks. Accordingly, total consumption 
of liquid was 36 ounces in a 3-hour period, 7. ounces 
of which were vodka. Following each drink, a sub- 
ject completed a postdrink questionnaire, designed 
to assess his ability to discriminate the amount of 
alcohol contained in the drink. 

F our BAL estimates were made during the course 
of this session, Neither BAL feedback nor training 
Ss PaL estimation were provided during the session. 
a jects were required to make their estimates on a 
cale of 0-150, with O representing “cold sober” and 
= representing “very high, about as high as you’ve 
ae been.” Estimates were scheduled 25 minutes 
bh the second and fifth drinks and 1 hour and 

s Ours after the sixth drink. 
aes in all sessions spent the time between 

S and BAL estimates in individual bedrooms. 
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Session 2 (training). Sessions 1 and 2 were sepa- 
eee by a day, during which alcohol was not avail- 
able. 

Drink doses were randomized as they had been 
during Session 1. Ingestion of drinks took place as 
it did during the first session, except that subjects 
were now required to gargle with an anesthetic 
mouthwash prior to each drink; the gargling served 
briefly to anesthetize the subject’s mouth and upper 
throat, thereby interfering with his ability to use 
taste cues to discriminate the amount of alcohol in 
the drink. As in Session 1, a postdrink questionnaire 
was completed immediately after a subject had fin- 
ished each drink. 

Depending on the group to which he was assigned, 
a subject received training in BAL estimation that 
focused either on external or on‘internal cues. 

External training. Each subject in this group re- 
ceived a programmed learning booklet explaining 
BAL-dose relationships. The booklet was prepared 
for each subject individually so that the approximate 
absorption rate of alcohol that he used was based 
on his weight and resultant BALs as observed in 
Session 1. All subjects were taught that they metabo- 
lize alcohol at the rate of about 10 points per hour 
on the 0-150 scale and that an ounce of 80-proof 
vouka produces a rise of 8, 10, or 12 points, depend- 
ing on weight and metabolism rate. All subjects were 
allotted sufficient time (2 hours) to complete their 
booklets. The results of written tests, administered 
following completion of the booklet, indicated that 
all subjects had mastered the material. The training 
sequence for the remainder of this session was as fol- 
lows: (a) at bar, gargling and drinks, postdrink 
questionnaire; (b) in room, 20-minute wait; (c) at 
Intoximeter, told alcohol content of immediately pre- 
ceding drink, BAL estimate, BAL feedback; (d) 
drink sequence repeated. 

Feedback from the Intoximeter was converted to 
the subject’s scale of 0-150 (0= 0 mg%; 150 = 300 
mg%). 

Internal training. Internal training began with 
each subject listening to a standard relaxation tape 
designed to increase awareness of important muscle 
groups and of the sensations that arise from them 
during states of tension and relaxation. The tape was 
played once for 15 minutes. Subjects were asked to 
sit quietly, focus on, then “tune-in” to the bodily 
sensations that they were experiencing. To help sub- 
jects become further aware of internal states, they 
were also asked to complete modified versions of the 
Body Sensation Checklist (Bois & Vogel-Sprott, 
1974) and the Mood Adjective Check List (McNair 
& Lorr, 1964) at this time. Prior to each BAL esti- 
mate, the subjects in this group again went through 
all procedures described above except for listening 
to the relaxation tape. Accordingly, the experimental 
sequence for this group was as follows: (a) At bar, 
gargling and drinks, postdrink questionnaire; (b) in 
room, 5-minute wait, Mood Adjective Check List, 
tune-in instructions, Body Sensation Check List; (c) 
at Intoximeter, BAL estimate, BAL feedback, tune-in 
instructions; and (d) drink sequence repeated. As 
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with the external training group, BAL feedback was 
converted to the 0-150 scale. 

The drink-estimate sequence was repeated six 
times during this session, with two additional esti- 
mates programmed 1 and 13 hours following the 
last drink. 

Session 3 (testing). This session followed Session 
2 by a day, during which alcohol was unavailable. 
Procedurally, it was identical to the first session with 
two exceptions. First, all subjects were required to 
continue gargling with the anesthetic mouthwash 
prior to each drink. Second, subjects in both groups 
were told the actual alcoholic content of their drinks 
after completing the postdrink questionnaire. This 
was done to standardize testing conditions and to en- 
sure that Session 3 provided a fair test of the efficacy 
of both training procedures. Thus, by making dosage 
information available, externally trained subjects were 
provided the minimal information necessary to appro- 
priately utilize their training, whereas internally 
trained subjects were provided more than sufficient 
information to fully utilize internal training meth- 
ods. As a result of this testing procedure, outcome 
was biased in favor of the internal training method. 
After each BAL estimate, each subject was required 
to fill out a postdecision questionnaire, indicating 
how he had arrived at his particular estimate of 
BAL. Finally, all subjects were told that they would 
receive a $3 bonus for each estimate given this ses- 
sion that was within 5 points of their actual BAL. 


Data Analysis 


BAL estimation accuracy. BAL estimation accu- 
tacy was defined as the absolute difference between 
actual and estimated BAL. In order to have com- 
parable data points across sessions, only those four 
BAL estimates in Session 2 (2, 5, 7, and 8) that cor- 
responded in time to those in Sessions 1 and 3 were 
included in the data analysis. Sessional changes in 
error scores were analyzed by means of a three-factor 
analysis of variance (Edwards, 1972), with Sessions, 
groups, and trials (i.e. estimates) as main effects, 
Within-session data were analyzed by means of two- 
factor analyses of variance, with groups and trials 
as main effects. Data analyses included computation 
of partial correlations (Ferguson, 1971) between ac- 
tual BAL and BAL estimates, with the effect of be- 
tween-subject variability partialed out. These anal- 
yses were performed for all subjects within a group 
on a  session-by-session basis, Third session mean 
error scores for the two groups were also compared 
by means of analysis of covariance (Edwards, 1972) 
zi mean Session 1 error scores as the covariate ; 

liscriminability of drink strength. T, ni 
whether the three alcohol doses aes E 


from each other, the drink estimat: ini 
7 es of both 
groups were combined fo Ge 


estimates for eight 4-ounce drinks, 
and 24 14-ounce drinks in each sessi 


© one-factor anal- 
ns (Edwards, 1972), 
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with the three dosage levels analyzed to determine 
the single main effect of dose. Relative drink dis- 
crimination accuracy by each of the two training 
groups was analyzed by comparing drink estimation 
error scores (defined as the absolute value of the 
difference between a subject’s estimate of the alco- 
holic content of a drink and actual drink dosage) at 
all three dosage levels. To this end, within-session 
error scores were subjected to two factor analyses 
of variance, with groups and trials as main effects, 


Results 
BAL Estimation Accuracy 


Raw BAL estimation error scores could not 
be subjected directly to analysis of variance, 
since an Fmax test (Winer, 1962) revealed 
significant heterogeneity of variance across 
sessions (Fmax = 118.62, p< .01). A loga- 
rithmic transformation (log actual error score 
+ 1.0) was required to reduce the heteroge- 
neity of variance. 

Figure 1 graphs the mean transformed error 
score of each group in each session. As this 
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Figure 1. Mean transformed blood alcohol level 


(BAL) estimation error scores of the alcoholics by 
Session and group. 
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figure shows, both groups markedly improved 
estimation accuracy with training. Mean 
transformed error scores, averaged over 
groups, were 1.5202 for Session 1, .9548 for 
Session 2, and .9374 for Session 3. An anal- 
ysis of variance of these data revealed a sig- 
nificant sessions effect, F(2, 12) = 5.49, p< 
02, Duncan’s multiple-range test for differ- 
ences in session means (Edwards, 1972) re- 
vealed significant differences between the 
mean error score in Session 1 and those in 
Sessions 2 and 3. Mean error scores in Ses- 
sions 2 and 3 did not differ from each other. 
These results confirm that BAL estimation 
accuracy improved significantly for both 
groups with training. 

Session 1 (pretraining). No significant 
effects were revealed by the analysis of vari- 
ance applied to Session 1 data. Mean trans- 
formed error score for internally trained sub- 
jects was 1.5368; for externally trained sub- 
jects, it was 1.5036. The relationship between 
BAL estimates and actual BAL was also ex- 
amined by computing partial correlation co- 
efficients (see Table 2). Although the corre- 
lation for the external-training group (.47) 
was greater than that for the internal-training 
group (.23), neither of these correlations ap- 
proached significance. If the degree of cor- 
relation between actual and estimated BAL is 
considered a measure of the ability to ac- 
curately monitor changes in actual BAL, then 
this analysis suggests that neither group of 
alcoholics was able to monitor these changes 
to a significant degree during this pretraining 
Session, 

Session 2 (training). As noted above and 
shown in Figure 1, the BAL estimation ac- 
curacy of both groups improved markedly in 
Session 2. A within-session analysis of vari- 
ance revealed no significant effects for this 
Session; the difference between the mean 
transformed error score for internally trained 
subjects of 1.0545 and the score of .8550 for 
externally trained subjects was not significant. 
In addition to these lowered error scores, both 
training groups were more successful at moni- 
toring actual BAL, as evidenced by increased 
Correlations between BAL estimates and ac- 
tual BAL, 
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Table 2 

Partial Correlations Between Actual and 
Estimated Blood Alcohol Level for Each Group 
on a Session by Session Basis 


Session Internal External 
1 .23 AT 
2 .66** 52% 
3 —.06 F hise 
*p <.07. 
** p < 02. 
*** p < 001. 


Session 3 (testing). An analysis of vari- 
ance applied to Session 3 data revealed a sig- 
nificant group effect, F(1, 6) = 7.69, P< 
035. No other effects reached significance. 
The mean transformed error score for the in- 
ternal-training group was 1.1860; it was .6887 
for the external-training group. Mean Session 
3 error scores were also analyzed with mean 
Session 1 error scores as the covariate. This 
analysis also revealed strong group differences, 
F(1, 5) = 6.41, p < .053. 

Group differences in Session 3 estimation 
accuracy appear even more striking when one 
considers correlations between actual and es- 
timated BALs (see Table 2). Although a high 
correlation between these scores was attained 
by internally trained subjects during Session 
2, a similar analysis of Session 3 data failed 
to reveal a similarly high significant correla- 
tion for these subjects. By contrast, the cor- 
relation attained by the externally trained 
group in Session 3 was substantially higher 
than that observed during Session 2. 


Discriminability of Drink Strength 


An analysis of variance applied to within- 
session drink estimation data revealed signifi- 
cant differences between estimates given at 
each dosage level during all three sessions, 
F(2, 45) = 14.53 (Session 1); 4.02 (Session 
2); 15.41 (Session 3), all ps < .03. Duncan’s 
multiple-range test indicates that estimates at 
the }-ounce dose were less than those at the 
l-ounce dose, which, in turn, were less than 
those at the 1}-ounce dose (all ps< 05). 
These data show, then, that even when sub- 
jects gargled with an anesthetic mouthwash 
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prior to consuming each drink during Sessions 
2 and 3, they could discriminate among drinks 
of varying doses. When drink estimation er- 
ror scores of the two groups were compared, 
there were no group differences in drink es- 
timation accuracy during Sessions 1 and 2. 
During Session 3, however, internally trained 
subjects tended to be more accurate (they had 
lower error scores) than externally trained 
subjects, F(1, 6) = 5.73, p < .053. 


Self-Report Measures 


Following each Session 3 BAL estimate, 
subjects were required to describe the means 
by which that estimate had been made. Exter- 
nally trained subjects reported having used 
only the training method that they had been 
taught during Session 2. Internally trained 
subjects also reported their estimates, depend- 
ing on their respective training method; two 
of these subjects noted, in addition, that they 
had attended to the number of drinks con- 
sumed as a cue to their estimates. Subjects 
were also asked to describe any difficulties 
that they had experienced with their respec- 
tive training modes. No externally trained 
subject reported any difficulty in this respect. 
By contrast, internally trained subjects re- 
ported that the gargling had detracted from 
the accuracy of their BAL estimates by mask- 
ing the taste of their drinks, 


Discussion 


For the chronic alcoholic subjects who took 
part in this study, there appears to be little 
doubt that training in BAL discrimination via 
external cue training was more effective than 
training in internal cues, Although all eight 
alcoholic subjects demonstrated comparable 
pretraining levels of estimation accuracy and 
comparable levels of accuracy during a sub- 
sequent training session, externally trained 
subjeots were significantly more accurate in 
their estimates after training ended and feed- 
back was withdrawn, 

The relative superiority of the external 
training mode for these subjects is also re- 
flected by the results of correlational anal- 
yses. As noted above, we presume that the de- 
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gree of correlation between actual and esti- 3 
mated BALs is a direct reflection of the ac- 
curacy with which subjects monitored changes | 
in BAL. In this regard, although externally 
trained subjects were more accurate in the 
ability to monitor changes in BAL before 
training began, neither group of subjects was 
able to do so particularly well at this point. 
In Session 2, however, when feedback on ac- 
curacy in the context of training was provided, 
internally trained subjects markedly improved. 
their monitoring performance; in fact, their 
accuracy at this point was marginally beyond 
that attained by externally trained subjects 
whose Session 2 accuracy was not substan- 
tially improved over that of Session 1. How- 
ever, once training ended and feedback was 
removed (during Session 3), the externally 
trained group substantially improved the ac- 
curacy of its BAL monitoring (to .91), 
whereas the correlation between actual and 
estimated BAL for the internally trained 
group returned to its Session 1 level of essen- 
tially zero. 

The pattern of correlations observed for the 
externally trained group suggests that BAL 
feedback alone did not enhance monitoring 
performance; rather, practice with the exter- 
nal training method itself appears to have 
been largely responsible for the development 
and maintenance of highly accurate monitor- 
ing. Conversely, as soon as internal training 
was begun, accurate tracking was observed im- 
mediately, but it was maintained for only as 
long as veridical feedback was available to 
the internally trained subjects. Thus, monitor- 
ing of BAL on the basis of external cues ap- 
pears to have developed more slowly but to 
have been more enduring, in large part be- 
cause it is less dependent on feedback than 
monitoring by subjects instructed to attend 
to internal cues. 

Several methodological problems may affect 
the generality of these findings. First, inter- 
nally trained subjects were provided drink 
dosage information during Session 3 but not 
during Session 2; this difference in training 
and testing conditions may have interfered 
with the internal discrimination techniques 
that these subjects were taught. This degree 
of impact seems unlikely, however, since self- 
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report questionnaire data indicated that in- 
ternally trained subjects attended primarily 
to their feelings and sensations in formulating 
Session 3 BAL estimates and that external 
cues, when available, facilitated this discrimi- 
nation process. In addition, since subjects 
were able to discriminate among drinks of 
varying doses in all three sessions, it is un- 
likely that this procedural variant provided 
information to which internally trained sub- 
jects did not already have access. 

A second methodological question that can 
be raised concerns the specificity of the train- 
ing information provided to the two groups 
of subjects. In this regard, externally trained 
subjects were provided objective information 
concerning relationships between distinct ex- 
ternal cues (i.e., alcohol doses) and BALs. 
By contrast, internally trained subjects were 
required to learn their discrimination task 
using cues that were less distinct and objecti- 
fable. Some authors (e.g., Lovibond & Caddy, 
1970) have attempted to circumvent this 
problem by providing internally trained sub- 
jects with some objective information about 
the relationship between internal cues and 
blood alcohol levels. (e.g., “At a BAL of .05, 
you will begin to feel a little unsteady.”) This 
Procedure was not followed in the present 
study because it was felt that if internal cues 
Were linked to different BALs for different 
subjects, as suggested by Huber et al. (1976), 
the provision of standardized information 
might have confused individual subjects by 
providing them with information that might 
Not have been relevant in their particular 
Cases, 

The small number of subjects in our sam- 
ple limits generalizations that can be drawn 
ftom our data. However, these data substan- 
tially support preliminary findings of the only 
Other study of BAL discrimination by alco- 
olics that programmed pretraining and post- 


_ taining tests of discrimination accuracy 


(Silverstein et al., 1974). It would appear to 
a reliable finding, therefore, that alcoholics 
R relatively unable to discriminate BAL on 

e basis of internal feelings and sensations. 
tee findings would bear considerable diag- 
th ic significance if it could also be shown 

at nonalcoholics have less difficulty discrimi- 
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nating BAL on the basis of internal cues. 
For example, Huber et al. (1976), Ludwig 
and Wikler (1974), Pattison (1976), and 
Silverstein et al. (1974) have all observed 
that insensitivity to the visceral events that 
accompany alcohol intake, as reflected by 
failure to discriminate BAL on the basis of 
internal cues, may be functionally related to 
an alcoholic’s inability to control his or her 
drinking. As already noted, Huber et al. 
(1976) and Bois and Vogel-Sprott (1974) 
have reported that nonalcoholics can discrimi- 
nate BAL on the basis of internal cues. A 
recent study of nonalcoholics by Maisto and 
Adesso (1977), however, failed to support 
these earlier findings. In their study, nonalco- 
holic social drinkers accurately discriminated 
BAL only when they knew that they were 
consuming alcohol; when misled into thinking 
that they were drinking tonic, the discrimina- 
tion accuracy of these subjects diminished 
substantially. It is not clear, therefore, 
whether the discrimination accuracy of Maisto 
and Adesso’s social drinkers—as well as that 
of subjects studied by Huber et al. (1976) 
and Bois and Vogel-Sprott (1974)—was due 
to discriminated sensitivity to different BALs 
or to other factors such as subjects’ knowl- 
edge of BAL dose—response curves and/or 
their acquiescence to demand characteristics 
of the experimental paradigm. Given these in- 
consistencies in data bearing on BAL dis- 
crimination skills of social drinkers, the di- 
agnostic significance of alcoholics’ apparent 
inability to discriminate BAL remains equiv- 
ocal, though their behavioral difference from, 
nonalcoholics in our BAL training paradigm 
remains real. 

The findings reported here bear substantial 
clinical significance. In the first place, despite 
elaborate “internal” training procedures de- 
veloped by other investigators to induce dis- 
criminated sensitivity to a range of BALs, 
our data suggest that a simpler, more effective 
means of accomplishing the same end is to 
train alcoholics to attend to the external cues 
that accompany alcohol intake. More gener- 
ally, alcoholics’ apparent insensitivity to the 
internal cues that accompany alcohol intake 
that therapy designed to enhance 
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properly approached via interventions that 
focus on external rather than internal sources 
of control. This conclusion, in turn, suggests 
that behavioral approaches to alcoholism 
treatment that focus on environmental deter- 
minants of drinking behavior might be more 
effective with problem drinkers than more 
traditional insight-oriented therapies. 
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Cuban Value Structure: Treatment Implications 
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This article discusses the relationship between cultural variables and psycho- 
social treatment. It is assumed that in order for psychosocial treatment to be 
acceptable and effective with a client population, it must be sensitive to the 
cultural characteristics of that population. The paradigm of planning therapy 
according to the cultural characteristics of a population is illustrated for Cuban 
immigrant adolescents. To investigate cultural variables, a Value Orientations 
Scale was developed based on the work of Kluckhohn and Strodtbeck using 
325 subjects. Four factorially derived subscales were obtained. When 208 addi- 
tional Cuban immigrant and Anglo-American adolescents were compared along 
the Value Orientations Scale, the Cubans tended to prefer lineality, subjugation 
to nature, present time, and not to endorse idealized humanistic values, whereas 
the Americans tended to prefer individuality, mastery over nature, future time, 
and to endorse idealized humanistic values. The implications of these value 
differences for the delivery of mental health treatment are discussed. 


cumstances. This paradigm for treatment out- 
come research has received strong support 


i Cross-cultural conditions have seldom been 
investigated as variables of individual differ- 


“i related to the appropriateness of dif- 
erent mental health treatment models. Re- 
cently, however, cultural variables have been 
considered as constituting relevant personal 
and situational characteristics that require 
Specific culturally sensitive treatment ap- 
Proaches (Weidman, 1975). 
as general, the issue of matching clients and 
i ae to enhance the likelihood 
“A E taining desired outcomes has received 
Xtensive discussion and widespread endorse- 
ac in psychotherapy. Paul (1969), for ex- 
oe” has argued that psychotherapy out- 
oe research should be directed toward as- 
x ae which treatment by whom is most 
lee ive for a person with specific character- 
s and problems in a particular set of cir- 
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from Bergin (1971), and Kiesler (1969, 
1971) among others. 

During the past 18 years, almost 700,000 
Cubans have migrated to the United States. 
Approximately 500,000 have settled in the 
Greater Miami area, comprising about 90% 
of the local Latin population. Establishing 
mental health and drug abuse treatment ser- 
vices for the Cuban community have pre- 
sented serious problems for the providers of 
these services, because the Cubans did not 
seek treatment from the established Anglo- 
American-oriented programs (cf. Ladner, 
Page, & Lee, 1975). These patterns of health 
care utilization are consistent with those ob- 
served in other Latin groups who, in general, 
underuse Anglo-American-oriented mental 
health services (Padilla & Ruiz, 1973). Con- 
comitant with the immigrant status of the 
Cubans, high levels of behavioral disorders 
were expected to occur as had been found with 
other immigrant groups (Al-Issa, 1970; Berry 
& Annis, 1974; Mezey, 1960). It was urgent, 
therefore, to develop therapeutic models fea- 
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sible for attracting and maintaining these 
Cubans in treatment. 


The Problem 


The present study is based on the assump- 
tion that in order to develop therapeutic 
models that will effectively attract and main- 
tain clients in therapy, their cultural back- 
ground must be understood. Specifically, it is 
postulated that cultural variables constitute 
an important set of client characteristics that 
need to be taken into consideration for de- 
veloping valid statements about the relation- 
ship between adolescent Cuban clients in 
treatment and the appropriateness of treat- 
ment models (cf. Kiesler, 1971). It is fur- 
ther hypothesized that an understanding of 
the cultural differences between Cuban immi- 
grant and Anglo-American adolescents pro- 
vides a conceptual framework for those as- 
pects of a psychosocial therapeutic model 
that enhances the appropriateness (and thus 
the effectiveness) of treatment for a Cuban 
immigrant adolescent population vis-à-vis an 
Anglo-American population. As part of a pro- 
grammatic research effort to investigate the 
cultural characteristics of Cubans as well as 
developing and investigating the treatment 
of behavioral disorders in this population, a 
study of Cuban/Anglo-American adolescent 
value differences was conducted. The implica- 
tions of these value differences for the appro- 
priateness of psychosocial treatment models 
are discussed. 


Theory 


Clinical experience in the treatment pro- 
gram and a survey of the literature on cross- 
cultural comparisons of value orientations 
suggested that the theory of value orientations 
developed by Kluckhohn and Strodtbeck 
(1961) would provide a useful framework for 
contrasting cultural differences between Cuban 
immigrants and Anglo Americans, They pos- 
tulated that to compare profiles between two 
cultures, it is necessary to delineate common 
human problems and to investigate the corre- 
sponding range of variations or ways of re- 
sponding to these problems to the two dif- 
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ferent cultures. They describe five human 
problems common, in general, to all cultures, 
The solutions provided by each culture to 
these problems are indicative of world view 
or basic value orientations within that culture. 
From Kluckhohn and Strodtbeck (1961), the 
following definitions of the five basic areas of 
human problems and the range of possible 
solutions to these problems were derived: 

1. Human nature orientation pertains to a 
society’s perception of innate human qual- 
ities in terms of good and evil: (a) good—the 
human being is perceived as being basically 
good but corruptible; (b) evil—the human 
being is perceived as being basically evil but 
perfectible; (c) neutral—the human being is 
perceived as neither good nor evil and subject 
to influence. 

2. Person-nature orientation refers to the 
perceived relationship of people to natural 
and environmental phenomena: (a) subjuga- 
tion to nature—the person is helpless and at 
the mercy of nature’s forces (worldly or other 
worldly); (b) mastery over nature—the per- 
son is seen as capable of controlling nature, 
mainly through technology; (c) harmony with 
nature—person and nature are one, working 
together in harmony. 

3. Activity orientation refers to the nature 
of the behaviors through which a person is 
judged or judges himself or herself: (a) doing 
—the person is judged by what he or she 
achieves and emphasizes success-oriented ac- 
tivities usually including externally measura- 
ble activities; (b) being—this variation em- 
phasizes activities that are an expression of 
existing desires (spontaneous expression), and 
activity is perceived existentially; (c) being 
in becoming—the emphasis in this variation 
is on meditation about one’s self, which leads 
to understanding and self-development. 

4. Time orientation refers to the meaning 
or emphasis placed on a particular time pe- 
riod: (a) past—the traditions of the past 
Ought to be maintained or recaptured; (b) | 
present— emphasis is on present time and 
problems; (c) future—emphasis is on a con- 
sideration of the future in solving present 
problems. 

5. Relational orientation refers to the na- 
ture of a person’s relation to other people: 
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(a) lineal—the way people relate to each 
other is determined by their relative positions 
within a hierarchy; (b) collateral—people’s 
relations to each other are determined by a 
horizontal network. In this network all per- 
sons are at the same level and relate to each 
other as “equals” having a place in the net- 
work; and (c) individualistic—people relate 
to others autonomously, not by hierarchical 
or lateral networks. 


Hypotheses 


From anthropological and clinical impres- 
sions obtained through the treatment program, 
the following hypotheses were formulated: 

1. Human nature orientation does not dif- 
fer significantly for Cuban immigrants and 
Anglo Americans. 

2. Person—nature orientation differs signifi- 
cantly between Cuban immigrants and Anglo 
Americans, with the former endorsing subju- 
gation to nature and the latter, mastery over 
nature. 

3. Activity orientation differs significantly 
between Cuban immigrants and Anglo Ameri- 
cans, with the former endorsing being and 
the latter endorsing doing as a preferred ac- 
tivity orientation. 

4. Time orientation differs significantly be- 
tween Cuban immigrants and Anglo Ameri- 
cans, with the former endorsing present and 
the latter endorsing future as a preferred time 
orientation. 

5. Relational orientation differs signifi- 
cantly between Cuban immigrants and Anglo 
Americans, with the former endorsing lineal- 
ity and the latter endorsing individualism as 
a preferred relationship style. 


Method 
Subjects 


There were two samples in this study: Sample 1 
ae used in the development of the Value Orienta- 
the 5 (VO) Scale; Sample 2 served primarily to test 
ING pais differences between Cuban immigrant and 
the. ; American adolescents. The 533 participants in 
eae me were obtained from various educational 
aan ions, such as high schools, junior colleges, uni- 
ae les, and continuing education centers; from so- 

agencies, such as senior citizens activity centers; 
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and from other frequently used facilities, such as 
Cuban medical clinics. All facilities were located in 
the Greater Miami area. 

Sample 1 consisted of 325 persons, including 120 
(37%) males and 205 (63%) females. In terms of 
ethnic background, Sample 1 contained 220 (67.7%) 
Cuban immigrants, 65 (20.0%) Anglo Americans, 
12 (3.7%) non-Cuban Latins, and 28 (8.6%) Black 
Americans. The average age of Sample 1 was 25.1 
years, with a standard deviation of 12.1 and a range 
from 15 to 77 years. 

Sample 2 was comprised of 208 persons, 81 (3970) 
males and 127 (61%) females, of whom 56 (27%) 
were Cuban immigrants and 152 (73%) were Anglo 
Americans. Since the majority of the clients in the 
treatment program referred to as “identified patients” 
and labeled as in need of treatment by their families 
were adolescents, Sample 2 was chosen to be repre- 
sentative of this sector of the population in treat- 
ment with respect to age. The average age of Sam- 
ple 2 was 16.4 years, with a standard deviation of 
1.4 and a range from 14 to 22 years. 


Development of the Value Orientations Scale 


Item construction, The first step in the develop- 
ment of the VO scale consisted of preparing an in- 
itial set of items reflecting the nature of the five 
human problems defined by Kluckhohn and Strodt- 
beck (1961) but in a context relevant to the target 
population. Each of the problem situations was fol- 
lowed by three statements presenting three possible 
alternative solutions. The final set of 22 problem 
situations 1 consisted of 9 relational, 4 human nature, 
4 person-nature, 3 time, and 2 activity items. Two 
parallel forms were prepared. The first form was in 
Spanish. The second form was devised by translat- 
ing the original set of items into English, The tech- 
nique of back translation was used to insure the 
equivalence of the items (Brislin, 1970). 

For each problem situation, the person was re- 
quired to choose the solution considered best and 
the solution considered worst. The scores for the 
keyed responses were as follows: A response of best 
for an item was given a score of 3, a response of 
worst for an item was given a score of 1. If an 
item was not endorsed as either best or worse, it was 
assigned a score of 2. A response for each item could 
thus range from 1 to 3. Each alternative response 
was scored as a separate variable. Thus, three al- 
ternative responses in each of 22 problem situations 
produced 66 variables with a score of 1, 2, or 3 for 
each variable. 

Scale construction, Rather than assuming that 
items should be combined as predicted by Kluckhohn 
and Strodtbeck’s (1961) theory of value orientations, 
the items were submitted to an empirical test. Fol- 


1 Copies of the original Spanish and English ver- 
sions of the Value Orientations Scale are available 
from the first author. 
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Table 1 
Correlation Matrix of Four Factors 
Factor 1 2 3 
1 
2 -086 
3 131 -160 
4 047 = 102 —.012 


lowing item construction, the scales were admin- 
istered to Sample 1. The item responses of Sample 
1 were factor analyzed, using an alpha solution and 
an oblique rotation (Harris-Kaiser, Type I).? Four 
interpretable factors? emerged from the analyses, 
accounting for 14.52% of the total variance, 
with Factors 1, 2, 3, and 4 accounting for 5.31%, 
3.50%, 3.21%, and 2.50% of the total variance, re- 
spectively. The factors, although obtained by oblique 
rotations, proved to be nearly independent of each 
other. Thus, for all practical purposes, the factor 
structure obtained can be said to be orthogonal. 
Table 1 presents the intercorrelations among the 
factors. 

The following scale descriptions flow from the 
item loadings of each factor: Factor 1 is clearly a 
“relational factor,” consistent with Kluckhohn and 
Strodtbeck’s (1961) relational dimension. The items 
loading greater than .30 on this factor comprise the 
Relational scale. A high score discloses an individ- 
ualistic value orientation in which the locus of re- 
sponsibility for a person’s behavior rests with the 
individual; “relationships are based on individual 
autonomy; reciprocal roles are based on recognition 
of the independence of interrelating members” (Papa- 
john & Spiegel, 1971, p. 260), A low score reflects 
a belief in lineality in which the locus of accounta- 
bility is defined by the social structure; “relation- 
ships on a vertical dimension are hierarchically or- 
dered; reciprocal roles are based on a dominance- 
submission mode of interrelationship” (Papajohn & 
Spiegel, 1971, p. 260). 

Factor 2 is mixed, including primarily relational 
items in addition to person-nature, activity, and 
human nature items. The items loading above 30 
on this factor comprise the Idealized Humanistic 
scale. A high score is an endorsement of idealized 
humanistic values, including a belief in collaterality, 
egalitarian social systems, and a growth-oriented life- 
style in search of harmony, peace, and spiritual de- 
velopment. A low score indicates low endorsement 
of these idealized humanistic values and greater per- 
sonal concern. 

Factor 3 is also a mixed v; 
relationship between 
time orientation. The item: 
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overcome the natural forces and harness them for 
human benefit. A low score reveals a fatalistic ac- 
ceptance of life’s circumstances and a belief that 
little can be done to counteract the forces of nature 
to which human beings are subjugated. Temporal 
focus is on the present, whereas the future is seen 
as being unpredictable. 

Factor 4 is definitely related to the perception of 
human qualities and impulses. This is consistent with 
Kluckhohn and Strodtbeck’s (1961) human nature 
dimension. Those items loading more than .30 on 
this factor comprise the Human Nature scale. A 
high score reflects a perception of human beings as 
basically selfish, malicious, and evil. A low score 
indicates a perception of human beings as basically 
good although corruptible. 

Reliability and validity. Internal consistency or 
alpha coefficients were calculated for Sample 1 for 
each factor. Factors 1, 2, 3, and 4 yielded alpha 
coefficients of .89, .84, .76, and .72, respectively. 
These coefficients are within the acceptable standards 
for scales that have achieved internal consistency, 
thus insuring their satisfactory levels of reliability. 

Factorial validity was obtained for the VO scale 
by ascertaining its internal statistical structure 
through factor analytic techniques. The factorial com- 
position produced four orthogonal VO subscales with 
high internal consistencies, thus providing high fac- 
torial validity. 


Results 


Value Comparisons 


To compare the value orientations of Cuban 
immigrants and Anglo Americans, the VO 
scale was administered to the persons in Sam- 
ple 2. Their item responses were scored as 
described above in the item construction sec- 
tion. A monotone scaling model for unspecified 
distribution forms, also known as a linear 
model, was adopted to develop the VO scale 
(cf. Nunnally, 1967). In other words, scores 
were obtained for each of the four factorially 
derived subscales of the VO scale by alge- 
braically summing the item scores of all of 
the items that loaded on each factor. Items 


*A factor analysis was also conducted for Sample 
1 excluding the Black American sample. The factor 
structure, obtained for this analysis was nearly iden- 
tical to the factor structure obtained for the full 
sample. Therefore, the factor analysis for the entire 
sample was used for this study. 

3 The items loading greater than .30 on each of 
the factors, and their factor loadings are available 


from the first author. These items also comprise the 
Value Orientation scales, 


| 
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Table 2 
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Value Comparisons: Means, Standard Deviations, and t Ratios for the Value 


Orientation Subscales 


Cubans Americans 
Factor M SD M SD t 
Sample 1 
1 46.54 9.68 57.78 12.74 OIIE 
2 52.67 10.92 50.30 10.25 —1.60 
3 51.82 11.47 54.97 9.48 2.22% 
4 47.06 9.94 45.44 9.95 —1.15 
Sample 2 
1 47.5 10.1 50.9 9.7 213r. 
2 47.5 9.5 50.9 9.9 piia 
3 46.7 9.9 51.3 9.9 DOOR: 
4 51.4 9.0 49.5 10.1 —1.32 


Note. For Sample 1, n = 211 Cubans and 65 Americans; for Sample 2, n = 56 Cubans and 152 Americans. 
Standard scores are presented with a mean of 50 and a standard deviation of 10. 


*p< 05. 
** b< 01. 
b < 001. 


loading positively on the factors were added, 
and items loading negatively were subtracted. 
Estimates of the internal consistency of the 
subscales were calculated for Sample 2. The 
alpha coefficients for Subscales (Factors) 1, 
2, 3, and 4 were .76, .58, .51, and .46, re- 
spectively. 


The Value Orientations Scale 


The scores obtained by Cuban immigrants 
and Anglo Americans on the four factorially 


all n p 
derived VO subscales were compared for Sam- 


ples 1 and 2; ¢ statistics were computed, and 
the significance of the obtained differences was 
determined using two-tailed tests of signifi- 
tance. Table 2 presents the means, standard 
deviations, and ¢ ratios for the differences be- 
tween the subscale scores. The subscale scores 
Presented in Table 2 were transformed into 
Standard scores, with a mean of 50 and a 


s Standard deviation of 10. Since Sample 1 was 


a to develop the VO scale, the results ob- 
nee with Sample 2 were used to test the 
Bees An examination of Table 2 indi- 
n ko that in Sample 2, the groups differ sig- 
ificantly for three of the four VO subscales. 
ng $ Predicted in Hypothesis 1, there were 

significant, #(206) = —1.32, differences 


between Cuban immigrants and Anglo Ameri- 
cans along the Human Nature dimension 
(Factor 4). 

The single largest difference, (206) = 2.92, 
p < .01, was obtained for the Person—Nature 
and Time subscale (Factor 3). As predicted 
in Hypotheses 2 and 4, Anglo Americans 
tended to value mastery over nature and pre- 
ferred to plan for the future, whereas Cubans 
tended to endorse a subjugation to nature 
orientation and a present-time orientation. 

It was not possible to test Hypothesis 3 di- 
rectly with respect to the differences in ac- 
tivity orientations between Cuban immigrants 
and Anglo Americans, since none of the fac- 
tors included a sufficient number. of activity 
items. Moreover, the small number of these 
items loading on the factors resulted from an 
artifact in the development of the original 
VO scale, which included only two human 
problems purporting to tap activity value 
orientations. 

As expected from Hypothesis 5, Cuban im- 
migrants and Anglo Americans differed sig- 
nificantly, #(206) = 2.13, p< .05, along the 
Relational subscale (Factor 1): Anglo Ameri- 
cans tended to value individuality over lineal- 
ity in interpersonal relations, whereas the 
converse was true for Cubans. 
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A significant, #(206) = 2.27, p < .05, and 
unexpected difference emerged for the Ideal- 
ized Humanistic Value subscale (Factor 2): 
Anglo Americans tended to endorse idealized 
humanistic values, whereas Cubans tended not 
to endorse these idealized humanistic values. 

The scores obtained by Cuban immigrants 
and Anglo Americans on the four factorially 
derived VO subscales were also compared for 
Sample 1. An examination of Table 2 shows 
that in Sample 1, the groups differed signifi- 
cantly on two of the four VO subscales, 

As with Sample 2, Cuban immigrants and 
Anglo Americans in Sample 1 (a) did not 
differ significantly, (274) = —1.15, along the 
Human Nature dimension (Factor 4) as pre- 
dicted in Hypothesis 1; (b) differed signifi- 
cantly, (274) = 2.22, p < .05, in the Person- 
Nature and Time (Factor 3) subscale scores 
in the direction predicted by Hypotheses 2 and 
4; and (c) differed significantly, (274) = 
6.55, p < .001, along the Relational subscale 
in the direction predicted by Hypothesis 5. 
Contrary to the findings obtained with Sam- 
ple 2, Cuban immigrants and Anglo Americans 
in Sample 1 did not differ significantly, (274) 
= 1.60, along the Idealized Humanistic Value 
subscale. 


Activity Items 


To test Hypothesis 3, the response scores 
obtained by Sample 2 Cuban immigrants and 
Anglo Americans on each item solution to the 
two activity problem situations were com- 
pared, The sample used in these comparisons 
included Sample 2 plus 120 additional high 
school students of Cuban or Anglo-American 
background. ¢ statistics were computed, and 
the significance of the obtained differences 
were determined using a two-tailed test of 
significance. The results indicate that Cuban 
immigrants endorsed both items reflecting a 
“doing” orientation, +(326) = 5.55, p < .0005 
(326) = 2.41, p< 02, significantly more 
frequently than Anglo Americans, whereas 
Anglo Americans endorsed both items reflect- 
ing a “being” orientation significantly more 
frequently than Cuban immigrants, (326) = 
3.11, p < .002; (326) = 1.98, p < .05. There 
were no differences between Cuban immi- 
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grants and Anglo Americans in their endorse- 
ment of the two items indicative of a being. 
in-becoming orientation, (326) = .80; £(326) 
= .21. These findings are contrary to the pre- 
diction in Hypothesis 3. 

The item solutions to the two activity prob- 
lem situations provided by Sample 1 Cuban 
immigrants and Anglo Americans were also 


compared using ¢ statistics and two-tailed tests | 


of significance. Again as with Sample 2, Sam- 
ple 1 Cuban immigrants endorsed both items 
reflecting a doing orientation significantly 
more frequently than Anglo Americans, 
t(274) = 2.02 p< .05; £(274) = 2.02, p< 
.05; Anglo Americans tended to endorse both 
items reflecting a being orientation more fre- 
quently than Cuban immigrants, ¢(274) = 
1.85, p < .07; (274) = 1.79, p < .08; and, 
there were no significant differences between 
the groups in their endorsement of “being-in- 
becoming” items, ¢(274) = .17, (274) = .01. 


Factor Structure of Sample 2 


To ascertain the generalizability of the VO 
scale to Sample 2, the item responses of Sam- 
ple 2 were also factor analyzed, using an alpha 
solution and an oblique rotation (Harris- 
Kaiser, Type I). Three of the four factorially 
derived VO subscales obtained from Sample 
1 were discernible in the factors that emerged 
from Sample 2. 

Factor 1, comprising the Relational sub- 
scale, proved to be the strongest factor in 
both Samples 1 and 2, accounting in each case 
for the largest proportion of the total factor 
variance. Of the 17 items loading on the Re- 
lational subscale, 14 items (82%) also loaded 
on the first factor of Sample 2. 

; Factor 2, comprising the Idealized Human- 
istic Value subscale, emerged as the second 
strongest factor in both Samples 1 and 2. Of 
the 13 items loading on the Idealized Hu- 


manistic Values subscale, 7 items (54%) also § 


loaded on the same factor for Sample 2. An 
apparent difference between these factors was 
observed, however, For Sample 1, the factor 
that emerged was essentially unipolar, mea- 
Suring low to high idealized humanistic val- 
ues. The factor that emerged from Sample 2 
was clearly bipolar, ranging from idealized 
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humanistic values on the one hand to a “life 
is a jungle” value orientation on the other, 
with an emphasis on the evil qualities of 
people and the need for self-protection as a 
survival measure. 

Factor 3 of Sample 1, comprising the Mas- 
tery over Nature/Future Time versus Sub- 
jugation to Nature/Present Time subscale, 
emerged also as a factor in Sample 2. Six of 
the nine items (67%) of this subscale loaded 
on a factor of Sample 2. In this instance, the 
factor that emerged for Sample 2 was clearly 
a mastery over nature/future time versus 
subjugation to nature/present time factor. 

It was not possible to identify a factor for 
Sample 2, which appeared comparable to the 
Human Nature subscale measured by Factor 
4 of Sample 1. 


Discussion 


The VO scale was developed using Sample 
1, and the hypotheses were tested using Sam- 
ple 2. However, to ascertain the stability of 
the results across both samples, each sample 
was factor analyzed separately, and the value 
comparisons between Cuban immigrants and 
Anglo Americans were also conducted sepa- 
tately for each sample. The first two factors 
emerged strongly in the factor structure of 
both samples; the third factor emerged from 
Sample 1 and was replicated partially in Sam- 
ple 2; and the fourth and weakest factor that 
emerged from Sample 1 was not identifiable 
in the factor structure of Sample 2. These dif- 
ferences in factor structure are not surpris- 
ìng, since the factors with the highest eigen- 
values replicated better across samples, and 
the factor with the lowest eigenvalue failed to 
replicate across samples. 

With one exception, the differences in value 
Orientations between Cuban immigrants and 
Anglo Americans held for both samples. The 
only exception occurred along the idealized 
umanistic value dimension. Whereas Sample 
manele Americans were significantly higher 
oe Sample 2 Cuban immigrants on this 
fae the same two groups in Sample 1 did 
ot differ significantly along this dimension. 
ee Sample 2 subjects were younger (M 
8€ = 16.4) than Sample 1 subjects (M age 
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= 25.1), it is suggested that the relative shift 
on idealized humanistic values between the 
groups in the two samples may have resulted 
from the differences in age between the samples. 

It is interesting to note that Cuban immi- 
grants and Anglo Americans may diverge on 
many culturally related variables other than 
nationality. For example, these groups may 
vary on religious affiliation, child-rearing cus- 
toms, family structure, and psychological 
variables such as need for approval, locus of 
control, and field dependence. The present 
study did not attempt to control for these 
variables. To have singled out anyone or a 
combination of these variables for analyses 
would have been artificial, since, in fact, these 
variables and many others contribute to the 
differences in basic value orientations observed 
between the two cultural groups examined. 

The differences between the two cultural 
groups may have been caused, however, by 
variables that are not necessarily culture re- 
lated. For example, Casavantes (1970) ar- 
gued that Mexican Americans value the pres- 
ent only as a function of their lower socio- 
economic status and not as a cultural value. 
Since socioeconomic status data were availa- 
ble for the subjects in Sample 2, the Cuban 
immigrants and Anglo Americans of Sample 
2 were compared along this variable, and they 
were found not to differ significantly. Hence, 
the findings of the present study do not ap- 
pear to have been caused by socioeconomic 
differences as Casavantes would suggest. 

Future research should address cultural dif- 
ferences among Cuban subgroups. Differences 
may exist, for example, between the sexes, 
among age groups, or between clinical and 
nonclinical samples. These differences between 
Cuban subgroups may also have important 
clinical implications. 


Clinical Implications 


The differences in basic value orientations 
between Cuban immigrant and Anglo-Ameri- 
can adolescents may have implications for the 
delivery of mental health services to these 
populations. If these value orientations are in- 
deed as basic as Kluckhohn and Strodtbeck 
(1961) postulated, then they must also have 
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implications for personality and psychosocial 
development, a notion derived from the work 
of Papajohn and Spiegel (1971) and Ramirez 
and Castafieda (1974) among others. 

As suggested earlier, clients with specific 
psychosocial characteristics require treatment 
approaches matched to their idiosyncratic 
styles. Following this premise, it would seem 
that to achieve desired psychotherapeutic out- 
comes with clinical Cuban populations, it is 
necessary to identify treatment models that 
are specifically matched to the culturally de- 
termined characteristics arising from the value 
structure of this population. Therefore, the 
Cuban adolescents’ preference for lineality, 
subjugation to nature, present time, and doing 
orientations as well as their low endorsement 
of idealized humanistic values must be taken 
into consideration when designing a psycho- 
social service delivery system for them. Many 
traditional Anglo-American treatment services 
are based on a model of a growth-oriented, 
self-actualizing individual who is ready to take 
control over his or her own destiny. In con- 
trast, clinical experience at the Spanish Fam- 
ily Guidance Clinic suggests that the provider 
of psychosocial treatment services to the 
Cuban immigrant must be ready to take 
charge of the therapist-client relationship, to 
validate hierarchical structures in the client’s 
life context, and to intervene on behalf of the 
client within the client’s life context to re- 
store ecological order. 

The most important feature of a psycho- 
social treatment model that is sensitive to the 
cultural characteristics of the Cuban immi- 
grants is to validate their preference for a 
lineal style of relationships. This relationship 
style may receive the support of the therapist 
in various phases of the treatment. First, the 
therapist must relate to the client hierarchi- 
cally, recognizing that the therapist’s role is 
perceived by the client as a Position of au- 
thority. With this Tecognition, the therapist 
assumes responsibility and further takes 
charge of the therapist—client relationship, 
Second, the therapist validates the 
Cuban client’s preference for lineality 
listing in the treatment Process the nai 
occurring hierarchical systems in the 
life context. Clearly, 


young 
by en- 
turally 
client’s 
the most significant na- 
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turally occurring hierarchical system is the 
family. Other authority figures such as teach- 
ers, school counselors, and even probation 
officers may also be important and may need 
to be included in the therapeutic plan. Many 
instances of dysfunction in young clients are 
also accompanied by the breakdown of the 
lineal structural relational patterns within the 
family. This is frequently manifested in the 
young person’s open rejection of the parents’ 
executive role in the family. Interestingly 
enough, even in these instances, clinical ex- 
perience suggests that desired therapeutic out- 
comes are reached most expediently by restor- 
ing the lineal-hierarchical relational structure 
in the family. Once the family’s natural lineal 
mileu is restored, and the parents’ role as the 


family’s executive system is reaffirmed, then | 


the family is taught the skills necessary to 
negotiate the youngster’s differentiation (tran- 
sition from lineality to individuality) from 
the family. The lineal family structure is sup- 
ported, so that within this basic culturally 
sanctioned framework, the process of negotiat- 
ing the youngster’s differentation from the 
family may take place. 

In preparing a psychosocial treatment plan 
for a Cuban immigrant client, the therapist 
should also consider the Cuban’s sensitivity 
to environmental social pressures. The high 
levels of need for approval of Cubans (Tholen, 
1974) and field dependence of Latins (Rami- 
tez & Castafieda, 1974) in general have been 
documented. Because of the strong influence of 
environmental social pressures on the Cuban 
client’s well-being, it becomes particularly im- 
portant that the etiology of psychosocial dys- 
functions be conceptualized within an eco- 
logical framework (Auerswald, 1971), since 
ecological theory takes into consideration the 
effects of the interaction between client and 
Psychosocial systems on the client’s function- 
ing. Further, as the present study indicates, 
Cuban clients tend to perceive themselves as 


unable to control or modify their environ- } 


mental circumstances (see also Santisteban, 
1975). For this reason, when environmental 
Pressures or tensions seem to be a source of 
client dysfunction, as is frequently the case 
(Scopetta, King, & Szapocznik, Note 1), it is 
necessary that therapeutic interventions re- 


CUBAN VALUE STRUCTURE 


structure the interactions of the client with 
his/her environment when these are sources 
of client functional impairment (Aponte, 

1974). 

The treatment of the Cuban client must 
also be present oriented. The Cuban client is 
usually mobilized for treatment by the onset 
of a crisis (Scopetta et al., Note 1) and ex- 
pects the therapist to provide immediate prob- 
lem-oriented solutions to the crisis situations. 
In general, the therapist must develop a treat- 
ment model that capitalizes on crises to pro- 
mote personal growth and the reorganization 
of interpersonal relations. Further, to use 
maximally this characteristic of the Cuban 
population, the culturally sensitive therapist 
is not only cognizant of how to use crises to 
promote growth but also knows how to create 
them for the same purpose. 

It will be recalled that young Anglo Ameri- 
cans are more likely than Cubans to endorse 
idealized humanistic values (Factor 2). Since 
these findings were unexpected, their clinical 
implications are as yet not clear. Nevertheless, 
these findings suggest that young Cuban im- 
migrants are less likely than their Anglo- 
American counterparts to value relationships 
based on goals of the laterally extended group, 
and thus to be mobilized in treatment by peer 
pressure groups. The findings also suggest that 
young Cuban immigrants are less likely than 
their Anglo American counterparts to be moti- 
vated in treatment by a search for personal 
and spiritual growth. In fact, clinical experi- 
ence suggests that the Cubans are motivated 
in treatment by concrete and obtainable ob- 
jectives. Consistent with this interpretation, 
young Cuban immigrants were found to en- 
dorse a doing activity orientation, whereas 
their Anglo American counterparts preferred 
asbeing activity orientation. 

With recognition of the culturally deter- 
mined characteristics of the Cuban popula- 
tion, the Spanish Family Guidance Clinic in 
the Department of Psychiatry of the Uni- 
Versity of Miami School of Medicine explored 
a variety of treatment approaches. Among 
these, one treatment model seemed particu- 
larly appropriate, ecological structural family 
therapy, first proposed by Aponte (1974), 
Who based his treatment model on Auers- 
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wald’s (1971) concepts of ecological therapy 
and Minuchin’s (1974) structural family ther- 
apy. The approach of these therapists seems 
to be particularly appropriate for the treat- 
ment of Puerto Ricans (e.g., Minuchin, Mon- 
talvo, Guerney, Rosman, & Schumer, 1967). 

Ecological structural family therapy as 
adopted at the Spanish Family Guidance 
Clinic is based on therapeutic assumptions 
that are matched to the value characteristics 
of the population of Cuban adolescents. Within 
this approach, the therapist relates hierarchi- 
cally to the client and works to restore the 
hierarchical structure in the client’s family. 
The therapist considers the ecological factors 
impinging on the client and actively inter- 
venes to remediate detrimental ecological re- 
lationships based on the notion that the client 
lacks the orientation to do so unassisted, The 
therapist is present oriented and intervenes 
to manipulate existing dysfunctional interac- 
tional patterns (cf. Minuchin, 1974) within 
the family and between the family and its 
environment. And, finally, consistent with a 
doing activity orientation, the client is moti- 
vated for treatment through the use of con- 
crete and obtainable objectives. 

Further studies are under way to test the 
effectiveness of ecological structural family 
therapy in the treatment of psychosocial dys- 
functions, including drug and alcohol abuse, 
with a population of Cuban immigrants. 

The procedure outlined in this article may 
have broad implications for the development 
of culturally specific psychosocial treatment 
models. It is suggested that this procedure 
may be applicable to the development of cul- 
ture specific treatment for other cultural 
groups. In fact, the VO scale may be used to 
ascertain basic cultural characteristics in client 
populations or for specific clients in treat- 
ment. Based on the findings obtained with the 
VO scale, it is then possible to identify treat- 
ment features that “match” the individual 
client or clients population’s basic value 
orientations. 


Reference Note 
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Personality Characteristics of Long-Term Recovered Alcoholics: 


A Comparative Analysis 


William M. Kurtines, Leah R. Ball, and Gloria H. Wood 


Florida International University 


This study reports data on the personality characteristics of alcoholics in two 
stages of recovery (short and long term). Three samples (total N = 183) were 
used in this study: (a) 60 newly recovered alcoholics (30 males and 30 females 
with at least 3 weeks but less than 4 months of sobriety); (b) 62 long-term 
recovered alcoholics (31 males and 31 females with a minimum of 4 years of 
continuous sobriety and a mean length of sobriety of 8.9 years); and (c) 61 
nonalcoholic controls (30 males and 31 females who reported moderate to in- 
frequent or no use of alcohol). All subjects were administered the California 
Psychological Inventory and a biographical data sheet. A multivariate analysis 
of variance was used to test the significance of group differences on the per- 


sonality variables, and a multiple discriminant analysis was conducted to deter- 


mine the most discriminating dimensions for differentiating among 


the three 


groups. The results of the analysis clearly indicate the existence of differential 
patterns of psychological adjustment at each stage of recovery. 


Early research on the personality charac- 
teristics of alcoholics focused on the identifi- 
cation of a single alcoholic personality type. 
The results of this research effort were gen- 
erally disappointing (Sutherland, Schroeder, 
& Tordella, 1950; Syme, 1957), and the need 
for alternative conceptualizations are obvious. 
Recent research studies have tended to focus 
on the identification and classification of per- 
sonality patterns common to alcoholics 
(Goldstein & Linden, 1969; Lawlis & Rubin, 
1971; Nerviano, 1976; Nerviano & Gross, 
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1973; Partington & Johnson, 1969; Skinner, 
Jackson, & Hoffman, 1974; Whitelock, Pat- 
rick, & Overall, 1971), and the results have 
led to a more realistic reappraisal of alcoholics 
as a heterogeneous treatment population 
(Nerviano, 1976). In general, these studies 
have used multivariate data analysis proce- 
dures to identify distinctive personality pro- 
files among newly recovered alcoholics or 
alcoholic patients undergoing treatment, 
Skinner et al. (1974), for example, used a 
multivariate classification strategy and the 
Differential Personality Inventory (DPI; 
Jackson & Messick, 1970) to identify eight 
modal personality profiles among male alco- 
holic psychiatric patients. A more recent 
study by Nerviano (1976), using the Per- 
sonality Research Form (PRF; Jackson, 
1967), used a multivariate approach to 
identify seven common personality patterns 
among male alcoholics undergoing treatment 
in a Veterans Administration hospital. 
Overall, considering the variety of assess- 
ment devices and methodological approaches 
used in these studies, there has been a re- 
markable convergence of results. Skinner, 
Reed, and Jackson (1976), for instance, have 
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demonstrated the feasibility of generalizing 
the modal personality profiles derived from 
alcoholics (cf. Skinner et al., 1974) to other 
psychiatric samples, Moreover, Skinner et al. 
(1976) observed that two particular profiles 
(Type I: defensive-repressive, and Type II: 
impulsive-socially deviant) were notably per- 
vasive across all of the samples used in their 
study, a finding that is strikingly consistent 
with those reported by Nerviano (1976). The 
two most frequently occurring personality 
patterns in the sample of alcoholics used by 
Nerviano (1976) were Type A (high impulse 
control, low autonomy) and Type B (ex- 
tremely low impulse control). 

In view of the substantial body of research 
currently available on the personality patterns 
of alcoholics in the initial stages of recovery, 
it is surprising to find that to date, little re- 
search has been conducted with long-term 
recovered alcoholics. There are, for example, 
no readily available studies on the personality 
characteristics of these individuals. The 
paucity of research on long-term recovered 
alcoholics is particularly striking in light of 
the generally acknowledged difficulty in 
working with alcoholics as a treatment popu- 
lation (Huber & Danahy, 1975; Tamerin & 
Neumann, 1974). The lack of research in 
this area may be due, in part, to the difficulty 
in obtaining data on such persons. Long-term 
recovered alcoholics are not typically identi- 
fied with treatment programs and conse- 
quently are not readily available to the inter- 
ested researcher, Moreover, because of the 
social, personal, and Occupational stigma as- 
sociated with alcoholism, the recovered alco- 
holic usually seeks to preserve anonymity, 
Thus, although the research literature con- 
tains valuable information about the per- 
sonality characteristics of newly recovered 
alcoholics, there is a conspicious lack of data 
on the characteristics of individuals who have 
been able to maintain long-term sobriety. 
Such data would provide useful clinical in- 
formation concerning the specific types of 
Personality characteristics associated with 
long-term recovery among alcoholics as g 
treatment population, 
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ery among alcoholics. The samples used in 
the study consisted of alcoholics in two 
stages of recovery and a sample of non- 
alcoholic controls. The California Psycho- 
logical Inventory (CPI; Gough, 1957) anda 
biographical data sheet were administered to 
all subjects, and multivariate data analysis 
techniques were used to determine differential 
personality patterns for the three samples. 
Since the research was intended to identify 
factors related to positive adjustment as well 
as clinical symptomatology, the CPI appeared 
to be a logical choice as an assessment de- 
vice. The CPI is designed for use with a 
general population and yields information on 
the adequacy of interpersonal as well as intra- 
personal functioning. 


Method 
Subjects 


Three samples (total N= 183) were used in this 
study. Group 1, the short-term recovered alcoholics, 
consisted of 60 “dry” alcoholics with at least 3 
weeks but less than 4 months of sobriety (30 males 
and 30 females). The Majority were residents in 
halfway houses in the Miami, Florida, area, and 
19 were new members from Alcoholics Anonymous 
(AA). Group 2, the long-term recovered alcoholics 
consisted of 62 sober alcoholics (31 males and 31 
females) with a minimum of 4 years of continuous 
Sobriety, the mean length of sobriety being 8.9 
years. All participants in Group 2 were active mem- 
bers of AA. Group 3, the nonalcoholic controls, con- 
sisted of 61 adult subjects (30 males and 31 females) 
chosen at random from the Miami, Florida, area. 
All members of the control group reported moderate 
to infrequent or no use of alcohol. 

All subjects were between the ages of 30 and 65. 
From the information available, the sober alcoholic 
and the nonalcoholic control groups appeared to be 
matched in terms of socioeconomic status. The dry 
alcoholics, on the other hand, ranked somewhat 
lower in terms of the traditional indices of socio- 
economic status (eg., occupation, income, etc.). 
These differences, however, appear to be related to 
the duration of their alcoholism and the recency of 
their recovery more than their actual status. ‘The 
mean ages of the groups were as follows: dry 
alcoholics, 46 years (SD=8.1); sober alcoholics, 


49 years (SD = 11.1); nonalcoholic controls, 49 
years (SD = 8.4). 


Procedure 


The CPI and a biographical data sheet were ad- 
ministered to all participants individually or in small 
groups. For the sober alcoholics, most of the ad- 
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ministration was conducted at informal meetings of 
active members of AA in the Miami area. The CPI 
was administered to the dry alcoholics, with the 
exception of the 19 new members of AA, on site at 
the various halfway houses in the Miami area. For 
the nonalcoholic controls, the CPI was administered 
in small groups to adult volunteers drawn from a 
wide variety of occupations (firefighter, meat cutter, 
pilot, postal worker, teacher, homemaker, etc.) and 
social backgrounds. The CPI was scored for the 18 
standard scales plus an additional scale developed 
by Hogan (1969). 


Results 


A 2 X3 multivariate analysis of variance 
(Clyde, 1969), using raw scores on the CPI 
scales as dependent variables, was conducted 
to determine the existence of sex and group 

. differences on the CPI profiles. A multivari- 
ate test of significance, using Wilk’s lambda 


Table 1 
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criterion, indicated a significant main effect 
due to sex, F(19, 159) = 9.64, p < .001, but 
no significant Sex X Group interaction, F(38, 
318) = .899, p > .05. An examination of the 
univariate F tests for the main effect of sex 
further indicated that males scored signifi- 
cantly higher on Well-Being (Wb), F(1, 
177) = 4.64, p< .05; and females scored 
higher on Achievement via Independence 
(Ai), F(1, 177) = 5.30, p < .01, and Fem- 
ininity (Fe), F(1, 177) = 149.36, p < .001. 

The multivariate analysis of variance also 
yielded a significant main effect for group, 
F(38, 318) = 3.37, p < .001. More impor- 
tantly, the pattern and direction of the sig- 
nificant differences obtained for the univariate 
F tests conducted for the individual scales 
provided strong evidence for the existence of 


Means, Standard Deviations, F Ratios, and Scheffé's S Method for the Samples Listed, Using the 
California Psychological Inventory and the Social Maturity Index (SMI) 


Group 1* Group 2» Group 3° Scheffé 
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Sc 246 8.5 27.1 70 30.5 7.2 9.5* S fy COON 0a 
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ance; Gi = Good Impression; Cm = Communality; 
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distinctive personality profiles for each of 
the groups. Fifteen of the 19 univariate F 
tests were significant at the .001 level, and 
31 of the 57 post hoc comparisons were sig- 
nificant at more than the .05 level. Table 1 
presents the means, standard deviations, and 
F ratios for all of the CPI scales for the 
three samples, Table 1 also presents the post 
hoc comparisons between means using 
Scheffé’s S method (Kirk, 1968). 

The single most notable finding reported 
in Table 1 is the large number of significant 
differences between the personality profiles of 
all three groups. As can be seen from Table 
1, the personality profiles of alcoholics in 
two stages of recovery (i.e., newly recovered 
and long-term recovered) differed signifi- 
cantly from nonalcoholic controls and from 
each other. The results of the group compari- 
sons can be summarized as follows: First, the 
profile of the sober alcoholics differed signifi- 
cantly from that of the nonalcoholic controls 
on the following nine scales: Capacity for 
Status (Cs; p < .01), Sociability (Sy; p< 
01), Responsibility (Re; p< 05), Sociali- 
zation (So; p < .001), Self-control (p < .05), 
Good Impression (p < :01), Achievement via 
Conformance (Ac; p < .01), Intellectual Ef- 
ficiency (p < .05), and Empathy (Em; p< 
05). Moreover, the nonalcoholic controls ex- 
hibited a more normal profile, scoring higher 
than the sober alcoholics on all nine scales, 
Second, the dry alcoholics had an extremely 
depressed profile in comparison to that of the 
nonalcoholic controls, The dry alcoholics 
scored significantly lower on all but the fol- 
lowing four scales: 
munality, Flexibility, 
13 of the 15 significant differences were in 
excess of 001. Finally, in comparison to the 
dry alcoholics, the sober alcoholics in general 
exhibited a more elevated profile, scoring 
higher than the dry alcoholics on the follow- 
ing seven scales: Wb (P< .05), Re (p< 
001), So (p< .05), Tolerance (2 < .05), 
ee and Psychologi- 
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Figure 1. Group centroids for discriminant functions 
I and II. 


criterion variable and CPI scale scores as 
predictor variables (Cooley & Lohnes, 1971). 
The analysis yielded two significant discrimi- 
nant functions; and a measure of overall 
group differentiation, Wilk’s lambda, indi- 
cated that both functions significantly dis- 
criminated between groups (p < .001). The 
centroids of the three groups are plotted in 
Figure 1, with the first significant discrimi- 
nant function serving as the ordinate and the 
second discriminant function serving as the | 
abscissa. These group centroids are the mean 
scores for the individuals within each group 
for the two significant functions. 

The first and most important dimension 
for differentiating between groups, which ac- 
counted for 69.6% of the between-group 
variance, was a bipolar dimension defined on 
the positive end by So (.56), Re (.41), and 
Em (.37) and defined on the negative end by 
Sy (—.37). This dimension, which was la- 
beled interpersonal values, appears to reflect 
a socially mature, empathic, and articulated 
value system versus a primarily affiliative, 
other-directed value orientation. From the 
post hoc comparisons presented in Table 1, 
it can be seen that the dry alcoholics had 
the lowest scores on So, Re, and Em (the 
CPI “moral”? scales) and Sy (a measure of 
interpersonal affiliation). More importantly, 
as can be seen from Figure 1, the first func- 
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tion differentiates among all three groups, 
with the nonalcoholic controls scoring high- 
est and the dry alcoholics scoring lowest. 
The second significant function, which ac- 
counted for 30.4% of the between-group 
variance, is a dimension of general profile ele- 
vation. Nine of the CPI scales loaded above 
35 on this function, whereas only four loaded 
above .35 on the first function. This dimen- 
sion of general profile elevation is defined 
primarily by Self-Acceptance (Sa) (.58) and 
Wb (.56) on the positive end and by Sy 
(—1.36) on the negative end. High scores on 
Sa indicate the presence of a positive self- 
image and high self-esteem; high scores on 
Wh indicate a sense of well-being and a free- 
dom from excessive health concerns. Low Sy 
scores, on the other hand, indicate social de- 
tachment, passivity, and generally poor in- 
terpersonal functioning. This dimension, 
which was labeled interpersonal identity, ap- 
pears to reflect a relatively conflict-free and 
comfortable sense of personal worth versus 
uncertainty in self-evaluation resulting in a 
need for interpersonal supports. The signifi- 
cance of this dimension can be seen from the 
pattern of centroid plots displayed in Figure 
1. The sober alcoholics scored highest on this 
function, whereas the dry alcoholics and the 
nonalcoholic controls obtained about the same 
mean centroid, suggesting that long-term re- 
covered alcoholics display a unique psycho- 
logical adjustment profile. On this dimension, 
long-term recovered alcoholics appeared to 
be relatively nonneurotic but moderately 
Socially maladjusted. Confirmation of this 
interpretation can be found in Table 1. The 
sober alcoholics did not differ significantly 
from the nonalcoholic controls on Sa or Wb, 
but they scored significantly lower than the 
controls on Cs and Sy. 
2 For the final phase of the analyses, subjects 
in all three groups were scored for a CPI- 
based “social maturity index” developed by 
Gough (1966). The social maturity index 
Was originally defined by comparing the re- 
Sponses of a large sample of delinquents and 
Nondelinquents on the CPI and developing a 
Six-variable regression equation to distinguish 
between the groups. Subsequent research 
(Gough, De Vos, & Mizushima, 1968; Hogan, 
ankin, Conway, & Fox, 1970) has estab- 
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lished the utility of the social maturity index 
as a measure of antisocial tendencies. The 
constant and the weights for the equation 
have been adjusted so that the mean score on 
the index in a normal population will be 50,0. 
In the original sample, the nondelinquent 
mean was 50.4 and the mean delinquent 
score was 42.7, Mean scores for the dry al- 
coholics, sober alcoholics, and nonalcoholic 
controls used in this study were, respectively, 
45.4 (SD = 3.7), 47.2 (SD = 3.2), and 49.4 
(SD = 3.3). A one-way analysis of variance, 
using social maturity scores as the dependent 
variable, was highly significant, F(2, 180) = 
22.05, p < .001. Post hoc comparisons, using 
Scheffé’s S method, were significant at or 
above the .05 level for all paired comparisons. 


Discussion 


This study presents data on the personality 
characteristics of alcoholics in two phases of 
recovery. A multivariate analysis of variance 
was used to determine group differences on 
the personality variables, and a multiple dis- 
criminant analysis was utilized to identify 
the most useful dimensions for differentiating 
between both groups of alcoholics and a 
group of nonalcoholic controls. The overall 
results of the analyses indicate the existence 
of differential patterns of adjustment and 
coping strategies at each stage of recovery. 

The personality profile of the alcoholics in 
the initial stages of recovery showed a marked 
similarity to common personality patterns 
identified in previous research. The profile of 
the newly recovered alcoholics can be sum- 
marized as follows: First, the dry alcoholics 
exhibited an extremely depressed profile, in- 
dicating a generally poor level of adjustment. 
Second, this group was characterized by 
strong antisocial tendencies and impulsive- 
ness, scoring significantly lower on the social 
maturity index than both other groups. Third, 
the dry alcoholics appeared to exhibit a sense 
of interpersonal inadequacy, low self-esteem, 
and feelings of guilt and self-blame, This 
overall pattern is notably consistent with 
several of the modal personality profiles re- 
ported in previous research with alcoholic 
patients. In particular, the CPI profile of the 
newly recovered alcoholics in this study bears 
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a strong resemblance to the Type II per- 
sonality (impulsive-socially deviant) identi- 
fied by Skinner et al. (1976) using the DPI, 
and to the Type B personality (extremely 
low impulse control) identified by Nerviano 
(1976) using the PRF and the Sixteen Per- 
sonality Factor Questionnaire. The con- 
vergence of results obtained with such a wide 
range of assessment devices provides strong 
evidence for the pervasiveness of this per- 
sonality type among alcoholics at this stage 
of recovery. 

The long-term recovered alcoholics, on the 
other hand, displayed a unique personality 
profile, differing from both the newly recov- 
ered alcoholics and the nonalcoholic controls. 
The profile of the sober alcoholics can be de- 
scribed as follows: First, the sober alcoholics 
displayed a generally more elevated profile 
than the dry alcoholics but significantly less 
elevated than the controls, a finding which 
suggests that the maintenance of long-term 
sobriety is associated with an overall im- 
provement in general level of adjustment. 
Second, the sober alcoholics appeared to be 
moderately undersocialized in terms of intra- 
personal values. As a group, they obtained a 
midrange score on the social maturity index. 
Moreover, they scored significantly higher on 
Re and So than the dry alcoholics but lower 
than the nonalcoholic controls, suggesting 
that recovery is significantly but not exclu- 
sively related to a restructuring of intraper- 
sonal values. Third, the sober alcoholics dis- 
played a pattern of poor interpersonal func- 
tioning similar to that of the dry alcoholics, 
As a group, they scored significantly lower 
on Cs and Sy than the controls, indicating a 
sense of interpersonal inadequacy and social 
inhibition, This generally poor level of inter- 
personal functioning stands in sharp contrast 
to their generally good level of intrapersonal 
functioning, They did not differ significantly 
from the controls on CPI scales concerned 
with intrapersonal functioning (Sa and Wb) 
but the dry alcoholics did. This finding site 
gests that long-term recovered alcoholics. 
although socially inhibited, are relatively 
self-accepting, have a strong sense of well- 


being, and are free from excessive health con- 
cerns. 
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The findings reported in this study, al- 
though tentative, appear to have implications 
for both treatment intervention and the 
identification of areas of much needed re- 
search. Current research efforts have tended 
to focus on the identification of distinctive 
subgroups of personality types among diag- 
nosed alcoholics and have emphasized the 
heterogeneous nature of the treatment popu- 
lation. The notable success of such research 
efforts has led to valuable suggestions con- 
cerning treatment intervention. Nerviano 
(1976), drawing on his extensive research 
with alcoholic patients, has suggested that 
treatment type and modality be adjusted to 
the individual alcoholic on the basis of 
identified personality typologies. The results 
reported here, while not directly concerned 
with the identification of personality sub- 
types, are consistent with this previous re- 
search and would further extend this sug- 
gestion to include a consideration of stage of 
recovery as an important factor in the deter- 
mination of treatment modality and type. 
For example, one of the clinically most sig- 
nificant findings of this study was that the 
overall personality profile of the long-term 
recovered alcoholics differed significantly 
from both the newly recovered alcoholics and 
the nonalcoholic controls, suggesting a dis- 
tinctive pattern of psychological adjustment 
among long-term recovered alcoholics. The 
most discriminatory dimension for differenti- 
ating between long-term recovered alcoholics 
and the other two groups was defined by 
(a) the general level of profile elevation, (b) 
an absence of psychoneurotic tendencies, 
and (c) poor interpersonal functioning. This 
finding would seem to strongly suggest that 
treatment intervention be adjusted to in- 
clude stage of recovery as well as overall per- 
sonality profile. Finally, the results reported 
here point to the need for more extensive and 
Systematic research on the personality char- 
acteristics of long-term recovered alcoholics. 
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Cognitive Preparation and Coping Self-Talk: 
Anxiety Management During the Stress of Flying 


: Michel Girodo and Julius Roehl 

University of Ottawa and The Royal Ottawa Hospital, Ottawa, Canada 

ser, 
The effectiveness of two cognitive coping strategies, singly and in combination, 
were investigated in 56 undergraduate females with a reported fear of flying. ' 
Subjeéts were assigned to four groups: preparatory information training, self- 
Statement training, combined (information and self-talk), and pseudotreatment 
control and were flown aboard an 11-passenger Twin Otter aircraft for two 
flights. One half of the subjects flew with the door to the cockpit open, the 
other half flew with the door closed. Each flight encountered a planned unex- 
pected missed landing, Self-reports of anxiety were obtained before takeoff, 
during the flight, and after landing. Even though the cognitive-coping strategies 
were not differentially effective in reducing anxiety during the ongoing stress of 
flying, under serious threat (unexpected event), with the cockpit door open, 
self-talk and combined subjects coped better than information and control 
subjects. With the door closed, all groups increased in anxiety. At final landing, 
with the door closed, self-statement-trained subjects increased in their self- 
reported anxiety. The results of a 4}-month follow-up on flight apprehension 


are discussed in light of the effects of the treatment manipulations. 


Stress inoculation training (SIT) refers 
to a cognitive-behavioral treatment package 
(Meichenbaum, 1975) in which patients are 
taught to emit positive coping self-statements 
in learning how to cope with such stressors as 
pain (Turk, Note 1), test anxiety (Meichen- 
baum, 1972), and anger problems (Novaco, 
1976). Basically, the SIT procedure involves 
(a) providing the subject with a theoretical 
framework or rationale for conceptualizing the 
effectiveness of self-talk in coping with a stress 
reaction; (b) teaching the subject a series of 
positive self-statements that focus on (i) 
preparing for a stressor, (ii) confronting and 
handling a stressor, (iii) coping with a feeling 
of being overwhelmed, and (iv) reinforcing 
onesel for having coped; and (c) trying out or 
rehearsing coping self-talk. 


ee 


The authors are grateful for the assistance i 
y rovided 
by Air Transit and the University of Ottawa, for 
gy sippar in a ard the study, 
equests for reprints should be sent to Michel 
Girodo, School of h iversi 
ye Hal ‘ology, University of Ottawa, 


KIH 6K9, e, Ottawa, Ontario, Canada 


Coping self-talk procedures have also been 
used in connection with the reduction of 
Psychological stress in surgical patients by 
Langer, Janis, and Wolfer (1975). Iri this 
study, patients who were taughté@ognitive 
reappraisal of anxiety-provoking events and 
were induced to engage in calming self-talk 
and attention diversion coped better with pre- 
Surgery and postsurgery stresses than did the 
subjects who were given preparatory informa- 
tion designed to induce “work of worrying” 
and subsequent emotional inoculation for the 
stressor (Janis, 1958, 1971). Although the 
Preparatory information increased arousal, this 
effect dissipated over time, and no evidence of 
positive «postoperative effects were found. 
These results are in contrast to the findings of 
Melamed and Siegel (1975), who studied the 
effects of filmed modeling in reducing anxiety 
in children facing hospitalization and surgery- 


` The results of this study support the contention 


that a moderate amount of arousal prior to the 
stressor may facilitate coping with the stress. 
once it presents itself. 

Another way of looking at the work of 
worrying process focuses on the notion that the 
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COGNITIVE PREPARATION AND COPING SELF-TALK 


Beneficial effects of engaging in work of worry- 
‘ing are obtained when the person engages in 
‘anticipatory problem solving and cognitive 
rehearsal of coping with the forthcoming stress- 
ful event (Meichenbaum, 1975), The sugges- 
tion is that the individual experiences a 
“moderate level of arousal when covertly re- 
hearsing the handling of a stressful situation, 
and this prompts the emotional inoculation 
necessary to help the person cope better with 
| the stressor. 
The general purpose of the present experi- 
ent was to assess the applicability of the SIT 
procedure and preparatory information in 
" coping with a real-life stressor. Specifically, the 
experiment sought (a) to examine the relative 
| effectiveness of (i) the SIT procedure and (ii) 
preparatory information, singly and in com- 
bination, in coping with a flying experience 
ong persons who were apprehensive of 
lying; (b) to study the effects of these treat- 
ent procedures over two trials; and (c) to 
amine the effects of an additional stressor in 
the form of an “unexpected missed landing” 
“on coping ability. It was felt that an unexpected 
Stressful event during the flight might provide 
the needed variation to demonstrate greater 
differential çffectiveness of one treatment over 
the other; 
| An additional feature of the experiment con- 
cerned whether subjects had visual access to 
the pilot and copilot in the cockpit area. In this 
Tegard, it was hypothesized that subjects who 
Teceived preparatory information in the form 
of reliance on “danger control authorities” 
Would experience less anxiety under condi- 
tions in which such figures were visually present 
and salient, compared to preparatory informa- 
tion subjects who did not have access to such 
Visual information and external reassurances. 
lt was predicted that subjects trained in the 
‘SIT procedure would not be as reliant on 
nger control authorities or on seeking ex- 
ternal reassurances compared with individuals 
ven preparatory information, and, as such, 
at they would be less susceptible to the 
ects of the availability or nonavailability of 
ese external cues. 


Method 


Subjects 


Subjects were 56 female undergraduate students 
ging in age from 18 to 34 (M = 21). They were 
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selected on the basis of four screening questionnaire 
criteria: (a) They scored 6 or more on a 10-point scale 
designed to assess flight apprehension (i.e., “How do 
you usually feel while flying in an airplane?” Very calm 
and relaxed served as the anchor at one end, and very 
nervous and tense provided the anchor at the other 
end). (b) They had previously flown (M no. flights 
= 5.48). (c) They had never flown in the particular type 
of aircraft used in the present experiment. (d) They had 
indicated on the screening questionnaire that they 
would be willing to serve as a subject in an experiment 
that involved flying. The screening questionnaire was 
completed by 1,526 male and female students enrolled 
at the University of Ottawa, Carleton University, and 
Algonquin College. Of these, 92 females met the four 
selection criteria.! These subjects were contacted on @ 
random basis and were scheduled for participation on 
1 of 6 weekends according to their availability. 


Aircraft and Apparatus 


The aircraft in which subjects were flown consisted of 
a DeHavilland Twin Otter with a capacity of two crew 
and 11 passengers. This short takeoff and landing 
(STOL) aircraft takes off and lands using a 1,500 foot 
(457.2 m) runway at a 6-degree angle (compared with 
the conventional 3-degree slope). In addition to stand- 
ard-specialized avionics, the STOL aircraft was equipped 
with a computerized air data acquisition system, which 
recorded 43 performance parameters of the flight, Of 
interest to study for possible covariate analysis were 
measures of turbulence in the form of vertical devia- 
tion scores obtained every 2 sec. Earphones and a 
hand-held microphone were installed at the rear of the 
cabin to allow for constant communication between the 
cockpit area and the experimenter. A sliding door 
served to separate the pilot and passenger area for half 
of the flights, The seating in the STOL was the same 
as that found in modern passenger aircraft, with five 
seats on the left and six seats on the right of the aircraft. 


Design 


Four training groups of 14 subjects each were exposed 
to one of four treatments or treatment combinations. 
One group of subjects was exposed to preparatory in- 
formation; a second group, to the SIT procedure (self- 
talk); a third group, to a combination of the informa- 
tion and self-talk treatments (combined) ; and a fourth 
group of subjects was exposed to films on the history of 
aviation (control). 

Subjects were seen i 
Saturday preceding t 


n small groups for training on the 
he Sunday flight. All subjects 
participated in two flying trials, the first involving a 
45-minute flight from Ottawa to Montreal with a 
20-minute stopover, followed by a second identical 
trial returning to Ottawa. Subjects in the same treat- 


1 The distribution of scores for the 943 females had 
a mean of 3.0 with a standard deviation of 2.8. The 
cutting score of 6 (2 = 1.29) represented the upper 10% 


of the population. 
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ment groups were assigned seats behind each other on 
the same side of the aircraft. This served to reduce 
possible social comparison processes; persons visible 
to any one subject were essentially strangers in a 
different treatment condition. All subjects flew during 
both trials with either the door open or closed. Measures 
of self-report of anxiety were obtained at six points 
throughout the first trial: during “rev up” ; after takeoff ; 
following 10 minutes, 20 minutes, and 30 minutes of 
cruising; and after landing. On Trial 2, an additional 
measure of self-report of anxiety was obtained following 
the “missed approach.” Thus, this formed the basis of a 
4 (groups) X 2 (door) X 2 (trials) X 6 or 7 (assessment 
periods) factorial design. 

On the second trial, the missed landing occurred as 
follows: In its final approach into Ottawa, the aircraft 
descended in a normal landing pattern to an altitude 
of 100 feet (30.4 m). At this point a stall-warning horn 
sounded in the cockpit, the pitch of the propellers was 
changed, and full power was applied to the engines. 
The aircraft rose rapidly, banking to the left until it 
reached an altitude of 2,000 feet (609.6 m). At this 
point, the captain apologized for the missed approach, 
explaining that a light aircraft was taxiing onto the 
runway without authorization from the control tower 
and that they would be making a normal landing in 
about 5 minutes. 


Procedure 


Instructions and specific training procedures were 
delivered primarily via audio tape recorder. All subjects 
were assured that their responses would be kept con- 
fidential and that no harm would come to them in the 
normal course of the experiment. The experimenter also 
requested that subjects not disclose details of the 
experiment to anyone until they had received a state- 
ment in the mail describing the purpose and results of 
the experiment and the role they had played, Following 
this, subjects were asked to complete a self-report of 
anxiety inventory (SRAT) and the initial flight appre- 
hension inventory (FLAPI) and were asked to sign a 
consent form. 


Self-Talk Training 


This 24-hour training procedure attempted to follow 
as closely as possible the descriptions given by Meichen- 
baum (1975) in outlining the SIT procedure. Briefly, 
as part of this training, the experimenter stressed the 
importance of learning positive self-statements as a 
method for coping with stress, All subjects were asked 
to share with other members in the group events in 
which they experienced nervousness or anxiety and to 


recall the negative thoughts th; t i 
situations. Through a oe Hic Ace 


rationale of the SIT anda 
in the four phases of copi 
sented. Taking a slow de 
effective strategy for de 


Component of anxiety. 
example of a person taki 
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self-statements and feelings of anxiety were present, 
followed by instructions that negative self-statements 
and anxiety feelings should serve as cues for emitting 
positive-coping self-statements. Following this, subjects 
were given a prepared list of coping self-statements 
(Meichenbaum, 1975). They were then instructed to 
imagine a stressful situation, to prepare their own list 
of coping self-statements, and to memorize this list. 
Following a 10-minute learning period, subjects were 
tested on their recall of the self-statements. To further 
consolidate the training procedure, the tape continued 
with a description of an imaginary horseback ride for a 
person who was fearful of horses. Subjects were asked 
to fantasize and imagine clearly a person making 
positive coping self-statements as the horse was 
approached, touched, mounted, ridden, and dismountet 
after a successful riding experience. Following this 
exercise, subjects were asked to share with other 
members of the group their individual self-statements, 
and each subject compiled the final list of positive self- 
statements that she was to use when flying the next 
day. Subjects were instructed to rehearse making 
positive coping self-statements in anticipation of the 
flight of the following day. 


Preparatory Information Training 


As in the self-talk training, this procedure began | 
with the same request for a discussion of events in 
which subjects experienced nervousness or anxiety. 
Following this, the experimenter expressed the im- 
portance of acquiring prior information about poten- 
tially stressful events as a method for coping with stress. 
The experimenter then illustrated the effectiveness of 
Prior information for coping with test anxiety. A tape 
recording then described the theoretical rationale of this 
training procedure, drawing examples from the scientific 
literature (e.g., Janis, 1958, 1971). The audiotape 
continued with a detailed description of the events 
that were to take place the following day. This descrip- 
tion included the nature of the transportation to the 
airport, the preboarding procedures, the exterior an 
interior of the aircraft, and the takeoff and landing 
procedures, Twelve colored slides depicting the STOL- 
Port, aircraft, a view from the ground over the two 
cities and during cruising, takeoff, and landing wer? 
shown at appropriate points in the presentation. To 
have the subjects view the pilots as “danger control 
authorities,” descriptions of the pilots’ training and 
experience with emphasis on their skill, expertis¢; 
maturity, and 20 years of service were given. To control 
for time with the experimenter, the above presentatiol 
was given twice. 


Combined Training 


This training procedure consisted of a 34-hour com” 
bination of the SIT and preparatory information pro" 
cedures. To control for equal time with the exper 
menter, the information procedure was presented once: 
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Pseudotreatment Control 


In this condition, as in the other three treatments, 
subjects spent an initial period of time sharing with the 
group how they felt and what they thought of when in a 
stressful situation. This was designed to control for 
possible effects associated with ventilation of affect 
related to fears. Following this, subjects were shown 
three 20-minute films depicting the history of aviation 
in Canada. Following the film presentations, subjects 
were asked to relate to the group what they liked least 
and most about each film. Thus, time with the experi- 
menter and interaction time among subjects was 
equivalent across all conditions. 

To reduce time spent completing scales on the day 
of the flight and after training, subjects in all groups 
were given specific instructions and practice with the 
SRAI that they would be using the following day. 
Subjects were instructed not to talk to each other at 
any time at the STOLport or throughout the flight. 


Dependent M easures* 


The SRAI served as the chief dependent measure 
in the experiment. This inventory consisted of four 
11-point scales anchored at each end from very to not 
at all, and was designed to measure cognitive and 
Somatic components of anxiety after the symptom 
clusters of Buss (1966). The four scales measured (a) 
calm and relaxed; (b) worried and apprehensive; (c) 
anxious and nervous; and (d) tense and trembling. 
Ttem analyses correlating each scale score with total 
scale score revealed that essentially the same construct 
Was being tapped, consistent with coefficients obtained 
In previous studies (Girodo, 1974). 

The SRAI was administered prior to training, im- 
mediately following training, at the training site, the 
Morning of the flight at the STOLport prior to boarding, 
uring the rev up, following takeoff, after three 
10-minute cruising periods, following the unexpected 
‘vent on Trial 2, and after landing. The SRAI was 
Modified slightly on several of these occasions to coin- 
Cide with the events that had just transpired. Thus, 
although subjects were generally asked to indicate 
ow they felt at present, the estimates obtained for 
the three 10-minute cruising periods asked the subjects 
to indicate how they had felt during the last 10 minutes 
or how they had felt as a result of the takeoff, the un- 
expected event (see below), and the landing. Another 
modification asked, “How have you been feeling since 
yesterday afternoon in anticipation of the flight today?” 
Subjects used the SRAI scales to reply to this item 
ERA met at the training site on the morning of 
iie ight. Following the missed approach, the experi- 
TEN stated that he wanted to take advantage of the 
tion and record reactions on an additional ques- 
rea: This he did by distributing an untitled SRAI 
scal verbally requested the subjects to indicate on the 
: €S provided how they had felt as a result of what 

Just happened. 

realment effectiveness questionnaire. This consisted 
ma cight-item postexperimental questionnaire de- 
ino to Measure the effectiveness of self-statements 

Ping with the stress of flying. Here, self-talk and 
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combined group subjects were asked to (a) list the self- 
statements that they emitted during the flight; (b) 
indicate the percentage of time that they used self- 
talk, and (c) rate the effectiveness of self-talk as a 
coping device on an 11-point scale ranging from not as 
helpful and useful at one end to very helpful and useful 
at the other end. Subjects in the information condition 
were asked to (a) rate the effectiveness of the informa- 
tion given to them on a similar 11-point scale in helping 
them cope with the stress and (b) indicate what kind 
of information might have been beneficial to them in 
coping with the flight. 


Follow-up 


A follow-up letter and questionnaire were mailed to 
all subjects 44 months after the conclusion of the ex- 
periment. Although this was designed to assess any 
negative or untoward effects of participating in the 
experiment, it also inquired into the current flight 
apprehension of subjects by repeating the FLAPI, 
given at the time of initial screening. 


Results 
Training and Preflight Assessment 


An analysis of variance performed on SRAI 
scores obtained immediately after training 
indicated that treatment manipulations pro- 
duced significant differences in self-report of 
anxiety between groups, F(3, 52) = 8.83, 
p < .001. Tukey tests indicated that both self- 
talk and combined subjects reported signifi- 
cantly more anxiety (p < .05) immediately 
after training compared with subjects in the 
control and information conditions. The scores 
obtained on the SRAI the morning of the 
flight at the training site failed to yield 
significant group differences; however, the re- 
sults obtained by using the Duncan’s multiple- 
range test (Duncan, 1955) suggested that self- 
talk subjects were still slightly more aroused 
than information subjects (p < .05). A similar 
analysis of SRAI scores obtained at the STOL- 
port prior to departure yielded a similar non- 
significant F; however, again Duncan’s test 
revealed that self-talk subjects were slightly 
more aroused than subjects in the combined 


group (p < .05). 


2 At the STOLport, electrodes were attached to the 
subjects’ wrists and left ankle to record heart rate on 
cassette. Equipment problems early in the experiment 
forced the experimenter to abandon this measure; how- 
ever, electrodes were still applied to all subjects on the 


remaining flights. 
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Table 1 
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Means and Standard Deviations of Self-Report of Anxiety Inventory (SRAI) Scores at 
Pretraining and Posttraining and Work of Worrying Scores on Flight Day 


SRAI 
Work of worrying 
Pretraining Posttraining scores 
Group M SD M SD M SD 

Information 11.65 7.67 6.43* 4.18 8.60 4,86 
Self-talk 14.29 6.93 14.36 6.42 15.50** 5,00 
Combined 17.21 7.81 12,29 7.26 8.14 5.33 
Control 13.43 7.71 5,29* 3.52 5.20 


8,29 


* p < .05; significant decreases from pretraining to posttraining. 
** p < .001; significantly greater than mean scores from other groups. 


An analysis of variance performed on the 
flight anticipation scale that asked how sub- 
jects had been feeling since yesterday afternoon 
in anticipation of the flight showed a signifi- 
cant treatment group effect F(3, 52) = 6.96, 
p < .001. Tukey tests indicated that self-talk 
subjects reported engaging in significantly more 
work of worrying than subjects in the other 
three groups. Table 1 summarizes the results 
of the SRAI and “worry” scales for these 
preflight assessments. 


Flight Assessment 


It was found that although turbulence did 
take place and was perceptible by the pilots, it 
had no significant effect on SRAT reports.* 

Trial 1. A 4 (groups) X 2 (door) X 6 (as- 
sessment periods) analysis of variance on the 
SRAI scores yielded a significant effect for 
assessment periods, F (5, 48) = 35.96, p< .001. 
Treatment groups did not differ from each 
other at either rev up, takeoff, following each 
of the three 10-minute cruising periods, or 
during landing. An analysis of variance on 
the Groups X Door X Cruising Periods pro- 
duced a main effect for cruising periods, 
F(2, 48) = 8.31, p < 001. A main effect for the 
door condition was obtained for the third 
cruising period, F(1, 48) = 4.73, p < 03 
and Tukey tests revealed that subjects who 
flew with the cockpit door open reported 
significantly more arousal (p < .01) than sub- 
jects who flew with the door closed. No SRAI 
differences were found between groups for the 
door condition for the first landing. 


Trial 2. A4X 2X 7 analysis of variance 
on SRAI scores produced a main effect for the 
door condition, F(1, 48) = 5.48, p < .03, and 
a main effect for assessment periods, F (5, 48) 
= 47.69, p < .001. An analysis of variance con- 
sidering only the treatment groups and the 
rev up and takeoff periods failed to produce 
significant differences. An analysis of variance 
on the groups, door condition, and three 
cruising periods yielded a significant effect for 
door, F(1, 48) = 8.90, p < .004, and for 
cruising periods, F(2, 48) = 11.46, p < .001. 
Even though treatment groups did not differ 
from one another at any of the cruising periods, 
subjects in the door open or door closed condi- 
tions differed significantly from each other 
after the first 10 minutés, F(1, 48) = 10.32, 
P < .003; following 20 minutes of cruising, 
F(1, 48) = 5.90, p < .02; and following 30 
minutes of cruising, F(1, 48) = 8.20, p < .006. 
Figure 1 illustrates the nature of these findings. 

Subjects who flew with the door open re- 
Ported significantly more arousal than subjects 
who flew with the door closed when con- 
fronted by the unexpected event, F(1, 48) 
= 4.25, p < .04. At final landing, however, no 
Significant overall differences were found 
between door open or door closed conditions: 

Since the door conditions produced signifi- 
cant differences in SRAI scores, separate post 
hoc analyses were performed on the third 


_ A complete report on the way the air data acquisi- 
tion system and crew reports of turbulence were treate’ 


in analyses of covariance is available from the first 
author. 
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Figure 1. Mean self-report of anxiety inventory scores for subjects in the door open and door closed 


conditions across flight events for the second trial. 


cruising period, the unexpected event, and the 
final landing. Figure 2 plots the SRAI scores 
for subjects in the treatment groups across 
these three periods for door open and door 
closed conditions. 
i: When the door open condition is examined, 
pon be seen that significant differences be- 
P een self-talk subjects and control subjects 
103) obtained during the unexpected event, 
A = 3.10, p < .009. Again, in the door 
on es the changes in SRAI scores 
Pen, e last cruising period to the unexpected 
ae were examined by correlated / tests. It 
n oe that the SRAI scores of subjects 
esa condition and in the information 
ane on significantly increased from the last 
a don period to the unexpected event, /(6) 
tespe A Ż < .02, and (6) = 3.39, p < .002, 
ae ; ively. Even though there was a tendency 
Broa © scores of both self-talk and combined 
Rice: Subjects _to increase, this was not 
Bent Sun significant. From the unexpected 
Eloy ne the landing, all groups tended to 
Bi creases in their self-report of anxiety; 
er, only the control subjects’ scores de- 


creased significantly from the unexpected 
event to the landing, ¿(6) = 2.53, p < .05. 

In the door closed conditions, the scores of 
subjects in all four treatment groups increased 
significantly but not differentially. Considering 
the SRAI scores from unexpected event to 
landing, correlated / tests revealed that 
although the SRAI scores of the control group 
subjects decreased significantly, (6) = 2.48, 
p < .05, together with a nonsignificant ten- 
dency for the scores of the combined group 
subjects to decrease, the scores of self-talk 
subjects actually increased from unexpected 
event to landing, /(6) = 4.58, p < .004. In- 
formation subjects showed a nonsignificant 
increase in self-report of anxiety. 

In view of the wide variability of the data 
at each of the plotted points, complex chi- 
square analyses were performed on the number 
of subjects whose scores decreased as opposed 
to those whose scores stayed the same or in- 
creased. From the last cruising period to the 
unexpected event, scores for all subjects except 
one (self-talk, door open) increased, x2(3) = 
3.06, ns; however, differential changes in SRAI 
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Figure 2. Mean self-report of anxiety 


open or closed conditions, at the last cruising period, 


Stress and Anxiety (Vol. 4) by I. G. S 


MICHEL GIRODO AND JULIUS ROEHL 


Information peng 
Self - talk e e 
Combined 
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inventory scores for subjects in the four treatment groups, in door 
, the unexpected event, and the final landing. (From 


arason and C. D. Spielberger (Eds.), 1977. Copyright 1977 by 
Hemisphere. Reprinted by permission.) 


scores from unexpected event to landing were 
obtained, X?(3) = 18.89, p < .005, in that 
scores of 11 of the 14 subjects in the control 
group and 11 of the 14 subjects in the com- 
bined group decreased from unexpected event 
to landing, whereas only 3 of the 14 self-talk 
subjects and 7 of the 14 information subjects 
obtained lower SRAI score 


r s from unexpected 
event to landing. When the door conditions 


were examined for chi-square differences, only 
the door closed chi-square obtained differences 
in number of subjects who either incre: 


ased or 
decreased from unexpected event to landing, 
x (3) 


3) = 15.43, p < 02. Here, 5 of 7 control 
subjects and 5 of 7 combined subjects showed 
decreases in anxiety, whereas only 3 of 7 in- 


formation subjects and none of the self-talk 
subjects reported decreases in anxiety. Fisher’s 
exact test (Fisher, 1966) revealed that there 
were significantly more control (5) and com- 
bined (5) subjects’ SRAI scores that decreased 
compared with none of the subjects in the 
self-talk group, x°(3) = 7.69, p < .05. 
Seating. An analysis was undertaken for 
subjects seated in the first four seats nearest 
the cockpit and the remaining six seats to the 
rear of the aircraft. On Trial 1, a Group Door 
X Seating analysis of variance failed to reveal 
any significant effect for seating or any inter- 
action between seating and the door or group 
condition. Similar nonsignificant effects for 
Seating were obtained on SRAI scores for 


| ing interaction was obtained for the landing 
"on Trial 2, F (3, 40) = 4.39, p < .009. Tukey 
"tests showed that at the final landing, informa- 
group subjects who sat in the front of the 
t reported the lowest SRAI scores, 
information subjects seated in the 
Teported the highest SRAI scores. Subjects 
‘in the combined group and in the self-talk 
who were seated in front had higher 
[scores than subjects in those groups 
Sat in the rear of the aircraft. Control 
up subjects did not differ significantly in 

ms of their assigned seating. It was inter- 

g to find that these effects were only 
s ly amplified (but not significantly) in the 
door open as opposed to the door closed 
condition. 


experimental Questionnaire 


mteply to the question “Do you wish you 
acquired more information and known 
what to anticipate today?”, seven of 
Control subjects expressed the need for 
information, compared with 2 of 14 in- 
tion subjects and 2 of 14 combined 
ts, X?(2) = 6.16, p < .05. Fisher’s exact 
“st revealed that more subjects in the infor- 
N group (12) expressed satisfaction with 
“amount of information that they had 
ed prior to the flight compared with 
tS in the control group (7) (p < .05). 
Subjects were asked to rate how effective 
had found the techniques of providing 
tion about what to anticipate in helping 
i cope with the stress of flying, no differ- 
Were found between subjects in the door 
condition. However, subjects in the 
Open condition differed significantly in 
atings on this question, F (2, 18) = 12.12, 
001. Tukey tests revealed that informa- 
nd combined subjects did not differ in 
ed effectiveness of acquiring informa- 
a coping device. However, control 
Tated the information that they had 
das significantly less effective in coping 
tress compared with information or com- 
Subjects (p < .05). The self-talk and 
€d subjects, overall and in both door 
S, did not differ on the percentage of 
; that they had spent emitting positive 
“ments. The subjects’ ratings on the 
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effectiveness of emitting positive self-state- 
ments in coping with the stress of flying also 
revealed no differences between these two 
groups. A significant positive correlation 
(61, p < .001) was obtained between the 
reported percentage of time that subjects spent 
making positive self-statements and the rated 
effectiveness of their making self-statements in 
coping with the flight experience. Also, positive 
correlations were obtained between SRAI 
scores at various assessment periods and the 
reported percentage of time spent making 
positive self-statements. Specifically, positive 
correlations between anxiety scores and self- 
statement use were obtained at the third 
cruising period for Trial 1 (r = .34, p < .04), 
at the first landing (r = .32, p < .05), and for 
the unexpected event (r = .33, p < .04). No 
significant correlations were obtained between 
SRAI scores at each of the assessment periods 
and the reported effectiveness of self-talk. 


Follow-up 


Fifty-three of 56 (95%) follow-up question- 
naires mailed 4} months after the study was 
completed were returned. Subjects in all four 
treatment groups reported significantly lower 
flight apprehension scores at follow-up com- 
pared with their pretraining scores. For sub- 
jects who were in the door open condition, the 
information, self-talk, and combined groups, 
1s(6) = 3.58, 4.46, and 4.70 (p < .01), re- 
spectively, reported significant decreases in 
their FLAPI scores, whereas control group 
subjects failed to show any significant change 
in their FLAPI scores from pretraining to 
follow-up. For subjects in the door closed 
condition, ¿ tests on change scores revealed 
significant decreases in flight apprehension for 
self-talk subjects, 1(6) = 6.30, p < .001; and 
combined subjects, /(6) = 3.12, p < 03. 
Subjects in the information and control condi- 
tions did not report significant changes in 
FLAPI scores. Thus, subjects who were ex- 
posed to self-statement training (self-talk and 
combined) reported significant reductions in 
flight apprehension 4} months after the ex- 
periment regardless of the condition that they 
were in. It is of interest to note that subjects in 
the control group failed to show significant 
decrements in flight apprehension from pre- 
training to follow-up, especially in view of the 
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fact that their SRAI scores were markedly 
similar to those of the treatment groups at 
various points throughout the two trials. 


Discussion 


The major findings of the experiment were 
as follows: (a) Generally, subjects who flew 
with the door open showed more anxiety than 
subjects who flew with the door closed. Even 
though all subjects tended to decrease in 
anxiety in the form of possible habitation 
effects (e.g., Solyom, Shugar, Bryntwick, & 
Solyom, 1973), door closed subjects habituated 
more rapidly than door open subjects. (b) The 
various treatment manipulations generally 
failed to demonstrate differential coping effec- 
tiveness throughout the normal course of the 
flights. (c) The additional stressor, in the form 
of an unexpected event, produced differential 
increases in anxiety when the cockpit door was 
open, in that self-statement-trained subjects 
coped better than information and control 
subjects. This was not the case in the door 
closed conditions in which self-talk subjects 
did not cope with the subsequent landing as 
well as subjects in other groups. 


Work of Worrying 


No evidence was found to support the 
hypothesis that preparatory information served 
to increase arousal levels either immediately 
after training or prior to the flight. Only SIT- 
trained subjects reported significantly high 
levels of anxiety immediately following training 
as well as prior to the flight itself. Also, self- 
talk subjects also reported greater amounts 
of “worry and apprehension” during the 20 
hours preceding the flight. It is interesting to 
note that combined subjects who also received 
self-statement training did not manifest the 
same increase in arousal following, nor did they 
report greater work of worrying, prior to the 
flight. Conceivably, self-talk subjects may 
have rehearsed coping self-statements, in con- 
nection with imaginary and nonspecific flight 
events, unlike the combined subjects who 
may have rehearsed their self-talk with pre- 
dictable events and images depicted on the 
slides. Whatever the effects of satisfactory 
worry/cognitive Preparation on coping with a 
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future stressor may really be, they were not 
obtained in the present study: Recall that 
both self-talk and combined subjects coped 
comparatively well with the unexpected event} 
in spite of the fact that combined and informa, 
tion subjects failed to reveal any indication of 
anticipatory anxiety. 


Door Open Condition 


As might be expected, preparatory informa 
tion seemed to be ineffective in coping with an 
unexpected missed landing, and subj in thi 
condition responded to stress with increas 
anxiety similar to the control subjects. Also; 
even when danger control authorities wert 
visible, information subjects did not appeat 
to find their visible presence a source of reas 
surance to allay their anxieties. 


Door Closed Condition 


Here, it appears as if not having visual 
contact with the cockpit area can account fot 
the reasons that the treatment groups signifi 
cantly increased in anxiety from the las 
cruising period to the unexpected event. It 
appears that in the door closed condition, the 
importance of being able to process informatio 
about the significance of a stressor may tak 
precedence over any attempted coping strategy: 
Conceivably, being able to monitor what i 
happening or to be able to answer question: 
one is raising concerning the significance am 
importance of an unanticipated event maj 
have an overriding influence, and no amoun 
of self-talk will serve to distract the individual 
from this information search process. 

The effects of seating arrangement appeare 
to be most salient for the information group 
Subjects in the information group seated 
the front of the aircraft reported the lowes! 
SRAI scores of all treatment groups, and th 
information subjects seated at the rear obi 
tained the highest scores. At first glance } 
would appear as if subjects exposed to th 
information coping strategy were better abl 
to cope with the second landing if they wel 
able to process information, confirm exp 
tancies, and rely on cues from danger contr 
authorities for obtaining reassurance. How 
ever, this seating effect on information su! 
Jects was obtained in both the door open an 
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door closed conditions. Although this is some- 
what puzzling, it could suggest that physical 
proximity from such danger control authori- 
ties is a more important variable than the 
availability of visual contact with such figures 
when it comes time to cope with such a 
stressor. It is interesting to note that since 
the effect of proximity to the cockpit area was 
not obtained for subjects given other coping 
strategies, subjects given preparatory informa- 
tion could have been excessively reliant on 
this coping strategy in coping with the stress 
of the second landing. 

The seating factor only became significant 
for the information subjects during the second 
landing. This is understandable if we consider 
that it is only after the missed landing that 
these subjects felt an additional stress in the 
sense of a disconfirmation of the expectancies 
induced by the preparatory information treat- 
ment package. Possibly because they could 


no longer rely on the information presented’ 


to them concerning what to anticipate, these 
subjects may only at that time have felt a 
need for reassurances or for proximity to 
danger control authorities. 

The follow-up data suggest that the SIT 
and preparatory information procedures singly 
or in combination were equally effective in 
reducing flight apprehension 43 months later. 
This was true only for subjects who flew with 
the cockpit door open, since only subjects 
trained in the SIT procedure who flew with the 
door closed reported long-term decrements in 
flight apprehension. The fact that door closed/ 
information group subjects did not report 
lowered flight apprehension at follow-up may 
relate to an assertion by Averill (1973) in con- 
nection with perceived control over aversive 
events. He suggested that even though in- 
formation about a stressor may have value in 
helping a person cope with a stress, it may 
only be effective if it is validated by experience 
and if such information was found to be 
veridical in reducing objective worry following 
reality testing. It may not be too surprising, 
therefore, to have found that preparatory in- 
formation was not useful in producing long- 
term decrements in flight apprehension espe- 
cially when access to sources of reassurance 
was cut off and after expectations concerning 
the flight events were disconfirmed. 
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Theoretical Considerations 


Two important theoretical issues are raised 
by the results of the present investigation. 
First, looking only at the anxiety scores ob- 
tained during the two cruising periods, the 
absence of any evidence of a differential 
coping ability across treatment groups sug- 
gests that the SIT procedure is no more effec- 
tive for coping with the ongoing stress of 
flying than preparatory information or when 
persons are left to their resources (i.e., control), 
These findings cast some doubt on the theo- 
retical underpinnings of the SIT procedure, 
specifically on the assertion that the act of 
emitting positive self-statements in the pres- 
ence of a stressor can effectuate anxiety 
reduction. 

The second theoretical issue concerns the 
mediating role of cognitions in altering emo- 
tional reactions. Since, in the present experi- 
ment, self-talk/door open subjects did not in- 
crease in anxiety during the unexpected event, 
and since the anxiety scores of all seven of the 
self-talk/door closed subjects increased from 
the unexpected event to the final landing, the 
issue does merit some attention. We can focus 
on the specific question, “What cognitive 
processes can explain the mediating role of 
self-talk in stress reactions?” First, it might 
be useful to determine what processes are 
probably not involved before trying to arrive 
at an understanding of what they might be. 
If we look at the semantic therapies of Beck 
(1970) and Ellis (1962), for example, a 
common denominator of cognitive reorganiza- 
tion serves as the foundation for coping and 
emotional adjustment. Whether it be the cor- 
rection of arbitrary inferences or selective 
abstractions in the former, or the changing 
of irrational beliefs and assumptions in the 
latter, both cognitive therapies work at the 
level of the person’s epistemological base for 
creating new perceptions and understandings. 
As has been explained elsewhere (Girodo, 
1977), in spite of certain similarities, the crucial 
cognitive processes in the SIT procedure may 
have very little to do with modifying the 
subject’s epistemological base. When subjects 
“buy” the persuasive conceptual rationale of 
the SIT package, this does not necessarily 
cast the die for producing emotional responses 
simply by repeating their semantic equiva- 
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lents. Indeed, as was found in a coping with 
pain study by Girodo and Wood (Note 3), the 
persuasive rationale serves more to motivate 
subjects to use the self-talk technique rather 
than to reorganize cognitions that would make 
the stressor less painful. Inasmuch as atten- 
tional processes were implicated in producing 
differential coping responses in the SIT pain 
experiment, we felt that it might be more than 
simple conjecture to suggest that similar 
attentional processes might be involved in SIT 
subjects’ responses to the unexpected event 
and subsequent landing. 

What is it about monitoring internal states 
and repeating self-statements that had such a 
detrimental effect on SIT door closed subjects 
during the final landing? Although it is con- 
ceivable that these subjects abandoned their 
self-talk strategy after the unexpected event 
(possibly because it may have failed them when 
it was most needed), an alternative explana- 
tion along the lines proffered by Sarason (1975) 
is more compelling. He suggested that self-pre- 
occupation involves the kind of attentional 
activity that can interfere at a variety of points 
with information processing and subsequent 
planning strategies for coping with a stressor. 
The conditions for increased self-preoccupa- 
tions were present in the self-talk treatment in 
that subjects were exposed to (a) a door closed 
condition in which cockpit and pilot reassur- 
ance cues were not available and (b) an indoc- 
trination to a self-statement strategy that 
forced subjects to attend mainly to their 
internal state. We propose that this kind of 
induced self-preoccupation may have interfered 
with information processing and evaluation 
necessary in the planning of a coping strategy 
for dealing with the uncertainty of the second 
landing. Why this increase in anxiety at the 
final landing was not manifested in combined 
subjects is difficult to explain unless we assume 
that the reassurances concerning pilot com- 
petency had an appropriate effect then. 

Finally, recall that significant positive cor- 
relations were obtained between the reported 
percentage of time that subjects were making 
positive self-statements and their anxiety 
scores at three of the assessment periods. 
In light of (a) the previous arguments suggest- 
ing that forced self-talk may interfere with 
appraisal and coping processes, (b) the finding 
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that two of these correlations were obtained for 1 
stressors that would normally invite informa- — 
tion processing and/or appraisal processes (i.e., 
the first landing and the unexpected event), 
and (c) the absence of a significant correlation 
between the subjects’ rated effectiveness of 
the SIT procedure and their anxiety scores, | 
it is reasonable to suggest that the more time © 
subjects spent making their self-statements, 
the more anxious they became at various | 
points of the flying experience. 

Tn conclusion, we would like to draw atten- ` 
tion to the distinction between semantic 
therapies that work on inducing new beliefs or 
correcting cognitive distortions, and thus pro- 
duce emotional change via this epistemological 
base, and the self-talk therapy in which the 
mediating processes may be more inimical. 
We should not lose sight of the fact that the 
conceptual rationale underlying the SIT pro- 
cedure is designed to induce the subject to 
comply with self-talk instructions and that 
they are given for his or her benefit, and not 
for the therapist to come to believe. On the 
basis of the present experiment, we suggest 
that if indeed attention information-processing 
demands mediate in SIT-produced emotional 
reactions, then these can be viewed as method- 
ological artifacts of the semantic therapy. 
On the one hand, maybe we better not let 
patients in on this, for it may be only because 
they believe that their positive self-talk can 
produce their semantic equivalent that they 
cope better on occasion. On the other hand, 
we should not forget that the self-talk may be 
applied so faithfully that it interferes with the 
use of other available coping mechanisms. 
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Use of Electromyographic Biofeedback and Cue-Controlled 
Relaxation in the Treatment of Test Anxiety 


D. Kenneth Counts, James G. Hollandsworth, Jr., and John D. Alcorn 


University of Southern Mississippi 


The effect of using electromyographic (EMG) biofeedback to increase the 
efficacy of cue-controlled relaxation training in the treatment of test anxiety 
was studied. Forty college undergraduates scoring in the upper third on a self- 
report measure of test anxiety were randomly assigned to one of four treatment 
conditions—EMG-assisted cue-controlled relaxation, cue-controlled relaxation 
alone, attention-placebo relaxation, and no-treatment control. Pre-post self- 
report measures of test anxiety, state anxiety, and trait anxiety were obtained. 
In addition, a performance measure in the form of a mental abilities test was 
administered. Subjects from the three relaxation groups received six 45-minute 
individual sessions over a period of 2 weeks, All treatments were conducted 
using audiotape recordings. The results indicate that cue-controlled relaxation 
is effective in increasing test performance for test anxious subjects, that EMG 
biofeedback does not contribute to the effectiveness of this procedure, and that 


self-report measures of anxiety are susceptible to a placebo effect. 


The use of relaxation training in the treat- 
ment of test anxiety has met with mixed re- 
sults (Chang-Liang & Denney, 1976; Johnson 
& Sechrest, 1968). Of primary concern has been 
the generalization of the relaxation response 
to stressful situations occurring beyond the 
confines of the treatment room itself (e.g., 
Goldfried & Trier, 1974). Modifications of the 
basic technique, such as cue-controlled relaxa- 
tion, appear to facilitate this generalization 
process (Russell & Sipich, 1973). Furthermore, 
the use of other aids, such as electromyo- 
graphic (EMG) biofeedback, may increase 
the effectiveness of relaxation training itself 
(Haynes, Moseley, & McGowan, 1975). Until 
now, however, there has been no investigation 
of the relative contribution of each of these 
procedures toward improving the test per- 
formance of test anxious students. 


The technique of cue-controlled relaxation 


The authors would like to thank Robert E. Agnew 
i his valuable assistance on preparing the treatment 

pes, 

Requests for reprints should be sent to James G. 
Hollandsworth, Jr., Department of Counseling Psy- 
chology, University of Southern Mississippi, Box 272 
Southern Station, Hattiesburg, Mississippi 39401. 


was first presented as a possible treatment for 
test anxiety by Russell and Sipich (1973). This 
approach consists of deep muscle relaxation 
training and the pairing of a cue word such 
as “relax” or “calm” with breath exhalation 
while relaxed. According to the classical 
conditioning paradigm, after a number of these 
pairings, the cue word alone should elicit 
relaxation and a feeling of calmness. Two case 
studies (Russell & Sipich, 1973, 1974) have 
provided support for the use of cue-controlled 
relaxation in treating test anxiety. Russell, 
Miller, and June (1974) also used group cue- 
controlled relaxation in the treatment of test 
anxious college students. Even though the} 
results were encouraging, no control group was 
included and the dependent measure was 
restricted to self-reports. A comparison of 
cue-controlled relaxation with systematic de- 
sensitization (Russell, Miller, & June, 1975) 
indicated that the two treatments weré 
equally effective. Both treatment conditions 
were superior to the control group on self) 
report measures, but no differences weré 
noted on the performance measure. Anothet| 
comparison of these treatment condition’ 
obtained equivocal results (Russell, Wise, & 
Stratoudakis, 1976). Marchetti, McGlynn, and 
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TEST ANXIETY 


Patterson (1977), however, found that the 
effects of cue-controlled relaxation failed to 
exceed those of a placebo or no-treatment 
condition on self-report measures and psycho- 
physiological indices of arousal during test 
taking. 

In recent years, EMG biofeedback training 
also has gained popularity as a relaxation 
training method. Essentially, the client receives 
feedback (auditory, visual, or both) on the 
amount of tension in the monitored muscle 
group. This feedback facilitates the relaxation 
process. Haynes et al. (1975) compared the 
effectiveness of frontalis EMG biofeedback 
and two types of verbal relaxation instructions 
in reducing muscular tension. EMG biofeed- 
back was equal to one form and superior to 
the other form of verbal instructions. Similar 
results were found by Reinking and Kohl 
(1975). Canter, Kondo, and Knott (1975) 
found EMG biofeedback superior to verbal 
instructions in reducing tension and con- 
comitantly relieving anxiety symptoms in 
adult psychiatric patients. 

The particular sensory mode of the feedback 
may be important in EMG relaxation training. 
Alexander, French, and Goodman (1975) 
compared the efficacy of auditory and visual 
feedback in EMG biofeedback training. The 
results indicated that auditory feedback may 
be more effective in the induction of muscular 
relaxation than visual feedback. 

The use of EMG biofeedback in conjunction 
with various behavioral treatment strategies 
has yielded encouraging results. In one study 
(Wickramasekera, 1972), EMG biofeedback 
was used in the relaxation phase of the syste- 
matic desensitization of test anxiety. The 
procedure was supported by decreases in client 
reports of test anxiety. Reeves and Mealiea 
(1975) used EMG-assisted relaxation in the 
cue-controlled relaxation treatment of flight 
Phobia. Even though the Reeves and Mealiea 
Study was without adequate controls, the 
results indicate that this method has con- 
Siderable promise as a treatment strategy for 
a wide range of disorders. 

At this point, the majority of research on 
Cue-controlled relaxation with test anxiety has 
tither lacked performance measures, control 
Conditions, attention-placebo conditions, or a 
Combination of these factors. Furthermore, a 
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controlled investigation of the combination of 
EMG biofeedback with cue-controlled relaxa- 
tion training has not been reported. 

This study was designed to investigate the 
effectiveness of EMG-assisted cue-controlled 
relaxation in the treatment of test anxiety. 
More specifically, it was hypothesized that 
test anxious subjects receiving cue-controlled 
relaxation training or EMG-assisted cue- 
controlled relaxation training would report 
less test anxiety and demonstrate a greater 
increase in test performance than subjects 
receiving an attention-placebo treatment or no 
treatment. Furthermore, it was hypothesized 
that test anxious subjects receiving EMG- 
assisted cue-controlled relaxation training 
would report less test anxiety and would 
demonstrate a greater increase in test per- 
formance than subjects receiving the cue- 
controlled relaxation procedure alone. 


Method 


Subjects 


The Test Anxiety Scale (TAS) was administered 
to 294 undergraduate students midway through the 
academic quarter. These students were unaware that 
high scorers would be given the opportunity to receive 
treatment for test anxiety. This sample generated a 
TAS mean score of 16.85 with a standard deviation of 
7.97. Sixty-nine students who received a raw score 
of 21 or above, which placed them in the upper third 
of the distribution, were contacted by telephone and 
were asked to participate in the study. A total of 47 
subjects were pretested and stratified according to the 
TAS and Otis-Lennon performance test results. 
Subjects within each stratum were randomly assigned 
to one of four treatment conditions —EMG-assisted 
cue-controlled relaxation, cue-controlled relaxation, 
attention-placebo, or no-treatment control. Seven of 
these 47 subjects failed to complete treatment, with the 
two treatment groups losing 2 each and the attention- 
placebo group losing 3. As a result, the final sample 
consisted of 40 subjects, 10 in each group, with a mean 
age of 20.6 years, ranging from 16 to 35. There were 28 
females and 12 males. Most of the subjects were either 
freshmen or sophomores. Subjects received academic 
credit for participating in the study. 


Apparatus 


Subjects received treatments in a dimly lit, air- 
conditioned, sound-attenuated room in which there 
was a reclining chair, cassette tape recorder, EMG 
biofeedback device, small table, and lamp. The room 
was fitted with a one-way mirror. A Bio-Dyne MR-200 
electromyographic biofeedback device providing con- 
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tinuous auditory (fluctuations in tone) feedback was 
used in this study. With the exception of two subjects, 
the sensitivity-gain adjustment was set at 25 aV. For 
two subjects the sensitivity-gain adjustment was set 
at 10 „V for the initial session only. Three electrodes, 
each with a surface area of approximately 5.65 cm?*, 
were secured with a rubber headstrap approximately 
28 cm apart. The electrodes were centered and placed 
horizontally on the subject’s forehead. The frontalis 
surface area was prepared with isopropyl alcohol. 
Cor-Gel electrocardiogram electrode gel was used as 
the contact medium, 


Measures 


The advanced level of the Otis-—Lennon Mental 
Ability Test (Otis & Lennon, 1967) was used as a pre- 
post performance measure. This instrument has been 
found to exhibit strong test-retest reliability (r = .94) 
over a 1-year period (Smith, 1970). Split-half, Kuder— 
Richardson, and alternate-forms reliability coefficients 
also have been found to range from .92 to .96 (Otis & 
Lennon, 1967). The Otis-Lennon was administered 
using directions modified to increase subject anxiety. 
More specifically, these instructions stated that 
subjects would be rank ordered in terms of the test 
results and would be provided with their individual 
standing in comparison to other students. Following 
the Otis-Lennon, the State-Trait Anxiety Inventory 
(STAI; Spielberger, Gorsuch, & Lushene, 1970) was 
administered. At posttesting these same two measures 
were readministered along with the TAS (Sarason, 
1957). 


Procedure 


Subjects in the two treatment and attention-placebo 
conditions each received six 45-minute individual 
sessions over a period of 2 weeks. Pretesting and post- 
testing occurred within 3 days of initiating and termi- 
nating treatment, respectively. All subjects for these 
three conditions were seated in a reclining chair and 
received relaxation instructions from a cassette tape 
recording. In preparing the tapes, the narrator was 
unaware of the hypotheses, measures, or rationale for 
the study. 

EMG-assisted cue-controlled relaxation group (CCR-B). 
After the subject was seated by the experimenter, 
electrodes were attached over the frontalis muscle in 
accordance with the operating manual. The sensitivity- 
gain scale was set at 25 uV, and the subject was asked 
to relax comfortably with eyes closed for approximately 
30 sec to establish a baseline of audible feedback. 
Periodic observations were made during the session 
through the one-way mirror to insure that continuous 
feedback was being generated. Once feedback was 
established, the experimenter began the tape recording 
and left the room. The tape included three components : 
(a) a brief statement of rationale for EMG biofeedback 
training, (b) instructions for progressive relaxation 
training prepared from Bernstein and Borkovec 
(1973), and (c) instructions for cue-controlled relaxation 
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training as presented by Russell and Sipich (197. 
More specifically, instructions for the third compon 
directed the subject to focus on his or her breath 
and to say the word relax with each exhalation. 
was continued for approximately 20 pairings, 
a 60-sec interval during which the subject was in- 
structed to focus his or her attention on general feelings 
of relaxation, 20 additional pairings were rehears 
prior to terminating the session. For both cue-con- 
trolled relaxation groups, the subject was instruc! 
to use this relaxation technique when faced with da 
anxiety-evoking situations, including academic tes 
Cue-controlied relaxation group (CCR). Subjects 
this condition were treated in the same manner 
subjects in the preceding condition, with the exceptii 
that no biofeedback apparatus was used. The tape fe 
this condition was identical to that for the biofeed 
group, except that the first component pi 


and all references to biofeedback were deleted. 

Altention-placebo group (AP), Subjects in 3 
condition were treated in the same manner as subjects 
in the CCR group. The tape for this condition, however, 
consisted primarily of soothing music performed b 
Roger Williams. A brief rationale for the use of musi¢ 
for relaxation purposes was presented, and at four 
points during the music the narrator used sugges 
imagery unrelated to test taking. There were 
references during this tape to either biofeedback or 
cue-controlled relaxation. The AP tape was approxi- 
mately 37 minutes in length, as compared to 42 minutes 
for the two other treatment tapes. All three tapes 
included statements encouraging the subjects to 
practice their respective treatments daily at home 
to use their respective strategies in anxiety-evoking 
situations. After the initial session, the rationale 
component of each of the three tapes was deleted for 
the remaining five sessions. 

No-treatment control group (NT). Subjects in this 
group were told that due to facility limitations, treat- 
ment could not be provided at present and that they 


had been placed on a waiting list for treatment in 
future. 


Statistical Analysis 


Statistical analysis was conducted using gain scores 
(Huck & McLean, 1975) on the TAS, the STAI State 
scale (A-State), the STAI Trait scale (A-Trait) and the 
Otis-Lennon Mental Ability Test. For these dependent” 
variables, two a priori orthogonal comparisons were 
conducted using ¢ tests. The comparisons included the 
two treatment conditions combined versus the two 
control conditions combined and the CCR-B condition 
versus the CCR condition. 

To test for main effects, the dependent measure gain 
Scores were subjected to a series of univariate analyses 
of variance. Post hoc comparisons, using Scheffé’s: 
multiple range test, were conducted for those de- 
pendent variables yielding a significant F ratio, The 
level of significance was set at .05 in all cases. 
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= Table 1 
Pre and Post Group Means and Standard Deviations for the Four Dependent Variables 
Treatment condition 
CCR-B CCR AP NT 
Variable M SD M SD M SD M SD 
TAS 
Pre 28.2 3.05 26.7 4.27 26.3 3 
$ 2 j .12 26.4 .99 
Post 24.2 6.88 18.8 7.98 22.2 3.36 26.1 be 
A-State 
Pre 49.4 8.30 47.5 14.88 50.8 6.37 
3 ʻ a = s 48.3 12.15 
Post 30.6 7.12 26.6 6.31 34.7 8.08 46.4 10.38 
A-Trait 
Pre 48.2 7.13 43.9 12.39 47.3 751 44.6 
Š x f a 4 5.30 
Post 42.9 6.94 36.2 7.66 39.7 5.54 40.8 6.11 
Otis-Lennon 
Pre 46.9 19.70 46.8 12.39 48.6 16.22 47.1 9.0 
Post 54.9 15.83 54.9 12.32 51.0 16.39 49.5 9.58 


isa CCR-B = electromyographic-assisted cue-controlled relaxation; CCR = cue-controlled relaxation ; 
$ = attention placebo; NT = no treatment; TAS = Test Anxiety Scale; A-State = State scale of the 
tate-Trait Anxiety Inventory; A-Trait = Trait scale of the State-Trait Anxiety Inventory ; Otis-Lennon 


= Otis-Lennon Mental Ability Test. 


Results 


Analysis of variance of pretest scores 
yielded nonsignificant F ratios across all 
dependent variables. Pretest and posttest 
group means and standard deviations for the 
dependent variables are presented in Table 1. 
Self-report measures of test anxiety and 
state anxiety indicated that the combined 
experimental conditions were superior to the 
combined placebo and no-treatment control 
conditions. No differences were noted for the 
self-report measure of trait anxiety. Results 


from the Otis-Lennon indicated that the com- 
bined treatment conditions made gains on this 
performance measure that were significantly 
greater than the combined control conditions. 
There were no differences, however, between 
EMG-assisted and traditional cue-controlled 
relaxation training in terms of any of the 
dependent measures. The ¢ ratios for these a 
priori comparisons are presented in Table 2. 
The series of univariate analyses of variance 
of change scores yielded significant F ratios 
for three dependent variables as follows: 
TAS, F(3, 36) = 4.388, p< 01; A-State, 


Table 2 
A Priori Orthogonal Comparison t Values for the Four Dependent Variables 
Comparison df TAS A-State A-Trait  Otis-Lennon 
CCR-B and CCR combined vs. 
AP and NT combined 36 —2.532* —3.297** —.460 4,640*** 
CCR-B vs. CCR 36 1.862 439 .976 — 581 


Note TAS = Test Anxiety Scale; A-State = State 
c Trait scale of the State-Trait Anxiety Inventory; 
R-B = electromyographic-assisted cue-controlled relaxation; 


5 attention placebo; NT = no treatment. 
b < 05. 


scale of the State-Trait Anxiety Inventory ; A-Trait 


Otis-Lennon = Otis-Lennon Mental Ability Test; 
CCR = cue-controlled relaxation ; AP 


994 


Table 3 


Means, Standard Deviations, and Post Hoc Comparisons for the Change Scores 


of the Four Dependent Variables 


rr ——— 
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Treatment condition 


CCR-B CCR AP NT 
Variable M SD M SD M SD M SD 
TAS —4.0a,b 5.59 7.95 5.85 —4.1p 4.14 — 3, 221 
A-State — 18.8, 12.76 —20.9, 14.21 —16.1. 8.60 —1.9% 4.33 
A-Trait 5:35 4.47 tele 6.71 7.60 6.27 —3.8, 4.04 
Otis-Lennon 8.00 5.14 8.1a 3.24 2.45 3.71 2.4 2.91 


Note. Scheffé’s multiple comparison test is at the .05 level. Means with the same subscript are not significantly 
different. CCR-B = electromyographic-assisted cue-controlled relaxation; CCR = cue-controlled relaxa- 
tion; AP = attention placebo; NT = no treatment; TAS = Test Anxiety Scale; A-State = State scale of 
the State-Trait Anxiety Inventory; A-Trait = Trait scale of the State-Trait Anxiety Inventory; Otis- 


Lennon = Otis-Lennon Mental Ability Test. 


F(3, 36) = 6.429, p < .01; and Otis—Lennon, 
F(3, 36) = 7.178, p< .001. Post hoc com- 
parisons using Scheffé’s test for multiple 
comparisons indicated that the CCR condition 
was superior to the no-treatment condition 
but was no different from the CCR-B or AP 
conditions for the TAS. In terms of state 
anxiety the CCR, CCR-B, and AP conditions 
were found superior to the no-treatment 
condition. No differences were found between 
conditions for trait anxiety. Both the CCR and 
CCR-B groups were found to result in signifi- 
cantly greater gains on the Otis-Lennon when 
compared to either control condition, but they 
were not significantly different from each other. 
Without exception, subjects in the experi- 
mental conditions increased performance at 
posttesting. These increases ranged from 2 to 
18 points. For the two control conditions, 
however, these gains were found to range from 
1 to 8 points only. Furthermore, 30% of the 
subjects in the control conditions actually 
demonstrated a decrease in performance at 
posttesting. Means and standard deviations 
for the change scores for the four dependent 
variables by treatment condition are presented 
in Table 3. 


Discussion ' 


As hypothesized, the combined cue-con- 
trolled relaxation conditions were found to 
reduce self-reported test anxiety and state 


anxiety more than the placebo and no-treat- | 


ment conditions. Also, the combined experi- 
mental conditions resulted in a greater 
increase in test-taking performance than the 
combined control conditions. However, the 
hypothesis that EMG-assisted cue-controlled 


relaxation would be superior to traditional | 


cue-controlled relaxation was not supported. 


The findings indicate that these approaches | 


to cue-controlled relaxation may be effective 


for the treatment of test anxiety. However, | 


despite the considerable enthusiasm generated 
around the use of EMG biofeedback in relaxa- 
tion training, it appears to add little to the 
effectiveness of cue-controlled relaxation. The 
general assumption that frontalis relaxation 
via EMG training generalizes to other muscle 
groups may not be totally accurate. Alexander 

(1975) found little evidence to support the 
use of EMG biofeedback alone in inducing 
generalized muscular relaxation. Given the 
potential cost and inconvenience of using EMG 
training, it may not be the treatment of choice 
for approaches using generalized relaxation 
training. The area in which biofeedback is 
making its most important contribution 
appears to be in the field of specific psycho- 
physiological disorders (Budzynski, Stoyva, & 
Adler, 1970; Budzynski, Stoyva, Adler, & 
Mullaney, 1973). 

„Of particular interest in this study is the 
significant increase in test-taking performance 
for the subjects in the experimental conditions. 
The mean gains on the Otis-Lennon for these 
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subjects translated into an increase of between 
7 and 24 IQ points, depending on the location 
of the scores in the distribution. The mean 
gain scores for the combined control conditions 
translated into an increase of between 1 to 4 
1Q points only. The literature dealing with the 
modification of test anxiety is replete with 
research yielding significant differences in 
self-reported test anxiety. Increases on per- 
formance measures, however, are not as 
plentiful. Finger and Galassi (1977) stated 
that a review of the literature revealed that 
performance improvements were obtained in 
only 16 (29.6%) of 54 studies. Although the 
present study resulted in significant increases 
in scores on a test of mental ability, replication 
and further investigation using performance 
measures is needed. 

It is noteworthy that we found no differences 
in the reduction of anxiety between the experi- 
mental treatments and the attention-placebo 
condition on any self-report measure. This 
would emphasize further the need for per- 
formance as well as self-report measures in the 
assessment of test anxiety. 

There are several important questions that 
remain unanswered. For one, what are the 
differential treatment effects for individual 
subjects receiving EMG-assisted cue-controlled 
relaxation? It may be that subjects who are 
trained successfully to modify EMG frontalis 
Tesponses will demonstrate an increase in 
performance, whereas those who fail to do so 
will not. Research designed to answer this 
question is needed. 

Implications for further research with cue- 
controlled relaxation itself include ‘investi- 
gating the relative contributions of various 
approaches to relaxation training and various 
forms of the subvocalized cue words. The 
Possible inclusion of additional subvocalized 
Statements as a supplement to the cue word 
appears promising. In this manner, many of 
the task-orienting self-verbalizations recom- 
mended by Meichenbaum (1977) could be 
Mcorporated into the cue-controlled relaxation 
treatment. Cognitive behavior modification 
approaches, without relaxation, could also be 
compared to cue-controlled relaxation. In this 
manner the relative importance of relaxation 
training in the treatment of test anxiety could 
be assessed. 
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This research investigated, in three studies, subjects’ state anxiety arousal in 
response to an in vivo vicarious threat to self-esteem. In Studies 1 and 2 stu- 
dents observed a guest speaker who provided the anxiety manipulation. In 
both studies, correlation and median split analyses indicated that high em- 
pathic and low trait anxious subjects reported elevated state anxiety in response 
to the vicarious threat. Even though external subjects reported overall higher 
levels of state anxiety, no differential responsiveness between internal and ex- 
ternal subjects was found. When subjects were matched on initial state anxiety, 
high-empathy subjects were found to have experienced vicarious anxiety, 
whereas subjects low on empathy did not. An analysis of high and low trait 
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anxious subjects who were matched on initial state anxiety did not reveal dif- 


ferential responsiveness. In addition to 
the Helplessness: an inability to affect ci 


replicating Study 1, Study 2 found that 
hange in others factor of locus of con- 


trol was significantly negatively related to empathy, and the cognitive reap- 


praisal styles of reversal (denial, 


reaction formation) and projection were 


related to state anxiety decreases. Study 3 provided evidence for the absence 


of a confound. 


Imagine an evening “on the town” in which 
an audience is exposed to a theatrical mis- 
fortune—an unprepared understudy. As any- 
one who has been in such a situation knows, 
there will be large differences in the behavior 
of the ticket holders. Some will demand a 
refund, whereas others will feel sorry and work 
themselves into an anxious sweat! What are 
the relevant personality characteristics and 
Cognitive reappraisal styles that are associated 
With such response variation? 

Present psychological evidence tends to 
support the notion that virtually all learning 
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phenomena that are directly experienced can 
also occur vicariously through the observation 
of another person (Bandura, 1969). The 
majority of the evidence demonstrates that 
such observation affords the acquisition of 
response patterns and the therapeutic benefits 
of a coping model. In the present study 
observational or vicarious emotionality (anx- 
iety), a less desirable response pattern, was 
investigated. 

Those investigations that have in the past 
targeted emotionality have usually been 
concerned with such variables as the type of 
stress (physical danger threat or threat to 
self-esteem: Bennett & Holmes, 1975; Kendall, 
1978; Kendall, Finch, Auerbach, Hooke, & 
Mikulka, 1976; Speisman, Lazarus, Mordkoff, 
& Davison, 1964) and the nature of the stress 
experience (direct or vicarious: Alfert, 1966, 
1967; Averill, Olbrich, & Lazarus, 1972; Opton 
& Lazarus, 1967), whereas others have in- 
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cluded an examination of threat reduction via 
defensive styles (Bennett & Holmes, 1975; 
Holmes & Houston, 1974; Lazarus, Opton, 
Nomikos, & Rankin, 1965; Speisman et al., 
1964). Uniformly, these studies have been 
conducted in the laboratory setting. 

When one organizes the studies in this 
area and examines a matrix of stress type 
(physical, ego threat) and nature of stress 
(vicarious, direct) variables, there is an 
absence of an evaluation of observational ego 
threat or vicarious stress to self-esteem. The 
intent of the present investigation was to 
conduct such a study and to examine the 
characteristics of subjects who would experi- 
ence vicarious anxiety. Three personality 
measures were selected: empathy (Hogan, 
1969), locus of control (Nowicki & Duke, 
1974), and anxiety (Spielberger, Gorsuch, & 
Lushene, 1970). 

It was hypothesized that subjects scoring 
high on empathy, indicating a high degree of 
sensitivity to the feelings and needs of others, 
would respond with increases in their personal 
state anxiety greater than those of the low- 
empathic subjects following a vicarious threat 
to self-esteem. Regarding locus of control (i.e., 
internal control reflecting a generalized expec- 
tancy that outcomes or reinforcement are a 
consequence of one’s own behavior, and 
external control suggesting an expectancy that 
outcomes are the result of luck, fate, chance, 
or powerful others), the empirical findings (see 
Phares, 1973) support the notion that ex- 
ternals are more anxious than internals (Ray 
& Katahn, 1968; Watson, 1967), and, corre- 
spondingly, it was hypothesized that external 
subjects would be more likely to respond with 
vicariously aroused state anxiety than internal 
subjects. Finally, in line with the state-trait 
theory of anxiety (Spielberger, 1972), which 
posits that high trait anxious subjects are more 
prone to experience elevated states of anxiety, 
it was predicted that high trait anxious 
subjects would respond with greater state 
anxiety than low trait anxious individuals to 
the vicarious threat to self-esteem, 

To reduce the unwanted variability due to 
expectancy and role playing, an in vivo situa- 
tion was selected. Thus, the present study set 
out to directly and intentionally evaluate 
subjects reporting state anxiety reactions in a 
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naturally occurring stimulus condition that 
includes a vicarious threat to self-esteem. 
Study 1 
Method 


Subjects 


The subjects in this study were 30 undergraduate 
psychology students enrolled in an evening introduc 
tory class at an urban Virginia university. 
14 males and 16 females, with a mean age of 24.5. 


There were 


Measures 


Empathy. Hogan’s (1969) empathy scale is a 6 
item inventory that requires subjects to endorse true 
or false items as they apply and measures an individual's 
sensitivity to the needs of others. 

Locus of control. The Adult Nowicki-Strickland 
Internal-External Scale (ANSIE; Nowicki & Duke, 
1974) is a 40-item true-false inventory that assesses 
an individual's belief in personal control over reinforce: 
ments (internal) or that luck, fate, chance, or powerful 
others are the source of reward (external). 

Anxiety. The State-Trait Anxiety Inventory 
(STAI; Spielberger et al., 1970) consists of two sepa 
rate 20-item self-report scales for measuring state 
anxiety (A-State) and trait anxiety (A-Trait). The 
STAI A-State scale requires people to describe how 
they feel at a particular moment in time; the STAI 
A-Trait scale asks people to describe how they generally 
feel. The A-State scale assesses anxiety at the moment 
in a given situation, whereas the A-Trait scale measures 
an individual’s global predisposition to feel anxious 


Procedure 


Subjects were distributed the empathy and locus 
of control scales and were required to complete an 
return them to their instructor. The STAI was a 
ministered according to standard instructions during 4 
lecture period under nonstress conditions. Subjects 
were merely informed that the instructor was “collecting 
some research data and needed for everyone to fill them 
out.” They were also informed that their scores woul 
not be individually analyzed but that the informatio! 
would be grouped together for comparisons. 
_ During a later lecture period, the subjects wer 
informed that there would be a guest speaker! who 
would talk on “child psychopathology” and that th 
Speaker was a recognized professional who had pub 
lished numerous articles in the area. In fact, the guest 
Speaker was a confederate who would perform thé 
experimental manipulation. The manipulation co” 
sisted of an introduction by the instructor and a genet! 


| The guest speaker was in fact a recognized clinical 
child psychologist who participated as a confederat? 
and who had prerehearsed the anxiety manipulation: 


I), 
i 


Ga = 
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statement by the guest speaker stating the topic of 
his talk and that the class would be asked to fill out a 
questionnaire used in related research as a demon- 
stration. Then a series of events followed, including 
brief and intermittent stuttering, repeating phrases, 
mixing ideas, spilling coffee, dropping papers, mis- 
placing slides, and an undelivered slide projector. The 
series of events lasted between 6 and 7 minutes. When 
the slide projector could not be located, the confederate 
guest speaker suggested that the class fill out the 
demonstration questionnaire while he and the instructor 
would try to get the slides and projector organized. 
The STAI was distributed, and subjects were in- 
structed to read the instructions and complete the 
inventory. 


Debriefing 


When all subjects had completed the STAT, the 
manipulation was explained. The guest speaker then 
presented his talk normally while the instructor and 
an assistant examined the data. At the end of the guest 
talk, a brief summary of the results of the present 
experiment was discussed with the class.” 


Results 


_ The change in subjects’ A-State from the 
initial nonstress phase to the postmanipula- 
tion phase was examined in a series of analyses 
for each of the personality measures. In each 
case, the correlations of the personality measure 
with the initial nonstress A-State score and 
the A-State increase score (postmanipulation 
A-State minus initial A-State) are presented. 
The correlations are followed by separate 2 
(personality variable: median split) X 2 (trials: 
premanipulation and postmanipulation) analy- 
ses of variance. In addition, when sufficient 
subjects could be included, subjects matched 
on their initial A-State scores were subjected 
to 2 X 2 analyses of variance’ similar to those 
Sag on all subjects using the median 
split. 


Empathy 


The correlation between empathy scores and 
the initial A-State scores was —-26, which 
only approached significance. However, em- 
pathy was significantly correlated with an 
Mcrease in A-State scores (r = .39, p < 05). 
This significant correlation indicates that 
empathy is directly related to reported eleva- 
tions in A-State. 

When subjects were divided at the median 
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Figure 1. Mean State-Trait Anxiety Inventory State 
Anxiety (STAI A-State) scores for high- and low- 
empathy subjects at premanipulation and postmanipu- 
lation. (The left portion includes all subjects divided 
at the median; the right portion includes certain 
subjects matched on initial A-State scores.) 


into high-empathy (>38) and low- 
empathy (<38) groups (both ns = 13; 4 
subjects whose scores equaled 38 were elimi- 
nated), the analysis of variance indicated 
nonsignificant main effects for empathy level 
and trials, Fs(1, 24) < 1. The Trials X Em- 
pathy interaction was significant, F(1, 24) 
= 14.27, p < .005, indicating that the A-State 
scores of high-empathic subjects increased due 
to the manipulation, whereas the A-State 
scores of the low-empathic subjects decreased 
(see lefthand portion of Figure 1). The means 
of the A-State scores for subjects high and low 
on empathy were pre = 37.0 and 43.69 and 
post = 42.46 and 35.15, respectively. As 
shown in Figure 1, high-empathy subjects 
were at higher A-State levels following the 
manipulation than they were at the pre- 
manipulation phase. In contrast, low-empathy 


(38) 


2 Although several students reported feeling very 
badly for the guest speaker (one cleaned up the coffee 
during the talk), the class unanimously stated that 
they were unaware that it was a manipulation. Im- 
portantly, the class also volunteered that the debriefing 
which included an explanation of some of the results 
of the study was a very stimulating and innovative 
teaching method. 

3 Subjects’ scores were considered matched within a 
1-point range. A matched group analysis of variance 
was not used. 
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subjects were less anxious following the 
manipulation. Thus, subjects who scored high 
on empathy were those who reported a 
vicarious anxiety response. 

However, a reevaluation of the left-hand 
portion of Figure 1 reveals that the initial 
A-State scores for the two groups differ 
substantially. (Recall also the —.26 correlation 
between empathy and initial A-State.) Anxiety 
is not entirely independent of empathy, and 
legitimate inferences about the effect of 
empathy on state anxiety can be drawn only 
when empathy is varied independently of 
anxiety.* To achieve independence, 16 subjects 
from the high- and low-empathy groups were 
matched according to their initial A-State 
scores, and changes in anxiety were analyzed. 
The premanipulation and postmanipulation 
means for the matched high- and low-empathy 
groups were pre = 36,75 and 37.25 and 
post = 43.25 and 33.37, respectively. Results 
of a 2 (empathy level) 2 (trials) analysis of 
variance indicated nonsignificant main effects 
for empathy level and for trials, Fs(1, 14) < 1, 
but a significant interaction, F(1, 14) = 10.91, 
p< .001. This finding is presented in the 
right-hand portion of Figure 1. Here, the high- 
empathy subjects reported increased levels of 
state anxiety following the manipulation, and 
the low-empathy subjects reported decreased 
anxiety. The results of the matched-subjects 
analysis coincide with those reported for the 
median-split procedure and demonstrate more 


clearly the effects of empathy on A-State 
changes. 


Locus of Control 


The correlation between locus of control and 
the initial A-State score was .39(p < -05), and 
the correlation between locus of control and 
the increase in A-State scores was 09(p > .10). 
These findings support those of Ray and 
Katahn (1968) and Watson (1967), who 
reported that external subjects score higher 
on anxiety than internals, However, the 
present evidence does not support the hy- 
pothesis that externals would be more likely 
to experience vicarious anxiety. 

When subjects were divided at the median 
(9) into internal (nm = 11, scores <9) and 
external (n = 9, scores >9) locus of control 
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groups (10 subjects whose scores were 9 y 
eliminated), the results of a 2 (intern 
external) X 2 (trials) analysis of variana 
indicated a significant main effect of locus o 
control, F(1,18) = 11.61, p < .005, a no 
significant main effect for trials, F(1, 
= 1.04, and a nonsignificant interacti 
F(1, 18) = 1.59. Examination of the me 
for the internals and externals (pre = 35.09 } 
and 44.22 and post = 29.27 and 46.22, respec. 
tively) indicated that the significant internal) 
external difference in A-State was reflected in 
the higher A-State scores for externals. An 
attempt to analyze for the effects of locus of 
control on state anxiety with subjects matched 
on initial A-State was aborted by the limite 
number (i.e., three) of matched cases. 


Anxiety 


The correlation between A-Trait and the 
initial nonstress A-State was .59 (p< .01), 
and the correlation between increases in 
A-State and A-Trait scores was —.48 ( 
< 01). The significant A-Trait/A-State corre 
lation is not surprising ; however, the significant 
negative correlation of A-State increases and 
A-Trait scores was not expected. This un- 
expected finding suggests that it was the lower 
A-Trait scores that were related to the greater 
State anxiety increases. 

When subjects were divided at the median 
of their A-Trait scores (38), high A-Trait | 
subjects (n = 12) had scores greater than 38, 
whereas low A-Trait subjects (n = 12) had J 
Scores less than 38 (6 subjects were eliminated 
by the median split). The mean A-State scores | 
for high and low A-Trait groups were prê 
= 45.83 and 31.50 and post = 38.83 and 34.25, 
respectively. Results of a 2 (A-Trait level) X 2 
(trials) analysis of variance indicated a _ 
significant A-Trait level effect, F(1, 22) 
= 13.38, p < .005, and a nonsignificant trials 
effect, F(1, 22) = 1.04. The A-Trait Level 
X Trials interaction was significant, F(1, 22) 
= 5.49, p < .05, indicating differential changes 
in A-State for subjects differing in A-Trait 
(see Figure 2). Figure 2 indicates that the low 
A-Trait subjects were more state anxious” 


ee authors wish to thank an anonymous reviewer _ 
for stressing this important issue. 
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following the vicarious threat to self-esteem 
than prior to it, whereas the high A-Trait 
subjects reported less A-State following the 
manipulation. It should be noted here that the 
extreme scores at the premanipulation phase 
set limits on the probable size and direction 
of any change, and an analysis of subjects 
matched on initial A-State would be more 
desirable. Given that A-Trait and A-State 
are not independent, it was not surprising 
that only 3 subjects could be matched, and 
thus no matched-subjects analysis was 
conducted. 


Discussion 


The results of the present study provide 
strong support for the hypothesis that high- 
empathic subjects will respond anxiously after 
observing an anxious speaker. Correspond- 
ingly, the present study provides validation 
for Hogan’s (1969) empathy scale. In a 
discriminant fashion, the locus of control 
results are also valuable. That is, locus of 
control appears to be unrelated to vicarious 
emotionality. However, external subjects did 
report more anxiety, as has been found in other 
Studies. 

The results of the trait anxiety analysis were 
hot entirely as expected. Although the signifi- 
cant interaction of high/low trait level and 
trials was predicted, the finding that the low 
A-Trait subjects increased while high A-Trait 
Subjects decreased in state anxiety as a func- 
tion of the manipulation was contradictory to 
our hypothesis. Indeed, this is the first known 
Study to disclose a situation in which the low 
A-Trait individuals were more disposed to be 
State anxious, and the findings are directly 
Contradictory to those of numerous studies 
Using different stresses (e.g., Auerbach, 1973; 
Hodges & Spielberger, 1966) and to the 
definition of A-Trait itself. A finding as 
Surprising as this appeared to warrant 
teplication. 

Yet another potential need for replication 
Concerns the multidimensionality of locus of 
Control. Recent evidence using the Rotter 
Scale (Abramowitz, 1973; Levenson, 1973) and 
the Nowicki—Strickland children’s form (Ken- 
dall, Finch, & Mahoney, 1976) have shown 
that although overall locus of control is 
Predictive, factor-specific scoring of the dimen- 
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HIGH A-TRAIT (n=12) 
LOW A-TRAIT (n=12) 


MEAN STAI A-STATE SCORES 
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Figure 2. Mean State-Trait Anxiety Inventory State 
Anxiety (STAI A-State) scores for all high and low 
Trait Anxiety (A-Trait) subjects at premanipulation 
and postmanipulation. 


sions of locus of control can advance accurate 
hypotheses. The factor analytic demonstration 
of the multidimensionality of the ANSIE 
(Kendall, Finch, & Mikulka, Note 1) pro- 
duced five meaningful factors. Among the 
factors were two representing helplessness— 
one that appeared to be related to an absence 
of alternatives or solutions and another that 
is suggestive of an inability to affect change 
in others. Another factor expressed super- 
stitious beliefs. It was felt that perhaps, even 
though overall locus of control was not related 
to vicarious anxiety, the specific factors might 
be related. Specifically, the Helplessness: An 
inability to affect change in others factor of the 
ANSIE was hypothesized to be negatively 
related to the vicarious anxiety experience. 
In the present study, subjects watched an 
individual anxiously blunder through a pro- 
fessional presentation. Those who were sensi- 
tive to the needs and feelings of others (i.e, 
high on empathy) showed anxiety arousal. 
Similarly, it might be hypothesized that 
individuals who perceive themselves as being 
capable of affecting change in others might 
also be aroused. 

Finally, an examination of the means of the 
high- and low-empathy and high and low 
A-Trait subjects revealed that although high- 
empathic and low A-Trait subjects reported 
increased state anxiety, the mean scores of 
both the low-empathic and high A-Trait 
subjects decreased after the manipulation. This 


h 
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Figure 3. Mean State-Trait Anxiety Inventory State 
Anxiety (STAI A-State) scores for high- and low- 
empathy subjects at premanipulation and postmanipu- 
lation. (The left portion contains all subjects divided 
at the median; the right portion includes certain 
subjects matched on initial A-State scores.) 


decrease suggests that perhaps certain cogni- 
tive reappraisals (Lazarus, 1966) had been 
used by these subjects and that these cognitive 
styles (i.e., defense mechanisms) should be 
examined. For this, and the above reasons, a 
replication and extension was conducted. 


Study 2 
Method 


The subjects in this study were 40 undergraduate 
psychology students enrolled in an early morning 
introductory psychology class at an urban Virginia 
university. There were 20 males and 20 females with 
a mean age of 19.7, 


Measures 


In addition to the measures 
initial study, the Defense Mechanisms Inventory 
(DMI; Gleser & Thilevich, 1969) was administered, 
The M gen ira intensity of five major 
groups of defenses. inventory consists of 10 bri 
g ; s tory of 10 brief 


turning-aga inst- 
(6) reversal (i.c., negation, denial, reaction E 
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Procedure 


The same experimental procedure was followed 
in Study 1. Similarly, subjects were debriefed and wer 
presented with a brief outline of the results. 


Results and Discussion 


As in Study 1, the A-State reaction ol 
subjects to the vicarious anxiety manipulation 
was examined in a series of analyses (ié 


and analysis of subjects matched on ini 
A-State) for each of the personality measures 


Empathy 


The correlation between empathy and the 
initial nonstress measure of A-State was —.28 
which approached but did not reach sig 
cance. However, empathy was significantl} 
related to A-State increases (r = .35). As il 
Study 1, the more empathic subjects reporte 
elevated state anxiety in response to 
vicarious anxiety manipulation. 

When subjects were again split at th 
median (38) into high-empathy (>38, n = 
and low-empathy (<38, n = 20) groups, th 
results of the analysis of variance indicated 
nonsignificant main effects for high/low em 
pathy and trials, F(1, 38) < 1, and F(1, 38 
< 1, respectively. The interaction of empathy 
and trials was significant, F(1, 38) = 5.29) 
p < .03, indicating that the state anxiely 
level of subjects high on empathy changed 
differentially from that of subjects low on 
empathy. Again, as in Study 1, high-empati 
subjects showed an increase in A-State due t0 


sented in the left-hand portion of Figure 4 
In addition, the extent of the replication 0 
Study 1 should be noted. (See left-hand 
portion of Figure 1.) 

To examine the effects of empathy on state 
anxiety when empathy is varied independently 
of anxiety, 24 subjects from the high- and 
low-empathy groups were matched accordin 
to their initial A-State scores. The pre 
manipulation and postmanipulation means fof 
the matched high- and low-empathy groups 
Were, respectively, pre = 36.72 and 36.72 and 
Post = 46.1 and 36.40. Results indicated non 
Significant main effects, F(1, 22) = 1.69, and 
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F(1, 22) = 1.48, respectively, and a significant 
Empathy Level X Trials interaction, F(1, 22) 
= 4.37, p < .05. This finding is presented in 
the right-hand portion of Figure 3. As can 
be seen, when high- and low-empathy subjects 
are matched on initial A-State, it is the high- 
empathic individuals who report experiencing 
vicariously aroused anxiety. It should be noted 
that this finding is quite similar to that found 
in the matched-subjects analysis in Study 1 
(compare to right-hand portion of Figure 1), 
with the exception that the low-empathic 
subjects in Study 2 did not show a marked 
reduction in anxiety. Nevertheless, the results 
regarding empathy in Figures 1 and 3 represent 
quite similar findings. Indeed, the results of the 
present study provide strong, replicated 
support for the finding that high-empathic 
individuals experience vicarious anxiety 
arousal, whereas less empathic people do not. 
Moreover, the aroused anxiety due to a 
vicarious threat to self-esteem in the high- 
empathic subjects also provided validation for 
the empathy scale (Hogan, 1969). The state 
anxiety increases reported by high-empathic 
subjects are considered the result of their 
placing themselves in the position of the 
speaker-—‘‘If I were up there, I would be 
extremely anxious” or “I can imagine what 
he feels like.” It would be interesting to 
examine in future research the self-statements 
made by empathic and nonempathic observers 
and their ensuing emotional response patterns. 


Locus of Control 


Two types of analyses were conducted using 
the locus of control data: analyses of overall 
locus of control scores and analyses of the 
gad factors within the locus of control 
Scale, 

Overall score analyses. The correlation be- 
tween locus of control and the initial nonstress 
Measure of A-State was .31 (p < -05), whereas 
the relationship to A-State increases was non- 
Significant (r = —.15). As in Study 1, ex- 
termals reported higher A-State, but locus of 
Control remained unrelated to the vicarious 
arousal of anxiety. 

Subjects were divided at the median of the 
Overall scores (9) into internal (n = 18, scores 
<9) and external (n = 18, scores >9) groups 
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(4 subjects who scored 9 were eliminated) 
with subsequent analyses of A-State scores 
across trials. Results indicated nonsignificant 
main effects and a nonsignificant interaction, 
F(1, 34) = 3.40, and F(1,34) <1, F(1,34) 
= 1.47, respectively. The means for the 
internal and external subjects, respectively, 
were pre = 35.16 and 43.16 and post = 38.33 
and 39.27. The locus of control main effect 
approached significance (p < .07) in the 
direction similar to Study 1 in which externals 
reported more state anxiety. 

To examine the effects of locus of control on 
A-State when locus of control is varied in- 
dependently of anxiety, 22 subjects from the 
internal and external groups were matched on 
initial A-State. The premanipulation and 
postmanipulation means for matched internal 
and external subjects, respectively, were 
pre = 35.54 and 35.90 and post = 37.45 and 
38.09. The results of the analysis of variance 
produced nonsignificant main effects for locus 
of control and trials and a nonsignificant 
interaction, F(1, 20) = 1.76, F(1, 20) = 1.96, 
and F(1, 20) < 1, in that order. More con- 
clusively, these findings demonstrate that 
locus of control is not related to vicarious 
anxiety arousal. 

Factor-specific analyses. Examination of the 
scores on the specific factors of locus of control 
revealed no additional meaningful relation- 
ships. That is, neither the overall locus of 
control score nor the five factor-specific scores 
were correlated with an increase in state 
anxiety (all rs < .27). Thus, the factor scores 
did not provide additional specificity. 

On the other hand, although the overall 
locus of control score did not correlate signifi- 
cantly with the empathy measure (r = .19), 
Factor II, Helplessness: an inability to affect 
change in others, did correlate meaningfully 
(—.35, p< .05). This relationship suggests 
that subjects who are empathic perceive them- 
selves as having the ability to affect change in 
others. 


Anxiety 


Trait anxiety correlated .61 (p< .005) 
with initial A-State and —.52 (p < .005) with 
increases in A-State. These results support 
those of Study 1 and, specifically, support the 
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Figure,4. Mean State-Trait Anxiety Inventory State 
Anxiety (STAI A-State) scores for high and low Trait 
Anxiety (A-Trait) subjects at premanipulation and 
postmanipulation. (The left portion contains all 
subjects divided at the median; the right portion 
includes certain subjects matched on initial A-State 
scores.) 


surprising finding that it was the lower A-Trait 
subjects who responded with increases in 
A-State. 

A 2 (A-Trait level) X 2 (trials) analysis of 
variance of subjects divided at the median 
of their A-Trait scores (38) into high (n = 20, 
A-Trait >38) and low (n = 20, A-Trait <38) 
A-Trait groups revealed a significant main 
effect for A-Trait level, F(1,38) = 6.01, 
p < .02, a nonsignificant main effect for trials, 
F(1, 38) < 1, and a significant interaction, 
F(1, 38) = 7.75, p < .005. The means for high 
and low A-Trait subjects were pre = 45.83 
and 31.50 and post = 38.83 and 34.25. Again, 
similar to Study 1, the significant Trait 
Anxiety Level X Trials interaction was ex- 
amined and was found to indicate that the low 
A-Trait subjects had reported more state 
anxiety following the manipulation than ata 
nonstress period, whereas the high A-Trait 
subjects had reported less. This interaction is 
presented in the left-hand portion of Figure 4. 
It should be noted again that the initial 
A-State means were very different for the 
A-Trait groups and that these initial score 
differences set limits on the Possible size and 
direction of change. 

A more crucial examination of the effects 
of trait anxiety on state anxiety was conducted 


using 20 subjects matched on initial A-State. 
The premanipulation and Postmanipulation 


means were high A-Trait = 36.9 and 33.2 and 
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low A-Trait = 36.5 and 33.6. An analysis of! 
variance revealed nonsignificant differences, 
F(1, 18) < 1, F(1, 18) = 1.29, and F(1, 18) 
< 1, for A-Trait level, trials, and the inter. 
action, in that order. Thus, the replicated 
finding in Studies 1 and 2 that high and low 
A-Trait subjects responded differently to the 
vicarious threat to self-esteem was mot sup- 
ported by the analysis of matched subjects, 
(See right-hand portion of Figure 4.) 

Future research should consider situation-| 
specific anxiety traits (Endler & Okada, 1975; 
Zuckerman, 1977) and the prediction of 
arousal based on the congruence of the class 
of the anxiety trait measure and that of the 
experimental situation (Kendall, 1978). Pre 
dictions based on the interactional assessment 
of trait anxiety should be quite heuristic in 
relation to vicarious anxiety. 


Age and Sex 


An examination of the correlations between 
age, sex, and the increase in state anxiety] 
indicated that neither age (r = —.25) not 
sex (r = —.14) were significantly related to 
A-State increases. In addition, neither sex nor 
age was related to empathy (both rs < .08), 
nor was sex related to trait anxiety (r = .06)) 
Age was related to trait anxiety (r = .33 
p < .05). 


Cognitive Reappraisals/ Defense Mechanisms 


To examine the reappraisal styles of subjects 
who did not become anxious following out} 
manipulation (some subjects actually de 


Table 1 


Correlations of the Defense Mechanism 
Inventory Categories With Decreases in 


State Anxiety and With Locus of Control 
Sn ee a U U 


A-State Locus of 
Category decrease control > 
Hostility-out 25 .36* 
Projection .34* -22 
Principalization 02 ls 
Turning-against-self AS .08 
Reversal .42** —.40** 
E <.05. 
ep S01 
+ 


VICARIOUS ANXIETY 


creased in reported state anxiety), the five 
scores on the DMI were intercorrelated with 
decreases in state anxiety (initial A-State 
minus postmanipulation A-State). The results 
are presented in Table 1. As can be seen, there 
exist significant relationships between subjects 
who characteristically use reversal and projec- 
tion and their changes in state anxiety. These 
relationships suggest that subjects who charac- 
teristically use reversal-type defenses (i.e., 
denial, negation, reaction formation) and 
projection are likely to report a nonanxious 
behavior pattern when exposed to an anxious 
model. Here, an observer using a reversing 
reappraisal style (e.g., denial) might think, 
“He’s not really anxious” or an observer might 
project the emotionality onto another person— 
“Tt looks like the teacher is getting uncom- 
fortable.” Future research should examine the 
content of the self-statements made during 
cognitive reappraisals of threat. 

Although four of the DMI categories did 
not correlate significantly with trait anxiety, 
there was a significant relationship between 
reversal and trait anxiety (r = .51, p < .005), 
suggesting that the high A-Trait subjects 
tended to rely on the reversal defensive styles. 
None of the defensive styles correlated with 
empathy. 

Locus of control and defense mechanisms. 
Although this is an ancillary section of the 
present study, several findings are note- 
worthy. The correlations of interest were 
included in Table 1. As can be seen, a signifi- 
cant relationship between locus of control and 
the defense styles of reversal, hostility—out, 
and principalization was found. These rela- 
tionships suggest that internal subjects are 
more characteristic users of reversal and 
principalization, whereas externals tend to 
rely on hostility-out (i.e, displacement). 

The results of Study 2 suggest that the 
state anxiety of subjects whose style of cogni- 
tive reappraisal emphasizes reversal and pro- 
jection was reduced by the anxious model. 
This is consistent with the findings of Houston 
(1971, 1973), Houston and Hodges (1970), and 
Lazarus and Alfert (1964), who reported the 
advantage of denial in stressful situations and 
the arousal-reducing qualities of denial and 
Teaction formation (reversals). The present 
findings are supportive of Bennett and Holmes 
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(1975), who reported a reduction in arousal 
relating to projection, but are in contrast to 
Holmes and Houston (1971) and Stevens and 
Reitz (1970), who did not support the projec- 
tion/anxiety reduction hypothesis. The re- 
versal defensive mechanisms of the DMI 
include denial, reaction formation, and nega- 
tion. The present findings support the relation- 
ship of such reversing as a style of cognitive 
reappraisal to obliterate anxiety arousal. 


Comment on the Replication 


The results of the first two studies provide 
replicated evidence that state anxiety as 
self-reported by high-empathic subjects in- 
creases following a vicarious threat to self- 
esteem manipulation. Trait anxiety was con- 
sistently related to both A-State and A-State 
increases, but A-Trait was not related to 
A-State increases when subjects were matched 
on initial A-State. In addition, internal and 
external subjects do not differ in their tendency 
to experience vicarious anxiety, but externality 
is related to higher levels of state anxiety.  * 

The contexts of replicated findings are 
noteworthy, and the present replication is no 
exception. In fact, there were actually several 
differences between Study 1 and Study 2. 
First, one was an evening class, whereas the 
other was an 8:00 a.m. class. Second, the 
mean ages were 24.5 and 19.7 years, respec- 
tively. Finally, the first guest speaker for the 
subjects in Study 1 was the experimental 
manipulation, whereas, due to other commit- 
ments, there were two guest speakers prior 
to the manipulation for subjects in Study 2. 
Nonetheless, even with the contextual varja- 
tions (notwithstanding the care taken to 
replicate the procedural matter), the results 
were consistent. 


Study 3 


Before attempting to draw any conclusions 
or implications from the results of the first 
two studies, it must first be asked whether 
the state anxiety changes were the result of 
our manipulation. According to Sarason (1972), 


models, teachers, and experimenters provide two types 
of information: 1) what they do or say, and 2) how 
they do or say it. Both require intensive inquiry 
because neglect of one of these dimensions could 
nullify effects of the other. (p. 399) 
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Even though the results of the present 
studies are considered indicative of reliable 
characteristics of the vicarious emotional 
responder (and the replication and extension 
provided by Study 2 clearly support this), a 
methodological dilemma may have confounded 
the results. That is, the absence of a control 
group that experienced the guest speaker “in 
his usual form” prevented our conclusions from 
being clear-cut. The speaker himself and the 
manipulation were confounded. Our results 
could have been due to “the nature of our 
guest speaker.” Analysis of changes in state 
anxiety from a nonstress to a post-guest- 
speaker period during which the guest speaker 
did not perform the anxiety manipulation was 
a needed control. The purpose of Study 3 was 
to examine such a control by analyzing state 
anxiety changes before and after the same guest 
speaker for subjects varying in empathy and 
trait anxiety. 


Method 
Subjects 


The subjects in this study were 14 undergraduate 
psychology students. There were 2 males and 12 
females, with a mean age of 23.2. All were students in a 
low-enrollment, summer-session undergraduate psy- 
chology class at an urban Virginia university. 


Procedure 


In this study the STAT and the empathy scale were 
administered to the entire class during a nonstress 
condition. The second administration of A-State 
followed a guest talk on the same topic as in Studies 1 
and 2 and by the same guest speaker. In this study the 
speaker did not perform the anxiety manipulation. 


Results and Discussion 
Empathy 


Empathy was found to be nonsigni 
related to both the initial A-State eared 
and A-State increases (r= —0.8). Even 
though both Study 1 and 2 found empathy 
to be significantly related to A-State increases 
there was no such relationship in this study 
when the vicarious anxiety experience was not 
provided. 

When subjects were divided at the median 
of Studies 1 and 2 (38) into high-empathy 
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(>38, n = 9) and low-empathy (> 38, n = 5) 
groups, it was possible to match 5 subjects on | 
their initial A-State scores. The means for 
the high- and low-empathy subjects that had 
been matched on initial A-State were pre 
= 344 and 34.4 and post = 32.2 and 33.6, 
respectively. The results of the analysis of 
variance indicated nonsignificant main effects 
for empathy and trials and a nonsignificant 
interaction, Fs(1, 12) < 1. These results indi- 
cate that with matched subjects® there is no 
difference in A-State for subjects varying in 
empathy and that the guest speaker did not 
produce anxiety arousal. Thus, without the 
anxiety manipulation the speaker did not 
arouse anxiety. 


Anxiety 


Trait anxiety was significantly related to 
initial A-State scores (r = .64, p < .05) but | 
was not significantly related to state anxiety 
increases (r = .16). As expected, unlike Studies 
1 and 2 but in agreement with the empathy 
data from Study 3, A-Trait was not related 
to A-State increases. 

When subjects were divided at the median 
(38) of their A-Trait scores into a high A-Trait 
(>38, n = 7) and low A-Trait (<38, n = 1) 
groups, the results of a 2 (A-Trait level) X 2 
(trials) analysis of variance indicated a signifi- 
cant main effect for A-Trait level, F(1, 12) 
= 5.38, p< .05, a nonsignificant main ef- 
fect for trials, and a nonsignificant inter- 
action, Fs(1, 12) < 1. The means for high and 
low A-Trait subjects were pre = 39.86 and 
32.43 and post = 38.00 and 31.86. These 
results support previous findings of higher 
A-State for high A-Trait subjects. Also, these 
tesults indicate that the guest speaker (with- 
out the anxiety manipulation) did not produce 
elevations in state anxiety. In addition, there 
was no differential responsiveness for high 
versus low A-Trait subjects. 

The results of the control study demon- 
strated that the observed anxiety reactions 
in Studies 1 and 2 were probably a function 
of the anxiety manipulation and not due to 


a 1 rly nonsignificant resuits were obtained when 
the median split groups were analyzed. 
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any idiosyncrasies of the speaker himself. 
Thus, those findings reported in Study 1 and 
Study 2 should not be considered subject to a 
confound due to the speaker’s style. 


Reference Note 


1. Kendall, P. C., Finch, A. J., Jr., & Mikulka, P. J. 
Multidimensional locus of control in adults. Unpub- 
lished manuscript, Virginia Commonwealth Uni- 
versity, 1976. 
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Illusory Correlation: 
A Further Exploration of Chapman’s Paradigm 


Richard M. Kurtz and Sol L. Garfield 
Washington University 


Even though the phenomenon of illusory correlation first researched by Chap- 
man has been demonstrated in a variety of studies, relatively little has been 
done to show convincingly that this bias cannot be reduced or eliminated by 
training. The present study addresses itself to this issue. There were four 
groups of 15 undergraduate subjects each in this study. The first group was a 
replication of a study by Chapman and Chapman with an equal association of 
all valid Wheeler signs and invalid signs and statements of the patients’ pur- 
ported problem. Group 2 was a replication of Chapman’s study, with valid signs 
presented 100% of the time and invalid signs presented 50% of the time. 
Group 3 provided subjects with a special pretraining against illusory correla- 
tion, with 50% presentation of valid signs; and Group 4 was a special pretrain- 
ing of the subjects against illusory correlations, with 100% presentation of the 
valid Wheeler signs. It was predicted that Groups 3 and 4 would show the least 
amount of illusory correlation. This hypothesis was not confirmed. However, 
we replicated part of the Chapmans’ findings that subjects predominantly asso- 
ciated the concept of anality with preconceived problems of homosexuality. A 
serendipitous finding was also noted in which subjects appeared to create their 


own illusory correlate. 


Illusory correlations are reported associa- 
tions between test responses and symptoms or 
syndromes that are based on verbal associa- 
live connections of the test-sign to the symp- 
fom rather than on valid observations. As 
aa by Chapman (1967), illusory correla- 
on is 


the report by observers of the correlation between 
two classes of events which, in reality, (a) are not 
‘orrelated, or, (b) are correlated to a lesser extent 
than reported, or (c) are correlated in the opposite 
tection from that which is reported (p. 151). 


Although the phenomenon of illusory corre- 
tion has been demonstrated in a variety of 
Studies (Chapman, 1967; Chapman & Chap- 
man, 1967, 1969; Golding & Rorer, 1972; 
tarr & Katkin, 1969), relatively little has 
fen done to show that this bias cannot be 


x Reauests for reprints should be sent to Richard M. 
Be Department of Psychology, Washington Uni- 
sity, St. Louis, Missouri, 63130. 
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reduced or eliminated by proper training. The 
present study addresses itself to that issue, 

The phenomenon of illusory correlation was 
first demonstrated by Chapman (1967) when 
subjects observed a random pairing between 
elements of two arrays of words and were 
then asked to report the frequency with 
which a word from one array was paired with 
a word from another array. He found the 
reported frequency of occurrence was biased 
upwards when the word pair was character- 
ized by either strong verbal associative con- 
nections or by distinctiveness. 

In a second study (Chapman & Chapman, 
1967), subjects were shown randomly paired 
symptom statements and Draw-a-Person fig- 
ures, They were then asked to report any 
drawing characteristic that they felt was 
associated with any of the symptoms. The 
relationships that the subjects reported tended 
to be similar to those that some clinicians 
reported from their clinical practices, and 
they basically seemed to rely on associative 
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connections between the drawing character- 
istics and the symptoms. 

In one of the nuclear studies conducted by 

Chapman and Chapman (1969) on the Ror- 
schach, they found that expert diagnosticians, 
on the basis of their clinical experience, and 
naive judges, on the basis of their observa- 
tions of random materials, tended to report 
the same Rorschach signs as being valid indi- 
cators of homosexuality. This was found to be 
the case even though these signs had not been 
demonstrated in the previous research litera- 
ture to relate to the purported problem of 
homosexuality, nor did they have, in the 
experimental task, a nonrandom association 
with the purported problem of homosexuality. 
The basic experimental task used by Chap- 
man and Chapman was to present a series of 
Rorschach cards in which for each Rorschach 
response or percept two statements of an 
emotional problem allegedly given by a pur- 
ported patient were paired to that particular 
part of the Rorschach blot, The task of the 
subjects in the Chapman study was to look 
at these paired associates of the Rorschach 
percept and the statement of the problem and, 
at the end, to indicate whether or not they 
noticed any kind of response that was seen 
most often by patients with this particular 
problem. 

Consistent with previous studies, Chapman 
and Chapman (1969) found that subjects 
picked those percepts that had the highest 
verbal associative connection with the symp- 
tom, in this case homosexuality. In addition, 
Chapman and Chapman (1969) demonstrated 
that subjects still tended to report the popular 
invalid sign * disproportionately even when it 
was paired randomly with the symptom “ho- 
mosexuality,” and the unpopular valid sign 
was paired 100% of the time with the same 
symptom, 

In an extensive replication, Goldin; 
Rorer (1972) came to the same Seay 
They found in replicating Chapman and 
Chapman’s (1969) study on illusory corre- 
lation that little change occurred even when 
the nonillusory (Chapmans’ so-called “valid”) 
cues were paired 100% of the time with the 

symptom of homosexuality and when the 
illusory cues (Chapmans’ so-called “invalid” 
signs) had a randomly paired relationship 
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with the symptom of homosexuality. They 
also found that the different modes of feed- 
back to the subjects and different symptom. 
base rates did not produce differential effects 
in posttraining estimates of the illusory cor 
relation, From this they concluded that train 
ing (as experimentally defined) has very li 
tle effect on the illusory correlation. 

The present study attempts to explore 
whether training can effect the illusory corre 
lation. One of the problems inherent in thé 
previous research is that attempts to modify 
the illusory correlation have only been mini 
mally manipulated with the device of chang: 
ing the amount or the percentage of time that 
nonillusory (the so-called valid) or the ik 
lusory signs (the so-called invalid) are paired 
with the purported problem of homosexuality. 
in the experimental task. Perhaps with morg 
explicit training, that is, warning the subjects 
that they must be on guard against illusory 
correlations, and showing them examples ol 
such associative connections, that perhaps th 
subjects’ tendency to fall prey to illusory 
correlations could be reduced. 

There were four groups in the present study: 
The first group was a replication of a part of 
Chapman and Chapman’s (1969) study (Ex 
periment 2, p. 275) with an equal association) 
that is 50% presentation of all valid and in- 
valid signs with all statements of the patient's 
purported problem. The valid signs were 
those designated by Chapman and Chapmall 
(1969) based on the work of Wheeler (1949) 
Group 2 was a replication of a part of Chap 
man and Chapman’s (1969) study (Experi 
ment 3, p. 277), with valid signs presented 


* For detail of the discussion of how Chapman ant 
Chapman (1969) empirically developed the concep 
of valid and invalid signs, see their article (pp. 272 
275). As the methodology was quite complex and 
Space is limited, suffice it to say that a “valid sign 
was one that they found in the literature to be em) 
Pirically related to actual reported homosexualii 
and “invalid signs” were ones that clinicians be 
lieved predicted homosexuality but in fact did ní 
The valid and invalid signs were either rando! 
assigned to a variety of different purported prol 
lems or were presented either 50% or 100% of 
time with the actual problem of homosexuality. 
Present study duplicates part of their total exPé 
ment, that is, their Experiments 2 and 3. 
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100% of the time with the symptom of homo- 
sexuality and invalid signs presented 50% of 
the time with the symptom of homosexuality. 
Group 3 was a special pretraining of the sub- 
jects against illusory correlation with 50% 
presentation of valid signs, and Group 4 was 
a special pretraining of the subjects against 
illusory correlation with 100% presentation 
of valid signs. It was predicted that Groups 3 
and 4 would show the least amount of illusory 
correlation. In other words, a linear trend was 
expected, with Group 1 showing the greatest 
amount of illusory correlation, then Group 2, 
Group 3, and Group 4 showing the least. 


Method 
Subjects 


Four groups of 15 volunteer subjects of both sexes ? 
each (W =60) were secured from undergraduate 
abnormal psychology classes at Washington Univer- 
sity. The subjects were told that the authors were 
studying training parameters that affect Rorschach 
interpretation and that they needed the help of an 
untrained population to study basic patterns of test 
interpretation. 


Test Instrument 


The same materials used by Chapman and Chap- 
man (1969) were used in the present study. These 
consisted of 30 Rorschach cards that were trans- 
ferred to transparencies. On each of the transparen- 
cies, one percept or response was paired with two 
statements of emotional problems of a purported 
patient who was alleged to have given that specific 
response. The Rorschach percepts were indicated by 
Circling an area of the card and pasting a typed 
Statement of the patient’s verbalization. For example, 
one of the 30 Rorschach responses labeled Bugs 
Bunny was given to the center area (D7) of card 5 
(Beck, Beck, Levitt, & Molish, 1961). In the corner 
of the card appeared the statement, “The man who 
Said this (1) had sexual feelings toward other men 
and (2) feels sad and depressed much of the time.” 
The 30 percepts were chosen so that 6 fell into each 
of the five following categories: (a) popular invalid 
Sign or a sign highly connected with associative 
material, in this case, human or animal anal con- 
tent; (b) Wheeler Sign No. 7 (for more complete 
discussion of the Wheeler signs, see Chapman & 
Chapman, 1969); (c) Wheeler Sign No. 8; (d) geo- 
graphical features, a filler category; and (d) food, 
another filler category. 

The two statements of emotional problems or 
symptoms listed on the cards were drawn from a 
Pool of four such statements, and these were identi- 
cal to those used by Chapman and Chapman (1969). 
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They were (a) “he has sexual feelings toward other 
men”; (b) “he believes other men are plotting 
against him”; (c) “he feels sad and depressed much 
of the time”; and (d) “he has strong feelings of 
inferiority.” The statements of symptoms (diag- 
nostic cues) and Rorschach percepts were paired on 
the 30 cards so that each of the four statements 
appeared 15 times in Groups 1 and 3. Each symp- 
tom statement was thus paired with 3 of the 6 
percepts from each of the five categories of percepts. 
Thus, for Groups 1 and 3 there was no intrinsic 
relationship between the occurrence of any one of 
the four symptoms and any one of the five categories 
of response. In Groups 2 and 4, the valid statement 
(both Wheeler signs) occurred 100% of the time 
with the purported problem of homosexuality, and 
the invalid statements occurred 50% of the time. 


Procedure 


The subjects were given some brief introductory 
information as to the nature of the Rorschach, How- 
ever, no information was given about categories of 
either content or determinants, All groups received 
the following instructions; 


I am going to give you a brief introduction to 
the well-known Rorschach test and then ask you 
to perform some exercises on it, 


As many of you know, the Rorschach is a test of 
personality functioning. It basically consists of a 
set of inkblots to which the subject is asked to 
respond by describing what he perceived in each 
of the inkblots. It is believed that what the indi- 
vidual sees and describes will reveal important 
aspects of his personality. If he is concerned or 
fearful about certain problems or has conflicts in 
certain areas, these may be reflected in his re- 
sponses to the Rorschach inkblots. How the indi- 
vidual perceives different blots or segments of the 
blots, as well as the content of his responses, may 
thus tell us some potentially important things 


about his personality. 


This will suffice as a brief, general overview of 
what the test aims to appraise. 


I am going to show you a series of inkblots, one 
at a time. On each inkblot you will find a typed 
statement of what one patient saw on this blot 
and also what his two chief emotional problems 
are, Each of these 30 cards represents a different 
patient. You will see what 30 different patients 


2 Chapman and Chapman, in their 1969 study, did 
not designate the sex of the subjects. In the present 
study, half of the subjects were male and half were 
female, and they were randomly assigned to each 
of the four training conditions. No further analysis 


was done by sex. 
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said they saw on a card. Now let me tell you what 
I want you to do. Please carefully study each ink- 
blot and the statement of what the patient said 
he saw in it. Also, study the statement of the 
patient’s two severe emotional problems. When 
everyone has looked at all of the cards, I’m going 
to give you a questionnaire in which I will ask 
you about the kinds of things seen by patients 
with each kind of problem. [This paragraph is 
from Chapman and Chapman, 1969, p. 276]. 
Is that clear? All right, then, we'll go ahead. 


Besides the above instructions Groups 3 and 4 also 
received these additional training instructions: 


Before we proceed, however, it is worthwhile to 
also present a few important cautions. Sometimes 
interpretations are made that are not based on 
actual empirical findings about the test. Instead, 
individuals may make interpretations which are 
based on what appear to be verbal or logical 
similarities between the content of the response 
and a certain type of personality problem. Let me 
give you some examples of what I mean. For 
example, since paranoid individuals are seen as 
suspicious, responses which depict “eyes” or “some- 
one staring” may be interpreted as indications of 
paranoid personality—even though this has not 
been really demonstrated. In a similar fashion, 
psychologists have interpreted responses depicting 
explosions or destruction as signifying aggression 
on the part of the person making such responses. 
In other words, these individuals are making some 
associational tie between the response and the 
interpretation—although such interpretations have 
not been supported by subsequent research. This 
is a common error in Rorschach interpretations 
and should be guarded against. You should be on 
your guard and alerted against making such ob- 
vious kinds of associations which seem to go 
together verbally but actually are not related, 
Instead, try to base your conclusions on the per- 
sonality characteristics which have been found to 
be associated with certain kinds of responses to 
the ink blots in the material to be presented to 
you. Since this is a very important aspect, let us 
go over these points again, [This paragraph was 
then repeated.] 


The cards were then circulated in a prearranged 
pattern so that each subject saw each of the 30 cards 
for 60 sec, The order of presentation was systemati- 
cally counterbalanced so that none of the content 
categories (i.e, Wheeler 8, Wheeler 7, geography, anal, 
food) appeared more than once in a sequence, After 
the 30 cards were presented, the subjects were given 
the following questionnaire, which js identical to 
that used by Chapman and Chapman (1969, p. 276). 


Some of the things in inkblots were seen by men 
who had the following problem: 
He had sexual feelings toward other men. 
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Did you notice any kind of general thing that was 
seen most often by men with this problem? 
Yes___— No. 
If your answer is yes, name that kind of thing 
and give one example of that kind of thing. Kind 
of thing 1 

Example 


The identical format was followed for the other 
three diagnostic problems: “He has strong feelings 
of inferiority”; “he feels sad and depressed much of 
the time”; “he believes other people are plotting 
against him.” After the subjects completed this 
questionnaire, they were asked to list any and all of 
the reactions and hypotheses that they formulated 
while participating in the experiment. Following 
this, they were debriefed. 


Results 


To test the major hypothesis that special 
training could affect the illusory correlation, 
a 4X 4 repeated measures analysis of vari- 
ance was performed on dichotomous data 
(Winer, 1962) with repeated measures on the 
second factor. The mean number of illusory 
correlates seen by Group 1 (50% presenta- 
tion valid-invalid signs; no training) was 
7.8; the mean for Group 2 (100 valid, 50% 
invalid; no training) was 8.0; for Group 3 
(50% presentation valid-invalid; training) 
was 9.5; and the mean for Group 4 (100% 
presentation valid, 50% invalid; training) 
was 9.3, 

The main effects for the training groups 
were insignificant, (F < 1; MS for treat- 
ments = .21, df = 3; error = .31, df = 56). 
A significant main effect for diagnostic cue 
was found, F = 19.1, MS for cue = 3.4, df= 
3; MS error = .18, df = 168). The interac- 
tion was also insignificant (F < 1). 

The mean number of illusory correlates 
seen for the cue of sex was 11.5; for inferi- 
ority, M = 10.5; for depression, M = 3.5, 
and for paranoid, M = 9.0, with the differ- 
ence between the means of depression and the 
other three accounting for the significant F 
value. 

Examining the data in the same way that 
Chapman and Chapman (1969) did, a chi- 
Square was computed on the sexual variable 
of homosexuality and broken into four cate- 
gories—(a) number of anal responses seen; 
(b) number of combined valid Wheeler signs 
seen; (c) number of filler items, that is. 
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Table 1 
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Percentage of Illusory Correlates Given by All Subjects to Diagnostic Cues 


—_——— 


Purported patient problems (diagnostic cues) 


Inferiority Sexual 


Depression Paranoid 


Response 
Anal (invalid sign) 2 
Wheeler (valid sign) 13 
Geography 5 
Food 3 
Other 2 
“Looks small” 45 
No response 30 


55 3 0 
20 2 23 
2 5 20 
0 5 3 
0 5 3 
0 3 12 
23 77 40 


geography, food, other; and (d) no illusory 
correlation seen. Since only one response Oc- 
curred in Category 3, this category was 
dropped and the response was placed in Cate- 
gory 4. The obtained frequencies for Category 
1 was 33; for Category 2, 12; and for Col- 
lapsed Categories 3 and 4, 15. The expected 
frequencies were 20 in each category. The ob- 
tained chi-square of 12.9 (df =2) was sig- 


nificant beyond the .005 level and was con- 


sistent with the data reported by Chapman. 
Thus, in this part of the study we completely 
replicated part of the findings of Chapman 
and Chapman (1969, Experiments 2 and 3). 
_ Due to the unexpectedly high number of 
illusory correlates seen for the category of 
inferiority as well as paranoia, all data were 


cast into Table 1, which shows the percentage 


of illusory correlates for the purported prob- 
lems of patients and the content of the Ror- 
schach area, Since the initial category of 
other” was composed predominantly of 
Subjects’ statements that said that the pa- 
tient had problems of inferiority because the 
area seen by the patient was small or tiny, 
this category was then divided into two: 
other” and “looks small.” Thus, a post hoc 
category of “things that look small” * was 
added. The remainder of the other category 
for problems of inferiority accounted for only 
2% of the illusory correlation, whereas the 
New category of “seeing things as small” ac- 
Counted for 45%. On the other hand, the 
expected anal content far outweighed that of 
the Wheeler signs for the problem of homo- 
Sexuality, Fifty-five percent of the subjects 
Bave the response “anal content” for what 
they most often saw from men with sexual 


problems. None of the subjects used the 
category looks small in association with sexual 
problems. There does not appear to be any 
consistent pattern with the purported prob- 
lem of paranoia, as 20% of the subjects used 
Wheeler signs, 20% used geography, and 12% 
used category looks small. The largest ma- 
jority, 779%, saw no illusory correlates in rela- 
tionship to the problem of depression. 

The frequency of percepts being character- 
ized as looks small suggests that the subjects 
may have invented an illusory correlate for 
the problem of inferiority. If this were so, it 
would be significant, since this concept was 
not linguistically built into the experimental 
material. 

Even though the data are convincing for 
Chapman’s position, when one examines the 
sex problems versus the Wheeler signs, it 
appears from the overall data that the phe- 
nomena of illusory correlation may be much 
more complex. Besides the area of content- 
triggered interpretations, it would appear that 
subjects try to find some meaning in the 
clinical material presented. In the case of in- 
feriority, the subjects seem to have selected 
an aspect of the blot configuration associated 
with smallness and then associated this aspect 
with “inferiority feelings” that were not, ina 
strict sense, linguistically triggered by the 


3 The category “seeing things as small” consisted 
of the subjects’ responses to small details of the 
Rorschach blot (see Beck, Beck, Levitt, & Molish, 
1961) and statements of “small in area,” “small 
things,” “tiny things,” and so on. A response was 
not considered belonging to this category if any 
other content was indicated, that is, “small islan UH 
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content. Also, considering the high number 
of illusory correlates (ie., 60%) given by 
subjects to the problem of paranoia, it does 
appear as if there is a strong tendency among 
subjects to find meaning in clinical material 
even if such meaning does not exist. 


Discussion 


In spite of the fact that attempts were 
made to influence the illusory correlation by 
providing a simulated training session for the 
subjects, it was not possible to reduce the il- 
lusory correlation. Under either the 100% or 
the 50% presentation conditions, training was 
no more effective than nontraining conditions 
in reducing the effect of this phenomenon. 
This finding is consistent with that of Chap- 
man and Chapman (1969) and Golding and 
Rorer (1972). It must be recognized, how- 
ever, that the training conditions in this study 
and in the other two just mentioned repre- 
sent a limited attempt to modify a phenome- 
non that appears to be basic in the way peo- 
ple conceptualize diagnostic problems. 
Whether more intensive training with con- 
stant feedback and with numerous examples 
of faulty associational thinking would reduce 
the illusory correlation is a question for fu- 
ture research. This study supports the Chap- 
man and Chapman (1969) position that the 
illusory correlation is a robust phenomenon. 

Although the training conditions were not 
effective in reducing the illusory correlation, 
we replicated part of the Chapman and Chap- 
man (1967) findings that the concept of 
“anality” was predominantly associated with 
preconceived problems of homosexuality re- 
gardless of training conditions and frequency 
of presentation of valid signs, A serendipitous 
finding was also noted in which the subjects 
appeared to create their own illusory corre- 
late. That is, they created a concept of see- 
ing “small things“ and associated it signifi- 
cantly with problems of inferiority. Since 
there was no linguistic content in the stimu- 
lus material per se, it can be hypothesized 
that this represents a structural illusory cor- 
relate, that is, small Segments of the blot 
rather than specific content as such. In other 
words, illusory correlations may span more 

than just simple linguistic associations, This 
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is an area that needs to be explored in further 
research. 

At the end of the experiment, all of the 
subjects were asked for their spontaneous 
comments, ideas, and expectations as to what) 
was being tested. These were recorded, and, 
although they do not allow any systematic 
quantitative analysis, they reveal that the 
majority of subjects expressed strong doubts 
as to whether the Rorschach had any valid 
meaning. In spite of this skepticism on the 
part of the subjects, the majority of them 
formed illusory correlations. The subjects 
also complained about the nature of this 
task, stating that it was excessively long andi 
boring. For Groups 2 and 4 the creation of 
nonrandom contingencies for the Wheeler 
signs means that nonrandom contingencies 
were created for the other three categories, 
This experimental format, of course, is identi- 
cal to that used by Chapman and: Chapman, 
None of the subjects, however, acknowledged 
this either in their responses to the valid 
Wheeler signs or in spontaneous comments 
in the debriefing. 

Another possible factor is the matter of 
potential response biases that may be im 
bedded in the form of the questionnaire used 
by Chapman and Chapman. For instance, 
Chapman and Chapman’s (1969) question 
naire asked: “Did you notice any kind of 
general thing that was seen most often by 
men with this problem? If your answer i 
‘yes,’ name that kind of thing and give ome 
example of that kind of thing.” Such instruc 
tions clearly establish a strong demand chat- 
acteristic for subjects not only to find some 
thing but to pick the most easily remembered, 
or salient, feature. Percepts like ‘anus’ | 
clearly fall into this class. However, Golding 
and Rorer (1972) used a form of prediction 
feedback in their study that was expected t0 
lead to a greater reduction in the amount of 
illusory correlation, but they found that the 
illusory correlation remained a robust ont: 
_ Also neglected in the research is the ques- 
tion of social pressures; that is, all subjects 
were tested in groups. Perhaps some felt that 
Since others were writing down their com] 
ments, they should also be finding things: 
even though they themselves did not believe 
that the task was meaningful. In other words, 


i 


f 
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the questionnaire asked whether people had 
seen something, but the form of the question- 
naire seems to have been interpreted by the 
subjects as a demand to find meaning in some- 
thing that they felt was predominantly puz- 
ding. Future research should investigate this 
by modifying the form or by leaving the task 
response more open-ended. Perhaps, provid- 
ing the subjects with a range of behavioral 
statements about people with so-called prob- 
lems, allowing them to use an actuarially de- 
signed cue sort, or allowing them to describe 
what they thought the predominant problems 
of these subjects were would produce a sig- 
nificantly different picture with regard to 
illusory correlates. With these modifications, 
subjects conceivably might rely less on asso- 
ciationally triggered responses that are paired 


with the diagnostic symptoms in a highly 


artificial, experimental paradigm. 

Finally, it can be noted that the experi- 
mental arrangement developed by Chapman 
and Chapman (1969) does not closely ap- 
proximate the traditional diagnostic process 
of trained clinicians and may limit the gen- 
eralizability of these studies to clinical situa- 
tions, Additional research might focus on 
attempts to do more in vivo studies of clini- 
cians’ actual conceptualizations and investi- 
gate whether the justification for the clinical 
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inferences can be reduced primarily to associ- 
ational variables or whether in fact they rep- 
resent rather illogical or irrational learning or 
teaching that clinicians have incorporated 
during their graduate education. 
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Avoidance-Approach: The Fifth Basic Conflict 


Seymour Epstein 
University of Massachusetts at Amherst 


The basic conflicts are almost always listed as approach-approach, approach- 
avoidance, avoidance-avoidance, and double approach-avoidance conflict. The 
possibility of avoidance-approach conflict, in which a steeper gradient of ap- 
proach intersects a gradient of avoidance, is ignored because it is assumed that 
approach gradients cannot be steeper than avoidance gradients, and that even if 
avoidance-approach conflict could exist, it would be of no interest, as the indi- 
vidual would simply stay away from the conflicting goal. It is demonstrated 
that there is no good reason for assuming that approach gradients cannot be 
steeper than avoidance gradients, and there is considerable evidence that they 
often are, It is further noted that individuals can be placed in situations not of 
their choosing. If an individual with an avoidance-approach conflict were placed 
on the goal side of the intersection of the gradients, the person would enthusi- 
astically approach a goal that had previously been avoided. Thus, avoidance- 
approach conflict can account for ego-alien behavior, such as when a shy, sex- 
avoidant “model” boy commits a violent crime of passion. The implications and 


causes of avoidance-approach conflict in everyday life are discussed. 


In almost all textbooks of introductory and 
abnormal psychology, four basic conflicts are 
listed with accompanying diagrams. These are 
approach-approach, avoidance-avoidance, ap- 
proach-avoidance, and double approach— 
avoidance conflict. Particular attention is 
then given to approach-avoidance conflict, 
because it can account for a variety of clinica! 
phenomena, such as that an individual in 
such a conflict tends to become entrapped at 
a midpoint from a goal and to suffer, as a 
consequence, from a state of heightened drive. 
The entrapment is explained by the assump- 
tion that the gradient of avoidance is steeper 
than the gradient of approach. No considera- 
tion is given to the possibility of a conflict 
which I shall refer to as “avoidance-ap- 
proach” conflict, in which a steeper gradient 
of approach intersects a gradient of avoid- 
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ance. There are probably two reasons why 
avoidance-approach conflict has been ignored. 
One if that it is assumed that the gradient 
of avoidance is necessarily steeper than thé 
gradient of approach. The other is that it is 
assumed that even if a case could be made for 
such conflict on theoretical grounds, it would 


be of no practical interest, as an individual 


with a flatter gradient of avoidance than of | 


approach would simply stay away from the 
conflicted goal. It will be demonstrated in 
this article that approach gradients can be 
Steeper than avoidance gradients, and that 
there are important clinical phenomena, such 
as ego-alien behavior, that can be accounted 
for once this is recognized, 

Lewin (1935), who originally formulated 
the concept of approach-avoidance conflict, 
simply stated without evidence that “the 
negative vector usually increases gradually in 
Strength and finally becomes stronger than 
the positive” (p. 90). It is not clear whether 
he meant the word usually to indicate that 
given a negative incentive, there is usually 4 
goal gradient, which would appear to be self- 
evident, or whether it was to indicate that 
the negative vector usually overtakes the post- 
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tive. In the latter case, it would indicate that 
Lewin did not believe that the positive vec- 
tor is always steeper than the negative one, 
but when it is, some interesting consequences 
follow. 

Miller (1944) also initially assumed with- 
out explanation that avoidance gradients are 
steeper than approach gradients. More re- 
cently, Miller (1959) revised his position and 
noted that there is no intrinsic reason for 
avoidance gradients to be steeper than ap- 
proach gradients, and that whether they are 
steeper or less steep depends on the extent to 
which the gradients are based on inner rela- 
tive to outer cues. He observed that in many 
of the studies in which steeper gradients of 
avoidance were found, the avoidance gradients 
were based on the delivery of punishment, an 
external source of stimulation, whereas the 
approach gradients were based on hunger, an 
internal source of stimulation. It appears that 
there is no logical basis for assuming that 
approach gradients cannot be steeper than 
avoidance gradients. 

As for empirical evidence on steepness of 
approach and avoidance gradients, it has 
been demonstrated that a number of factors 
(cf. review in Heilizer, 1977), such as stimu- 
lus differentiation (e.g., Bugelski & Wood- 
ward, 1951; Hearst, 1962; Saltz, Whitman, 
& Paul, 1963), number of trials (e.g., Elder, 
Kuehne, Clarke, & Larre, 1970; Elder, 
Kuehne, & Moriarty, 1970; Schroeder & 
Gerjuoy, 1965; Weiss, 1960), runway length 
(eg., Clifford, 1973), and mental age (Tem- 
Pone, 1965), are directly related to steepness 
of gradients. Among these, stimulus differ- 
entiation appears to be the most important 
factor, as it can account for the others. Thus, 
the finding that runway length is an impor- 
tant factor can be accounted for by the con- 
sideration that the greater the runway length, 
the more easily the end and the beginning of 
the runway can be differentiated (Saltz et al., 
1963). The influence of number of reinforced 


trials can be attributed to an increase in 


stimulus discrimination that occurs over trials, 
and the influence of mental age can be at- 
tributed to the more accurate discrimination 
of children with higher mental age. Stimulus 
differentiation can also account for Miller’s 
Observation that avoidance gradients have 
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been found to be steeper than approach gradi- 
ents when avoidance is contingent on external 
cues and approach on internal cues, as internal 
cues may not be as easily discriminated as 
external cues. 

It follows from the above that depending on 
such factors as training in stimulus discrimi- 
nation, number of reinforced trials, and de- 
gree to which motives are based on external 
relative to internal stimuli, approach gradi- 
ents can be less steep, more steep, or no dif- 
ferent in steepness from avoidance gradients. 
This conclusion is supported by a number of 
studies that have directly compared the steep- 
ness of approach and avoidance gradients, 
Some have reported avoidance gradients to be 
steeper than approach gradients (e.g., Brown, 
1948; Miller & Kraeling, 1952; Miller & 
Murray, 1952; Murray & Berkun, 1955), 
some have reported approach gradients to be 
steeper than avoidance gradients (e.g., Hearst, 
1960, 1962; Smith, 1965, 1969), and some 
have reported no difference in steepness of 
approach and avoidance gradients (e.g, 
Desiderato, Foldes, & Gockley, 1966; 
Gjesme, 1974; Hearst, 1960; Rigby, 1954). 

Following an analysis of the relative steep- 
ness of approach and avoidance gradients, 
Maher (1966) noted that there is no firm 
support for the assumption that avoidance 
gradients tend to be steeper than approach 
gradients. He observed that the classic studies 
of Brown (1948), widely cited as evidence for 
the assumption, have serious methodological 
flaws. In a study by Maher and Nuttall (in 
Maher, 1966), order of testing was found to 
influence the relative steepness of gradients, 
and Maher concluded that Brown’s finding 
of a steeper gradient of avoidance than of 
approach could be attributed to fatigue ef- 
fects associated with order of testing. Even in 
studies that report no significant difference in 
steepness of avoidance and approach gradi- 
ents, it should be considered that some sub- 
jects produced steeper gradients of approach 
and others of avoidance. That is, there are 
individual differences with respect to relative 
steepness of approach and avoidance gradients. 
In a study by Gewirtz (1959) that directly 
dealt with this issue, reliable individual dif- 
ferences were found in the relative steepness 
of approach and avoidance gradients. In a 
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Figure 1. Approach-avoidance conflict. (The indi- 
vidual approaches the goal when at a distance, vascil- 
lates at the intersection of the gradients, and avoids 
the goal when closer to it. The result is that the 
individual becomes trapped by his or her own mo- 
tives, neither being able to fulfill nor abandon the 
approach motive.) 


recent study in our own laboratory (Losco & 
Epstein, 1977), relative steepness of avoid- 
ance and approach gradients was investigated 
while holding magnitude and quality of in- 
centive constant. Although a weak, significant 
tendency was found for the avoidance gradi- 
ent to be steeper than the approach gradient, 
inspection of the results for individuals re- 
vealed that the phenomenon was far from 
uniform among subjects, with some produc- 
ing much steeper approach than avoidance 
gradients. It was further noted that the 
mean tendency was not very robust and was 
readily canceled out by incidental factors 
such as motor exertion. y 

Considering all the evidence together, it can 
safely be concluded that it is possible for 
approach gradients to be steeper than avoid- 
ance gradients. As already noted, Miller has 
come to the same conclusion. He has indi- 
cated that the only reason that he did not 
pursue its implications is that he found a 
steeper gradient of approach than of avoid- 
ance to be “less perspicuous than the other 
pattern” (Miller, 1959, p. 222). 

A pattern in which a steeper gradient of 
approach intersects a gradient of avoidance 
not only has significant implications in its own 
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right, but it is important to recognize it in 
order to avoid confusion between approach- 
avoidance and avoidance-approach conflict, 
It is instructive, in this respect, to contrast 
the major features of the two types of con- 
flict. There are three major behavioral conse- 
quences for an individual with an approach- 
avoidance conflict. A person with such a 


conflict approaches the conflicted goal at a dis- | 
tance, is blocked and vascillates at an inter- 
mediate point at which the gradients inter- | 


sect, and retreats when closer to the goal (see 
Figure 1). According to Miller, the most 
efficacious treatment for approach-avoidance 
conflict is to reduce the avoidance drive, as 
attempting to force the individual to reach 
the goal can induce unmanageable levels of 


anxiety, and would very likely fail, since | 


avoidance tendencies might well mount more 
rapidly than the combined approach tenden- 
cies. Avoidance-approach conflict also has 
three behavioral consequences. The individual 
in an avoidance-approach conflict avoids the 
goal when on the far side of the intersection 


of the gradients, is hesitant and blocked, at | 


least momentarily, when at the point of inter- 


section of the gradients, and exhibits an ; 


accelerated approach response if placed on 


the goal side of the intersection point (see | 


Figure 2). Thus, in both approach—avoidance 
and avoidance-approach conflict, there are 
manifestations of strong avoidance reactions 
and of blocking and hesitation, although in 
the case of avoidance—approach conflict the 
blocking is of short duration, as movement in 
either direction breaks the deadlock. Consid- 
ering that what is near and far from a goal 
and what is brief and enduring blocking are 
matters of judgment, it is quite possible to 
confuse approach-avoidance conflict with 
avoidance-approach conflict, unless sufficient 
information is obtained by observing behavior 
at different points along time, distance, oF 
cue dimensions. 

Perhaps the most interesting difference be- 
tween approach-avoidance and avoidance- 


approach conflict is that only in the former | 


Case is the individual propelled into the con- 


flict area by his or her own volition and re 


mains trapped there. As a result, in the ab- 
sence of outside pressure, the person experi- 
ences continuous tension. No such stable 
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equilibrium exists in avoidance—approach con- 
flict.’ If left to his or her own volition, the 
individual with an avoidance—-approach con- 
flict would simply remain out of the realm of 
influence of the conflict and would therefore 
not experience enduring stress, As already 
noted, this is probably one of the reasons why 
avoidance-approach conflict has not at- 
tracted any attention. Further consideration, 
however, reveals that avoidance-approach 
conflict is not as benign as the above analysis 
suggests. First, consider that the area of 
conflict to be avoided could be a highly sig- 
nificant one, such as the experience of close 
relationships with others. The individual 
with such a conflict could avoid tension only 
by imposing severe restrictions on his or her 
realm of experience. It might be argued that 
the limitations are imposed only by the avoid- 
ance gradient, and nothing is added by postu- 
lating an avoidance-approach conflict. After 
all, what function can the approach gradient 
play if an individual will not voluntarily ex- 
pose himself or herself to its influence? The 
answer is that any extrinsic motive or cir- 
cumstance beyond the individual’s control 
could propel the individual to a point be- 
yond the intersection of the two gradients. In 
such a circumstance, the person would ex- 
hibit behavior that is radically out of char- 
acter with his or her normal behavior. The 
person would show an accelerating approach 
reaction to a goal that had been assiduously 
avoided up to then. 

If the approach motivation involved anti- 
social behavior, the behavior would appear 
as an ego-alien breakthrough of a destructive 
impulse. An example of such a case is the shy, 
inhibited “model” boy who, suddenly faced 
with a temptation that he had previously 
succeeded in avoiding, commits a bizarre 
ctime of passion. On the other hand, if the 
approach motive were a constructive one, 
such as the expression of socially acceptable 
heterosexual feelings and had been avoided 
because of an excessively broad fear gradient, 
then an accidental encounter from which 
withdrawal was difficult could be therapeutic. 
Once the individual was on the near side of 
the goal, there would be no further conflict, 
and the individual would eagerly approach a 
goal that had previously been avoided. With 
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Figure 2. Avoidance-approach conflict, (The indi- 
vidual avoids the goal when at a distance, is mo- 
mentarily blocked at the point of intersection of 
the gradients, but, if placed by circumstances beyond 
his or her control at a closer point will rapidly and 
intensely approach the goal and exhibit ego-alien 
behavior.) 


repetitive exposure to the same situation, and 
consequent goal attainment, the conflict 
would gradually be extinguished. 

Given the significant consequences of 
pathological levels of avoidance-approach 
conflict, it is important to consider the condi- 
tions in real life that could give rise to it, As 
noted earlier, according to Miller, a critical 
condition for determining the relative steep- 
ness of approach and avoidance gradients is 
the degree to which response tendencies are 
influenced by inner relative to external cues. 
When a high proportion of inner cues is in- 
volved in an avoidance gradient, the gradient 
will be relatively flat and conducive to the 
development of avoidance-approach conflict. 
In humans, inner cues frequently consist of 
thoughts and images. To the extent that an 
individual is trained to believe that the ex- 
pression of an impulse in any form is bad, the 
individual will have a broad avoidance gradi- 
ent, Now consider that the same individual's 
approach tendencies are minimally mediated 
by inner responses, so that if impulses to 


1 This, of course, is also true of approach—approach 
conflict and of avoidance-avoidance conflict when 
there are no restraints against leaving the field. 
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approach are aroused, it is apt to be because 
of the presence of external cues, Such an in- 
dividual would have a steeper approach than 
avoidance gradient, because the avoidance 
tendencies are internalized to a greater extent 
than the approach tendencies, and this could 
even be the case if the approach tendencies 
were activated by a combination of physio- 
logical and external cues, as in the sex drive. 

Given sufficiently broad injunctions against 
the expression of an impulse in any circum- 
stances, there is obviously no need to dis- 
criminate between the conditions in which 

+ the impulse should or should not be expressed. 
It will be recalled that poor stimulus dis- 
crimination is a major factor in producing 
broad generalization gradients, Thus, whether 
one wishes to analyze the comparative steep- 
ness of gradients by considering the relative 
contribution of inner to outer cues, or by con- 
sidering the role of stimulus discrimination, 
one arrives at the same conclusion, namely, 
that strong categorical prohibitions against 
the expression of an impulse foster broad 
avoidance gradients and are conducive to the 
development of an avoidance-approach con- 
flict. 

The relative strengths, or heights, of ap- 
proach and avoidance gradients are obviously 
critical factors in determining whether an 
avoidance—approach conflict will occur. If the 
approach motive is weak enough compared to 
the avoidance motive, the avoidance motive 
will, of course, dominate it at all levels, and 
the individual will simply exhibit a general- 
ized avoidance tendency to all motive-rele- 
vant stimuli. The opposite will be the case if 
the approach motive is sufficiently strong 
compared to the avoidance motive. Conflict 
can only exist within a restricted range of 
relative strengths of the gradients that allows 
them to intersect. Given such conditions, the 
stronger the approach and avoidance drives, 
the more extreme the conflict-related phe- 
nomena that will be exhibited, and thus the 
greater the potential for extreme ego-alien 
behavior, It follows that the conditions for 
producing pathological levels of avoidance— 

approach conflict are those that foster the 

internalization of intense and broad prohibi- 


bw against inherently strong approach mo- 
ives, 
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Given the above analysis of avoidance- 
approach conflict, it follows that overcon 
trolled persons are more apt to exhibit bizarre 
ego-alien behavior than undercontrolled per- 
sons. This conclusion is supported by studies 
of criminal violence, which show that the most 
extreme crimes are committed by individuals 
whose hostility is characteristically overcon 
trolled (cf. Blackburn, 1968; Haven, 1973; 
Megargee, 1966, 1971; White, McAdoo, & 
Megargee, 1971). 

There are some interesting implications for 
psychotherapy that follow from an analysis 
of avoidance-approach conflict. For one, if 
the approach impulses are nondestructive, as 
in the case of certain sexual feelings, then | 
persuasion, coercion, and confrontation should | 
serve a useful role, as once the individual is | 
brought to the goal side of the point of inter 
section of the gradients, the conflict will be 
resolved. This can be contrasted with ap- 
proach-avoidance conflict, in which the same 
techniques are contraindicated, as pressure to 
approach what is feared is apt to induce un- 
manageable levels of anxiety and therefore 
cause the patient to withdraw from therapy 
(cf. Dollard & Miller, 1950). Thus, it is im- 
portant for the therapist to discriminate be 
tween avoidance-approach and approach- 
avoidance conflict. j 

Tt also follows that the therapist must dis” 
tinguish an avoidance-approach conflict it | 
which the approach motive is socially accept- 
able from one in which it is destructive. It i$ 
obviously unwise to encourage an individual , 
with the latter type of conflict to proceed t0 
the goal side of the intersection point of the | 
gradients or to lower the fear gradient, a 
either would result in the expression of de 
structive behavior. It is noteworthy, in this 
Tespect, that lowering the fear gradient is the 
Procedure recommended by Dollard and 
Miller (1950) for treating approach-avoid- 
ance conflict. ] 

The method of choice for treating a? 
avoidance-approach conflict when the ap 
proach motive is potentially destructive is 4 
increase discrimination of the avoidance di 
mension so that selective avoidance reactions 
against extreme and inappropriate expression 
of the approach motive can be substituted oF | 
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a blanket prohibition against its expression in 
any form. 

It is interesting to consider that the dif- 
ferences in therapeutic approaches recom- 
mended for approach-avoidance and avoid- 
ance-approach conflicts with socially accept- 
able and unacceptable motives correspond to 
major emphases in different schools of psy- 
chotherapy. For approach-avoidance con- 
flict, the method of choice recommended by 
Dollard and Miller (1950) is reduction of 
fear, or lowering of the avoidance motive. 
This appears to be appropriate when the ap- 
proach motive involves socially acceptable 
tendencies. For avoidance-approach conflict, 
when the approach motive is socially accept- 
able, the method of choice is to induce the 
individual to arrive at the goal side of the 
intersection of the gradients. Although re- 
duction of fear might ultimately work, other 
procedures, such as encouragement, coercion, 
and environmental manipulation provide a 
simpler and more efficient form of treatment. 
It is noteworthy that such procedures are 
frowned on by adherents of depth psychology, 
such as psychoanalysts, and by adherents of 
accepting forms of treatment, such as client- 
centered therapists, but are routinely prac- 
ticed by counseling psychologists, rational- 
emotive psychologists, and psychodramatists. 
For both approach—avoidance and avoidance— 
‘approach conflicts in which the inhibited 
drive is destructive, the method of choice is 
to teach discrimination, both with respect to 
the stimulus dimension and the response di- 
mension, with the aim of ultimately replacing 
an all-or-none system of total avoidance or 
total impulse expression with a system that 
facilitates modulated control. Schools of 
therapy that emphasize such an approach 
among other techniques include psychoanaly- 
sis, rational-emotive therapy, and, in some 
Cases, behavior therapy. 
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Effective Ingredients in Psychotherapy: 
Prediction of Outcome From Process Variables 


Beverly Gomes-Schwartz 
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This study was designed to examine the impact of (a) exploration of the psy- 
chodynamic roots of patients’ conflicts, (b) warmth and friendliness of the 
therapist-offered relationship, and (c) positiveness of patients’ attitudes toward 
working in therapy on the outcome of brief therapy with 35 college males ex- 
hibiting symptoms of depression, anxiety, and social introversion, Analyses of 
process ratings for audiotaped segments from four sessions throughout the 
course of therapy revealed that the activities of therapists of differing theoret- 
ical orientations and of professional versus untrained, “inherently helpful” 
therapists could be distinguished. Although patients’ attitudes toward the ther- 
apist and patient involvement in the therapy process did not differ as a function 
of the type of therapist, the process dimension that most consistently predicted 
therapy outcome was patient involvement. Exploratory processes and therapist- 


offered relationship had a lesser influence on outcome. 


Questions about how psychotherapy works— 
what qualities in the patient, the therapist, and 
the process of their interaction contribute to 
the amelioration of the patient’s psychic 
distress—have generated considerable debate 
among proponents of varying systems of 
psychotherapy. Some of the issues that have 
prompted the widest discussion include (a) 
the relative importance of the patient- 
therapist relationship, as opposed to specialized 
techniques of intervention, and (b) the 
relevance of patients’ attitudes toward the 
therapist and the therapy process. 
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Relationship Versus Technique 


Rogers (1957) asserted that the consistent 
communication of genuine warmth and em- 
pathic understanding by the therapist is 
sufficient to produce constructive personality 
change. As long as the therapist is able to 
offer the patient a warm human relationship, 
even the most recalcitrant psychotic patients 
can eventually be reached (Rogers, Gendlin, 
Kiesler, & Truax, 1967). 

In contrast, psychodynamic therapists have 
emphasized the importance of exploratory 
techniques—clarification, interpretation, con- 
frontation—for producing the cognitive and 
emotional insight considered instrumental for 
change (Bibring, 1954; Glover, 1955; Langs, 
1973). Even though a number of analytically 
oriented therapists have noted the significance 
of the patient-therapist relationship or thera- 
peutic alliance (Greenson, 1967 ; Zetzel, 1956), 
there are fundamental differences between 
most dynamic therapists and the Rogerians. 
The analytic therapist is cautioned to maintain 
the role of an expert healer rather than try to 
be a friendly or equal partner in an inter- 
personal relationship. Although the trust and 
rapport engendered by a good therapeutic 
relationship may be necessary to facilitate 
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engagement in an exploratory process, the 
relationship itself traditionally has not been 
considered the primary force for change by 
analytic therapists. (For a divergent view 
from within the analytic mainstream on the 
importance of the therapeutic relationship, 
see Kohut, 1971.) 

Empirical analyses of the relative importance 
of the therapeutic relationship and exploratory 
techniques in promoting positive change have 
yielded equivocal results. Although some re- 
searchers (e.g., Truax, 1963 ; Truax & Mitchell, 
1971) were initially able to demonstrate that 
high levels of the relationship variables pro- 
posed as necessary and sufficient for thera- 
peutic change by client-centered theorists (i.e., 
accurate empathy, unconditional positive re- 
gard, and congruence) were correlated with 
therapy outcome, their results have not been 
replicated in more recent studies (cf. Beutler, 
Johnson, Neville, & Workman, 1972; Garfield 
& Bergin, 1971; Mullen & Abeles, 1971; 
Sloane, Staples, Cristol, Yorkston, & Whipple, 
1975). More significantly, serious criticisms of 
the conceptualization of the scales for mea- 
suring ‘‘therapist-offered conditions” and the 
methodology of much of the research have 
been raised (Chinsky & Rappaport, 1970; 
Gomes-Schwartz, Hadley, & Strupp, 1978; 
Gormally & Hill, 1974; Blackwood, Note 1). 

Research studying other measures of the 
therapeutic relationship has also yielded in- 
consistent results. Adult outpatients in ana- 
lytically oriented therapy (Feifel & Eells, 1963; 
Strupp, Fox, & Lessler, 1969) and clients at 
a college counseling center (Saltzman, Luetgert, 
Roth, Creaser, & Howard, 1976) who felt that 
their therapists were warm, understanding 
and respectful of them were more likely to be 
satisfied with their therapy experience and to 
manifest improvement than were patients who 
saw their therapists as indifferent, bored, or 
irritated. In contrast, ratings based on Fiedler’s 
(1950) measure of the “ideal therapeutic rela- 
tionship” bore a minimal relationship to out- 
come of analytically oriented group therapy 
(Parlofi, 1961) and no relationship to outcome 
1061) ial counseling (Gonyea, 1963; Lesser, 
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related to therapy outcome. The value of 
Malan’s (1976) finding that the frequency of 
interpretations linking early family experiences 
with the patient-therapist relationship (je, 
transference/parent links) was related to the 
outcome of brief psychoanalytic therapy is 
limited by serious methodological deficits, in- 
cluding the use of session notes rather than 
recordings or transcripts of the interviews and | 
ratings contaminated by raters’ knowledge of 
treatment outcomes. In contrast to Malan’s 
results, among a sample of similar adult out- 
patients in dynamic therapy, Sloane et al, 
(1975) found that the frequency of interpretive 
statements was negatively correlated with im- 
provement on target symptoms. These con- 
flicting results suggest that the value of inter- | 
pretive techniques may be largely determined 
by the patient’s response to the interpretation, 
If a patient is generally resistant to self- 
exploration, even the most perceptive inter- 
pretations may be useless. 

Unfortunately, the few studies of the rela- 
tionship between therapy outcome and the 
patient’s engagement in self-exploration (as 
measured from a client-centered perspective) 
have yielded equivocal results. Even though 
Truax and Carkhuff (1967) reported data from 
Several unpublished studies substantiating 
their hypothesis that greater self-exploration 
was related to positive outcome, others (Kurtz 
& Grummon, 1972; Sloane et al., 1975) have 
found no relationship between self-exploration 
and outcome. 


Patients’ Attitudes and Behavior in Therapy 


Unlike Rogers (1957), who asserted that 
virtually any patient could be successfully 
treated provided the therapist offered sufficient 
warmth, empathy, and genuiness, many others 
have Suggested that the ways in which patients 
view their problems and the therapeutic enter- 
Prise can influence patients’ responses to ther- 
apy and, consequently, the benefits that they 
derive from treatment. Dynamic therapists 
(Castelnuovo-Tedesco, 1975; Malan, 1976; 
Sifneos, 1972; Strupp, 1973) have asserted that 
appropriate therapy candidates must have 

th the capacity and the motivation to form 
an intense interpersonal relationship with the 
therapist and to withstand the stresses of in- 
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sight-oriented therapy. Although there is some 
empirical evidence that variables such as ego 
‘strength and motivation are related to outcome 
‘(see reviews by Gomes-Schwartz et al., 1978; 
Luborsky, Chandler, Auerbach, Cohen, & 
Bachrach, 1971; Strupp & Bergin, 1969), the 
‘amount of outcome variance that can be pre- 
dicted from such pretherapy measures has 
generally been quite small (Auerbach, Lubor- 
sky, & Johnson, 1972; Fiske, Cartwright, & 
Kirtner, 1964). 

More striking data on the importance of 
patient attitudes come from studies of behavior 
in early therapy sessions. Patients who were 
involved in the therapy process from the outset 
of treatment—acknowledging their own re- 
sponsibility for changing their behavior and 
actively examining their feelings and experi- 
ences—were most likely to improve (Kirtner 
& Cartwright, 1958; Rice & Wagstaff, 1967; 
Saltzman et al., 1976). In contrast, patients 
who viewed their problems as externally im- 
posed or who distanced themselves from the 
therapy interaction through defensive ma- 
heuvers such as intellectualization were un- 
likely to benefit from therapy (Kirtner & 
Cartwright, 1958; Rice & Wagstaff, 1967). 

Frank (1973) carried the argument for the 
salience of patient attitudes even farther by 


more important determinants of therapy out- 
come than the techniques that a therapist uses. 
Studies tapping such factors in therapy have 
yielded some support for Frank’s position. 
Although there is inconsistency in the litera- 
ture (Wilkins, 1973), a number of investigators 
have found a significant relationship between 
‘Patients’ expectations that they will benefit 
from therapy and treatment outcome (e.g., 
Friedman, 1963; Goldstein & Shipman, 1961; 
Lipkin, 1954; Martin, Sterne, Moore, & 
Friedmeyer, 1976). Furthermore, results from 
Several outcome studies suggest that patients 
(or subjects) who believe that the “placebo” 
treatments they are receiving are potent thera- 
Peutic interventions may experience symptom 
telief comparable to that of treated patients 
(Frank, Gliedman, Imber, Stone, & Nash, 
1959; Paul, 1966; Smith, 1976). 

These results suggest that the content of the 
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therapist’s communications (i.e. interpreta- 
tion, reflection of feelings) may be less im- 
portant than the patients’ perceptions that 
they are receiving help. A logical corollary of 
this idea is that any concerned, helpful listener, 
with or without professional training as a 
psychotherapist, can ameliorate psychic dis- 
tress if he or she offers the patient a warm, 
supportive relationship and if the patient 
has confidence in the “therapist’s” abilities 
(Torrey, 1972; Truax & Mitchell, 1971). 
Empirical data on the effectiveness of non- 
professionals or “inherently helpful” listeners 
have been conflicting, however. Findings that 
untrained or minimally trained nonprofes- 
sionals were potent therapeutic agents (Cark- 
huff & Truax, 1965a; Poser, 1966) can be 
contrasted with findings that interventions by 
some nonprofessionals actually exacerbated 
symptoms (Fo & O’Donnell, 1975; Sines, 
Silver, & Lucero, 1961). 


Reflection of Theoretical Differences in the 
Therapy Process 


Just as proponents of client-centered, psy- 
chodynamic, and “nonspecific factors” theories 
of the therapeutic change process differ in their 
conceptualization of effective therapeutic in- 
gredients, therapists subscribing to varying 
theoretical systems appear to differ in their 
practice of psychotherapy. Results from 
surveys of therapists’ usual practices (Rice, 
Gurman, & Razin, 1974; Sundland & Barker, 
1962; Wallach & Strupp, 1964) and from 
several studies of therapists’ behavior in actual 
or simulated interviews (Strupp, 1955, 1958, 
1960) have indicated that analytic therapists 
emphasized exploratory responses such as 
questioning and interpretation, whereas client- 
centered therapists consistently relied on “Te 
flection of feelings.” Although data from 
these studies also indicated that analytic 
therapists advocated a more formal, pro- 
fessional relationship with their patients than 
did Rogerian therapists, other investigators 
found no differences between “nondirective” 
(or humanistic) and psychodynamic thera- 
pists on variables such as warmth, empathy, 
and genuineness (Fischer, Paveza, Kickertz, 
Hubbard, & Grayston, 1975) or insensitivity, 
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punitiveness, understanding, and acceptance 
(Fiedler, 1950). 

Comparisons of professional therapists with 
nonprofessional “helpers” have indicated that 
psychotherapy training has an important 
impact on therapeutic techniques. In an 
analogue of the initial therapy interview, un- 
trained college students emphasized lead- 
ing responses, particularly direct questions 
(D’Augelli, Danish, & Brock, 1976). Similarly, 
Bohn (1965) found that in responding to tape- 
recorded enactments of a “hostile,” ‘“de- 
pendent,” or “typical” client, relatively ex- 
perienced graduate student therapists most 
often used responses categorized as restatement 
of content or clarification of feelings, whereas 
naive undergraduate “counselors” relied most 
heavily on reassurance, persuasion, direct ques- 
tioning, and forcing the topic. Using a similar 
set of recordings, Parsons and Parker (1968) 
found that psychiatric residents were signifi- 
cantly less directive than either senior medical 
students or college undergraduates. 

In contrast to the obtained differences in 
style of intervention, results from two studies 
indicated that nonprofessional therapists could 
not be distinguished from professional thera- 
pists on measures of “core conditions.” Cark- 
huff and Truax (1965b) found that with 
limited training, psychiatric aides were able 
to offer levels of warmth and empathy com- 
parable to those offered by advanced graduate 
students and experienced therapists. In initial 

therapeutic” interviews, untrained college 
student volunteers were as warm, genuine, and 
empathic as experienced psychiatrists and 


psychiatric residents (Pope, Nudler, VonKorff, 
& McGee, 1974). not 


Hypotheses 


In summary, there are varying definitions of 
the effective ingredients in psychotherapy. 
Client-centered theory emphasizes the curative 
powers of the good human relationship, whereas 
psychodynamic theory indicates that a good 
relationship is not sufficient to induce enduring 
personality change. Both the psychodynamic 
perspective and Frank’s (1973) concept of 
nonspecific factors take into account the role 
of patients’ attitudes, whereas client-centered 
theory carries with it the assumption that all 
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patients are equally amenable to therapy, In 
his reformulation of the concepts of 
therapeutic influence, Strupp (1973) has su 
gested that all three elements—the qualit: 
the relationship that the therapist offers, the 
reconstructive learning experiences that the 
therapist mediates, and the patient’s willing 
ness and capacity to engage in the therapeutic 
interaction—are determinants of therapeutic 
change. Given that there have been no a 
tempts to examine the relative contributions 
of all three factors in a unified research design, 
the present study was designed to 
1. assess the influence of theoretical orienta- 
tion and psychotherapy training on the thera- 
peutic interactions of analytically oriented, 
experimental, and untrained, inherently helpful 
counselors. Experiential therapists and non- 
professional therapists chosen on the basis of 
their interpersonal skill (identified as alternate 
therapists in this study) were expected tol 
maintain friendlier, more personal relationships 
with their patients than analytic therapists, 
Analytic therapists and their patients were 
expected to engage in greater exploration of 
underlying psychodynamics than either experi- 
ential or alternate dyads, Alternate therapists) 
were expected to be more directive than pro- 
fessionals of either theoretical orientation. | 
2. determine the relative impact on the 
outcome of therapy of (a) engagement in exif 
Ploratory processes, (b) quality of the thera 
Pist-offered relationship, and (c) degree of 
patient involvement in the therapy interaction} 
Expectations concerning the relative influence 
of each dimension may be viewed as tests of 
competing theories. From the psychodynamic} 
Perspective, high levels of exploration (e.g., ind 
terpretation, clarification) and high patient 
involvement (eg., willingness to communicate; 
trust in the therapist, recognition of the paf 
tient’s own responsibility for effecting change) 
should be the best predictors of outcome, 
From the Rogerian perspective, high levels of 
warmth and personal involvement on the par 
of the therapist should be the most importan’} 
Predictor of change. Finally, from Frank’) 
“nonspecific factors” perspective, both the 
quality of the therapist-offered relationship! 
and the patent’s attitude toward therapy} 
should be the strongest determinants of chang@ 
If the relationship between process and outi 
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come is similar across the three groups, this 
would lend support to the hypothesis that 
common ingredients account for therapeutic 
change regardless of the theoretical orientation 
of the treatment. If, however, the therapists in 
each treatment group do differ on the process 
dimensions hypothesized as effective ingredi- 
ents of psychotherapy and some of these 
process dimensions are relatively more potent 
predictors of outcome, analytic, experiential, 
and alternate therapists should achieve differ- 
ent outcomes. 


Method 
Subjects 


Patients. The patients were 35 unmarried, male 

college students with elevated scores (T > 60) on the 
Depression (2), Psychasthenia (7), and Social Intro- 
fersion (0) scales of the Minnesota Multiphasic Person- 
lity Inventory! (MMPI) who had participated in a 
psychotherapy outcome study. Patients had been 
directly referred from a university counseling center or 
had responded to a letter announcing a special counsel- 
ing program designed to deal with difficulties in interper- 
sonal relations, anxiety, and shyness. In previous re- 
search, similar elevations on the Depression and 
Psychasthenia scales had proved to be valid indicators 
of enduring psychological difficulties (Strupp & 
Bloxom, 1975). 
,, Patients were assigned on a rotational basis to either 
» professional or a nonprofessional therapist. Ten 
tients were seen by analytic therapists, 10 by ex- 
jeriential therapists, and 15 by alternate therapists. 
Patients were offered up to 25 sessions of therapy on 
1 once- or twice-a-week basis. Mean durations of 
herapy for cases seen by analytic, experiential, and 
ilternate therapists were 18.9 (SD = 6.5), 16.3 (SD 
= 6.9), and 17.4 (SD = 7.2) sessions, respectively. 

Therapists. The professional therapists were four 
nale psychiatrists (M experience = 23.5 years) and 
our male psychologists (M experience = 15.0 years) 
ho were respected clinicians within the Nashville 
ommunity. The psychiatrists (analytic therapists) all 
dentified psychoanalytic theoreticians as major pro- 
essional influences. In contrast, the psychologists (ex- 
eriential therapists) cited the writings of Carl Rogers 
s having a major impact on their techniques. 

The alternate therapists were seven experienced 
M years since PhD = 17.0) male college professors who 
ad been identified by university administrators, other 
“culty, and students as teachers who were frequently 
Pproached by students for personal counseling. These 
tofessors were affiliated with a variety of academic 
‘partments including mathematics, English, history, 
nd philosophy. Alternate therapists had been enlisted 
Participate in a study ostensibly to determine 
hether inherently helpful people without formal psy- 
wtherapy training could aid college students in dealing 
ith their problems in much the same ways that 


uy 


1027 


professional therapists did. They were instructed to 
behave in the therapy sessions as they usually did when 
students consulted them for personal advice or counsel- 
ing. They were specifically advised not to make a special 
effort to read about psychotherapy. 


Instruments 


Process scales. To measure the process of psycho- 
therapy, it was necessary to use an instrument that was 
sufficiently sensitive to capture the quality of the inter- 
action, yet did not require the rating of dimensions 
so abstract as to preclude reasonable interrater agree- ` 
ment. The Vanderbilt Psychotherapy Process Scale 
(VPPS; Strupp, Hartley, & Blackwood, Note 2), an ` 
84item, Likert-type scale, adapted from earlier work 
by Orlinsky and Howard (1967) to rate the therapy 
hour from the perspective of a clinical observer, seemed 
an appropriate instrument. 

In an earlier study with a smaller sample of subjects, 
eight internally consistent subscales derived from the 
instrument by a priori content analysis successfully 
discriminated among the three treatment groups— 
analytic, experiential, and alternate (Gomes-Schwartz 
& Schwartz, 1978). However, these original scales did 
not seem to precisely tap the process dimensions that 
were hypothesized as predictors of therapy outcome. To 
obtain another perspective on how the individual items 
might be related, a principal components factor analysis 
with varimax rotation? was performed on the data from 
the Gomes-Schwartz and Schwartz (1978) study. 
Rotated factors with eigenvalues > 1 were defined by 
items loading > .50. 

From these factors seven scales were derived that 
tapped dimensions hypothesized as predictors of out- 
come and that proved to be internally consistent and 
reliably rated in the present study. (a) Patient Explora- 
tion (7 items, coefficient a = .83, interrater r = .88) 
gauged the patient’s level of self-examination and ex- 
ploration of feelings and experiences, (b) Therapist 
Exploration (7 items, a = .91, r = .93) gauged the 
degree to which the therapist attempted to examine the 
psychodynamics underlying the patient’s problems. 
(c) Patient Participation (7 items, a = .86, r = 16) 
tapped the degree to which the patient was actively 
engaged in the therapy interaction (e.g., initiating 
discussions, not inhibited, etc.). (d) Patient Hostility 
(6 items, æ = .84, r = .82) measured the level of nega- 
tivism, hostility, or distrust displayed by the patient. 
(e) Therapist Warmth and Friendliness (10 items, 
æ = .83, r = .60) measured the therapist’s warmth, 
caring, and emotional involvement with the patient. 
(f) Negative Therapist Attitude (3 items, a = .65, 


1Scores on Scales 2, 7, and 0 were not necessarily 
the only elevations or the highest elevations on the 
patient’s profile. Thus not all of the patients could be 
categorized as 2-7-0s. j 

2 Although factor analysis is often of questionable 
value with small samples, it was felt that this technique 
might generate some meaningful combinations that 
had been overlooked in the a priori content analysis. 
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r = .85) gauged attitudes that might be expected to 
threaten or intimidate the patient. (g) Therapist 
Directiveness (5 items, a = .88, r = .83) measured 
directive interventions such as offering advice and 
modeling behavior. Although this scale was expected to 
discriminate professional from nonprofessional thera- 
pists, a relationship between Therapist Directiveness 
and treatment outcome was not predicted. 

Outcome measures. In previous psychotherapy re- 
search (cf, Cartwright, Kirtner & Fiske, 1963; Gar- 
field, Bergin, & Prager, 1971), correlations between 
outcome indices have generally been low. Therefore, 
it seemed important to assess treatment effects from 
several perspectives. Three global and three indi- 
vidualized measures of outcome were selected. 

At the conclusion of treatment, both the therapist 
and an experienced clinical interviewer rated change 
(6-point Likert-type scales) in (a) severity of the 
patient’s problems; (b) level of the patient’s distress; 
and (c) quality of the patient’s functioning in his social, 
work, and academic roles. Scores on the three items were 
summed to yield, respectively, therapists’ and clini- 
cians’ overall ratings of improvement. Global improve- 
ment from the patient’s perspective was assessed 
through residual gain scores on an Minnesota Multi- 
phasic Personality Inventory (MMPI) index of malad- 


Table 1 


Multiple Discriminant Analysis Jor Process Variables 
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justment (Cooke, 1967). To obtain the MMPI malad: 
justment score, each of the standard MMPI scales 
plus Welsh A, Welsh R, and Barron Æs were assigne 
beta weights and summed. 

Patients’, therapists’, and clinicians’ ratings of im 
provement on the target complaints that patients h 
presented at the outset of therapy were used as mo 
individualized outcome indices. Since patients hi 
been permitted to specify up to three problems that 
they wished to resolve in therapy, scores were based 
on the average change across all of the patient’s targets, 


Procedure 


Two advanced clinical psychology graduate students 
who were not familiar with any of the therapists in the 
study were selected as raters to reduce the possibility 
of ratings being biased by knowledge that a therapist 
was not a professional or by raters’ expectations about 
a therapist based on prior experiences with him, 

To provide a broad representative sample of the 
interaction audiotaped segments from the third inter. 
view, the interviews one half and three fourths of thi 
way through treatment and the next-to-last intervie 
were selected for rating. Starting at a randomly deter: 


Treatment group 


y Univariate 
Process variable Analytic* —_ Experiential* Alternate Fe 
Patient Exploration 
M 
138.70 120.30 93.13 
sD ; 14.83 16.33 15.93 ie 
peep Exploration y 
166.10 130.30 90.80 
SD d 51,5288 
racat Participation a tee ie 
196.60 192.30 203. 
je k .87 <1 
Patient Hostility 24.95 24.90 23.72 
an oe 100.20 86.00 1.49 
T Warmth and Friendliness i by eee i 
D 190.40 227.50 224.80 + 
TER BENG 17.93 29.76 21.65 p 
aaiye therapist Attitude à h 
SD 43,50 39.60 36.93 1 
Therapist Directiveness ara p00 13-2 s 
M 
53.10 88.00 
SD 12.53 13.72 Dis ie 
Note. Wilks à = .08; F(14, 52) = 9, 
pees ) = 9.22, p < .0001. 
ba = 15, 
edf = 2, 32. 
*p <.05. 
> <.01 


> < 001. 
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mined point in each recording (e.g., 11, 19, 13 minutes 
from the beginning), the succeeding 10-minute segment 
was recorded onto a master tape such that the segments 
from all cases and all sessions were presented in random 
sequence, Independent ratings were made at the con- 
clusion of each 10-minute segment. Since there were 
no systematic differences in process scores attributable 
to the time sequence of the segment, all Fs(3, 96) < 
144, ns, or the interaction of time sequence and 
treatment group, all Fs(6, 96) < 2.01, ms, overall scores 
for each proces le were obtained by summing across 
both the two r: and the four segments. 


Results 
Differences Among the Treatment Groups 


Process variables. To test the hypothesis 
that therapy interactions differed according to 
the theoretical orientation or professional- 
nonprofessional status of the therapist, differ- 
ences among the three groups—analytic, ex- 
periential, and alternate—on the seven process 
scales were assessed through a multiple dis- 
criminant analysis. This analysis, presented in 
Table 1, yielded a significant overall difference 
among the groups (Wilks’ \ = .08), F(14, 52) 
= 9.22, p < .0001. Examination of differences 
on individual variables using Dunn’s multiple 
Comparison procedure (Dunn, 1961) revealed 
that the hypothesized relationships among 
treatment groups were generally obtained. 

Both alternate and experiential therapists 
mok greater Therapist Warmth and 
ee tess than analytic therapists (p < .01). 
ie cases received higher scores than 
a experiential or alternate cases on 
ie et Exploration (ps < .05 and .01, re- 
A ively) and Patient Exploration (ps < .01 

001, respectively). In addition, the ex- 


i aad group received higher ratings than 
alternates on both Patient Exploration 


oo? and Therapist Exploration ($ < 01). 
€ hypothesis that alternate therapists 


_ Woi À “ p, . 
uld receive higher ratings than analytic and 


€xperj, . 
~Petiential therapists on Therapist Direc- 


iv : 5 
4) ness received partial support. Alternate 


ee were more directive than analytic 
erapists (p < .01). However, experiential 
ete si s did not differ from alternates and 
anal Significantly more directive than the 
ag group (p < .01). 
ifferences among groups were obtained 


rati ` : 
Hio tings of Patient Participation, Patient 


stili 
tility, and Negative Therapist Attitude. 
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Table 2 


Multiple Discriminant Analysi 
pil ysis for Outcome 


Treatment group 


Outcome Experi- 
variable* Analytic? ential»  Alternate® 
Overall rati 
Clinicians ned 
M 6.20 5.15 5.40 
SD 3.51 3.84 2.84 
Therapists 
M 6.10 5.30 4,93 
SD 3.77 4.65 2.23 
MMPI mal- 
adjustment 
M 50.03 50.82 49,43 
SD 11.41 8.18 10.05 
Target complaints 
Patients 
M 2.56 1.92 2.19 
SD 1.27 1.65 1.68 
Therapists 
M 1.76 1.81 2.36 
SD 1,28 1,64 1.12 
Clinicians 
M 1.73 1,69 1.97 
SD 1,25 1.79 1.90 


Note. Wilks à = .69; F(12, 54) = .92, ms, All uni- 
variate Fs(2, 32) < 1. 

«For all variables except Minnesota Multiphasic 
Personality Inventory (MMPI) maladjustment, 
higher scores indicate greater change. 

bn = 10. 

en = 15. 


Outcome variables. A discriminant function 
analysis comparing the treatment groups on 
the six outcome criteria yielded neither a 
significant overall difference (Wilks à = 69), 
F(12, 54) = .92, ns, nor differences on indi- 
vidual outcome criteria (see Table 2). 


Prediction of Outcome from Process Dimensions 


The relationship between psychotherapy 
process and outcome was examined in several 
ways to determine (a) which process dimen- 
sions (i.e., exploratory processes, patient in- 
volvement, therapist-offered relationship) were 
the best predictors of outcome, (b) whether 
the relationships between process dimensions 


and outcome variables were primarily due to 


the contribution of individual dimensions oe 
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Table 3 


Multiple Regression Predicting Outcome from Process Dimensions 
ani e—a 
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Therapist-offered 


Exploratory processes Patient involvement relationship 
Outcome variable Tes Ke F RR F T F 
ti 
Se F .26 04 eal AEE MSY A kid -28 05 1,32 
Therapists 45 E EEE T OaE DA TEE aa 33. .08 1.97 
MMPI maladjustment 13 —.01 <1 29 05 1.44 30 06 1.62 
Target complaints 
Patien 14 —.01 <1 LL Vie 10 —.02 <1 2 
Therapists E BADE 2)83: -63 .38 10.80*** 51 24 5.56 
Clinicians 10° =.02 <1 38 12 2.70 27 05 1.31 


Note. MMPI = Minnesota Multiphasic Personality Inventory. dfs for F tests = 2, 32. Reported R? is the f 


unbiased estimate of population R*. 
*p < 05, 

** > < 01. 

*+* p < 001; 


to the interactions among the dimensions, and 
(c) whether the relationship between process 
and outcome was consistent across treatment 
groups. 

Since each of the process dimensions hy- 
pothesized as a predictor of outcome was tapped 
by two process scales (i.e., exploratory pro- 
cesses = Patient Exploration and Therapist 
Exploration; patient involvement = Patient 
Participation and Patient Hostility ; therapist- 
offered relationship = Therapist Warmth and 
Friendliness and Negative Therapist Attitude), 
multiple regression analyses predicting each 
of the outcome variables from each pair of 
process scales were performed. Results from 
these analyses, presented in Table 3, indicated 
that patient involvement was consistently the 
best predictor of outcome. Multiple correla- 
tions of patient involvement with clinicians’ 
and therapists’ overall improvement ratings 
and therapists’ ratings of improvement in 
target complaints were significant. In addition, 
multiple correlations of involvement with im- 
provement in patients’ and clinicians’ target 
complaints approached significance (p < .10). 
In contrast, exploratory processes predicted 
only therapists’ overall improvement ratings 

(p < .05), and therapist-offered relationship 
predicted only improvement in therapists’ 
target complaints (p < .01). 
_ Not only did patient involvement bear a 
significant relationship with more outcome 
variables than the process dimensions, in each 


case in which another process dimension was 
significantly correlated with outcome, but 
patient involvement accounted for more of the 
variance in outcome ratings (i.e., 30% vs. 18% 
for exploratory processes and 38% vs. 24% 
for therapist-offered relationship). 

Since the three process dimensions cannot be 
regarded as independent (e.g., the patients 
attitude may be influenced by the therapists Í 
behavior), it was necessary to determine 
whether the significant multiple correlations © 
represented the interactive effects of more that 
one process dimension or were primarily deter- 
mined by a single dimension. Thus, partial 
correlations between each process dimensiot 
and each outcome variable with the effects o | 
the remaining two process dimensions partial 
out were computed (see Table 4). These ana 
lyses revealed that patient involvement €% 
clusive of the influences of both exploratoty 
processes and therapist-offered relationship 1 
showed a consistent relationship with outcome: 
Four of the six partial correlations betwee? 
involvement and outcome variables wert 
significant. In contrast, none of the partial | 
correlations between exploratory processes or 
therapist-offered relationship and outcome ap 
proached significance. f 

To determine whether the observed relation 
ships between process and outcome were simila! 
for the three treatment groups (analytiG 
experiential, and alternate), it was necessa) _ 
to test whether the multiple regression slop“ 
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f 
i ‘able 4 
i 
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T: 
Partial Correlations Between Process Dimensions and Outcome Variables 


Therapist-offered 


Outcome variable Exploratory processes Patient involvement relationship 

Overall ratings 

Clinicians -08 ie 

Therapists 20 Sore ait 

MMPI maladjustment —.03 29 a 
Target complaints í. re 

Patients 14 wi 

Therapists 16 ae E 

Clinicians —.02 25 <08 


Note. N for all correlations = 35, df = 31. S i i 
a = 35, df = 31. Scores for process dimensions are weighted combinations of pro- 
cess scale scores based upon beta values from multiple regression analyses. MMPI = Minnesota Multiphasic 


Personality Inventory. 
“p <.05. 
rp < .01. 


4 depicting the relation of the three process di- 
mensions to each outcome variable differed 
across treatment groups. F tests for homo- 
geneity of regression yielded no significant 
effects, all Fs(6, 23) < 1.59, ms, indicating 
that the relationship between process and out- 
Come was consistent regardless of the theo- 
retical orientation or professional status of the 
therapist. 


Discussion 


The primary findings in this study were as 
follows: (a) Theoretical orientation and pro- 
fessional_nonprofessional status of the thera- 
pist had an impact on the process but not the 
outcome of psychotherapy; and (b) therapy 
Outcome was most consistently predicted by 
the patient’s willingness and ability to become 
actively involved in the therapy interaction—a 
dimension of therapy process that did not 
distinguish among the three treatment groups. 

As predicted, professionally trained thera- 
pists, particularly those with an analytic 
orientation, and their patients invested more 
effort in uncovering the psychodynamic roots 
Fe the patient’s problems than did the dyads 
ed by nonprofessional therapists. Therapists 
with experiential training and the untrained 
Counselors offered warmer, more personal rela- 
tionships with their patients than did thera- 
n who assumed an analytic stance. These 

dings are largely consistent with the psy- 


chotherapeutic theories to which the thera- 
pists subscribed. The analyst has traditionally 
been taught to remain aloof—an expert healer 
rather than a warm friend (Langs, 1973). In 
contrast, warmth, empathy, and genuineness 
are considered to be the fundamental tools 
of the client-centered therapist (Rogers, 1957). 
Perhaps the emergence of “self” theory in 
psychoanalysis with its increased focus on the 
importance of the therapist’s capacity to 
respond empathically (see, for example, Kohut, 
1971) may eventually challenge traditional 
notions concerning the appropriate analytic 
posture and blur some of the distinctions 
between client-centered and dynamic thera- 
pists’ self-presentations in therapy. However, 
as the behavior of the analytic therapists in 
this study may illustrate, the therapist who 
has assumed the traditional passive, non- 
demonstrative role for many years may 

it difficult to relinquish. 

Even though the therapists for each of the 
three treatment groups behaved quite differ- 
ently, these differences did not seem to in- 
fluence patients’ attitudes toward therapy or 
the therapist. Patients were as likely to become 
involved in the therapy process regardless of 
whether they saw analytic, experiential, or 
alternate therapists. The fact that the patient’s 
willingness to ally himself with the therapist 
and work at changing was not influenced by 
the theoretical orientation and professional 
status of his therapist may be of particular im- 
portance for understanding why there were 
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no differences among the groups on treatment 
outcome. These patient characteristics that did 
not distinguish the three treatment groups 
consistently emerged as the best predictors of 
therapy outcome. Patients who were not hostile 
or mistrustful and who actively contributed 
to the therapy interaction achieved greater 
changes than those who \were withdrawn, 
defensive, or otherwise unwilling to engage in 
the therapy process. 

How can the findings from this study best be 
accommodated with theories of therapeutic 
influence? The results are clearly congruent 
with Frank’s (1973) theory of “nonspecific 
factors” as the determinants of outcome. Given 
mildly to moderately disturbed patients in 
short-term therapy (an average of 17.4 
sessions), untrained professor/therapists gen- 
erally affected as much improvement as 
experienced psychologists and psychiatrists. 
Further, the variables that best predicted 
change were not related to therapeutic tech- 
niques but to the positiveness of the patient’s 
attitude toward his therapist and his com- 
mitment to work at changing. 

The present findings are also consonant with 
the ideas of specialists in brief psychoanalytic 
therapy that those patients who have the 
willingness and the adaptive resources to 
work with the therapist to resolve their 
Problems are most likely to profit from short- 
term therapy (Castelnuoyo-Tedesco, 1975; 
Malan, 1976; Sifneos, 1972), Although there is 
little evidence in this study that the use of 
exploratory technique also strongly influenced 
outcome as analytic theory .would suggest, it 
is possible that technique variables might have 
been more salient if treatment were of longer 
duration (e.g., the 40-session limit proposed 
by Malan, 1976, or the maximum of 1 year 
Suggested by Sifneos, 1972) or if only patients 

appropriate” for brief dynamic therapy (i.e., 
those with sufficient motivation and ego re- 
sources) were selected for treatment, If diff- 
culties in maintaining self-esteem, which a 
considerable porti 
study exhibited, 


exh » can be viewed as indicators 
of narcissistic pathology, 


tion of the therapist may also aid in explaining 
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that the degree to which patients become; 


why proficiency in dynamic exploration bore | 
so little relation to outcome in this sample, 

If the patient’s capacity and willingness tol 
participate in the therapy interaction ate 
among the most important determinants of 
improvement in short-term therapy, one of the 
aims of future research should be to determing | 


or inability to collaborate reflects (a) relatively } 
stable personality characteristics or (b) re 
sponse to psychonoxious attitudes and be 
haviors on the part of the therapist. | 

If as some therapists (Castelnuovo-Tedesco, | 
1975; Sifneos, 1972) and researchers (Kirtner | 
& Cartwright, 1958; Rice & Wagstaff, 1967) 
have suggested, the patient’s ability to actively, 
work toward resolving problems is a reflection 
of an enduring character structure or life view, 
researchers may be able to detect evidence of a 
generalized sense of hostility or passive in- 
difference in the patient’s interactions with f 
others (e.g., an intake interviewer). Finding) 


positively involved in the therapy process was | 
influenced by long-standing personality char- 
acteristics would have important implications 
for therapy practice. 

One option for maximizing the effectiveness 
of psychotherapy would be to select only those | 
patients who evidence a capacity to actively | 
participate in a therapeutic interaction. Rather 
than offering psychotherapy, particularly un- 
covering therapy, to all patients who present 
themselves at a clinic or community mental 
health center, clinicians might consider alter- | 
native interventions for applicants who did 
not appear to have the capacity to ally them- 
selves with a therapist or to assume 4 
great deal of the responsibility for changing | 
themselves, | 

Another approach might be to alter some of 
the behaviors that prevent patients from sut- 
ceeding in psychotherapy. If patients’ abilities | 
to become involved in the therapy process were 
as much a product of inappropriate expecta 
tions about the psychotherapy enterprise as 4 
lack of willingness to take responsibility fot 
their own behavior, role-induction procedures 
(cf. Hoehn-Saric et al., 1964; Strupp & Bloxom, | 
1973) might be useful to demonstrate to pro 
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spective patients what types of behaviors 
would be expected of them in therapy. 

Even if enduring attitudes were a major 
determinant of a patient’s capacity to become 
involved in therapy, there are very likely some 
therapist behaviors that can prompt or exacer- 
bate negativism, defensiveness, or apathy. 
Determining the therapist behaviors, attitudes, 
or character traits that interact with patient 
attitudes to influence involvement may have 
important implications for the selection and 
training of therapists and for the “matching” 
of optimal patient-therapist pairs. 

In conclusion, the results of this study have 
illustrated that meaningful measures of psy- 
chotherapy process are possible, and that out- 
come can be predicted from the process of the 
therapy interaction. However, additional ques- 
tions about the role of the patient and the 
therapist in determining the course of therapy 
have been raised. It is reasonable to hope that 
some of these questions can be answered 
through continued research efforts. 
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Coping and the Self-Control of Chronic Tension Headache 


Kenneth A. Holroyd and Frank Andrasik 
Ohio University 


Thirty-nine community residents with chronic tension headache were assigned 
to one of two self-control treatment groups, a headache discussion group, or a 
symptom-monitoring control group. Participants in the two self-control treat- 
ment groups and in the headache discussion group were provided similar ra- 
tionales for treatment and were taught to monitor their cognitive responses to 
Stress-eliciting situations. Participants in the two self-control treatment groups 
were also taught either cognitive or both cognitive and relaxation coping skills 
for controlling tension headache. Participants in the headache discussion group 
were not provided with specific skills for controlling their headaches but were 
led in a discussion of the historical roots of their symptoms. Both the self- 
control treatments and the headache discussion procedure produced substantial 
reductions in headache that were maintained at a 6-week follow-up. The symp- 


tom-monitoring control group showed no change in headache symptoms. These 
findings provide additional evidence of the effectiven 
therapeutic procedures for the treatment of tension h 
concerning the active ingredients of these treatments. 


Headache may be the most commonly re- 
ported bodily complaint (Wolff, 1963). Sur- 
vey data indicate that between 50% and 70% 
of adults experience headaches, 40% of which 
are tension headaches (Kashiwagi, McClure, 
& Wetzel, 1972). Of the 15 classes of head- 
ache identified by the Ad Hoc Committee on 
Classification of Headache (1962) of the 
American Medical Association, tension head- 
ache, also commonly termed muscle contrac- 
tion, psychogenic, or nervous headache, is the 
most frequently occurring. Tension headache 
is typically characterized by persistent sen- 
sations of bandlike pain or tightness located 
bilaterally in the occipital and/or forehead 
regions. It is gradual in onset and may last 
for hours, weeks, or even months, 


The exact etiology of tension headache 
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ess of cognitively oriented 
eadache but raise questions 


remains unclear (Bakal, 1975). Howe 
there is a general consensus that tensi 
headache (a) is an individual response 
psychological stress (Ad Hoc Committee 
the Classification of Headache, 1962; Wol 
1963) and (b) may result from the sustai 
contraction of skeletal muscles about the 
scalp, neck, and shoulders (Bakal, 1 
Martin, 1972), 

Behavioral approaches to the treatment 
tension headache have focused on modif. 
the muscle contraction responses presum 
to contribute to tension headache (Budzyn 
Stoyva, Adler, & Mullaney, 1973; 
Freudlich, & Meyer, 1975; Haynes, Gri 
Mooney, & Parise, 1975; Hutchings & R 
king, 1976). However, a somewhat diffe 
approach was taken by Holroyd, Andra 
and Westbrook (1977), who found that a seli 
Control treatment that focused on modif: 
Cognitive responses to stress-eliciting si 
tions was more effective in reducing tensio 
headaches than biofeedback-assisted rela 
tion training when these treatments were at 
companied by counterdemand instruction 
The self-control treatment used in Holroyé 
et al. contained cognitively oriented thera 
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peutic procedures (Beck, 1976; Goldfried, 
Decenteceo, & Weinberg, 1974; Meichen- 
baum, 1977) to teach individuals to identify 
their reactions to stress and to use effective 
cognitive coping skills, Similar treatment pro- 
cedures have been found effective not only 
for the treatment of clinical problems such 
as specific anxieties (Di Loreto, 1971; Gold- 
fried, Linehan, & Smith, 1978; Holroyd, 
1976; Meichenbaum, 1972; Meichenbaum, 
Gilmore, & Fedoravicious, 1971; Kanter & 
Goldfried, Note 1), depression (Rush, Beck, 
Kovacs, & Hollon, 1977), stuttering (Moleski 
& Tosi, 1976), and umassertive behavior 
(Thorpe, 1975; Wolfe & Fodor, 1977; Line- 
han & Goldfried, Note 2) but also for pro- 
viding individuals with skills for coping with 
laboratory (Meichenbaum, Turk, & Burn- 
stein, 1975) and real-life (Langer, Janis, & 
Wolfer, 1975) stressors. 

In light of growing evidence of the effec- 
tiveness of such cognitive self-control treat- 
ments, it becomes important to determine to 
what extent the various components of these 
treatments contribute to the effectiveness of 
the treatments. Although such complex social 
influence procedures are difficult to dissect, 
recent analyses (Beck, 1976; Goldfried, 
1977; Holroyd et al., 1977; Meichenbaum, 
1977) suggest that these treatment proce- 
dures at least influence clients to (a) attribute 
the source of their symptoms to relatively 
specific cognitive aberrations rather than to 
external stimuli or complex inner dispositions; 
(b) identify or self-monitor a cognitive com- 
ponent of their distress; and (c) engage in 
Specific cognitive coping strategies (e.g. re- 
appraisal, self-instruction, imagery, etc.). The 
present study attempted to determine to what 
extent the specific coping strategies that are 
taught during treatment contribute to treat- 
ment outcome. 

Chronic headache sufferers were assigned 
to one of two self-control treatment groups, 
a headache discussion group, or a symptom- 
Monitoring control group. Participants in the 
two self-control treatment groups and in the 
headache discussion group were provided 
similar rationales for. treatment designed to 
influence them to adopt similar cognitively 
Oriented explanations for their symptoms, 
and were taught to monitor their cognitive 
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responses to stress-eliciting situations in a 
similar manner. Participants in the two self- 
control treatment groups were also taught 
either cognitive coping skills or both cog- 
nitive and relaxation skills for managing 
stress and controlling tension headache, Par- 
ticipants in the headache discussion group 
were provided no specific skills for managing 
stress or controlling headaches. However, 
since previous work has indicated that a treat- 
ment procedure consisting solely of the treat- 
ment rationale and instruction in the monitor- 
ing of cognitive responses to stress lacks 
credibility, an alternate therapeutic task in- 
volving a discussion of the historical sources 
of headache symptoms was provided for this 
group. Symptom-monitoring control group 
members recorded their headaches but did 
not receive treatment until the study had 
been completed. 


Method 
Subjects 


Articles in local newspapers, appearances on radio 
“talk shows,” and wall posters were used in addi- 
tion to newspaper, radio, and television advertise- 
ments to circulate announcements of a program 
teaching methods for the self-control of tension head- 
ache to communities within a 50-mile radius of 
‘Athens, Ohio. Telephone and small group screening 
procedures, as well as evaluations by participants’ 
physicians, were used to identify those individuals 
exhibiting clear-cut symptoms of tension headache 
occurring consistently at least three times per week 
from 123 initial respondents. Participants included 
35 females and 4 males with a mean age of 35.2 
years and a mean duration of headache symptoms 


of 10.1 years. 


Procedure 


Following initial telephone screening, potential 
participants were seen in small groups by one of the 
authors, who obtained informed consent and a re- 
quired $5 deposit, administered pretreatment mea- 
sures, and arranged to obtain diagnostic medical 
information. A list of 19 characteristics of tension 
headache was also used to eliminate volunteers suf- 
fering from other types of headache or reporting 
mixed headache symptoms (Wolff, 1963). Tension 
headache sufferers’ who reported consistently experi- 
encing a minimum of three tension headaches per 
week were assigned by a within-sample matching 
technique (Goldstein, Heller, & Sechrest, 1966) to 
one of the two self-control treatments, the headache 
discussion group, OY to the symptom-monitoring con- 


trol group. 
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Measures 


Symptoms. For at least 2 weeks prior to receiv- 
ing treatment, continuing through a 2-week post- 
treatment assessment, and for 2 weeks at a 6-week 
follow-up evaluation, participants maintained a head- 
ache data card on which they rated the occurrence 
and intensity of headaches on an 11-point scale (0= 
no headache, 10=incapacitating headache) every 
hour from 10:00 a.m. through 10:00 p.m. (Haynes 
et al., 1975). An index of overall headache activity 
was calculated by summing headache ratings each 
day with weekly averages.1 In addition, participants’ 
ratings of the frequency of occurrence of 18 com- 
mon psychosomatic complaints were obtained on the 
psychosomatic checklist (Cox et al, 1975) at pre- 
treatment, posttreatment, and follow-up evaluations. 

Frontalis electromyographic (EMG) activity. 
Frontalis EMG was assessed during resting periods 
at pretreatment evaluation and, for participants re- 
ceiving treatment, following each treatment session. 
While participants rested in a heavily padded chair, 
in a different location from where they received 
treatment, frontalis muscle activity was monitored 
from forehead disk electrode placements (Budzynski 
et al., 1973) directed to a Cyborg BL933 electro- 
myograph, Three 1-minute integrated EMG mea- 
surements were sampled and averaged to provide an 
index of frontalis muscle activity. 

Additional measures. To assess their perceptions of 
the credibility of the treatment that they received, 
the participants evaluated the probability of their 
recommending the treatment to a friend suffering 
from tension headache and how important they felt 
it was that their treatment be made available to 
other headache sufferers (on 5-point scales) following 
the first treatment session, at posttreatment, and at 
follow-up evaluations. Participants also rated the 
warmth, empathy, and skill of their therapist at the 
posttreatment and follow-up evaluations. Since it was 
suspected that the headache discussion procedure 
might increase participants’ levels of self-esteem 
without necessarily reducing specific symptoms, self- 
esteem was assessed by the Miskimins Self-Goal 
Other Discrepancy Scale (Ryan, Krall, & Hodges, 

1976) at all three assessments. No attempts were 
made to influence participants’ medication intake. 
However, medication intake was recorded on a daily 
basis on the headache data cards, 


Treatment 


k enr poos were 
ve weekly 12-hour group Meetings, We each con- 
ducted one group of each type so that therapists and 


individual format 
first author had 
perience (5 years) than the second author, who had 


conducted one group Previous to this sti 
d idy. - 
demand instructions emphasizing spe ged 
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ment could be expected until the completion of treat- 
ment (Steinmark & Borkovec, 1974) were admin- 
istered at the end of the first treatment session, 

Cognitive self-control (n=10). This treatment 
focused on altering maladaptive cognitive responses 
that were assumed to mediate the occurrence of 
tension headache. The treatment format closely fol- 
lowed that used by Holroyd et al. (1977), with ap. 
propriate modifications for the group setting. Spe- 
cific procedures were adapted from cognitively 
oriented therapy procedures (Beck, 1976; Goldfried 
et al., 1974; Meichenbaum, 1977) and were designed 
to maximize the occurrence of causal reattribution 
and the development of self-monitoring and cognitive 
coping skills described below. 

The rationale for treatment emphasized that dis- 
turbing emotional and behavioral responses are a 
direct function of specifiable maladaptive cogni- 
tions. It was emphasized that tension headache re- 
sults from psychological stress and that stress 
responses are determined by cognitions about an 
event or situation. Several concrete examples wert 
provided to illustrate the variety of events that can 
be perceived as stressful by different individuals and 
the way in which cognitions can induce psychological 
stress and headache. In addition, unreasonable expec- 
tations (that one should be perfect or liked by 
everyone) were discussed, and the manner in which 
these expectations predispose individuals to experience 
stress was illustrated. 

Following presentation of the treatment rationale 
each group member constructed a list of stressful 
Situations. The therapist, working in turn with each 
group member, focused on identifying (a) the cues 
that trigger tension and anxiety, (b) how the client 
responded when anxious (withdrawal, passivity, 
ete.), (c) the clients’ thoughts prior to becoming 
aware of tension while tense and subsequently, and 
(d) the way in which these cognitions appeared to 
contribute to the clients’ tension and headache. 
Clients were encouraged to learn from group mem- 
bers who proved most adept at this cognitive analy- 
sis and to assist other group members in identify- 
ing cognitive components of their distress. 

As soon as clients became fluent at verbalizing 
cognitions associated with feelings of distress, they 
were instructed to deliberately interrupt the sequence 
of covert events preceding their emotional re- 
Sponse at the earliest possible moment. To do this 
clients were instructed to use signs of impending 
distress as a signal to engage in cognitive strategie 
incompatible with the further occurrence of CoB 
nitive stress responses. The therapist verbally mod- 
eled (Kazdin, 1973) strategies that were designed 


* Average weekly headache activity scores (HA) 
were computed by the following formula: HA= 
(IX D), where I is intensity of headache and D ® 
the hours of duration of headache. This index of 
headache is considered to be the most useful mea- 
sure of headache activity, as it incorporates tw? 
Separate dimensions of each reported headache. 
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to enable clients to use each of the three main types 
of intrapsychic coping responses that have been 
jdentified by Lazarus and his co-workers (Lazarus, 
‘Averill, & Opton, 1974): cognitive reappraisal, at- 
tention deployment, and fantasy. Clients were en- 
couraged to practice these coping skills on a daily 
pasis and to implement these cognitive coping skills 
at the first sign of headache following the third 
treatment session. 

Combined cognitive and relaxation self-control 
(n=10). This treatment focused on altering both 
cognitive and muscle contraction responses to stress- 
ful situations. The treatment format was similar to 
that used in the cognitive self-control groups except 
that muscle relaxation was also taught as a self- 
control skill (Goldfried & Trier, 1974) for coping 
with psychological stress and tension headache. 

The rationale for this treatment emphasized that 
tension headache resulted from both psychological 
stress and sustained muscle contraction responses. 
Clients were encouraged to monitor both their cog- 
nitive responses to stress and their perceived level 
of muscle tension in stressful situations. In addition 
to the cognitive coping strategies provided to clients 
in the cognitive self-control group, clients were 
taught to use muscle relaxation as a coping skill. 
Relaxation training followed procedures outlined by 
Bernstein and Borkovec (1973), with particular 
attention to facial, neck, and forehead muscles, 
which are thought to be associated with tension 
headaches. Approximately equal portions of each 
session were spent on teaching cognitive and relaxa- 
tion coping skills. Clients were encouraged to prac- 
tice muscle relaxation at home on a daily basis and 
to implement these coping skills at the first sign of 
headache following the third treatment session. 

Headache discussion. This treatment focused on 
a discussion of the historical roots of symptoms 
rather than on the development of specific coping 
skills. Although clients were taught to monitor their 
cognitive responses to stress in the same manner as 
they were in the other two treatment groups, no 
Strategies for coping with stress were provided. 

The rationale presented for this treatment also 
described headaches as resulting from psychological 
stress but emphasized that feelings of distress would 
improve if clients understood the underlying source 
of their problems. Group members were encouraged 
to examine the thoughts and feelings that accom- 
panied their headaches for clues that might be pro- 
vided to the underlying source of their symptoms, 
The basic procedures followed by the therapist were 
designed to increase the clients’ self-confidence and 
Self-esteem and to provide a reasonable explanation 
for the clients’ distress in terms of historical events 
in their lives. The therapist encouraged clients to 
Openly discuss and explore their emotional responses 
to stressful life events, emphasized similarities among 
the problems and reactions of group members, and, 
Where possible, offered plausible interpretations de- 
Signed to link previous life events with current 
emotional reactions and problems. (“Your anxiety 
appears to be a natural reaction to the way you 
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were treated as a child.”) The therapists deliberately 
avoided suggesting specific methods for coping with 
stress and attempted to prevent other group mem- 
bers from offering this type of advice. 

Symptom-monitoring control (n= 10), Partici- 
pants assigned to this group recorded their head- 
aches in the same manner as other participants in 
the study but were informed that due to the large 
number of applicants, treatment would not be avail- 
able until a later date. Participants assigned to this 
group returned their headache data cards by mail 
on a weekly basis. To maximize compliance with 
this procedure, they were paid $10 for accurately 
recording their headaches. t 

Six of the initial 39 participants were eliminated 
prior to data analysis. Two participants withdrew 
from treatment: 1 participant in the combined 
group reported that she was uncomfortable partici- 
pating with another member of her group whose 
child she taught at school; another participant in 
the headache discussion group reported that the 
self-monitoring was exacerbating his headaches, In 
addition, schedule conflicts prevented 2 participants 
in the combined group and 1 participant in the 
headache discussion group from attending a minimum 
of three sessions. One participant in the symptom- 
monitoring control group also reported that she lost 
her headache data cards. Follow-up data were un- 
available for 1 participant in the headache discussion 
group who moved out of state. 


Results 


Examination of pretreatment scores, pre- 
sented in Table 1, reveals occasional differ- 
ences among groups. Although separate analy- 
ses of variance revealed that none of these 
differences were significant, analysis of co- 
variance (with pretreatment scores as the 
covariate) was used to provide the most ac- 
curate assessment of treatment effects. 


Headache Recording 


Average weekly headache activity scores 
are presented in Figure 1. Although both self- 
control groups and the headache discussion 
group showed substantial reductions in head- 
ache activity that were maintained at fol- 
low-up, the symptom-monitoring control 
group showed essentially no change in head- 
ache activity. 

Analysis of covariance revealed highly sig- 
nificant treatment effects at both posttreat- 
ment and follow-up assessments, F (3, 28) = 
7.1, and F(3, 27) = 11.8, respectively, both 
ps < 001. £ tests for correlated means re- 
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Table 1 
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Means of Pretreatment, Posttreatment, and Follow-Up Evaluations for Major Dependent Variables 


$$ 


Group 
Headache- 
Cognitive Combined Headache monitoring 
Variable self-control self-control discussion control 
ache activit: 

ae ; Hf 127.5 183.4 132.4 152.1 

Post® 35.7 57.2 61.9 147.9 

Follow-up* 34.1 39.4 47.8 148.1 
Headache frequenc: 

oe i i ye 5.5 4.6 5.6 

Post* 2.2 2.8 3.0 4.9 

Follow-up* 2.0 3.2 2.6 4.8 
Headache duration 

Pre 31.2 44.1 30.4 38.1 

Post* 14.1 20.2 18.8 36.1 

Follow-up* 11.4 14.9 17.8 34.2 
Headache intensity 

Pre 3.7 3.7 4.5 3.7 

Post* 2.2 24 3:2 3.3 

Follow-up* 2.8 2.2 1.4 3.3 
Psychosomatic symptoms? 

Pre 80.2 76.4 75.9 85.1 

Post® 86.5 78.5 90.9 86.1 

Follow-up* 86.8 77.7 92.2 86.1 
Electromyogram (4V/min) 

Pre Sey 3.9 3:5 3.5 

Post® 4.0 3.0 3.2 = 
Credibility® 

Session 1 9.5 9.9 8.4 —_ 

Post 8.9 9.9 8.3 ae 

Follow-up 9A 9.7 9.0 = 


a Adjusted means, 


» Larger scores represent a lower incidence of psychosomatic symptoms. 


° Sum of two 5-point scales. 


vealed that both self-control groups and the 
headache discussion group showed significant 
improvements at posttreatment and follow-up 
assessments (at least p < .05), whereas the 
symptom-monitoring control group showed 
no change in headache activity. Duncan’s new 
multiple-range test conducted on adjusted 
posttest and follow-up means further revealed 
that the two self-control groups and the 
headache discussion group differed signifi- 
cantly from the Symptom-monitoring control 
group (p < .05) but not from one another, 
Pretreatment and adjusted posttreatment 
and follow-up means for additional compo- 
nent measures of headache (frequency, dura- 
tion, and intensity) are presented in Table 1, 
Results from Separate analyses of covariance 
conducted on these measures revealed signifi- 


cant treatment effects on all three of these 
measures (at least p < .05 at follow-up), with 
the pattern of results very similar to that 
reported for the composite headache activity 
score discussed above. 

To examine possible therapist differences 
in outcome, headache activity scores for par- 
ticipants in the three treatment conditions 
were subjected to Treatment x Therapist 
analyses of covariance, Significant therapist 
effects were obtained at both posttreatment 
and follow-up assessments, F(1, 17) = 6.0, 
and F(1, 16) = 6.1, both ps < .025. At both 
assessments participants treated by the first 
author obtained somewhat lower adjusted 
headache activity scores (posttest M = 26.9, 
follow-up M=25.9) than participants 
treated by the second author (posttest M = 


SELF-CONTROL OF TENSION HEADACHE 


200 
190 
180 
170 
160 
150 
140 
130 
120 
110 
100 
90 
80 
70 
60 
50 
40 
30 
20 
10 


COGNITIVE 

COGNITIVE + RELAXATION 
HEADACHE DISCUSSION 
SYMPTOM MONITORING CONTROL 


MEAN HEADACHE ACTIVITY SCORES 


PRE 


TREATMENT 


1041 


an 
oo 
o—o 
*—* 


2 


POST FOLL 


Figure 1. Mean weekly headache activity scores in 2-week blocks. (Foll = follow-up.) 


79.9, follow-up M = 58.9), However, ¢ tests 
for correlated means revealed that partici- 
pants treated by each of the therapists 
showed significant improvements in headache 
activity at both assessments (at least p< 
01). Thus, even though both therapists pro- 
duced significant reductions in headache ac- 
tivity, reductions by the first author were of 
a somewhat larger magnitude. Similar analy- 
ses conducted on the additional component 
headache activity scores revealed significant 
therapist effects on headache duration ($ < 
05) but not on headache frequency or in- 
tensity, 


Additional Measures 


Analysis of covariance conducted on post- 
test and follow-up psychosomatic checklist 
frequency scores revealed significant treat- 
ment effects at both assessments, F (3, 28) = 
42, p < 02, and F(3, 27) = 4-1, Ps < 0l, 
respectively. Although # tests for correlated 
means revealed that only the cognitive self- 
control group and the headache discussion 
a reported fewer psychosomatic symp- 
eee treatment (p < .01), Duncan’s 
a indicated that only the headache discus- 

n and combined cognitive and relaxation 


self-control groups differed significantly at 
posttest (p < 01). No therapist differences 
were observed on this measure. Differences 
among the four groups in self-esteem ap- 
proached but did not reach significance, 
F(3, 28) = 2.7, p < .07. No differences were 
observed among the four groups in the num- 
ber of participants taking medication for 
their tension headaches, x (3) = 28. How- 
ever, of the 28 participants taking medica- 
tion, 17 of 20 participants receiving treat- 
ment recorded reductions in the frequency 
of their medication intake at follow-up, 
whereas only 1 of 8 participants not receiv- 
ing treatment recorded reductions (Fisher’s 
exact probability test, p= 0007). Correla- 
tional analyses revealed no significant rela- 
tionships between demographic variables, 
headache history, or self-esteem scores and 
headache improvement. 


Electromyographic Activity 


Examination of pretreatment means pre- 
sented in Table 1 reveals that the severe head- 
ache symptoms exhibited by participants in 
this study were not accompanied by similarly 
elevated levels of resting frontalis muscle ten- 
sion. Thus, to the extent that increased mus- 
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cle tension is associated with headache symp- 
toms exhibited by these individuals, this 
muscle tension must be elicited by a limited 
range of, probably, stressful situations. Analy- 
ses of covariance revealed that differences in 
EMG activity among the two self-control 
groups and the headache discussion group 
only approached significance at posttest, 
F(2, 20) = 2.1, p < .10. Also, EMG reduc- 
tions were not significantly correlated with 
improvements in headache symptoms. 


Therapist and Treatment Ratings 


It can be seen in Table 1 that both self- 
control treatments and the headache discus- 
sion procedure were rated as highly credible 
treatments at all three evaluations. However, 
analyses of variance revealed significant dif- 
ferences in the ratings of these procedures 
following the first treatment session, F(2, 
29) = 4.9, p < .05, but not at posttreatment 
and follow-up assessments. Duncan’s test re- 
vealed that the headache discussion procedure 
was rated as slightly less credible than the 
combined cognitive and relaxation treatment 
(p < .05), whereas the cognitive treatment 
did not differ from either of the other pro- 
cedures. The two therapists were rated as 
equally warm, empathetic, and skillful by 
participants in their respective groups. How- 
ever, neither therapist ratings nor treatment 
ratings were significantly correlated with 
headache improvement. 


Discussion 


Results from the present study complement 
those obtained by Holroyd et al, (1977) in 
showing that cognitive self-control procedures 
can provide an effective treatment for chronic 
tension headache whether they are admin- 
istered individually or in a group. The group 
administration used in the present study re- 
sulted in reductions in headache activity that 
were compatable to those obtained when the 
treatment was individually administered in 
Holroyd et al. This suggests that therapist 
time might be effectively conserved by the 
group administration of these procedures. In 
addition, these results add to a growing body 
of evidence (Beck, 1976; Goldfried, 1977; 
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Mahoney & Arnkoff, in press; Meichenbaum, 
1977) supporting the effectiveness of thera. 


peutic procedures designed to alter clients | 


cognitions in the treatment of anxiety and 
stress-related disorders. 

However, additional findings from the 
present study raise questions about the ef- 
fective procedural ingredients of such cog. 
nitive therapeutic procedures. Neither the 
elimination of cognitive coping strategies 
from the cognitive treatment used in the 
present study nor the addition of self-con- 
trol relaxation to this treatment altered its 
effectiveness. Thus, the positive outcomes that 
were obtained with this treatment do not 
appear to have resulted from the specific cog- 
nitive coping strategies that were provided. 
Somewhat similar results have been obtained 
by Thorpe, Amatu, Blakey, and Burns (1976) 
who found that not only was the effectiveness 
of rational emotive therapy not enhanced by 
the addition of specific  self-instructional 
coping strategies, but the combined procedure 
was less effective than rational emotive 
therapy alone on some measures. 

Participants in the two self-control treat- 
ment groups and in the headache discussion 
group were interviewed following treatment 
to obtain additional information concerning 
the methods that they used to control theit 
headaches. Although all participants in the 


two self-control groups reported using the | 


self-control procedures that they were taught 
during treatment, it is of note that all but on 
of the participants in the headache discus 


sion group also reported devising cognitive | 
self-control procedures for coping with thei 


tension headaches. These strategies appearé 
to distract the users from worrisome thoughts 
and/or to enable them to reevaluate the 


stressor situation. Although the reported 
methods for controlling headaches were often 
strikingly similar to those used by partie” 


pants in the two self-control groups, several 
individuals developed somewhat unusu 
methods for managing their headaches. For 
example, one woman began praying when 

noted cognitive symptoms of distress, 4” 
she indicated that a brief period of pray“ 
enabled her to approach previously stressi 
situations with some detachment. A ™% 
imaginally engaged in karate exercises. He 
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reported that these imaginary exercises dis- 
tracted him from his worries and allowed him 
to “get a handle” on himself. The only par- 
ticipant who did not report engaging in spe- 
cific cognitive strategies for coping with head- 
ache also showed only minimal improvement 
in headache activity. These results suggest 
that the improvements shown by participants 
in the headache discussion group may have 
resulted from their use of cognitive coping 
strategies of their own devising. 

Results from other studies that have com- 
pared the combination of cognitive therapy 
and relaxation training procedures with cog- 
nitive procedures alone have obtained results 
consistent with the present findings, indicating 
that the addition of relaxation training to 
cognitive interventions does not enhance their 
effectiveness, Thus, even though combined 
treatments have been found to be less effec- 
tive than cognitive interventions (Holroyd, 
1976; Meichenbaum et al., 1971; Kanter & 
Goldfried, Note 1) and equally effective as 
cognitive interventions (Novaco, 1975; Osar- 
chuk, 1977), they have not been shown to be 
more effective than cognitive interventions 
alone, A number of investigators have ex- 
plained these findings by assuming that clients 
are unable to master both of these techniques 
in the brief treatment time typically allowed 
in these studies (Goldfried, 1977; Meichen- 
baum et al., 1971). Although frontalis EMG 
activity was not assessed during relaxation 
training, the fact that participants in the 
combined treatment group did not show sig- 
nificant reductions in EMG level following 
treatment suggests that these individuals 
May not have adequately mastered the relaxa- 
tion training procedure. On the other hand, 
the posttreatment EMG levels of participants 
in this group were comparable to the post- 
treatment EMG levels achieved by partici- 
pants in other studies using relaxation train- 
ing procedures (Budzynski et al., 1973; 
Haynes et al., 1975; Hutchings & Reinking, 
1976). Thus, the failure to obtain reductions 
in frontalis muscle activity may have resulted 
ftom the initially low resting EMG levels 
exhibited by participants in this study. 

Frontalis muscle activity was not signifi- 
cantly associated with headache symptoms 
Prior to treatment, and reductions in E 
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activity were not correlated with headache 
improvement following treatment. This weak 
association between resting levels of frontalis 
EMG activity and tension headache has been 
found in a number of recent studies (Cox et 
al., 1975; Epstein & Abel, 1977; Haynes et 
al., 1975; Holroyd et al., 1977). It seems 
likely that muscle contraction responses to 
specific stressful situations might contribute 
to tension headaches even though these re- 
sponses might not be elicited in the relaxed 
laboratory assessment situation. Therefore, 
research on tension headache would probably 
benefit from the development of methods for 
assessing such responses to laboratory or real- 
life stressors. 

The fact that both the self-control and 
headache discussion treatments yielded sim- 
ilar outcomes suggests that elements com- 
mon to these interventions may have ac- 
counted for their results. For example, the 
provision of a causal explanation for distress- 
ing symptoms may have served to increase 
clients’ belief in their ability to cope with 
their symptoms (Frank, 1974; Murray & 
Jacobson, in press). Although somewhat dif- 
ferent explanations of the therapeutic pro- 
cess were provided in each of the groups, they 
all emphasized that clients could master their 
symptoms. Thus, specific cognitive distor- 
tions or muscle contraction responses elicit- 
ing tension headache could be combated by 
engaging in specific coping responses OF by 
understanding their historical antecendents, 
which were no longer present and therefore 
need not influence current responses. Such 
explanations, which increase an individual’s 
belief in their ability to cope with previously 
debilitating symptoms, might be expected to 
lead to a greater initiation and persistence 
of coping behavior (Bandura, 1977). Simi- 
larly, training in identifying a cognitive com- 
ponent of distress may have facilitated con- 
trol of tension headaches by sensitizing clients 
to early signs of psychological stress, thus 
providing cues for appropriately engaging 1n 
coping responses (Meichenbaum, 1975). It 
may be less crucial to provide clients with 
specific coping responses than to insure that 
they monitor the insidious onset of symptoms 
and are capable of engaging in some sort of 
cognitive or behavioral response incompatible 
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with the further exacerbation of symptoms. 
Clearly, further research will be required to 
determine the way that these factors con- 
tribute to therapeutic change. 

Although clients seen by each of the thera- 
pists showed substantial reductions in head- 
ache activity, somewhat larger reductions in 
headache activity were obtained by one of the 
therapists. Since no differences in outcome 
were observed when similar treatments were 
ministered individually in a previous study 
(Holroyd et al., 1977), the differences ob- 
tained in the present study may have resulted 
from therapist differences in experience with 
group therapy procedures. 
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Previous studies of linear and configural Minnesota Multiphasic Personality 
Inventory (MMPI) diagnostic predictors have suffered from varying degrees 
of criterion contamination. We replicated and extended previous findings with 
572 subjects who had been diagnostically classified without MMPI contamina- 
tion. The Goldberg linear equations derived from MMPI group profiles achieved 
84% accuracy in classifying group profiles and a 14% increment over base-rate 
accuracy in classifying individual profiles. The Goldberg linear equation and 
several configural methods for discriminating psychotics from neurotics were 
compared. The linear equation was found to be most accurate. Conflicting re- 
sults in previous articles suggest that criterion contamination must be avoided 
in prediction studies. A possible use for the group profile classification equations 
in evaluating experimental studies is suggested. 


Several configural rules and linear indices 
have been developed to diagnostically classify 
Minnesota Multiphasic Personality Inventory 
(MMPI) profiles. Peterson (1954) specified a 
number of signs that discriminated patients 
who later become schizophrenic. These signs 
are (a) T scores on four or more clinical scales 
greater than 70; (b) F greater than 65; (c) Sc 
greater than Pt; (d) Pa or Ma greater than 70; 
(e) Pa, Sc, or Ma greater than Hs, D, and Hy; 
and (f) D greater than both Hs and Hy. 
Taulbee and Sisson (1957) developed rules to 
discriminate the profiles of schizophrenics from 
those of neurotics. Their method compares 16 
scale pairs for elevation and tallies the number 
of pairs that differ in the scored direction. 
Experimental results have shown this method 
to be more accurate than the judgment of 
clinicians. Meehl and Dahlstrom (1960) con- 
structed and cross-validated a highly complex, 
multistep, configural algorithm for discrimi- 
nating the profiles of psychotics from neurotics. 
They elucidated the central assumption of the 
configural approach, namely, that the infor- 
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mation required to make discrimi 
resides primarily in nonlinear relatio: 
among MMPI scales. 

Goldberg (1965, 1969) challenged thi 
sumption in two comprehensive studies: 
compared numerous techniques for dis¢ 
nating psychotic from neurotic MMPI pr 
His results indicated that a five-variable 
composite of MMPI scales (L+ Pa 
— Hy — Pt) was superior to previously 
veloped configural rules, profile typol 
actuarial tables, several nonlinear : 
Bayesian) actuarial techniques, and clinic 
Judgments in making the psychotic/ne 
discrimination. 

Goldberg (1972) also applied the 
model to group profiles. He hypothesized 
his procedure results in indicators of 
lying processes that are more useful 
results derived from individual profiles: 
rationale was that the process of aver 
individual profiles reduces the error 
“noise,” inherent in individual profiles 
results in a clearer “signal” of under 
processes. Goldberg classified Lanyon’s ( 
compilation of MMPI group profiles int 
molar diagnostic categories of psych 
neurotic, sociopathic, and normal. He 


Inc. 0022-006X/78/4605-1046$00.75 


MMPI DIAGNOSTIC METHODS 


developed a set of three linear equations, which 
are applied sequentially to group profiles for 
diagnostic sorting. The first equation separates 
normals from psychiatric groups. The second 
separates sociopathic from psychiatric groups. 
The final equation separates the remaining 
groups into psychotic and neurotic. This 
procedure classified profiles with remarkable 
accuracy (93%-99%). In addition, the linear 
equation, earlier derived from individual 
profiles to discriminate psychotics from neu- 
rotics, was replicated for group profiles. These 
results provide support for the use of linear 
rather than configural systems for MMPI 
profile classification. 

However, the issue of configural rules versus 
linear combinations has not been conclusively 
resolved. In each of the studies cited above, 
there was some criterion contamination; the 
MMPI results were used in varying degrees to 
determine the criterion diagnoses that were 
being predicted. Goldberg (1965) was aware 
of this difficulty and reanalyzed his individual 
profile data after dividing his sample into 
groups considered to be “least contaminated” 
and “most contaminated.” The results of these 
analyses suggested that criterion contami- 
nation may have had significant effects on 
the predictive validities. In the least con- 
taminated sample, there was no difference 
between the predictive validities of the Gold- 
berg equation and the Meehl-Dahlstrom rules. 
He also found that if one assigns a rank order 
to scores by elevation to the eight psycho- 
pathology scales, the sum of the ranks of D, 
Hy, and Pt (the most elevated scale receives 
a rank score of 1; low scores are considered 
neurotic) has the highest validity coefficient in 
the least contaminated sample. Thus, although 
the preponderance of his results favored the 
linear equation; one configural method equaled 
and another outperformed it. 

In a replication study, Goodson and King 
(1976) compared the predictive accuracy of 
the Peterson signs and the Goldberg equation. 
In contradiction to Goldberg’s results, they 
found the Peterson signs to be superior in two 
samples. However, the criterion diagnoses in 
One sample were based entirely on the MMPI 
and, hence, were contaminated. Their findings 
Were unclear as to whether the MMPI con- 
tributed to the diagnoses in the second sample. 
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The present study was undertaken to 
replicate and extend certain of the results of 
the above studies using an uncontaminated 
criterion. The purposes of this study were to 
(a) attempt to replicate Goldberg’s (1972) 
results with group profiles; (b) estimate the 
diagnostic classificatory accuracy of the equa- 
tions derived from group profiles when applied 
to individual profiles; and (c) compare the 
relative classificatory accuracy of the various 
“cookbook” methods for discriminating psy- 
chotic from neurotic profiles (specifically, the 
Goldberg «equation, the Meehl-Dahlstrom 
rules, the sum-of-ranks rule, the Taulbee- 
Sission signs, and the Peterson signs). 


Method 


Subjects 


The sample was composed of male veterans who 
were evaluated at the time of application for treatment 
at the Psychiatric Assessment Unit of the Salt Lake 
City Veterans Administration Hospital. Subject selec- 
tion was limited to those veterans who completed the 
experimental evaluation and testing within 48 hours 
of referral. In addition, all subjects whose MMPI had 
more than 30 missing items or had an F — K dissimula- 
tion index (Gough, 1950) greater than 14 were elimi- 
nated. This resulted in a sample of 572 subjects classi- 
fied into seven psychotic groups (n = 218), three 
sociopathic groups (n = 208), three neurotic groups 
(n = 88), and one “no mental illness” group (n = 58). 
The mean age for the sample was 37.7 years (SD 
= 12.4). The mean IQ estimated from the Shipley~ 
Hartford (Paulson & Lin, 1970) was 103.4 (SD = 12.7). 


Instrumentation 


The Current and Past Psychopathology Scales 
(CAPPS) structured recording form for evaluating 
current and past psychopathology and social function- 
ing (Endicott & Spitzer, 1972) was used as the diag- 
nostic data base form. The CAPPS consists of 41 items 
about current psychopathology and 130 items about 
past psychopathology, personality characteristics, and 
academic, occupational, and interpersonal adjustment. 
It is a rationally constructed instrument that was 
developed to provide coverage of symptoms generally 
considered to be important in the evaluation of diag- 
nosis, severity of illness, and prognosis. Endicott and 
Spitzer have shown that interjudge reliabilities for 


individual items are very high (range = .68-1,00). For 
CAPPS was used as 


purposes of this experiment, the CA v 
data input to the DIAGNO II diagnostic program 
(Spitzer & Endicott, 1969). The DIAGNO II program 
was rationally constructed to mimic the clinician’s 


diagnostic reasoning, given ratings on the CAPPS items 


as the source of input information. This program has 
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been shown to produce diagnoses that agree with 
standard clinical diagnoses as well as experienced 
clinicians agree with each other and at the same time to 
reduce situational diagnostic bias (Spitzer & Endicott, 
1969). 

Subjects were also administered the booklet form 
of the MMPI and the Shipley-Hartford. 


Procedure 


CAPPS interviews were administered by three 
staff members (a clinical psychologist, a social worker, 
and a nurse) who had been trained using materials 
provided by Jean Endicott. The order of administration 
of the CAPPS and MMPI was dependent on the 
availability of an interviewer. In those instances in 
which the MMPI was administered prior to the CAPPS, 
the interviewer had no knowledge of the MMPI 
results. 

Although previous research with the CAPPS has 
shown that well-trained interviewers produce highly 
reliable item ratings, the interjudge diagnostic relia- 
bility of the staff members was studied. An additional 
sample of 30 male applicants for psychiatric care 
served as subjects for this study. Three interviewers 
recorded CAPPS ratings for each of the subjects. These 
ratings were input to DIAGNO II, Output consisted 
of 30 sets of three diagnoses made using the Diagnostic 
and Statistical Manual of Mental Disorders (American 
Psychiatric Association, 1968), The kappa technique, 
developed by Fleiss (1971) for measuring nominal 
agreement among raters, was used to compute inter- 
judge diagnostic reliability. The value of kappa was 
-45 for all diagnoses for the 30 subjects. This value was 
significantly greater than zero (p < .001) and some- 
what greater than the mean of the mean of kappas 
(k = .40) reported by Spitzer and Fleiss (1974) for 12 
specific diagnoses in their review of seven diagnostic 
reliability studies. 

A previous study on our patient population showed 
that experienced senior clinicians agree with DIAGNO- 
TI diagnoses no better than with the treatment clini- 
on diagnoses (Klingler, Miller, Johnson, & Williams, 

To cross-validate the Predictors of 
a mean profile was calculated for each of the diagnostic 
groups. Group profiles were then Scored on the relevant 
predictors derived in Goldberg’s (1972) study (normal 
Hs + 2Pd — Ma; psychiatric ys, 
and psychotic r 
neurotic, L + Pa + Sc — Hy — Pi). ae ai 
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applied. Thus, for the “normal versus psychiatric” i 


equation, these statistics were calculated over the 
entire sample. For the “psychiatric versus sociopathic” 
equation, these calculations were made with the no 
mental illness group removed, and so forth, Second, 
assuming a normal distribution of scores on the pre- 
dictors, the cutting scores for classification were set 
such that the selection ratio would equal the diagnostic 
base rates when each equation was applied. 

The psychotic versus neurotic predictors were 
subjected to two analyses for comparability with 
Goldberg’s (1965) results. Because the Meehl-Dahl- 
strom rules and Taulbee-Sisson signs permit an 
indeterminate classification, the Goldberg equation and 
the sum-of-ranks rule were also permitted an in- 
determinate classification, Using Goldberg’s cutting 
scores, the indeterminate ranges were designated as 
scores between 40 and 49 on the Goldberg equation 
and scores of 11 or 12 on the sum-of-ranks rule, f 

A second analysis was performed to force a prediction 
for each case, because an indeterminate range has not 
been designated for the Peterson signs. In this analysis 
only the Goldberg equation, the sum-of-ranks rule, 
and the Peterson signs could be compared. The cutting 
points for predicting psychosis were scores greater than 
45 for the Goldberg equation, greater than 11 for the 
sum-of-ranks rule, and more than 2 signs present for 
the Peterson signs, 4 

One further procedure was necessary to insure 
comparability to Goldberg’s results. His analyses were 
based on a sample that approached a 50% split between 
the criterion diagnoses, The present sample contained 
far more psychotics than neurotics. Thus, all 88 in- 
dividuals with neurotic diagnoses were included in the 
analysis. An equal number of psychotics was selected 
by randomly sampling from each of the seven psychotic 
diagnoses in proportion to the number of individuals 
receiving each diagnosis. 


Results 


Analysis of the group profiles confirms 
Goldberg’s results. Table 1 displays the means 
and standard deviations obtained on the group 
profile predictors and those obtained RY 
Goldberg’s (1972) derivation sample. Despite 
the small sample of groups in this study, the 
Separations among the groups were very large 
and in the expected directions. The poorest 
Separation was between psychotics and neu 
Totics; yet, the means were more than two 
Standard deviations apart, one-tailed #(8) 
= 2.95, p < .01. Since a test of significance 
does not necessarily imply a strong degree of 
Statistical association (Hays, 1963), it Me 
decided to calculate the omega-square for this 
Contrast. The value of w? was .44, suggesting 
a strong degree of discrimination betwee? 
groups. Further evidence for the strength of 


Table 1 


Comparison of the Results of the Goldberg Group-Profile Classification Formulae with a Replication Sample 


Goldberg (1972) 


Present study 


Neurotic Sociopathic Normal Psychotic Neurotic Sociopathic Normal 


Psychotic 


SD SD SD SD SD SD. SD 


SD 


Predictor 


136 14 141 140 110 


130 


149 


(Hs + 2Pd — Ma) 
Sociopathic vs. psychiatric 


Psychiatrics vs. normal 
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Table 2 

Valid Positive Predictions for Individual 
Profiles Using Group-Profile-Derived 
Predictors 


Valid 
Category positives (%) n 

Schizophrenia, paranoid type 59 22 
Other major affective disorder 58 19 
Psychotic depressive reaction 50 36 
Schizophrenia, catatonic type 50 16 
Schizophrenia, chronic un- 

differentiated type 44 84 
Acute schizophrenic episode 39 23 
Schizophrenia, schizoaffective 

type 28 18 
Total psychotic 46 218 
Drug dependence 66 44 
Unspecified alcoholism 52 130 
Antisocial personality 50 34 
Total sociopathic 54 208 
Hysterical neurosis 50 22 
Depressive neurosis 25 32 
Anxiety neurosis 19 34 
Total neurotic 36 88 
No mental illness 19 58 

Total sample 45 572 


these results was obtained by applying the 
cutting scores for the classification equations 
suggested in the derivation study to this 
sample. Only 2 of the 14 group profiles were 
misclassified: “depressive neurosis” and “no 
mental disorder.” This represents 86% ac- 
curacy of classification in a small replication 
sample. 

These equations were then used to sequen- 
tially classify the 572 individual profiles. After 
each equation was applied, the fourfold 
hit-miss table was analyzed using McNemar’s 
test for correlated proportions (Siegel, 1956). 
This was done to test the hypothesis that the 
distribution of predictions was not significantly 
different from the distribution of diagnoses. 
Since a hypothesis of no difference was being 
tested, the significance level was set at .10 to 
avoid a Type I error. The one-tailed chi-square 
values (df = 1) for the three prediction stages 
were .01, 1.6, and 2.2, respectively (p > 10). 
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Table 3 
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Comparison of Four Methods for Discriminating Psychotics From Neurotics 


Permitting an Indeterminate Classification 


————— ee 


Indeter- 

Hits Misses minate 

Method n % n % n % 
Goldberg equation 95 65.5 50 34.5 31 17.6 
Sum of ranks (Hy, D, & Pt) 17 55.8 61 44.2 38 21.6 
Meehl-Dahlstrom rules 63 46.7 72 53.3 41 23.3 
Taulbee-Sisson signs 45 45.9 53 54.1 78 44.3 


Note. Hit-miss percentages are based on the number of profiles classified. 


The distributions of predicted and criterion 
diagnoses were not significantly different. 

Table 2 displays the hit rates for each 
specific diagnosis, each molar diagnostic 
category, and the total sample. The sequential 
approach was most accurate in identifying 
characterological and psychotic syndromes 
(with the exception of schizoaffective schizo- 
phrenia) and least accurate in identifying the 
no mental illness profiles. Random assignment 
of profiles to the molar diagnostic categories, 
given base rates, should produce an expected 
hit rate of 31%. For all subjects the hit rate 
was 45%, a 14% improvement over chance 
accuracy. 

Results of the comparison of the various 
methods for discriminating psychotics from 
neurotics, with an indeterminate category 
permitted, are shown in Table 3. These results 
confirm Goldberg’s findings. The Goldberg 
equation resulted in the most hits, the fewest 
misses, and left the fewest individuals un- 
classified. Calculating a phi coefficient on the 
fourfold hit-miss contingency table on in- 
dividuals for whom a prediction was made 
yielded a @ = .320, x?(1) = 14.85, p < .0005. 
The sum-of-ranks rule yielded a $= .121 
which did not achieve significance, £0) 
= 2.01, p> .10. Surprisingly, the Meehl- 
Dahlstrom rules and the Taulbee-Sisson signs 
achieved less than chance accuracy, and the 
latter left nearly 45% of the sample unclassified. 

Results with the indeterminate category 
excluded are shown in Table 4. In this analysis 
the Goldberg equation again was the most 
ey omens closely by the Peterson 

s. The sum-of-ranks i 
P IEE rule did not produce 


Discussion 


The results obtained in discriminating 
psychotics from neurotics support Goldberg's 
previous results, which suggest the superiority 
of the linear model over configural methods. 
However, it is noted that none of the presently 
available linear, sequential-linear, or con 
figural methods works very well in predicting 
diagnostic categories from individual profiles 
In each case the incremental accuracies ob- 
tained above the base rates were quite modest. 
Even though actuarial methods are accurate 
in classifying groups, they appear to have 
little validity for the practical clinical problem 
of classifying individuals. 

The present findings also serve to highlight 
the potent effects of criterion contamination. 
In Goldberg’s (1965) most contaminated 
sample, the Meehl-Dahlstrom rules and the 
Taulbee-Sisson signs achieved the creditable 
validity coefficients of .42 and .40, respectively: 
These coefficients fell to .29 and .27 in the 
least contaminated sample. In the present 
study both methods resulted in accuracy below 
chance. In addition, a number of other pt 


Table 4 

Comparison of Three Methods for 
Distinguishing Psychotics From Neurotics, 
Excluding an Indeterminate Classification 


Method % hits $ x 
Goldberg equation 61.9 .239 10.09" 
Peterson signs 60.8 -216 8.35 
Sum of ranks 55.7 114 2.08 


Note. df = 1 for chi-squares. 
*~ <.005 (n = 176). 


dictors in the 1965 study showed the opposite 
effect on their coefficients; they had lower 
yalidities in the more contaminated sample. 
This suggests that criterion contamination 
must be scrupulously avoided. If it occurs, 
even to a slight degree, results obtained from 
| such data can be misleading. 

Despite possible difficulties with Goldberg’s 

(1972) group profile derivation sample, this 
technique appears quite robust on cross- 

validation. This replication lends greater 

credence to the argument that group averaging 

serves to filter out much of the measurement 

error inherent in individual profiles, revealing 
a composite variable closely related to the 
criterion. Unfortunately, the high degree of 
accuracy in classifying groups does not 
" generalize to classifying individuals. 

The accuracy of group classification pro- 
cedures may be used to advantage in evalu- 
“ating research about psychopathology. Many 
“studies of pathological groups report MMPI 
“mean scores. Two procedures are commonly 
used to relate the findings of these studies to 
the MMPI. The mean profile of a group is 
interpreted as if it were an individual profile, 
and the percentage of the most frequently 
“appearing two-point codes in the group are 
teported. Although these procedures are useful 
for descriptive purposes, application of the 
“Sequential formulae to the group mean profile 
Would provide strong evidence as to which 
Population the group under study was drawn 
from: normal, sociopathic, neurotic, or 
Psychotic. 

This may be illustrated by two examples 
from the recent literature. Rader (1977) com- 
Pared the MMPIs of three groups of men 
arrested for either indecent exposure, assult, 
Or rape. The mean profile of the exposure 
Soup was within normal limits, with Pd being 
the most elevated. Rader concluded that the 
exposure group was comprised primarily of 
mild nonconformists who occasionally test 
Societal limits, The sequential formulae result 

a sociopathic classification for this mean 

Profile. This indicates that the exposure group 

More deviant than visual inspection of the 

Mean profile would suggest. In another study, 

Widom (1977) tested the efficacy of an 

‘Advertisement for recruiting noninstitutional- 


psychopaths for experimental study. 


MMPI DIAGNOSTIC METHODS 


1051 


Applying the Goldberg rules to the mean 
profile of 23 men recruited by this procedure 
results in a sociopathic classification.! This 
corroborates Widom’s findings that a sample 
of psychopaths had been recruited by this 
method. Thus, Goldberg’s linear combinations 
are a valuable adjunct for interpreting group 
results. 


1 MMPI scale scores were estimated visually from a 
figure in the study. 
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Longitudinal Study of Marital Success and Failure 


P. M. Bentler and Michael D. Newcomb 
University of California, Los Angeles 


Personality and background questionnaires were administered to newly married 
couples. Four years later these couples were followed up to determine their 
marital status and satisfaction. Our findings indicate that (a) correlational sim- 
ilarity as well as mean differentiation between partners was higher in the still- 
married group than the divorced group; (b) accuracy of self-perception was 
marginally reflective of marital success; (c) living together before marriage had 
no apparent effect on the outcome of marriage; (d) divorced couples appeared 
to face qualitatively different problems than married couples; and (e) longi- 
tudinal prediction of marital adjustment was possible, with prediction based on 
signed, equal weights yielding R = .70. It appears that variation in marital out- 
come is most accurately predicted from personality and not demographic var- 
iables, based largely on data from women. 


This study attempts to clarify the role of 
marital partners’ personality traits on the 
success or failure of their marriage. We gath- 
ered information on background, personality, 
and peer assessment of personality variables 
at the beginning of the marriages in our sam- 
ple. Four years later we determined the out- 
come of these marriages in terms of staying 
together or divorcing and the quality of these 
marriages using a marital adjustment scale. 
We then compared the separated or divorced 
with the still-married couples in terms of the 
variables that we had assessed at the earlier 
period, Since our independent variables— 
Personality and background data—were as- 
sessed at the beginning of the marriages, we 
Were able to construct several useful regres- 
sion equations to predict marital adjustment. 

There have been only a few longitudinal 
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studies addressed to the question of what 
draws people together as couples and what 
keeps them there once joined. Several of 
these studies focus on factors that contribute 
most toward a couple deciding to marry or 
to separate before marriage (Burgess & Wal- 
lin, 1953; Hill, Rubin, & Peplau, 1976; Udry, 
1967). This study addresses a different seg- 
ment in the marriage process, beginning with 
the marriage itself. We wanted to know how 
and which personality traits, brought to the 
marriage by each partner, affected the subse- 
quent outcome and quality of that marriage 
4 years later. This is an old problem; Kelly 
(1939) and Terman and Oden (1947), both 
using longitudinal designs, addressed a similar 
question. Their assessment of personality was 
in the form of a general personality factor 
that could not be broken down into identifia- 
ble or logical components. They were looking 
for the overall contribution of “personality,” 
as a global concept, to the outcome of mar- 
riage. The present study has assessed a wide 
variety of heterogeneous personality traits in 
order to specify personality to a much greater 
extent. Burgess and Wallin (1953) have also 
studied engagement and marital adjustment 
longitudinally. They eliminated divorced or 
separated couples from their sample, looked 
only at intact marriages, and used the Thur- 
stone Personality Inventory, designed to as- 
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sess primarily neurotic problems, which, as 
Tharp (1963) pointed out, makes the results 
less applicable to a nonpathological popula- 
tion. 

Trait research has also generated many 
cross-sectional studies of married and divorced 
groups. Even though personality traits in- 
cluded in recent research have focused more 
on those that are specifiable and nonneurotic, 
the question of causation or prediction has 
remained clouded and unanswered in cross- 
sectional designs (DeYoung & Fleischer, 
1976; Locke, 1951; Murstein & Glaudin, 
1966; Pickford, Signori, & Rempel, 1966a, 
1966b; Singh, Nigam, & Saxena, 1976). 
Whereas the causative effects of background 
characteristics can reasonably be studied 
cross-sectionally in that they are not subject 
to change by the passing of time or the pres- 
sures of marriage, this cannot necessarily be 
said for personality traits. If one studies a 
marriage failure cross-sectionally, one does 
not know whether the personality pattern 
was the cause or consequence of marital 
breakup, 

The current research has attempted to 
avoid these pitfalls by longitudinally studying 
a wide variety of nonpathological personality 
traits and the effects that they have on mari- 
tal outcome. Since the nature of our study 
does not permit the use of initial random as- 
signment nor the manipulation of personality 
variables, it represents something less than 
the ideal longitudinal experiment. Yet, the 
design would seem to be quite functional 
within these constraints. Our general hypothe- 
ses were that (a) those marriages that are 
still intact will have shown, at the beginning 
of their marriage, more similarity (homog- 
amy) of partners on personality traits and 
background items than the separated or di- 
vorced couples (Barton & Cattell, 1972; Cat- 
tell & Nesselroade, 1967; DeYoung & 
Fleischer, 1976; Kernodle, 1969; Murstein 
1961, 1967; Singh et al., 1976; Holz, Note 
1); (b) there will be a greater peerself 
agreement on personality for the still-married 
couples than for the split marriages (Mur- 
stein & Beck, 1972; Weigel, Weigel, & Rich- 
ardson, 1973); (c) living together before 
marriage will increase the probability of 
marital success; (d) problems of a married 
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couple will be qualitatively different than 
problems of a divorced couple; and (e) sig 
nificant longitudinal prediction of marital 
success based on traits and demographics js 
possible. 


Method 
Sample 


Our original sample consisted of 162 newly mar 
ried couples solicited from the rolls of the local 
marriage license bureau. After follow-up 4 year 
later, our working sample consisted of 77 couples: $3 
were still married, and 24 had separated or divorced, 
The attrition in sample size was primarily due to 
the inability to contact many couples, as well as 
death in two instances. At the time of follow-up, we 
received responses from 89% of the deliverable 
letters, which seems reasonably high. Los Angeles is 
a highly mobile community, which made recon- 
tacting the original sample extremely difficult, Only | 
a very small percentage of our original couples lived 
at the same address 4 years after initial contact with 
them. Nonetheless, we do not feel that our sample 
reduction has been caused by any systematic bias; it 
represents a fair sample of couples under realistic 
field conditions. All studies of this type have prob- 
lems with sampling bias (e.g., volunteer bias, moti- 
vation) that we feel are unavoidable. This bias 
should not invalidate the results, but it stresses the 
importance of replication. 

All demographic data (e.g., age, education, occu- 
pation) were assessed at the beginning of the mar- 
riage. The mean age of the males in our working 
sample was 27 years old and ranged from the late 
teens to the 60s. They were predominantly Cau 
casian—only three minorities—and the majority 0 
their religious choices were split between “none 
and Protestant. Their mean educational level achieved 
was “some college,” ranging from the eighth grade 


to the doctoral level. Their mean occupational level 
was lower-middle class and ranged from lower 

to upper-middle class (seven class levels were consi 
ered: lower, working, low to upper middle, low 
upper, and upper). 

The mean age of the women in our final sample 
was 24 and ranged from the late teens to the o 
Their race and religious affiliations were similar a 
the males. Their mean educational level achieve 
was “some college” and ranged from the eigb 
grade to the master’s level. Their mean occupations | 
level was middle-middle class and ranged from worm 
ing class to upper-middle class. y 

The average length of time for knowing each E 
before marriage was 2 years. This ranged from A 
minimum of 6 months to a maximum of 8 ma 
Twelve percent did not get engaged. For those © 5 
ples who did get engaged, the average length of © 
gagement time before marriage was 5 months aM 
ranged from less than 1 month to over 3 years. 
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Initial Data Collection 


We used the Bentler Psychological Inventory 
(BPI), which assesses 28 personality traits (Comrey, 
Backer, & Glaser, 1973). The BPI consists of 680 
pairs of statements, with each pair representing two 
poles of a single dimension. The respondent is asked 
to choose the one item of each pair that most 
closely reflects himself/herself. Although the BPI 
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Personality Trait-Scale Descriptions 
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was developed with multivariate methods, the items 
are fairly face valid so that the results indicate how 
the person views himself/herself. Table 1 provides a 
listing of the 28 trait-scale names. Brief descriptive 
examples of high and low rankings on each scale are 
given in columns 2 and 3. Columns 4 and 5 show 
the internal consistency coefficients for a standardi- 
zation sample of the self- and peer-inventories, 
respectively (Bentler, Note 2). The last column pro- 


Statistical characteristics" 


Rank 
Self Peer Self- 
Scale name High Low K-R 20> K-R 20° Peerr 
Agility Quick reflexes Clumsy 92 <93 „50 
Ambition Seek status; want to Satisfying position 95 95 47 
be well-known is OK 
Art Interest Like museums and Not concerned with 94 94 69 
the fine arts art exhibits 
Attractiveness Nice face; sexy Homely; plain; 96 95 39 
average features 
Body weight Thin; skinny Husky; fat 98 97 7 
Cheerfulness Playful; happy Moody; serious 93 94 54 
Clothes consciousness Fastidious dresser; Sloppy ; sometimes 95 95 73 
neat unkempt 
Congeniality Good-natured Impatient; stern 88 92 44 
Deliberateness Premeditated; careful Impulsive; rash; 93 95 AL 
make few plans 
Diligence Energetic Lazy; dawdler 94 94 a 
Extraversion Pushy; talker Shy; reserved 95 95 A 
Flexibility Give in; pliable Obstinate .87 p: sa 
Generosity Spender; sharer Selfish; miserly 93 ah oo 
Intelligence Intellectual Mentally average 93 a4 = 
Invulnerability Thick-skinned; not a easily 94 : : 
easily hurt urt 
Law Abidance Must be legal Rules can be bent 89 a 9 
Leadership Domineering Follower _ 95 cs oe 
Liberalism Like social protest Conservative 89 i È 
and change 
Masculinity Read sports page; Like to sew; read 96 96 .94 
mannish social page 4 an A 
Objectivity Scientific Superstitious 8 e 
Orderliness Organized; neat Messy; careless, ee a = 
erceptiveness Empathetic; aware eee other’s 8 h $ 
eelings 
Religious commitment Believe in God; like Atheist; don’t go .94 94 .16 
church to church S D 
Self-acceptance High self-esteem; Feel worthless; bad 95 = ë 
happy with self self-image ki a 
Stability Relaxed; composed Tense; nervous 94 = os 
Thriftiness Buy on sale; save Don’t shop around 92 
coupons 
Travel interest Take ae often; Stay at home; — 92 95 44 
traveler rarely take trips a 
Trustfulness Faith in people Often disbelieve 93 93 ` 


* Based on Bentler (Note 2); n = 216. 


* Kuder-Richardson 20 interna 


] consistency coefficient. 
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vides the peer-self correlations based on the standard- 
ization data, representing the cross-correlations be- 
tween BPI and the Bentler Interactive Psychological 
Inventory (BIPI) scales. The BIPI is identical to 
the BPI but is directed at the description of an- 
other person. It is generally filled out by a peer or 
knowledgeable observer of a subject. In the current 
study, the BIPI was filled out by a friend of the 
husband or wife in reference to the husband or wife, 
respectively, representing peer evaluations of person- 
ality. 

The Sexual Behavior Inventory consists of a listing 
of 20 sexual activities, based on Bentler (1968a, 
1968b). The couple was asked to answer in a yes/no 
manner which of these behaviors they, as a couple, 
have engaged in. 

Background information was obtained for both 
individual partners, as well as on their relationship. 
For each individual information was collected on age, 
height, weight, race, religion, education, occupation, 
previous divorce (or death of spouse), previous chil- 
dren, and parental divorces. In regard to the couple, 
it was ascertained how long they had known each 
other, been engaged, and, if applicable, lived together 
before marriage. 


Follow-Up Data Collection 


Our follow-up questionnaire had two sections. The 
first section consisted of the Locke and Wallace 
(1959) Marital Adjustment Scale, which we scored 
by their methods. The second section consisted of 3- 
point ratings (no problem = 0, moderate problem = 
1, extreme problem=2) of 19 potential problem 
areas. This was scored by weighted summation across 
all problem areas, One questionnaire was completed 
by each still-married couple and by as many di- 
vorced people as could be located. A score was cal- 
culated for each section on which questionnaires 
had been completed by both partners; in a divorced 
marriage, we averaged the respective scores. In total, 
we obtained this information on 68 couples. Although 


to have had each 


A factor analysis of the adj 
te aly ijustment 
substantial positive loadings of all E, eae 
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unrotated factor, which accounted for 75% of the 
variance. Thus, the scale assesses a fairly unidime, 
sional quality. Since the first and second pat 
scores were significantly correlated, r(66) =) 
< 001, indicating that marital adjustment and lik 
of problems were positively related, a composite 
score was obtained by averaging the normalized at 
justment and lack of problems score. A probabilistic 
lower bound to reliability of the composite in th 
population (p) is given by fiw =.78; that is, p(1i 
<p) =.99 (Woodward & Bentler, in press), ind 
cating that the composite score possesses adequate 
internal consistency. This composite dependent 
variable differentiated the married from divoresi] 
groups quite well, ¢(66) = 9.54, p < .001. 


Results | 
: 
Evidence for the first hypothesis come 
from within-group similarities between part: 
ners, with the homogamy hypothesis predic 
ing greater initial similarity of partners among 
couples longitudinally successfully married 
than among those who become divorced oi 
later separate. | 
In regard to this hypothesis, five of thel 
background items (age, education, occupation 
previous divorce, and previous children) had 
significantly positive husband/wife corre 
tions for the married couples compared with 
four (age, education, parental divorce, and 
previous children) for the divorced couples) 
Table 2 shows these correlations in the topi 
portion of columns 1 and 2; a one-tailed test 
in the hypothesized direction was used k 
assess significance. The third column in this 
table shows the Fisher r to z conversion W 
test for significant differences between corte 
lations, also using a one-tailed test. Only tht 
age correlation for the married group was sig 
nificantly more positive than it was in the 
divorced group, Looking at the personality 
trait section of columns 1 and 2 of Table i 
it can be seen that there were 10 significan 
husband/wife correlations for the martie 
Couples, and there were 6 for the diver 
couples, again using a one-tailed test. a 
traits (ambition, liberalism, religious comm! d 
ment, and travel interest) had significan 
positive correlations for both groups. i 
only substantial negative correlation was oi 
Stability for the divorced group: This seem 
to be an area in which a mismatch contribut 
to marital failure. Column 3 shows that thre? 
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Table 2 
Iniracouple Correlations and Mean Differences 
Husband wil Maist between 
nd—wife r esis husband and wife 
Married Divorced Marri I 
arried & S a 
Item (n=53) (n=24) DSE a grates Divorced 
i = 104) (df = 46) 
q Background 
e 
Height an an! 2:83%* 2.76** 3.05** 
sen 13 —.20 cn 10.98 6.768 
ucation .28* 37* X 10A 7.9288" 
Occupation “ggr 30 — 39 67 1,10 
Previous divorce “o4"** janes 1.42 —2.20* 36 
Previous children “57a "50%" 25 59 .30 
Parents’ divorce 02 ei ae 42 1.22 
$ pr i PLY 50 87 
Personality traits 
Agility 
Ambition ras os Re 3.82" 3.63"* 
“Art interest “53*** ‘02 1 351 192 
Attractiveness “50*** ‘07 2.197 —2.47* —1.95 
Body weight 09 an 2.34 —1.80 —,99 
poe — 01 ‘03 a een -.29 
[Clothes consciou i ž e z 18 
Congeniality N RA aay —2.14 3.18** —1.03 
Deliberateness —.18 34 te oa a 
Diligence ne oF 66 32 1.14 
Extraversion 49 = .02 46. 1,13 48 
Flexibility R —.25 1.72 2729% —.06 
enerosity “93* ae -08 —1.23 —1,57 
Pieligence ‘21 be 40 —2.24* —.62 
p vulnerability ‘00 i see eal 08 
| a abidance `22 ‘21 TF ges 623 
leadership A 2l 04 -270 6s 
a “ue o 40" 19 tsa n 
aad 22 zai —.20 aims 17.10% 
Orderliness pon o at pe 3.56"* 
poes —.06 Lie ‘23 Taie 00 
Ganon commitment ‘age “oor =1.25 1.66 1.03 
ue ‘00 a7 ~.66 47 1.47 
Pr E E eS ee 
Eyer interest 35" 43" = 36 ‘52 ‘91 
paes 10 :37* S -2a 315" 
`p 
22 < 08. 
Mates 01. 
b < .001. 


co . 
Bens (art interest, attractiveness, and 
ie ee were significantly more posi- 
as pan F one-tailed test, for the married, 
fonsciousns with the divorced, group. Clothes 
| Substanti a on the other hand, seems to be 
Dothesig ally contrary to the homogamy hy- 
, since a larger, positive correlation 


rced, compared with 


was obtained for the divo 
As another way of 


the still-married, group. 
looking at this issue, 4 sign test on the hus- 
band/wife correlations, compared across the 
married and divorced groups, showed that the 
still marrieds had more positive correlations 
than the divorced ($ < 05, one-tailed test). 
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Turning next to mean differences rather 
than correlational similarities, the upper por- 
tion of column 4 in Table 2 shows that there 
were four background areas that were signifi- 
cantly different between husbands and wives 
for the married group; these differences were 
on age, height, weight, and occupation. Two- 
tailed ¢ tests were used on these mean com- 
parisons. A positive sign indicates the hus- 
band had the higher score, and a negative 
sign means that the wife had the higher 
score, Column 5 shows that for the divorced 
group three areas were significantly different; 
these were age, height, and weight. Looking 
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next at personality traits, column 4 show 
that there were 16 significant mean differences 
between husbands and wives among the still. 
married group, whereas there were 6 signifi. 
cant mean differences for the divorced group, 
as can be seen in column 5. The groups were 
significantly different in regard to the number 
of trait differences when the results were con 
sidered statistically, x°(1) = 6.06, p < 05. 

We then looked at whether the magnitudes 
of the husband/wife discrepancies were stá: 
tistically different between the married and 
divorced groups. Although none of the dis 


a aa 


t between married and divorced 
U i 
The averagt 

of male 
and female 
(Self—Peet! 
(af = 18) 


r between self and peer 


Male Female Self—Peer* 


Married Divorced 


Married Divorced 
(n = 14) 


(n= 15) (n = 9) 


i Male Female 
Trait (n = 6) (df = 18) (df = 22) 


Agility 43 Ad 
Ambition 44 .86* 
Art interest Bi) hoe 1M 
Attractiveness .23 .81* 36 
Body weight 53° A? aaa .88*** 86*** 24 
Cheerfulness Al 23.06 { i 
Clothes consciousness Al -69** 
Congeniality 10 79 .09 73* 
Deliberateness „58 : 
Diligence 
Extraversion 40 
Flexibility 
Generosity 
Intelligence 56° 69 
Invulnerability 
Law abidance 67%" 04 
Leadership 4 , 
ih arses f 
asculinity Al : 
Objectivity AS 3 
Orderliness 46° 67 
Perceptiveness 22 A 
zasos commitment 


65** —.09 74 
47" 51 -70 
gat ri 


—1.14 
Soi 
—.03 
—.10 
—.07 

1.55 57 

1.33 

—.09 30 -70* 123 

mn .74* 18 AT 

à 4 = 133, 

HE lO 37 2.60 ‘62 

NLS 1.64 06 

Hoe 31 94 13 


ass 
«60% 


—2.07* 95 
1.01 
1.29 

—1.87* 

: L .55 

839e .85** ee 43 

1.75 ‘80 

Ta 

FEN 

AS* 71 80 73 

—1.10 11 

cai ‘69 

—.62 


À 
Table 3 
Peer and Self-Trait-Rating Comparisons 
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 qrepancies were found to be significantly dif- 
ferent, taken trait by trait, a sign test re- 
vealed that the husband/wife differences were 
significantly smaller overall for the divorced 
compared to the married group (p < .05, two- 
tailed). Combined with the correlational re- 
sults, it seems that mean differentiation in 
personality between husbands and wives who 
are successfully married coexists with rela- 
tive agreement correlationally, in self-percep- 
tion. 

Using tests appropriate for nominal data, 
the background items “race” and “religion” 
showed no significant husband/wife variation 
or difference between married and divorced 
groups. 

Our second hypothesis, concerning peer/ 
self congruity as a measure of accuracy of 
self-perception, is mildly supported by our 
findings, Referring to Table 3, columns 1 and 
2 for males and 3 and 4 for females show that 
although there were more significantly non- 
zero peer/self correlations among both males 
and females of the married group than the 
divorced group, the difference in number was 
not significant for either males or females. 
Using again the Fisher z to z conversion, we 
find that thriftiness, for the males, and art 
interest, for the females, showed a signifi- 
cantly higher peer/self correlation within the 
married group as compared to the divorced 
group. (Although not related to marital dif- 
ferences, women showed a significantly larger 
number of nonzero peer/self correlations than 
did the males; apparently, there is greater 

peer/self agreement regarding the wife’s per- 
sonality than the husband’s personality.) 
Turning to the mean scores, it was noted that 
the married males showed a significantly lower 
peer/self discrepancy (squared) on cheerful- 
ness and law abidance, as compared to the 
divorced males, as can be seen in column 5 
of Table 3. One trait, masculinity, showed 
analogous results for the females (column 6). 
» Flexibility, for the males, turned out to be 
Substantially contrary to our hypothesis. 
When the peer/self discrepancies (squared) 
on a given trait were averaged for the couple, 
cheerfulness and masculinity were significantly 
less discrepant in the married than in the 
divorced group (see Table 3, column 7). 
Flexibility, on the other hand, showed a sub- 
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stantially greater discrepancy for the Partial 
as compared with the divorced group, which is 
contrary to our hypothesis. 

Our third hypothesis, concerning living 
together before marriage, was not supported 
by the data, For married couples, 24 lived 
together before marriage and 29 did not. For 
Divorced couples, 14 lived together before 
marriage and 10 did not, x*(1) = .66, ns. In 
regard to other couple background character- 
istics, we found no significant differences be- 
tween the divorced and still-married groups 
for the length of time they had known each 
other, been engaged, or lived together before 
marriage (if in fact they did), Although none 
of these mean differences was significant, the 
still-married couples tended to have known 
each other longer and to have been engaged 
longer than the divorced group. The opposite 
tendency was true for the duration of living 
together before marriage (if they did), The 
divorced group had lived together longer 
than the still marrieds, One other background 
question did turn out to be a statistically 
reliable predictor of marital success. The 
couples who have remained together more 
often had one or both partners previously 
widowed than did the couples who had di- 
vorced. In regard to the Sexual Behavior In- 
ventory, there was no significant differentia- 
tion revealed between the married and di- 
vorced groups on this measure. 

Our fourth hypothesis, concerning types of 
problems faced by divorced and married cou- 
ples, shows some clear distinctions. Table 4 
shows the mean values of the married and 
divorced groups 


columns 1 and 2, 
shows the # values for each problem area be- 


, mate sent to jail, 
laws—which were essentially no more of a 
problem for the divorced group than the 
married group. Twelve areas pi to be 
significantly greater problems for the divorced, 
the married, group. These could 


example, sex relations, 


lack of mutual affection, bickering, selfish- 
dependence. One area— 
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Table“ 


Problem Ratings at Follow-Up 


P. M. BENTLER AND MICHAEL D. NEWCOMB 


a 
M problem score* 


t between married 


Married Divorced and divorced 

Problem (n = 53) (n = 15) (df = 66) 
Attention to another <2 A) —4.1 hate 
Mutual affection T 11 ATs 
Adultery Al .9 Sie 
Sex relations A 9 —2.73 
Venereal disease 0 0 .00 
Desire for child 2 3. — 44 
Finances 3 1.0 —4,34*** 
Nonsupport xt) 5 —3.80*** 
Drunkenness :2 AS} —.54 
Drug abuse 0 2 —2.65** 
Gambling 0 Ait —.70 
Sent to jail 0 S| —.69 
Friends a tf 4.92""* 
Selfishness 3 1.0 —4.37*** 
In-laws K] 0.5 —.13 | 
Ill health 2 0 +2.41* | 
Bickering 3 6 —2.04* | 
Independence ao 1.0 —3.87*** 
Career conflicts a 5 —2,72** 
Other 3 5 —.84 
Problem score 3.5 11.3 6.00*** 


Note. All significance tests are two-tailed. 
a0 = no problem; 1 = moderate 


ill health—was found to be a significantly 
greater problem for the married than for the 
divorced groups. This is more of an involun- 
tary problem and seems to haye drawn the 
couples together, 

The successful longitudinal prediction of 
marital success requires significant differentia- 
tion of means when comparing still-married 
and divorced groups. Columns 1 and 2 of 
Table 5 show the ¢ values for these mean 
comparisons for the males and females, re- 
spectively. A positive sign indicates that the 
married group had the larger value, whereas 
a negative sign means the divorced group had 
the greater value. We used one-tailed tests of 
significance on the background items, since 
we had a priori expectations from previous 
research (e.g, Murstein & Glaudin, 1966) 
which we will address more fully in the Dis. 
cussion section, The only significant difference 
on background item means between married 
and divorced males was on Parental divorce. 


problem; 2 = extreme problem. 


The married males had fewer parental dr 
vorces than the divorced males. For females 
there were five significant mean differences: 
The married females were older, had higher 
educational and occupational levels, had more 
previous children, and had fewer parenta 
divorces than the divorced females. Compal 
ing mean scores on personality traits actos 
the two groups, using a two-tailed test 
significance, the following picture emerged: 
The married males showed significantly Jes 
extraversion, invulnerability, and orderliné® 
compared with the divorced males. The mat 
tied females showed significantly greater 
clothes consciousness and congeniality COM - 
pared to the divorced females. 

Next, we evaluated the longitudinal pr 
dictability of marital success with initial p 
sonality variables obtained at the time 3 
marriage, variable by variable. The corre i 
tions between the composite dependent ma l 
tal adjustment score and prior personality 


Table 5 
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Predictive Correlations and Mean Differences Between Married and Divorced Groups 


Mait between 


married and divorced 


r with the composite 
adjustment score 


erceptiveness 
Religious commitment 
Self-acceptance 
Stability 
Thriftiness 

ravel interest 


Males Females Males Females z between male 
Item (n = 53) (n = 24) (n = 68) (n = 68) and female r 
Background 
Age 1.36 SRH ba nore Y Yd 12 
Height —.07 117 —.12 10 —1.26 
Weight —1.60 -23 —.05 04 —.51 
Education 91 5 RhA .09 00 51 
Occupation —.70 1.76* 02 —.10 -69 
Previous divorce 52 Al A3 13 .00 
Previous children 1.16 ENST oak hie —.74 
Parents divorced —2.30* —2.00* —.21 —.01 —=1.16 
Personality traits 
—1.02 .02 —.04 —.03 —.06 
—1.16 —1.93 —.19 —.24* 30 
—.35 —1.07 —.10 —.28* 1.07 
—.31 —.62 —.12 —.14 12 
—1.37 .22 —.01 —.02 .06 
24 1.27 —.01 4 — 86 
28 2.05* —.01 Pipe —1.64 
—.10 2527 .06 14 — 46 
38 1.60 1254 .04 1.23 
—.28 —.53 .02 A7 —.86 
—2.20* —.59 —.29"* .05 —1.99* 
.56 —.15 .02 —.06 46 
—.02 86 —.07 18 -1.4 
—1.06 =.92 01 Sias 1,39 
Invulnerability —2.05* 19 =A7 19 2.08" 
Law Abidance a7 87 At AS -.23 
Leadership —.81 —.69 —.16 00 —.92 
Liberalism =1.74 —1.37 «00 =.14 80 
Masculinity 84 —.33 —.03 09 69 
Objectivity —.04 95 2 123° 7:65 
Orderliness —2,18* —.62 —.12 05 =1.03 
P y 1.79 —.06 16 —1.26 
R 3 
T 4 . . 


P< 05. 
b < 01. 


scores were obtained separately for males and 
females (see columns 3 and 4 of Table 5). A 
two-tailed significance test was used. Among 
males, two significant correlations were found 
on extraversion and deliberateness. The more 
adjusted and happy the marriage, the more 
deliberate and less extraverted were the males, 
Among females, six significant correlations 


Agility 
Ambition 
Art interest 

T Attractiveness 

| Body weight 

| Cheerfulness 
Clothes consciousness 
Congeniality 
Deliberateness 
Diligence 
Extraversion 
Flexibility 
Generosity 
Intelligence 

A 


were found for ambition, art interest, clothes 
intelligence, objectivity, and 
stability. The higher the marital adjustment 
score, the more the females reported them- 
selves to be clothes conscious, objective, and 
stable while reporting relatively less ambition, 
art interest, and intelligence. Again using the 
r to z conversion, column 5 shows the s 


consciousness, 
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Table 6 i 4 
Stepwise Selection Regression Equation: Using 
Initial Pool of All Variables 


Variable Beta Sign 
Previous children—female hy 1 
Deliberateness—male .23* 1 
Ambition—female —.19 =i 
Objectivity—female g 1 
Clothes consciousness—female .454** 1 
Orderliness—male Olay -1 
Masculinity—female oo 1 
Intelligence—female —.26* —1 
Thriftiness—male -24* 1 
Flexibility—male —.23* =i 
Source SS df MS F 
Signed, 
equal weight 25,887.31 25,887.3 60.31*** 
Differential 
weights 1,829.4 9 203.3 AT 
Total 
regression 27,716.7 10 2,771.7 6.45*** 
Residual 23,607.2 55 429.2 
*p UE 
> < 01. 
** p < 001. 


values when comparing the correlations of 
the males and females. Two correlations were 
significantly different between the males and 
the females. The males had negative correla- 
tions on extraversion and invulnerability, 
whereas the females had significantly differ- 
ent, positive correlations. The greater number 
of significant correlations on personality traits 
among females reflects the apparent greater 
predictability of marital success among women 
than men. 

When the composite dependent variable was 
correlated with the background variables, by 
sex, there were two significant correlations 
for females and one for males. For the males 
being older is predictive of marital adjust- 


ment, whereas for females, being older and 
having previous children are predictive of 
marital adjustment, There were no significant 
differences between males and females for any 
of the background correlations, using the r to 
2 conversion, 
s Several longitudinal prediction equations 
involving linear combinations of predictor 
variables have been generated from our data 
using as dependent variables (a) the com- 
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posite adjustment score and (b) being ma. 
ried versus divorced at follow-up. The most 
concise summarization of these results can 
be stated for the prediction equations using 
background and personality data from all 
male and female subjects, focusing on the 
continuous dependent variable (the compos 
ite score). A stepwise-regression procedure 
was used, which showed significant incre 
ments in prediction up to the point at which 
the procedure was terminated, due to a conf 
servative judgment regarding stability of re 
sults on potential cross-validation, With 10) 
predictor variables the multiple correlation] 
was .74, F(10, 55) = 6.45, p < .001. The beta 
weights for this equation are shown in the 
second column of Table 6, and the F test 
asociated with this multiple regression model 
is given in the third row of the analysis of 
variance breakdown. All but one of the pre 
dictors contributed significantly (p < .05) 10] 
the multiple regression equation, whereas the) 
weight of the remaining variable was 1,8) 
times its standard error. | 

To assess in finer detail the nature of the| 
regression, we used a new regression model 
developed by Bentler and Woodward (Note 
3; Woodward & Bentler, Note 4). This modél 
can be written in the form y= prxt th 
(B — B*t) +e, where y is the standardized 
dependent variable, X is the matrix of stant 
dardized predictor variables, and e is the 1 | 
siduals. The vector of the beta weights is the 
usual optimal vector of least-squares coeffi- 
cients. The vector ¢ is a sign vector with elt- 
ments +1, chosen to optimize prediction of J 
from X, and 8* is a weight asociated with | 
this regression. The Bentler-Woodward 1% 
gression method in essence partitions thé 
usual sum of squares due to regression in! 
two additive components: one, associated Wi 
8*Xt, and the second, associated with X (7 
A*t). The first component is determined suc”) 
that the sum of squares associated with th# 
component is maximized; it represents ‘i 
optimal prediction possible with weights P q 
that are equal for all i predictors, except E 
sign. The second component assesses whethei 
differential weighting contributes a signife 
increment to prediction beyond that possibl | 
with the signed, equal weights. 

The Bentler-Woodward regression solv 


~ 
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tion of Table 6 was obtained after the step- 
wise regression method yielded the 10 pre- 
dictors shown in the table. It determined that 
the optimal sign weights for the regression 
were those given in the right-hand part of the 
table. In this case these signs are identical to 
those of the beta weights; this is not a neces- 
sary feature of the approach. The analysis of 
variance at the bottom of Table 6 shows that 
the null hypothesis that the contribution of 
the signed, equal weights to prediction is 
zro could be rejected at a high level of 
statistical significance ($ < 001). On the 
other hand, the analysis of variance revealed 
that the contribution of differential weight- 
ing, over and above the contribution of the 
signed, equal weights, was not statistically 
reliable. Thus, although the ordinary multiple 
regression procedure yielded a significant re- 
gression of criterion on predictor variables, the 
more fine-grained analysis reveals that vir- 
tually the only source of such prediction is 
associated with the signed, equal weights. 

The proportions of variance associated with 
the Table 6 results mirror the statistical 
results, The correlation of the signed equal- 
weight composite with the criterion was .71, 
yielding an R2g* = .50. The contribution of 
differential weighting was only R°p — R? = 
04. Of course, these two components of re- 
gression add to the total variance accounted 
for by the least-squares beta weights R’ = 
54, As a consequence of these results, one 
would not expect the differential weights to 
cross-validate. Another way to assess the 
potential shrinkage due to using the highly 
tuned beta weights is to evaluate the adjusted 
R%s, which represents an unbiased estimator 
of the population squared multiple correlation 
(Olkin & Pratt, 1958). It is an adjustment of 
the Rs value using the number of indepen- 
dent variables and the sample size. Specifi- 
cally, the equation we used was 


adjust Peet) N-3 spe 
ed Rè = 1 (ween R’s) 


a G = a Je — R's), 


oa k equals the number of predictor varja- 
es in the equation and N equals the total 
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number of cases. Our adjusted R? = .46, a 
value in the same range as R*g*. i 

Turning now to the substantive meaning of 
the regression described in Table 6, the 
variables included in the regression were those 
obtained from women (presence of previous 
children, less ambition, objectivity, clothes 
consciousness, masculinity, and less intelli- 
gence) as well as from men (deliberateness, 
Tess orderliness, thriftiness, and less flexibil- 
ity). In spite of the variable-by-variable re- 
sults mentioned previously, the simultaneous 
approach indicates that self-reported person- 
ality variables contributed the major share 
to the longitudinal predictability of marital 
success, with background variables being 
barely represented in the multiple regression. 
(Only presence of previous children, among 
females, contributed significantly.) Further- 
more, the majority of significant predictor 
variables came from the women’s data, sug- 
gesting that variation in longitudinal marital 
success is a greater consequence of women’s 
attributes than that of men. This effect was 
more dramatically observed in stepwise re- 
gression equations based on larger numbers 
of variables. For example, the 15-variable 
equation (R = .80) consisted of 10 variables 
from women and only 5 from men. Although 
the dimension of traditionality in sex role 
orientation (Ellis & Bentler, 1973) may be 
relevant to understanding these results (see 
Discussion), the differential prediction effect 
does not seem explainable. 

One goal in constructing optimally weighted 
prediction equations is to find one that maxi- 
mizes the amount of variance (R°) ac- 
counted for by a small number of variables. 
Statistically, R*ş can be decomposed into the 
sum of the products of the simple correlation 
coefficients and the beta weights for each 
variable in the equation. We arbitrarily set 4 
minimum level of 03 for this product and 
located 15 variables from an equation based 
on the best 30 variables that met this cri- 
terion. These variables thus showed a mini- 
mally adequate beta weight in the set of all 
variables as well as a minimally adequate 


zero-order criterion correlation. Using these 


15 variables in a stepwise regression manner 
Only 12 vana- 


led to the equation of Table 7. : i 
bles became part of the final equation, since 
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Table 7 

Stepwise Selection Regression Equation From 
the Pool of 15 Variables that Maximized the 
Beta X Simple Correlation 
ee Ea A AN 


Variable Beta Sign 
Previous children—female .24* 1 
Objectivity—female A hes 1 
Clothes consciousness—female art 1 
Extraversion—male —.10 =i 
Masculinity—female A0*** 1 
Intelligence—female — 3998" -i 
Parents divorced—male =S: —1 
Cheerfulness—female 20% 1 
Orderliness—male —,294 -1 
Thriftiness—male aks 1 
Invulnerability—male —.19 =1 
Perceptiveness—female 16 1 
Source SS df MS F 
Signed, 
equal weight 24,655.51 24,655.5 58.70*** 
Differential 
weights 4,407.8 11 400.7 95 
Total 
regression  29,063.3 12 2,421.9 5.77*#* 
Residual 22,260.6 53 420.0 
*p <.05. 
“p< 01. 
ere p< O0L: 


3 of the 15 did not meet our minimum entry 
criterion in the stepwise inclusion (F > 1.0). 
Table 7 exhibits virtually the same pattern of 
results regarding the relative merits of various 
sets of weights as does Table 6. In this case, 
Rg = 48, Rọ — Rig = 09, R? = DA 
adjusted R? = .48, 
Again, the signed, equal weights carry the 
lion’s share of the total prediction, as well as 
the only statistically reliable effect. 
To evaluate whether the stepwise regression 
procedure was somehow unfairly selecting 


variables, or from variables associated with a 
given sex, we ran separate stepwise regres- 
sion equations on the background variables 
and traits for the males and females. From 
each of these four equations, we selected the 
2 best predicting variables according to an F 
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bles. This pool was considered to adequately 
represent the background variables, A stepwise 
regression was run on this set of variables, 
The results are shown in Table 8. This | 
equation uses 11 variables and was chosen 
since it maximized the overall F value asso- 
ciated with Rg*. Perusing the substantive con- 
tent, the sex difference is noted again, with | 
the males being represented by 3 of the Il | 
variables and the females representing the 
majority by 8 of the variables. This result is | 
not due to an initial bias in the pool of 21 
variables, since they were fairly evenly di | 
vided by sex (10 males and 11 females), | 
Turning finally to the quantitative results, we 
find that the analysis of variance mirrors the | 
results of Tables 6 and 7. We have 


Rip = 49, Rg — Rẹ = 08, R= .57, | 
and adjusted R? = 49, 


| 
(It should be noted that the total degrees of 
freedom associated with Table 8 is two greater 
than in either Table 6 or Table 7. This is due 
to not having a complete data set on every | 


Table 8 

Stepwise Selection Regression Equation of the 
Combined Pool of 21 Variables 
oe cis I ūU 


Variable Beta sy 
Previous children—female 33%" r 
bition—female —.21* E 
Objectivity—female 29%" 
Thriftiness—male .28"* l 
Perceptiveness—female -20 I 
Intelligence—female — 33" a 
Masculinity—female Ase ; 
Cheerfulness—female 21* i 
Parents divorced—male —AS i 
Clothes consciousness—female 26* 1 
Orderliness—male —.21* 2 
Source SS df MS F 
Signed, att 
equal weight 25,818.11 5,818.1 63.75 
Differential 
weights 3,042.9 10 3943 97 
Total at 
regression 29,761.0 11 2,705.5 6-68 
Residual 22,681.1 56 405.0 
*b<.05. 
> <.01. 


™* > <.001. 
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! couple and does not reflect a numerical 
error.) 

Although we did generate several equations 
using our dichotomous dependent variable 
(still married vs. divorced), we have chosen 
not to report these for several reasons. Most 
important was the consideration that the sta- 
tistical assumptions underlying this type of 
dichotomous regression are less sound and 
proven than for a continuous dependent 
variable. This uncertainty and lack of sound- 
ness could create diffculty in cross-validating 
any dichotomously based equations. For ex- 
ample, a simple shift in the proportion of 
divorced couples in a given study would mod- 
ify the optimal beta weights and the overall 
R?ș, due to the sensitivity of the regression 
to the mean of the dependent variable. This 
is why we have chosen to report only equa- 
tions based on the continuous composite de- 
pendent variable. 


Discussion 


The present investigation has shown some 
clear longitudinal evidence for the homogamy 
hypothesis. Our results indicate that homog- 
amy of personality traits between marital 
partners, assessed at the beginning of their 
marriage, is evidenced to a greater degree in 
marriages that turn out successfully than for 
marriages that terminate in separation or 
divorce. In other words, correlational similar- 
ity between marital partners, based on person- 
ality traits measured at the beginning of a 
marriage, was substantially higher for couples 
who remained together after 4 years than 
couples who decided to end their marriage 
within that period of time. This pattern was 
also found for background or demographic 
variables. Other researchers have found simi- 
lar results cross-sectionally on personality 
traits (e.g., Cattell & Nesselroade, 1967; De- 
Young & Fleischer, 1976; Pickford et al., 
1966b). Using the Guilford-Zimmerman 
Temperament Survey, Pickford et al. (1966b) 
found four significantly positive husband/ 
Wife correlations on general activity, restraint, 
friendliness, and personal relations for their 
happily married group. Our significant find- 
ing on congeniality seems similar to theirs on 
ftiendliness for happily or still-married cou- 
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ples. They found no significantly positive 
correlations, but they did report one signifi- 
cantly negative correlation on emotional sta- 
bility for couples on the verge of separation; 
our divorced sample had a similarly large 
negative correlation on stability. Cattell and 
Nesselroade (1967), using the 16 Personality 
Factor Questionnaire (16 PF), found four 
factors (Affectothymia, Surgency, Protension, 
and Self-Sufficiency) that were significantly 
more positively correlated between husbands 
and wives in stable marriages compared to 
couples in unstable marriages. Both Affecto- 
thymia and Surgency seem similar to our sig- 
nificantly different correlations on extraver- 
sion. DeYoung and Fleischer (1976) found 
all positive husband/wife correlations, using 
the 16 PF, with 10 (Affectothymia, Intelli- 
gence, Dominance, Surgency, Super Ego 
Strength, Autia, Radicalism, Self-Sufficiency, 
Self-Concept Control, and Ergic Tension) 
significantly so. They found these results in 
their one sample of couples who were still 
married and did not contrast them with a 
group experiencing marital disharmony or 
divorce. 

This type of analysis does not directly 
address the need complementarity hypothe- 
sis (e.g, Meyer & Pepper, 1977; Murstein, 
1961, 1967; Rosow, 1957; Winch, 1967), 
since the validity of inferring needs from 
personality traits is not adequately known. 
Nonetheless, one can consider many of our 
trait scales to be bipolar (e.g. extraversion: 
high score means very extraverted, a low 
score means very introverted), so that evi- 
dence for a complementary trait hypothesis 
can be examined. For the successfully mar- 
ried couples, no trait was found to have a 
significantly negative correlation between 
partners. If complementarity were an impor- 
tant influence for a successful marriage, some 
significant negative correlations should have 
been found in this group. The divorced group, 
on the other hand, had one substantial nega- 
tive correlation on stability. Although we do 
not claim that these results disprove a com- 
plementary trait hypothesis, certainly there 
is no evidence favoring the concept. Yet, 
when we look at our simple prediction corre- 
lations by sex (traits predicting the composite 
dependent variable), we find two traits— 
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extraversion and invulnerability—that have 
significantly different effects for the males and 
the females. The happy and/or maritally ad- 
justed males were relatively introverted and 
vulnerable, whereas the successfully married 
females were relatively extraverted and invul- 
nerable, When comparing these two types of 
analyses—husband/wife correlations and sim- 
ple predictive correlations between individual 
traits and the composite dependent measure— 
no clear support or refutation of the comple- 
mentarity hypothesis seems evident. Thus, 
although we have found longitudinal support 
for the homogamy or correlational similarity 
hypothesis of spouse personality traits, we 
can say little regarding the complementarity 
(of traits) notion, 

Some researchers (e.g., Singh et al., 1976) 

have found that significant mean differentia- 
tion of marital partners on personality traits 
occurs more often in successfully married 
couples than in those who divorce, These re- 
searchers implied that this finding is incom- 
patible with correlational similarity between 
successfully married spouses. Our results do 
not provide longitudinal support for this 
cross-sectional finding. We found both signifi- 
cant mean differentiation as well as correla- 
tional similarity on personality traits between 
partners within the still-married group. In 
addition, we found greater mean differences 
among the successfully married than the di- 
vorced. In other words, correlational simi- 
larity does not necessarily imply equality of 
trait levels; it only indicates that the traits 
are related in a linear manner, Apparently, 
correlational similarity and mean differenti- 
ation between marital partners are both con- 
ducive to subsequent marital happiness and 
Success. How this finding is reflected in 
spouses’ behavior toward each other and how 
it might contribute to marital satisfaction is 
unclear. Further theory development and re- 
search needs to be done to integrate and 
thoroughly understand this finding. 

Our hypothesis concerning the effect of 
self-perception accuracy, as revealed by peer/ 
self rating comparisons, was only mildly sup- 
ported by our data. We speculated that the 
more similar peer- and self-ratings were be- 
fore marriage, the more accurately that 
spouse would self-perceive in the marriage. 
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He or she should then face a marriage rel 
tionship more adaptively, resulting in greate 
adjustment, happiness, and ultimately suc 
cess in marriage, Weigel, Weigel, and Ri 
ardson (1973), addressing somewhat similar 
issues, found no significant results, whereas 
Murstein and Beck (1972) found some gen- 
eral support for this idea. Although our find- 
ings were not strong, many were in the direc- 
tion hypothesized, and some significantly so, 
Some results found contrary to this hypothesis 
have been previously noted. Perhaps if the” 
proportion of our sample that did have peer 
assessments of their personalities had been 
larger, more definitive results might have been 
found. Of course, it is also possible that ac | 
curacy in interpersonal and self-perceptiol 
may generate a more rapid separation if the” 
marriage is in actual trouble, or perhaps th 
effect of accuracy depends on the particular 
traits involved. We have not as yet developed 
an adequate theory in this regard. 

Our hypothesis concerning living together 
before marriage was not supported by the 
data. This is a somewhat discouraging find- 
ing in light of the currently growing move- 
ment toward trial or premarriages. The effect 
of living together would seem to merit fur- 
ther study beyond our reported results to 
determine if there are any specific variables 
in the couple that are important to making 
this experience a helpful or hindering influ- 
ence on any potential subsequent marriage. 

An analysis of problems faced by married 
couples compared to divorced couples revealed E 
some clear distinctions. “Ill health” was the 
only significantly greater problem for thé 
still marrieds and seemed to be an involun- 
tary problem that helped draw the couple 
together. Several areas represented a signifi- 
cantly greater problem for the divorced group 
and could be roughly classified as volitional/ 
relational problems. It is interesting to note 
that although “drug abuse” was a signifi- 
cantly greater problem for the divorced group, 
“drunkenness” was in absolute terms a greatel 
Problem for both groups than drug abuse 
Thus, even though drug abuse has recently 
received much public attention, alcohol ap- 
pears to be the greater problem during the 
first few years of marriage for both martle 
and divorced couples. 


MARITAL SUCCESS AND FAILURE 


In regard to the longitudinal predictability 
of marital success, we found seven back- 
ground or trait variables that significantly 
differentiated the married females from the 
divorced females. Four background or trait 
variables significantly differentiated married 
from divorced males. Several of the back- 
ground variable differences have been found 
by other investigators. The age relationship 
was found by the U.S. Bureau of the Census 
(1973), Luckey (1966), and Landis (1956, 
1963). The educational difference was also 
noted by Murstein and Glaudin (1966), 
whereas Landis (1956, 1963) found results 
opposite to ours on education and occupa- 
tional levels, probably because differing 
methods were used to define experimental 
groups. Several other studies (e.g., Pope & 
Mueller, 1976; Renne, 1971) have reported 
results similar to ours on parents’ divorce. On 
the basis of previous research, we had ex- 
pected that the married group would have had 
fewer previous divorces than the divorced 
group, but this effect was not found in our 
data, It indicates, somewhat happily, that 
previously divorced individuals are not at a 
special risk in future marriage. 

We found three significant mean differences 
on traits between the married and divorced 
groups for the males and two for the females. 
Previous cross-sectional research (e.g., Cattell 
& Nesselroade, 1967; Pickford et al., 1966a; 
Singh et al., 1976) has found personality dif- 
ferences between married and divorced 
groups. Using the Guilford-Zimmerman Tem- 
perament Survey, Pickford et al. (1966a) 
found one significant difference between hap- 
pily married males and males who were in 
marriages that were on the verge of divorce. 
This trait was personal relations, which may 
be related to our finding that males differed 
on extraversion. No significant differences 
were found between the groups for females 
in the Pickford et al. (1966a) study. Cattell 
and Nesselroade (1967), using the 16 PF, 
found five traits (Intelligence, Dominance, 
Protension, Shrewdness, and Self-Concept 
Control) that significantly differentiated 
Stable from unstable marriages on husband/ 
Wife averages. Also using the 16 PF, Singh 
et al. (1976) found three traits (Premsia, 
Timidity, and Ergic Tension) that signifi- 
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cantly differentiated males in happy mar- 
riages from males in marital disharmony. 
Our significant differences for males on extra- 
version and invulnerability seem to coincide 
with their findings on Timidity and Premsia, 
respectively. These researchers found no sig- 
nificant differences between their sample 
groups for the females. We were not able to 
replicate the finding of Reevy (1963) regard- 
ing greater previous sexual experience in per- 
sons with an unfavorable marital happiness 
prediction; such sexual experience was unre- 
lated to actual marital success or failure in 
our sample. 

There were three significant simple pre- 
dictive correlations between the composite 
dependent score and background or trait 
variables for the males. In contrast, there 
were eight significant correlations for the 
females. It could be argued that rather than 
reveal a linear trend, trait levels could relate 
to marital adjustment in some curvilinear 
fashion. For example, perhaps intermediate 
levels of a particular trait are more conducive 
to marital happiness than extremes in either 
direction, Although this is a possibility, we 
have chosen to look for linear relationships, 
since experience has shown little empirical 
support for curvilinear relationships on cross- 
validation (e.g., Wiggins, 1973). 

For both mean differentiation between mar- 
ried and divorced groups as well as simple 
predictive correlations, the females showed 
the largest number of statistically significant 
results, indicating that the woman has greater 
predictive influence on the outcome of a mar- 
riage than does the man in the marital dyad. 
This sex difference was also found longitudi- 
nally for dating couples by Hill et al. (1976) 
and cross-sectionally by Murstein and 
Glaudin (1966), In our study, these trait dif- 
ferences between groups cannot be attributed 
to interactions between spouses. Rather, they 
precede marital interaction and apparently 
have some bearing on the subsequent outcome 
and quality of a marriage. 

In her review of the marital adjustment 
literature, Laws (1971) showed that women 
adjust better to their husbands, given the tra- 
ditional definition of roles in marital relation- 
ships. Similarly, in our study, women who 
rated their own ambition, intelligence, and 
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interest in art as being low also tended to be 
more satisfied 4 years later, and women who 
had children from a previous marriage also 
had greater solace in their new marriage at 
follow-up. Certain personality traits and some 
background items for women, then, seem to 
be quite consistent with a traditional defini- 
tion of marital adjustment in society as it is 
today. Cross-validation in a population or 
society that has different values or role defi- 
nitions may be less than fully confirmatory. 
For instance, if traditional female roles, the 
value of heterosexual relationships, and defi- 
nitions of marital happiness change or become 
altered in some manner—as the feminist 
movement might desire—our findings on spe- 
cific variables may not be replicated within 
this new cultural framework, It remains pos- 
sible, however, that our general findings con- 
cerning homogamy, self-perception, problem 
areas, and so forth, will not be severely af- 
fected by a modification in society. Further- 
more, certain results (such as on extraversion 
and invulnerability, which showed signifi- 
cantly different associations with marital ad- 
justment by sex) indicate that nontradition- 
ality may yield greater marital adjustment. 
Thus, the concept of sex role traditionality 
(e.g., Ellis & Bentler, 1973) cannot explain 
all the major results, 
In choosing to look at personality trait 
variables for longitudinal prediction of the 
quality of marital success or failure, we had 
made the unproven—albeit reasonable—as- 
sumption that these types of variables will be 
more powerful predictors than simple back- 
ground or demographic characteristics. To 
test this idea, we chose the two best predict- 
ing traits for the males and then the females, 
We then regressed these four traits on the 
composite dependent variable, This equation 
had an R of 54, F(4, 63) = 6.58, p < .001, 
We then chose the two best predicting back- 
ground variables for the males and then the 
females, We entered these four variables into 
a prediction equation and obtained a multiple 
correlation of .42, F(4, 63) = 3.39 p< .025. 
The first equation accounted for 29% of the 
criterion variance, and the latter equation ac- 
counted for 18% of the variance. It seems 
clear when comparing these two equations 
that the traits have a much greater longi- 
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tudinal predictive effect than do thi 
ground/demographic variables. 

Previous prediction equation ri 
(Karlsson, 1951; Kelly, 1939; Locke & K 
son, 1952; Terman & Oden, 1947) have 
multiple correlations ranging from .32 
using personality and/or background — 
ables. Our equations seem to be a fair 
provement over these, with multiple cor 
tions in the .75 range. Unfortunately, 
sample size was too small to cross-val 
these equations, and thus this task must } 
left for later research. It should be noted 
as a rule of thumb, for beta weights 
reliable estimates of the population, -th 
should be about 30 cases for each pre 
variable. Since our sample size does not 
mit this, our betas are liable to fluctuate 
some extent on replication. To assess whi 
the optimal beta weights indeed repre 
some important feature of the data ral 
than accidental noise, we calculated the 
biased estimator of the population sq 
multiple correlation, This indicated 1 
about 8%-10% of the predictive varian 
was due to overfitting in the sample. In V 
of our use of stepwise regression, which 
capitalizes on chance to some extent, 
would expect the population R of about . 
to be an upper-bound value rather than a i 
unbiased estimator that equally oversh 
and undershoots the true value in its 
mates. A completely different coefficient 
given by the sample cross-validity corte 
tion, which estimates the effect of using | 
sample regression equation in the populal 
Using the conservative Darlington (196 
coefficient, we obtained predicted correlatio 
of .59, .58, and .61, somewhat below the w 
biased estimates for Tables 6, 7, and 8 but 
still substantial in magnitude. We also used 
a novel regression method from Bentler and 
Woodward (Note 3), which determines 
optimal prediction possible by using 
equal weights rather than beta weights. Thi 
Procedure found that the only significant COM 
tribution to prediction was being made P 
the signed, equal weights, and that any i 
ther contribution due to differential well 
ing simply represented statistical noise 
Sequently, we would urge the use of 
weights rather than beta weights in ful 
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replications, since these signs appear to carry 
all the reliable predictive information. With 
such weights, one obtains R = .70, still a sub- 
stantial improvement over previous research. 
Of course, replication of the signs in new 
samples is a clear task for future research. 

Our emphasis on predicting marital suc- 
cess or failure from factors assessed at the 
beginning of a marriage presupposes that 
marriage may somehow alter the personality 
traits of each partner. Further study needs 
to be done regarding how such traits are 
changed, if in fact they are. Does a success- 
ful marriage alter one’s personality traits in 
a given direction—for example, in a “regres- 
sion effect” toward the mean of both part- 
ners—and does the change depend on the 
success of the marriage? Are traits affected 
differentially by various marital experiences? 

The interaction of two people in that 
uniquely intimate and intensive situation 
called marriage is an immensely complicated 
process that draws on many factors of each 
member’s background and personality, as 
well as more situational characteristics. A 
marriage is the result of what each partner 
brings to the marriage and what they make of 
it, once together. We have focused on what 
each partner has brought to the marriage in 
order to understand the variations that seem 
to limit the best that individuals can subse- 
quently make of their marriage. 


Reference Notes 


1. Holz, R. F. Homogamy and heterogamy in the 
marital dyad: The effects of role on need dispo- 
sition, Paper presented at the meeting of the 
eee Sociological Association, San Francisco, 
9. 
. Bentler, P. M. On the multimethod existence of 
29 personality dimensions: Data for the situation 
versus trait controversy. Invited research address 
Presented at the meeting of the Western Psycho- 
logical Association, Portland, Oregon, April 1972. 
3. Bentler, P. M., & Woodward, J. A. Regression on 
linear composites: Statistical theory and applica- 
tion, Manuscript submitted for publication, 1977. 

4. Woodward, J. A, & Bentler, P. M. Unit-weight 
composites with maximum variance. Paper pre- 
sented at the meeting of the Psychometric Society, 
Chapel Hill, North Carolina, June 1977. 


References 


Barton, K., & Cattell, R. B. Real and perceived 
Similarities in personality between spouses: Test 


1069 


of “likeness” versus “completeness” theories. Psy- 
chological Reports, 1972, 31, 15-18, 

Bentler, P. M. Heterosexual behavior assessment—I. 
Males. Behaviour Research and Therapy, 1968, 6, 
21-25. (a) 

Bentler, P. M. Heterosexual behavior assessment—II. 
Females. Behaviour Research and Therapy, 1968, 
6, 27-30. (b) 

Burgess, E. W., & Wallin, P, Engagement and mar- 
riage. Philadelphia: Lippincott, 1953. 

Cattell, R. B., & Nesselroade, J. B. Likeness and 
completeness theories examined by sixteen per- 
sonality factor measures on stably and unstably 
married couples. Journal of Personality and Social 
Psychology, 1967, 7, 351-361. 

Comrey, A. L., Backer, T. E., & Glaser, E. M, A 
sourcebook for mental health measures. Los 
Angeles: Human Interaction Research Institute, 
1973. 

Darlington, R. B. Multiple regression in psychologi- 
cal research and practice. Psychological Bulletin, 
1968, 69, 161-182. 

DeYoung, G. E., & Fleischer, B. Motivational and 
personality trait relationships in mate selection. 
Behavior Genetics, 1976, 6, 1-6, 

Ellis, L., & Bentler, P. M. Traditional sex-determined 
role standards and sex-stereotypes. Journal of Per- 
sonality and Social Psychology, 1973, 4, 75-88. 

Hill, C. T., Rubin, Z., & Peplau, A. Breakups before 
marriage: The end of 103 affairs. Journal of Social 
Issues, 1976, 32, 147-168. 

Karlsson, G. Adaptability and communication in mar- 
riage: A Swedish predictive study of marital satis- 
faction. Uppsala, Sweden: Almqvist & Wiksells, 
1951. 

Kelly, E. L. Concerning the validity of Terman’s 
weights for predicting marital happiness. Psycho- 
logical Bulletin, 1939, 36, 202-203. 

Kernodle, W. Some implications of the homogamy- 
complementary needs theories of mate selection 
for sociological research. Social Forces, 1969, 38, 
145-152. 

Landis, J. T. The pattern of divorce in three genera- 
tions. Social Forces, 1956, 34, 213-216. 

Landis, J. T. Social correlates of divorce or non- 
divorce among the unhappily married. Marriage 
and Family Living, 1963, 25, 178-180. 

Laws, J. A feminist review of marital adjustment 
literature: The rape of the Locke. Journal of Mar- 
riage and the Family, 1971, 33, 483-516. 

Locke, H. J. Predicting adjustment in marriage: A 
comparison of a divorced and a happily married 
group. New York: Holt, 1951. 

Locke, H. J., & Karlsson, G. Marital adjustment and 
prediction in Sweden. American Sociological Re- 
view, 1952, 17, 10-17. 

Locke, H. J., & Wallace, K. M. Short marital ad- 
justment and prediction tests: Their reliability and 
validity. Marriage and Family Living, 1959, 21, 
251-255. 

Luckey, E. B. Number of years married as related to 
personality perceptions and marital satisfaction. 
Journal of Marriage and the Family, 1966, 28, 


44-48. 


1070 


Meyer, J. P., & Pepper, S. Need compatibility and 
marital adjustment in young married couples. 
Journal of Personality and Social Psychology, 
1977, 35, 331-342. 

Murstein, B. J. The complementary need hypothesis 
in newlyweds and middle-aged couples. Journal of 
Abnormal and Social Psychology, 1961, 63, 194— 
197. 

Murstein, B. J. Empirical tests of role, complemen- 
tary needs and homogamy theories of marital 
choice. Journal of Marriage and the Family, 1967, 
29, 689-696. 

Murstein, B. J., & Beck, G. D. Person perception, 
marriage adjustment and social desirability, Journal 
of Consulting and Clinical Psychology, 1972, 39, 
396-403. 

Murstein, B. J., & Glaudin, V. The relationship of 
marital adjustment to personality: A factor analy- 
sis of the Interpersonal Checklist. Journal of Mar- 
riage and the Family, 1966, 28, 37-43. 

Olkin, I., & Pratt, J. W. Unbiased estimation of cer- 
tain correlation coefficients. Annals of Mathemati- 
cal Statistics, 1958, 29, 201-211. 

Pickford, J. H., Signori, E. J., & Rempel, H. The 
intensity of personality traits in relation to marital 
happiness. Journal of Marriage and the Family, 
1966, 28, 458-459. (a) 

Pickford, J, H., Signori, E. J., & Rempel, H. Similar 
or related personality traits as a factor in marital 
happiness. Journal of Marriage and the Family, 
1966, 28, 190-192. (b) 

Pope, H., & Mueller, C. W. The intergenerational 
transmission of marital instability: Comparisons by 
race and sex. Journal of Social Issues, 1976, 32, 
49-66. 

Reevy, W. R. Vestured genital apposition and coitus. 
In H. G. Beigel (Ed.), Advances in sex research, 
Harper & Row, 1963. 


P. M. BENTLER AND MICHAEL D. NEWCOMB 


Renne, K. S. Health and marital experience in an } 
urban population. Journal of Marriage and the 
Family, 1971, 33, 338. 

Rosow, I. Issues in the concept of need comple- 
mentarity. Sociometry, 1957, 20, 216-233. | 

Singh, S. B., Nigam, A., & Saxena, N. K. 16 PF: 
Study in the cases of marital disharmony. Indian | 
Journal of Clinical Psychology, 1976, 3, 47-52. | 

Terman, L. M., & Oden, M. H. The gifted child 
grows up: Twenty-five years’ follow-up of a sue 
perior group. Stanford, Calif.: Stanford University i 
Press, 1947. 

Tharp, R. J. Psychological patterning in marriage, 
Psychological Bulletin, 1963, 60, 97-117. 

Udry, J. R. Personality match and interpersonal | 
perception as predictors of marriage. Journal of | 
Marriage and the Family, 1967, 29, 722-725. | 

U.S. Bureau of the Census, 1970 Census of Popula- 

tion. Age at first marriage (Final Rep. PC(2)-4D), 
Washington, D.C.: U.S. Government Printing Of 
fice, 1973, | 

Weigel, R. G., Weigel, V. M., & Richardson, F. C. | 
Congruence of spouses’ personal constructs and 
reported marital success: Pitfalls in instrumenta- 
tion. Psychological Reports, 1973, 33, 212-214. 

Wiggins, J. S. Personality and prediction: Principles f 
of personality assessment. Reading, Mass.: Addi- | 
son-Wesley, 1973. | 

Winch, R. F. Another look at the theory of com: | 
plementary needs in mate selection. Journal of 
Marriage and the Family, 1967, 29, 756-767. : 

Woodward, J. A, & Bentler, P. M. A statistical 
lower-bound to population reliability. Psychologt 
cal Bulletin, in press. 


Received August 9, 1977 # | 


fournal of Consulting and Clinical Psychology 
iis. Vol. 46, No. 5, 1071-1078 
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The relationship between scores on the Minnesota Multiphasic Personality In- 
ventory and both concurrent and prior aggression was examined for a sample 
of 426 19-year-olds from the general population. Aggression was measured 
through peer nominations obtained concurrently and 10 years earlier. Correla- 
tion and regression analysis indicated that the sum of T scores for Scales F, 4, 
and 9 was a valid measure of aggression. The composite was also shown to have 
a higher reliability than its component scales. Using an additional 283 subjects 
from delinquent populations, it was demonstrated that the composite was an 
excellent discriminator between delinquent and general populations of males 
and females even when intelligence and social status were controlled. 


Elevation of the T scores on both Scales 4 
and 9 of the Minnesota Multiphasic Person- 
ality Inventory (MMPI) has been thought to 
form the profile characteristic of the male 
juvenile delinquent (Dahlstrom & Welsh, 
1960; Dahlstrom, Welsh, & Dahlstrom, 1972; 
Hathaway & Monachesi, 1953). Scale 4 
(Psychopathic Deviate) by itself has been 
used to measure levels of social deviance or 
antisocial behavior (Elion & Megargee, 1975; 
Hathaway & Monachesi, 1953; Megargee & 
Mendelsohn, 1962). Dahlstrom et al. (1972) 
noted that marked elevations on Scale 4 can be 
observed in prison groups. These authors also 
noted that Scale 9 (Hypomania) appears to 
energize the pattern related to Scale 4. Scale 9 
was viewed by Hathaway and Monachesi 
(1953) as an exciter that in combination with 
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Scale 4 produces rebellious and excitable 
behavior in high-delinquent children. A later 
analysis of their data (Monachesi & Hathaway, 
1969) showed that the highest rates of delin- 
quency for both boys and girls were associated 
with deviance in MMPI Scales 4, 8, and 9. 

In a study of juvenile delinquents who 
succeeded or failed in adjustment to institu- 
tionalization, Lefkowitz (1966) found that the 
mean score on Scale 9 was significantly higher 
for the failures. Similarly, in a follow-up study 
of parolled prisoners whose postrelease be- 
havior was classified as acceptable or un- 
acceptable, the MMPI 49 code type was 
heavily represented among the unacceptable 
group (Jacobson & Wirt, 1969). Butcher 
(1965) found that highly aggressive boys 
(based on peer nominations) had significant 
elevations on Scales 4 and 9. These boys 
responded in a rebellious and excitable manner 
in interpersonal situations, whereas the low- 
aggressive boys tended to internalize their 
conflicts, which were then manifested in 
hypochondriacal symptoms and withdrawal. 

There has been a proliferation of attempts 
to validate not only certain of the clinical 
scales as aggression measures (see, €g., 
Megargee & Mendelsohn, 1962; Shipman, 
1965) but also a number of special scales 
developed from the MMPI. At this writing, 
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as many as 25 scales of the MMPI have been 
proposed as measures of hostility or aggression 
(Deiker, 1974). Thus, the present study is an 
attempt to simplify this area of measurement 
by providing a reliable and valid single score 
of aggressive behavior obtained by summing 
the T scores of a few common clinical scales. 
The virtue of a single score over a profile of 
scale scores is that the former is readily usable 
as a continuous variable, whereas the latter 
is a dichotomous variable and therefore in the 
main limited to frequency analysis. 

A two-step procedure was followed in this 
study. First, using a sample of subjects from 
a general population, we attempted to identify 
a valid and reliable composite of T scores for 
measuring aggression. Then using a sample 
of subjects from a delinquent population, we 
attempted to validate the composite as a 
discriminator of delinquent adolescents. 


Study 1 
Method 


Subjects. The subjects were part of a larger longi- 
tudinal research project on aggressive behavior re- 
ported elsewhere (Eron, Huesmann, Lefkowitz, & 
Walder, 1972; Lefkowitz, Eron, Walder, & Huesmann, 
1977). In the first wave, data were gathered during 
1959-1960 from the entire 3rd-grade population of 875 
boys and girls residing in a semirural county in New 
York State. Ten years later in the second wave, termed 
the ‘13th grade,” data were collected from 426 of these 
subjects (211 boys and 215 girls who could be located 
at that time; Lefkowitz et al., 1977). In the 3rd and 
13th grades, the model ages of this sample of 426 were 
8 and 19 years, respectively. The mean IQ of this 
sample in the 3rd grade was 107.10 + 13.66. Based on 
fathers’ occupation, the sample can be described as 
predominantly middle class, 
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Procedure. In the 3rd and 13th grades, aggression 
scores were obtained by a peer nomination questionnaire 
comprised of items describing aggression behavior, For 
example, each subject was asked to name anyone in the 
classroom “Who starts a fight over nothing?” The peer 
nomination technique also included items. intended to 
measure aggression avoidance; for example, “Who will 
never fight even when picked on?” The reliability 
(r > .85) and validity of this aggression measure and 
its the procedures for administration have been dis- 
cussed elsewhere (Butcher, 1965; Eron, Walder, & 
Lefkowitz, 1971; Walder, Abelson, Eron, Banta, & 
Laulicht, 1961). In the 3rd grade the peer nomination 
technique, including 10 aggression items, was adminis- 
tered to classroom groups. In the 13th grade the 
technique, using 9 of the 10 aggression items from the 
3rd grade, was administered individually to subjects 
as part of a larger 2-hour interview procedure. At 
the time of the 3rd grade interview, IQ (Sullivan, Clark, 
& Tiegs, 1957) and father’s occupational status (U.S. 
Bureau of Census, 1960) were recorded for most 
subjects. During the 13th grade interview, subjects 
also completed self-report inventories on aggressive 
behavior and were administered the MMPI. The 
K-corrected T scores on the MMPI were used as the 
potential predictors of aggression in this study. The 
self-report measure of aggression, termed total aggressive 
environment, was comprised of 43 items divided into 
five subscales: (a) respondent as a victim of aggression, 
(b) respondent as a witness of aggression, (c) re- 
spondent’s aggressive habits, (d) respondent's antisocial 
behavior, and (e) respondent’s aggressive feelings. The 
exact composition of this scale and the procedures used 
to administer all of the 13th grade measures have been 
reported elsewhere (Lefkowitz et al., 1977). 


Results 


The intercorrelations between the various 
measures of aggression are shown in Table 1 for 
all 426 subjects. Since 13th grade peer-rated 
aggression (Peer Agg 13) correlated well with 
all the other measures and was measured at 
the same time that the MMPI was admin- 


Table 1 
Correlations Between the Criterion Measures of Aggression for All 426 Subjects 
Measure 1 2 3 4 5 
1. Peer Agg 13 == 
2. Peer Avoid Agg13 ~.364 — 
3. Self Rep Agg 13 S20. —.357 = 
a ee Agg 3 420 —.284 200 mE 
Fs aes Avoid Agg 3 —.236 298 —.233 —.381 = 
346 —.080 390 -199 —.139 


Note. fw = 15. Peer = peer 
refers to Grade 3; 13 refers to 
* Female = 0; male = 1, 


nominated ; = + : 7 i 
Gale Agg = aggression; Avoid = avoidance of; Rep = reported: 3 


MMPI SCALES F, 4, AND 9 


Table 2 
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Correlations and Significant Multiple Regression Coefficients (MR. icti 
g u 's) for Predicti: 
Concurrent Aggression from Minnesota Multiphasic Personality T: ae CL MPI) Scales 


Concurrent peer-rated aggression 


Males* Females? 
Raw Raw 
M MEI : regression regression 
coefficient r coefficient 
L 064 .087 
F 1322444 1.675* .260*** 1,30201% 
K —.134 —.056 
1 052 049 
2 077 -005 
3 .001 090 
4 Roby bade 2,469*** ON bon! 10737 
5 129. —1.799** .182** .561* 
6 .180** S121 
7 131 —.017 
8 PEA eo All 
9 Ae EP hee 1,351* .146* 
10 —.132 =.150* —.674* 
a For males, n = 211, R = .471***, 


b For females, n = 215, R = .478***. 
*p <.05. 
p < 01, 
p < .001. 


istered, it was decided to use it as the primary 
criterion measure. Since sex was significantly 
related to aggression, all further analysis were 
done separately for males and females. 

Table 2 shows the correlations between Peer 
Agg 13 and each of the MMPI scales as well 
as the results of a multiple regression analysis 
predicting Peer Agg 13 from the MMPI 
scales. For males, as expected, Scales 4 and 9 
were highly significant correlates of aggression, 
but Scale F was also highly correlated. Scale 8 
correlated less highly though significantly. For 
females, the two best unvariate predictors 
were Scale 4 and the F scale. Scales 5, 9, and 
10 also were significantly correlated with 
aggressiveness but not as highly. Since the T 
scores for Scales 4, 7, 8, and 9 were K corrected 
and since K scores were slightly negatively 
correlated with aggressiveness, the observed 
correlations between aggressiveness and 4, 7, 8, 
and 9 may have been slightly reduced by the 
K correction. However, the reduction could 
not be significant considering the small size of 
the correlation between K and Peer Agg 13. 

The multiple regression analysis shows the 


best weighted composite of the 7 scores for 
measuring aggression.! As expected, the regres- 
sion coefficients reveal that Scale 4 was the 
most important predictor of aggression across 
sexes. Surprisingly, however, Scale F was as 
important a predictor for females and equal 
with Scales 5 and 9 in predicting aggression in 
males. Scale 5 relates negatively to aggression 
in males and positively in females, indicating 
the relation between aggression and sex role 
behaviors. Scale 9 made an important con- 
tribution for males but not for females, for 
whom Scale 10 appeared as an inverse 
predictor. 


Discussion 


From these results it appears that a com- 
posite to be used as a general measure of 


1The raw regression coefficients are presented 
because the objective is to develop a predictive com- 
posite. Even though the intercorrelations between the 
MMPI scales are mostly significant, they only range 
from —.32 to .52 for the significant scales in the equa- 
tion, so multicolinearily should not be a serious problem. 
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att 3 aggression across sexes without co 
PS SES EEJ by sexual stereotyping should be 

5 E Ñ R, F of Scales F and 4 and either Scale 9 o; 

3 Of course, a weighted linear com 

S tet volving every scale would have th 

S 4 Rue es correlation with aggression. However, 

S $ T ze oe sure consisting of the sum of only two 

s 3 ! scales has the advantage of simplicity 

S a sae 3 of administration. Furthermore, as 

$ I al & $ ó E| (1976) and others have argued, the sim 

S è SaR SS/¢ of significant predictor variables is 

X 3 | | 2 almost as good a predictor in the pop 

3 l rr: a as the entire multiple regression equal 

S fy beh nw 3 Scale 5, although a significant pre 

g + SAR EEIE not a desirable variable to include in a 

Š | Tel i measure of aggression, because it repres 

à Es WN $ that component of aggression associa 

§ i ES E sexual stereotyping. Since it was includ 

8 mR & halo th . : 

7 <+ Hae SSle e regression equation, however, the 

= 5 wi ile of sexual stereotyping have been 

x is 3 controlled in determining the other sij 

3 8 a predictors. 

a tee +/3 On a priori grounds, Scale 9 woul 

È s SxS $b 5 to be a more appropriate element to 

Š a P SAS in a measure of aggression than Scale I 

SŠ Ve is intended to measure a propensity for mi 

È tie +: | 8 behavior, so it certainly has face validit 

N n SAh A $ a a component of aggression. Furthel 

8 SNS Agls reported in the Introduction, Scale 9 hi 

A K | è found to be significantly higher in 

3 z yes : Š high-aggressive, albeit male, populations: 

Š KS Rad silg the other hand, Scale 10 has to our kno 

$ & San SS] 4 never been proposed as a discrimina 

2 8 | z delinquents. In addition, the utility of 

= $ Te fe E 10 in this study could possible be an a 

$ RLE bijz of the peer nomination procedure. A 

N SRI See E S scoring high on Scale 10 is scoring 

& | i 4 social introversion. Therefore, she n 

= A RE 3 less likely to be nominated by her 

= zy git fila any scale resulting in negative correlatiel 

& +| 338 Sos The available data are consistent with’ 

Š s 2 e 2 interpretation. Even though Scale 10 ¢ 

: y lated negatively with peer-rated a 

Š E for females (r = —.15, p < .03), it d 

Š “a = correlate significantly with self-rated 

2 k Be 3 2 sion (r= .04), though one would ' 

8 Se] ote £ z expect a higher correlation between 

$ 2 Selz . ES $ ai ae self-ratings. Also, peer-rated popularity, 
08 È £ f oe È Bee] 5 SSS correlated negatively with peer-rated 
28 mn Ipga ds àvvv sion (r= —.28, p < .001), had exactl 
és Be aa oad ROAN Same correlation with Scale 10 (r = — 

b= * 


did peer-rated aggression. Thus, it seems 
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Scale 10 is measuring a component of peer 
rating more than aggression. Scale 9, on the 
other hand, correlated more significantly with 
self-rated aggression for girls (r= 34, p 
<.001) than with peer-rated aggression 
(r= .15, p< .03). Therefore, a composite 
measure of aggression was constructed consist- 
ing of the sum of Scales F, 4, and 9. 

Table 3 displays the correlations between 
this composite measure of aggression and each 
of the other variables measuring aggression. 
It also contains the correlations between the 
aggression measures and the individual scales. 
One can see that for males the sum of Scales F, 
4, and 9 is a valid measure of concurrent 
aggression and is significantly related to the 
subject’s aggression to years earlier. The 
composite correlated more strongly with every 
measure than any of its components or F + 4. 
For females, the composite also correlated 
significantly with every measure of concurrent 
aggression, though less strongly than for 
males. Scale 4 alone and F + 4 are just as 
strongly or slightly more strongly related to 
peer ratings than the full composite, but the 
full composite did better on the other measures. 
The composite also significantly correlated 
with aggression 10 years earlier. 


Reliability 


The reliability of any linear combination 
of variables can be computed from its variance 
and the variances and reliabilities of its 
components. For the current sample, however, 
MMPI item scores were not available to us, 
so reliability data from comparable samples 
had to be used. The largest study of reliability 
in a college-age population of normals appears 
to have been conducted by Mauger in 1972 
(cited in Dahlstrom et al., 1972). Test-retest 
correlations were computed for 490 subjects 
over an 8-month lag. The average correlations 
of males and females were .56 on Scale F, .57 
on Scale 4, and .62 on Scale 9. These relatively 
long-term stability coefficients can be viewed 
as lower bounds on internal consistency 
reliabilities (i.e., coefficient alpha). Dahlstrom 
et al. (1972) did not report any studies in- 
volving a substantial number of normal 
college-age subjects for which coefficient alpha 
was calculated, so these stability coefficients 
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Table 4 
Means for F + 4 + 9 as a Function of 
Sex and Delinquency 
Population Males Females Total 
General 183.3 174.6 178.9 
n 210 215 
Delinquent 217.4 237.7 227.2 
n 147 136 
Total 197.3 199.0 198.2 


will have to serve as our conservative esti- 
mates of reliabilities. We can now show that 
even assuming such conservative estimates, the 
reliability of F + 4 + 9 is acceptable. 

From Nunnally (1967, p. 229), we can write: 


Let 
y = MMPI; + MMPI, + MMPI, 


foy = 
1 


(ototo?) rear — tuot — too 


oy 


Using our sample of 427 subjects to estimate 
the standard deviations gives 


oy = 26.52 or = 10.82 o4 = 11.15 oy = 11.50, 


and using Mauger’s stability coefficients as 
conservative estimates of reliability gives 


rep = 56, tae = 57, and tw = .62. 


Therefore, 


155.23 
DROSS E, 
: 703.31 


fy = 


Thus the reliability of F + 4+ 9 is sub- 
stantially higher than the reliabilities of its 
components and sufficient for its use as a 
measure of aggression. Furthermore, using 
less conservative estimates of the reliabilities 
of the scales based on 1-week stabilities 
(Dahlstrom et al., 1972), the reliability of 
F+44+49 is almost 87, 


Study 2 


To further validate F + 4 + 9 as a measure 
of aggression, a second analysis was undertaken 
to compare a sample from a known population 
of delinquents with the previously studied 
sample from a normal population. 
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Table 5 
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Analysis of Covariance for F + 4 + 9 as a Function of Sex and Delinquency 
eee 


Source df SS MS F 
acuta 1 157,298.44 157,298.44 218.27* 
Father’s occupation 1 13,772.81 13,772.81 19.11* : 
Effects 
Sex (A) 1 10,496.69 10,496.69 14.57* 
Population (B)* 1 153,635.63 153,635.63 213,19* 
AXB 1 17,628.37 17,628.37 24.46* 
Residual 621 447,522.25 720.65 
Total 626 800,354.19 


Note. The total sample size for the analysis of covariance was only 627 because IQ or father’s occupation was 


missing for 81 subjects. 
* Delinquent versus general. 
*p < 001. 


Method 


Subjects. The delinquency sample consisted of 136 
females institutionalized at a facility of the New York 
State Division for Youth and 147 males from a privately 
operated institution for delinquent boys in New York.? 
All subjects had been sent to the institutions by family 
and children’s courts, The subjects ranged in age from 
12.9 to 17.1, with a median of 14.9. The mean IQ for 
girls was only 77.3 but was 100.3 for boys. The entire 
population of the institutions minus a few subjects for 
whom usable data could not be obtained constituted 
the sample. The offenses committed by the delinquents 
ran the gamut from incorrigibility, shoplifting, petty 
larceny, and prostitution to car theft, breaking and 
entering, and assault. 

Procedure. Form R of the MMPI was group 
administered to 3-5 subjects at a time. The 399 items 
were read aloud twice on a tape while the subjects, 
closely monitored, read along silently in their booklets, 
The IQs and fathers’ occupations were obtained from 
the subjects’ case histories, 


Results 


The means and analysis of covariance shown 
in Tables 4 and 5 indicate that F+4+49 
is an excellent discriminator between popu- 
lations known to vary in their levels of aggres- 
sion. The analysis of covariance was performed 
ine hierarchical manner, so the F value for 

elinquent versus nondelinquent ulations, 
F(1, 621) = 2441, p< 001, ae the 
discriminative Strength of F + 4 + 9 after the 
effects of IQ, father’s Occupation, and sex have 
been partialed out. Since the standard devia- 
tion of the composite was about 25, the table 
of means indicates that delinquent boys 
scored about 13 standard deviations higher 


than normals, and delinquent girls scored 
about 2} standard deviations higher. This) 
greater difference for females is reflected in the 
significant Sex X Population interaction F(i, 
621) = 25.83, p < .001. Although males scored 
higher than females on F + 4+ 9 in the 
normal population, ¢(423) = 3.42, p< 001, 
females scored higher in the delinquent” 
population, 4(281) = 5.98, p < .001. 


Discussion 


These results provide construct validity for 
the sum of MMPI Scales F, 4, and 9 as # 
measure of aggression. The sum of these scales 
was an excellent discriminator between dé 
linquents high in aggression and youth oa 
normal population even when IQ and socia 
class were controlled. Furthermore, as 0e 
would predict for a measure of aggression, 
nondelinquent males scored significantly highet 
on F+44+9 than nondelinquent females 
Within the delinquent group, however, fen 
contrary to the hypothesis, scored significant | 
higher than males. Speculatively, this af 
predicted result of higher scores for delinque? 
females than delinquent males on this MM 
measure may be due to an artifact of sea 
Traditionally, females have encountered A 
fewer difficulties with law-enforcing agenc® 


the 
*The authors wish to express their thanks for fot 


Cooperation of the staffs of the Hudson Scho 
Girls and the Berkshire Farm Institute for Trai 
and Research, 
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and/or have been treated more leniently by 
these agencies than males, particularly juve- 
niles (Monachesi & Hathaway, 1969). Thus, 
in order for a female to be considered de- 
linquent and institutionalized, the level of 
antisocial behavior may have to be excessive 
as compared to males. Consequently, the 
present group of institutionalized delinquent 
females may be unrepresentative of delinquent 
females generally, in that their level of aggres- 
siveness may be inordinately high. The 
unusually low mean IQ fo# the female de- 
linquents supports this interpretation. If this 
post hoc hypothesis is valid, this MMPI 
measure should be able to discriminate among 
degrees of delinquency in a population of 
delinquent females. 

Although the normals in the present study 
were approximately 5 years older than the 
delinquents, it is unlikely that this age differ- 
ence affected the results. Hathaway and 
Monachesi (1963) provided MMPI norms for 
4,944 boys and 5,207 girls in the ninth grade 
whose average ages, respectively, were 15.1 
and 14.9 years. The sum of the T scores on 
Scales F, 4, and 9 were 176.3 for boys and 
170.4 for girls. These values are markedly 
below those of delinquents of the same age 
(as shown in Table 4) but approximately the 
same as those of the normal subjects in the 
present study. 


Summary 


~ The results of the present study indicate 

that the sum of the T scores on MMPI Scales 
F, 4, and 9 serves as a reliable and valid 
unitary measure of aggression. Highly signifi- 
cant correlations obtained between F + 4 + 9 
and both peer-rating and self-rating of aggres- 
sion. Added strength is lent to the validity 
by the fact that scores on the measure related 
back 10 years in time to nominations on 
aggressive behavior that the subjects received 
from their peers in a classroom setting. More- 
over, as evidenced by the multiple regression 
analysis, the composite is valid for both males 
and females and is not simply a sex-typing 
Measure. In general, the sum of these scales 
appeared more valid than any of its com- 
Ponents. Other support for the validity of this 
MMPI measure was derived from its ability 


1077 


to distinguish between delinquent and non- 
delinquent populations known to differ in 
aggressiveness. Finally, F + 4 + 9 was shown 
to have a reliability greater than that of any 
of its components.. 

Although the intent was to validate this 
measure on a noninstitutionalized population 
of normal subjects, the fact that the measure 
discriminates so well between institutionalized 
and noninstitutionalized populations suggests 
that it may also be useful in criminal justice 
settings in which the MMPI is so widely used. 
The ease with which this measure is obtained 
from the MMPI makes it a potentially useful 
tool for screening and perhaps program place- 
ment in such settings. Also, the measure 
meets the demand for a valid paper-and-pencil 
measure of aggressive behavior for research 
with normal subjects. 
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A Longitudinal Study of Coping Styles 
in Self-Defining and Socially Defined Waren 


Abigail J. Stewart 
Boston University 


In a longitudinal study of 51 female college graduates, self-definition as mea- 
sured by freshman-year Thematic Apperception Tests predicted several aspects 
of problem-solving and coping behavior 14 years later. Those women who 
viewed themselves, their world, and their own personal problems in ways that 
facilitated effective coping scored higher in self-definition. The women who took 
instrumental actions to solve their problems, rather than taking noninstrumental 
actions or remaining passive, also scored higher in self-definition. Theoretical 
and practical implications of these findings are discussed. 


Self-definition and social definition were 
proposed by Stewart and Winter (1974) as two 
contrasting patterns of organization of experi- 
ence that are important in understanding 
intrasex differences in women.! Stewart and 
Winter began by identifying two different 
styles of storytelling that characterized The- 
matic Apperception Test (TAT) stories 
of career-oriented and non-career-oriented 
female college students. The stories written 
by the career-oriented women were marked by 
a clear, causally organized plot and instru- 
mentally active and effective characters. The 
stories written by the non-career-oriented 
women were characterized by inactivity or 
ineffective activity on the part of characters, 
and by an absence of intelligible causal ordering 
of the plot.? 

Stewart and Winter argued that these two 
styles of storytelling reflected an underlying 
Pervasive personality characteristic. They 
Presented evidence that the women who told 
Stories of rationally caused events created by 
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effective actors were (a) relatively indifferent 
to sex role norms (and perhaps to norms 
associated with other roles); (b) capable of 
emotional “distance” and objectivity in the 
context of a small number of intense, personal 
relationships; (c) active and interested in 
broad social movements and issues; and (d) 
capable of vigorous instrumental activity. 

The women who told stories in which events 
were chaotic and in which characters were 
helpless (a) behaved in accord with sex role 
norms; (b) showed rather superficial emotional 
attachments to a large number of people, with 
little capacity for analytic “distance”; (c) 
were deeply preoccupied with their own 
emotional and social lives; and (d) showed a 
greater inclination toward expressive rather 
than instrumental activity. The initial research 
reported by Stewart and Winter (1974), then, 
indicated that these two patterns of thought 
(self- and social definition) were coherently 
reflected in two similar patterns of ordinary 
behavior in female college students. In later 
research, Stewart (1975) showed that self- 
definition predicted career and work activities 
as well as certain marital and family patterns 
in adult women. Winter, Stewart, and Mc- 


1 Later research (see Winter, Stewart, & McClelland, 
1977) has indicated that it may also be relevant in 


studying men. k 
2 Validational research and the psychometric proper- 


ties of the measure are discussed in detail in Stewart 
and Winter (1974). 8 
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Clelland (1977) also showed that self-definition 
was associated with nontraditional marriage 
patterns in a longitudinal study of adult men. 

No research, however, has explored the 
possibility that the patterns associated with 
self- and social definitions might extend to the 
domain of adult behavior in response to life 
stress. Nevertheless, the presumably general 
and stable tendencies to organize experience 
identified in Stewart and Winter’s (1974) 
original study should have important conse- 
quences for an individual’s approach to solving 
a personal problem. A great deal of theory and 
laboratory evidence suggests that individuals’ 
attributions about the causes of events (see 
de Charms, 1968; Kelley, 1972) and their 
feelings of personal control over events (see 
Rotter, 1966; Seligman, 1975) have important 
consequences both for their acts and their 
emotional states. Self- and social definition are 
alternative patterns of organization of experi- 
ence that do not involve conscious beliefs (as 
“Jocus of control” does) but rather implicit 
beliefs. Moreover, the two patterns simul- 
taneously reveal implicit beliefs about the 
causal organization of external reality and 
one’s own capacity for causal agency (see, 
€g., de Charms, 1968) rather than being 
restricted to one or the other. It is assumed 
here that stable patterns of thought (implicit 
belief systems) about one’s capacity to in- 
fluence events, and about the predictability 
and meaningfulness of events in general, will 
affect individuals’ styles of responding to 
personal life events. 

Thus, if a woman sees the world as a syste- 
matic environment in which events have 
identifiable causes, it seems reasonable that 
she would seek out the causes of her problems. 
If, however, a woman sees the world as an 
essentially random environment in which 
events lack intelligible causes, she will waste 
no time in fruitless searches for nonexistent 
explanations for her problems, Also, a woman 
who sees the individual as a locus of effective 
voluntary action will take action to solve her 
problems, whereas a woman who sees in- 
dividual action as futile and pointless may 
simply wait for this, too, to “pass.” 

E p other hand, the link between world 
view “and coping style may not be so close. 
A socially defined woman could “typically” 
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see the world as chaotic and herself as helpless, $ 
but when faced with real problems she might 

seek out the causes of the problems and | 
rational action to solve them. Alternatively, a 

self-defining woman who “normally” sees the. 
world as orderly and herself as effective might | 
be overwhelmed and paralyzed by a personal 

difficulty. One test of an individual’s commit- 

ment to a world view, then, may be the 

extent to which that world view is retained in 

the face of calamity. 

The purpose of this study was to test the 
general hypothesis that  self-defining and] 
socially defined women would indeed respond 
to personal life stresses differently and in 
ways that are consistent with their respective 
world views. An adequate test of this hy- 
pothesis offers the possibility of extending the 
construct validity of self-definition and at the | 
same time determining whether alternative) 
coping styles are grounded in characteristi¢) 
constructions of social reality. | 


Hypotheses 


The first hypothesis investigated in this 
study is that self-defining women will be more | 
likely than socially defined women to interpret 
their own personal problems as existing outside 
of themselves. That is, they will tend to see @ 
problem (regardless of its content) as located 
in the environment, rather than within theitl 
Own personality. However, the second hy- 
pothesis is that self-defining women will be] 
more likely than socially defined women t0 
interpret the solution to their problems 4 
located within themselves. That is, they wil 
see the problem as external to themselves, but) 
the solution will be seen as internal; in this way) | 
it is presumed, they may reasonably plan toi 
initiate action to solve the problem. If the 
Problem seems internal, or if the solution | 
the problem seems external, it may make little 
sense to take action. Only if the problemi 
appears to be external, but the solution to thè | 
Problem appears to be within one’s contt0 
is it reasonable for an individual to take action 
to solve it. i | 

In addition, it is hypothesized that self 
defining women will be better able tha? 
Socially defined women to clearly articulate the | 
nature of their problems and their solution 
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and to identify the causes of these problems. 
That is, the stable tendency to see the world 
as rational and causally organized will be 
reflected in a tendency to see a particular 
problem as the result of a number of factors 
that are intelligible and identifiable. 

It is further hypothesized that self-defining 
women will be more likely to present their 
problems in a broad social context. They will 
be more likely to experience their problems as 
involving a wider network of social relation- 
ships (e.g., colleagues in work and voluntary 
settings, the wider society), whereas socially 
defined women will be inclined to define their 
problems as located within their personal 
domestic context (i.e. the home, marriage, 
or family). 

Finally, it is hypothesized that self-defining 
women will be more likely than socially defined 
women to make instrumental responses to 
their problems, and that socially defined 
women will be more likely to make either 
noninstrumental responses or no active re 
sponses at all to their problems. Thus, overall, 
it is hypothesized that an individual’s particu- 
lar interpretation of and response toa particular 
life problem may be grounded in a dispositional 
world view. 

In order to examine these substantive 
hypotheses it is, however, also necessary to 
determine whether self-definition is associated 
with the type of problem that a woman faces. 
That is, if we wish to conclude that interpretive 
differences between the two groups are 
meaningful, and that the differences are a 
function of the women’s viewpoints, we must 
show that the differences are not a function of 
the actual problems faced by the two groups. 


Method 


Subjects and Procedure 


The subjects in this study were 60 women, randomly 
selected from a larger longitudinal study of 122 women 
educated at an elite New England women’s college. A 
six-picture TAT had been administered to the women 
under neutral conditions in the fall of their freshman 
year of college, as part of a larger study of college 
students. Ten years after their graduation from college 
(when they were 31 years old), they were mailed a 
questionnaire, inquiring about their activities in the 
years since college. In addition, all of the women were 
asked whether they would be willing to be interviewed. 
Because only 30% of the sample lived within a distance 
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permitting a personal interview, a random sample of 
60 of the 122 women was selected to be interviewed 
over the phone. Of the 60 women, 57 were successfully 
contacted by telephone and interviewed for 20-30 
minutes. The interviews were taped, with subjects’ 
knowledge and permission, and were later transcribed 
for coding, with any identifying indications deleted. 
Complete, codable transcriptions were made for 51 
of the 57 interviewees. The remaining 6 interviews 
were at least partially inaudible or unintelligible due 
to various mechanical failures. 

Female interviewers? conducted the interviewing 
according to the interview schedule outlined below, 
mainly attempting to ensure that in discussing the 
material, subjects covered all topics in the schedule. 
Subjects were surprisingly unruffled by the prospect 
of being interviewed over the telephone, and the 
resulting data appear quite open and complete. It is 
possible that the relative anonymity and imperson- 
ality of a telephone conversation encourages free and 
open conversation, since there is little opportunity for 
inadvertent expression of disapproval or disagreement. 
The hypothetical disadvantages of telephone interviews 
(lack of personal contact, lower rapport, etc.) did not 
seem to be significant factors in this study. 


The Interview Schedule 


Interviewers read the following paragraph to the 
subjects: 


We are interested in studying stress in women’s lives, 
We would like you to describe the period in your life 
that you think of as the most unhappy or upsetting 
time that you’ve lived: through, Any period is all 
right, but choose a time when—for an extended 
period, not just briefly—things really seemed to go 
badly. The time needn’t be catastrophic—just a 
time when things were not going well. 


If subjects responded with a question about how long 

was “an extended period,” interviewers were instructed 

to indicate that “the period should be about 6 months 

or more.” If subjects indicated that nothing really 

terrible had happened, interviewers were instructed 
to respond, “just pick a time when you were less happy 
and more frustrated or distressed than usual.” If 
subjects indicated that more than one period would 
qualify, interviewers asked them to select the period 
they felt was “most stressful for the longest period.” 

In addition, interviewers used the following probes 
if the subjects’ responses did not answer the questions 
involved: “What led up to this situation?” “What were 
the causes of this situation?” “Why did you find these 
particular things so upsetting?” “What did you do?” 
“Did you take any steps to deal with your situation?’ 

“How did this all end?” 


At the end of the interview, the interviewer concluded : 
We feel that women’s lives are especially stressful, 
but we hope to understand these stresses better, so 


£ 


3 The author is grateful to Betsy Harrington and 
Kathleen Finn for their assistance as interviewers, 
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we can help do something about them. Would you 
like a summary of our findings? 


The interviewer then filled out a card with the subject’s 
name and address, and feedback was later sent to the 
subject. 


Coding the Data 


I scored the original TATs, administered in 1960, 
blindly for self-definition before the interviews were 
conducted. I have demonstrated interrater reliability 
with other scorers on this variable (rho = .95). In 
addition, I have demonstrated score-rescore and 
intercoder reliabilities (rhos) above .94 on a subsample 
of the data scored for this study. 

Briefly, the scoring categories in the self-definition 
scoring system‘ can be summarized as follows: 


Self-definition categories (scored +1). Causality: 
Use of explicit causal language. Reason-action sequence: 
The plot is organized in such a way that the final 
element of the story is an action or plan for action which 
follows logically from the preceding elements. 

Social definition categories (scored — 1). No caus- 
ality: Events occur that appear to have no known 
causes or that are bizarre and unintelligible within the 
context of the story. Mental state ending: The plot is 
organized in such a way that the final element of the 
story is a character's feeling (“she was sad”), expressive 
behavior (“she cried”), or state of being (“then she 
was alone”). Ineffective actor: The final element of 
the story is phrased in passive voice or impersonal 
construction, It is explicitly stated that characters are 
helpless or that action is futile and pointless. 


‘The scores for each picture (theoretical range = —3 
to +2) are summed to create a single total score 
(theoretical range = —18 to +12). In this sample, 
the mean self-definition score was —.12 and the 
standard deviation was 3.40, 

The interview transcripts were coded by a student 
assistant trained by me on pretest transcripts.5 This 
coder was blind to the identities and scores of the 
interviewees and was unaware of the nature and 
hypotheses of the study. The coder and I achieved 
intercoder reliability (category agreement) above .90 
for all categories scored, The categories for coding the 
interviews can be summarized as follows :6 

Overall ratings of the interview, The coder was 
instructed to perform an overall rating of the total 
interview for the following variables: (a) Clarity of 
formulation of the problem—Code the problem from 
O (quite unclear) to 2 (quite clear). (b) Clarity of 
explanation of causes of the problem—Again, rated 
clarity of the explanation of the causes of the problem 
on a 3-point scale from “unclear” to “clear.” (c) Locus 
of the problem—Indicate whether the problem is 
ces by the Person as being located either within 

ierself or outside herself. (d) Locus of the solution to the 
problem—Indicate whether the Person sees the solution 
to the problem as located within her own control or 
outside her own control. (e) Content of problem— 


Indicate which of the following areas best describes the 
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Table 1 
Self-Definition and the Content of the 


Content of problem n M 
Work or school* 14 By 
Relationships 15 —.13> 
Psychological 8 —.88 
Death or illness 14 —.36 


Note. F = .51, ns. 

* Including housework, and so forth. f 
b Since some of the self-definition categori 
scored —1, negative total or average sco 
possible. 


content of the problem described: (i) work oi 
(including work at home, if issue is work), (ii) i 
personal relationships, (iii) emotional and psych 
problems (depression, alcoholism, etc.), or (iv) 
and illness. 

Type of response to problems. The coder 
structed to code the problem solutions as ei 
No action—The subject indicated that she did 
to solve the problem at all (e.g., “I did nothing,” 
waited,” “Nothing could be done,” etc.) or wal 
the problem to be solved by some outside ag 
“My husband got transferred so it got better. 
Noninstrumental action—The subject indicat 
the solution to the problem was “distraction. ‘ 
the subject’s solution to the problem was to 
herself in some other unrelated activity that hi 
ignore the problem (e.g., “I was afraid of whai 
happen if I told him how I felt, so I joined 
clubs to keep myself busy,” “I took up bridg 
Instrumental action—The subject indicated 
solved the problem by initiating some action 
actions (e.g., “I went back to school,” “I told h 
I felt,” “I got a job,” etc.) that was relevant f 
problem at hand. 


Results 


First, as can be seen in Table 1, there 
no differences in self-definition sco 
women reporting problems in each 0 
content areas (work or school, other re 
ships, etc.). Any differences in the 
tations of self-defining and socially d 
women’s problems cannot reasonabl 
attributed to differences in the tyP 
problems that they faced. 


“A detailed scoring manual, with illustrati 
amples and sets of practice stories with 
Scoring, can be obtained by writing to the aul 

‘The author is grateful to Gwen Arthur føl 
assistance. 

ë Detailed instructions for coding can be © 
by writing to the author. 
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The first substantive hypotheses tested 
involved subjects’ interpretation of their 
| problems. As can be seen in Table 2, these 
hypotheses were supported. Because subjects 
whose problem involved death or illness might 
"be less likely to hold alternative interpretations 
of their problem, these subjects were excluded 
from these analyses. However, results that 
included these subjects were substantially the 
same as those reported here. In both analyses, 
women who perceived their problem as located 
outside themselves and perceived the solution 
to their problem as located within themselves 
were significantly higher in self-definition. 
Similarly, women who cited more causes of 
their problem scored higher (though not 
significantly) in self-definition. Scores were 
also significantly correlated with the clarity 
of articulation of the problem (r= .64, 
p < .001) and clarity of explanation of the 
causes of the problem (r= .73, p < .001). 
Finally, women who presented their problems 
as involving a wider social context than the 


Table 2 
Self-Definition and Interpretations of the 
Problem 
ee E Ee 
Item n M t 


me es 


Locus of the problem* 


Within self 18 ér] 
Outside self 19 ET ae 
Locus of the solution* 
Within self E ores 
Outside self A E 
No. causes of problem 
More than 3 25 St Ly 
Fewer than 3 26 =s sol 
Context of the problem 
Broad (outside home) 32 1,02 
rE C aag 
Narrow (only home) ONE ey 


Note, All p values are one-tailed. i 
Problems classified as “death and illness” were 
excluded in these cases, since by definition they in- 
volved events both outside the self and outside the 
{ pantrol of the self. Problems open to alternative 
p terpretations (work, relationships, psychological) 
ere included in the analysis. 
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Table 3 
Self-Definition and Alternative Responses to 
Problems 


—_——— 


Response n M SD 
Instrumental action 16 1.56" 3,71 
Noninstrumental action 10 —2.10 3.4 
No action 25 —.44 2.86 


rs, 


Note. F(2, 48) = 4.25, p < .05. 
à This mean is significantly different from the means 
of those taking noninstrumental actions (¢ = 2.59, 
p < .01, one-tailed) and of those taking no action 
(t = 1.89, p < .05, one-tailed). 


nuclear family were significantly more self- 
defining than women who did not. 

The final hypothesis tested here was that 
self-definition would be associated with instru- 
mental rather than noninstrumental or passive 
responses. As can be seen in Table 3, this 
hypothesis was also supported. Women who 
made instrumental responses were significantly 
higher in self-definition than women who made 
noninstrumental responses or no responses 
at all. 

Discussion 

It is clear that self-defining and socially 
defined women do indeed interpret the prob- 
lems that they face differently. As predicted, 
self-definition in women is associated with 
seeing one’s problems as external but the 
solutions as within one’s grasp (or internal). 
It is also related to seeing the causes of 
problems as multiple, and to explaining 
those causes clearly. Finally, self-definition is 
high among those who see their problems as 
part of a relatiyely wide social context, rather 
than in a narrowly domestic one. The world 
view expressed in these women’s TAT stories 
is, then, reflected in the way they construe 
their own problems, 14 years later. Indeed, 
it can be said that the categories scored in the 
interviews are closely related to those scored 
in the TAT stories, and that therefore the 
relationships found here might better be 
interpreted as reliability estimates for self- 
definition rather than as genuine relationships 
between independent variables. Even if this 
argument is accepted, the consequent estimate 
of the stability of self-definition (.60-.70) over 
14 years is itself remarkable enough. However, 
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the original measure was based on fantasy 
stories told about neutral pictures; the current 
interview coding was based on descriptions of 
actual problem situations and the methods 
used to handle those situations. Thus, on 
grounds of temporal separation, different base 
data, and different scoring systems, the two 
measures seem to be sufficiently different to 
justify the inference that self-definition (a 
dispositional world view) predisposes cer- 
tain kinds of interpretations of problematic 
situations. 

Finally, it seems reasonable to suspect that 
self-definition iy also related to the actual 
problem-solving responses that the subjects 
reported having made. Though there is no in- 
dependent evidence that these responses were 
actually made by the subjects, I have no 
reason to believe that the subjects distorted 
their responses, particularly since so few 
reported instrumental responses (31%), which 
are presumably the more socially desirable 
ones. 

In summary, it is clear that self-definition 
and social definition are two implicit belief 
systems about the nature of external reality 
and one’s own position within it with implica- 
tions for an individual’s interpretation of 
specific personal problem situations and for be- 
havioral responses to those situations. 

The implications of these findings should 
be clear. First, theoretically, the confirmation 
of a close link between individuals’ general 
implicit belief system and their specific 
interpretations of and responses to particular 
problems indicates that coping responses, 
however “irrational,” when viewed by an 
outside observer, may “make sense,” given the 
world view of the subject. Moreover, this 
study indicates that it may be useful to 
examine both general stable belief systems 

(not only specific, situationally based attribu- 
tions) and belief systems that are not neces- 
sarily conscious but are instead implicit in 
one’s habits of thinking and organizing 
experience. Second, and more practically, if we 
wish to change ineffective coping strategies. 
we may find that it is sometimes useful a 
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intervene at the level of the glol 

view rather than at the particular 

level. In addition, it is clear that as a con 
the predictive and theoretical utility o 
definition extends beyond the doma 
career orientation and career activity. 
one implication of this study is that indi 
characteristic world views may often 
their particular judgments in stress situati 
Further research examining the role ol 
definition, and related dispositional var 
in determining alternative appraisals of 
in the laboratory (see, e.g., Lazarus, 19 
alternative cognitive ‘‘sets” in problem 
situations (see Scott & Howard, 1970) s 
appropriate, given the strong asso 
found in this study between self-defini 
individuals’ interpretations of their 
personal problems. 
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' Self Versus Ideal Self: 
A Comparison of Five Adjective Check List Indices 


; Harrison G. Gough 
Institute of Personality Assessment and Research 
University of California, Berkeley 


Renato Lazzari and Mario Fioravanti 
University of Rome 


Five self versus ideal-self measures were defined on the 300-item Adjective 


Check List. Overall congruence was 
and by the sum of the absolute 


y indexed by the phi coefficient for items 
differences on standard scores for the 24 scales 


(D-T). The absolute differences on the 24 scales were also correlated and fac- 


tored in two samples: 100 American Air Force 


officers and 95 Italian young 


men applying for a national precollege military training school. Three factors 
common to both samples were identified. D-1, D-2, and D-3 measures of dis- 


similarity were obtained 
scales assignable to each 


by summing 


the absolute differences on just those 
of the three factors. Analyses of observers’ ratings in 


the sample of American officers revealed phi, D-T, and D-1 to be indicative of 


superior personal and social 


ciency and diligence. D-3, in contrast, 


adjustment and D-2 to 
had rather unfavorable connotations. It 


suggest goal-oriented effi- 


is concluded that internal components of self-ideal congruence have differential 
implications that overall measures will obscure or even fail to detect. 


A compelling aspect of the phenomenology 
of human experience is the sense of individu- 
ality, locus, wishes, fears, capacities, and 
continuity that behavioral scientists refer to 
as the self. By means of introspection and the 
observation of what others say and do, each 
person gradually evolves a notion of who he is 
and how he resembles and differs from other 
people. Various writers have attended to the 
way in which this self-concept or sense of 
identity may reflect social consensus (e.g., the 
“looking-glass self” of Cooley, 1902), the 
consequences of role-taking activity in child- 
hood and later (Mead, 1934), and a growing 
awareness of the differences between what one 
wants and what the world is prepared to give 
(James, 1890). 

_ The notion of an ideal self—what one would 
like or feels constrained to be—can also be 
Posited. McDougall (1932), for example, 


i Requests for reprints should be sent to Harrison G. 
Gough, Institute of Personality Assessment and 
Research, University of California, Berkeley, Cali- 
fornia 94720. 
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described a “self-regarding sentiment” and 
stressed the active process of comparison that 
goes on between the actual and the ideal self. 
Socialization, according to McDougall, is in 
part the resultant of a continuously more 
effective and encompassing reconciliation of 
these two selves. Allport (1961), in his concept 
of the proprium, sought to bring together 
seven facets of selfhood, including the bodily 
self, continuity, self-esteem, extension, imagi- 
nation, rational coping, and goal-directed or 
propriate striving. The degree to which these 
facets of the self are harmoniously integrated 
will determine the degree to which an in- 
dividual becomes what he or she is capable 
of becoming. 

In this article we shall be dealing primarily 
with contrasts between the real and ideal 
selves, or aspects of the self, and with ways 
in which their congruence or discrepancies can 
be measured. There have, of course, been 
many ‘attempts to assess these differences 
(Wylie, 1974), making use of methods such 
as the Q sort (Butler, 1968; Butler & Haigh, 
1954), the semantic differential (Pervin & 
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Lilly, 1967), the Interpersonal Check List 
(Leary, 1957), and the Minnesota Multiphasic 
Personality Inventory (Rosen, 1956). For the 
most part, these analyses have used a single 
score or index, based either on the responses 
to individual items under two conditions of 
administration, or on differences between 
scores on the scales of the instrument. The 
possibility that internal components or dimen- 
sions of similarity may be identified should 
also be considered. An overall index of con- 
gruence may mask or obscure important 
internal facets of similarity having different 
diagnostic implications. 


Method 
Subjects 


Two samples were studied. The first consisted of 95 
Italian males (M age = 15.5) who were applying for a 
national educational program sponsored by the Depart- 
ment of Defense. In educational background these 
applicants have the equivalent of 10 or 11 years of 
schooling in the American system, Successful applicants 
receive free education, food, and lodging, and incidental 
financial aid, much as do cadets or midshipmen in 
American military academies. A student in the Italian 
Program may, on graduation, go on to one of the 
Italian military or naval academies, and the school, 
in fact, is viewed as a Preparatory center for such 
training. The 95 applicants included in this study had 
come from all parts of the country to a testing center in 
Naples. 

The other sample was composed of 100 male Air 
Force officers seen at the Institute of Personality 
Assessment and Research in Berkeley (MacKinnon, 
1958). All held the rank of captain, Their mean age 
was 33.6, mean years of education 13.7, and mean years 
of service 11. Each officer spent 3 days at the Institute. 
participating in interviews, leaderless group discussions, 
testing sessions, laboratory exercises, and informal 
activity in which his behavior was observed and rated. 


Tests 
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aggression, and deference. Testing time varies from 5 
to 10 minutes. Italian norms for the ACL (Gough, 
1968) were used for standardizing the scores of the 95 
Italian males in this study. 

The ACL was administered twice, once with normal 
instructions and the second time with a request to 
describe one’s own ideal, defined as “the kind of person 
you would like to be—your personal ideal.” 

For the American sample, each officer was described 
on the ACL by 10 staff observers from the Institute, 
Tallies were made of the number of times each word 
had been checked, and these sums were used as descrip- 
tive scores. For example, if all 10 observers described 
an officer as “boastful,” then his score on this item 
would be 10. If only 2 observers described him as 
“cooperative,” then his score on this item would be 2. 
In this way 300 descriptive scores were obtained for 
each of the men. 


Five ACL indices 


To indicate the correspondence between self and 
ideal-self descriptions, five measures were used. The 
first of these was the phi coefficient for the adjectives 
checked and not checked in the two conditions. This 
coefficient reflects the degree of overall agreement 
between the descriptions of self and ideal self. The 
second index, called D-T, was also an indicator of 
overall agreement. It was defined as the sum of the 
absolute differences between the 24 standard scores 
on the ACL profiles for self and ideal self. 

To develop indices of internal components, several 
methods were tried. One was to factor analyze the 
two 24 X 24 matrices of standard scores for the ideal 
self and then to compute difference scores separately 
for the scales included in each ideal-self factor. In 
each sample there were five ideal-self factors with 
eigenvalues greater than 1. Another approach was by 
way of the canonical correlations between the self an 
ideal-self scales in the two samples; the self-descriptive 
weights could then be used to compute estimates of 
the ideal vectors. These analyses identified two 
canonical variates in each sample. A third method was 
to factor analyze the 24 difference scores, based on the 
absolute discrepancies between the self and ideal-self 
Scores for each of the scales in the checklist. Being 
based directly on the similarity of scores under the two 
conditions of testing, this third method appeared to be 
conceptually closest to the phenomenological notion © 
congruence between self and ideal self, and it was the 
one selected for subsequent analyses, 3 

Tn the factor analyses for both Italian and American 
samples, there were four factors with eigenvalues equ 
to or greater than 1. After varimax rotation, thes? 
factors were compared for equivalence. The American 
factors 1, 2, and 3 were correlated .47, .58, and - 
with the Italian factors 2, 1, and 3. The fourth factors 
from each analysis, however, were ambiguously relat 
and were therefore dropped from further consideration: 

On the basis of these factor analyses, three different 
Scores were computed, each based on the absolu 
Sum of the standard score discrepancies for the scale 
assigned to each factor. Specifically, D-1 was base 


SELF VERSUS 


Table 1 


IDEAL SELF 1087 


Comparison of Real Versus Ideal Self Adjective Check List Protocols for the American and 


Italian Samples 


tanan eee eee 


Air Force officers* 


Italian students 


Real Ideal Real Ideal 

ACL scale M SD M SD M SD M SD 
Number Checked 46.94 8.43 48.49 7.91 0. 
Defensiveness 5749 9.41  60.65** 7.35 pa 134 35 teed 
Amber ‘avorable 52.98 9.24 59.85** 8.52 55.01 746 57.88* 9.69 
Unfavorable 47.20** 9.09 41.04 2.04 44.72** 
Self-confidence 52.73 10.66 59.16%* 7.92 seu 389 seni He 
Self-control 54.69 9.75 60.28** 5.75 50.95 830 55.88" 7.10 
Lability 46.06 10.63 49.89** 7.83 51.39 1043 5214 9.01 
Personal : 
Adjustment 54.26 9.51 58.87** 6.33 53:67 E E ANE RE 
Achievement 58.09 10.03 62.70** 7.10 54.82 8.13 58.58** 7.81 
Dominance 56.08 9.47 59.92** 6.30 56.54 7.46  60.72** 7.90 
Endurance 58.48 9.79  63.90** 5.65 54.00 9.26 56.85** 6.40 
Order 57.28 9.06  60.28** 5.96 52.81 9.87  55.63"*. 6.44 
Intraception 52.79 10.19 57.78** 8.62 5246 7.21 5316 8.33 
Nurturance 5341 9.33  55.84%* 6.34 52.38 6.24 52.27 8.46 
Affiliation 54.94 8.24 56.58 7.53 54.78 7.45 52.56 940 
Heterosexuality 47.32 8.58 52.62%* 7.64 50.11 8.63  53.32** 10.10 
Exhibition 48.07 9.76 49.91 5.50 50.78 7.62 52.28 6.80 
Autonomy 46.83* 9.36 44.46 5.40 49.20 6.80 48.03 7.89 
Aggression S719" OAT 44.76 6.06 47.44 6.62 47.33 7,33 
Change 4467 8.08 43,77 6.88 49.78" 9,26 45.88 7.32 
Succorance 43.88** 7.78 39.42 4.37 43.60%" 7.65 38.59 6.85 
Abasement 45.55" 7.96 43,44 4.59 45.19 7.34 41.37 6.93 
Deference 50.36 10.04 49.11 6.87 48.42* 6.70 46.02 8.58 
Counseling 

Readiness 46.79** 8.53 39.23 6.36 44.42"* 8.22 41.63 7.51 
“n= 100. 
bn = 95, 
web < 05. 

p< 0i. 

be expected, scores were generally higher 


on the scales from the American first factor and Italian 
second; it was defined as the sum of the absolute 
discrepancies on the scales for Number of Items 
gaead) Number of Favorable Items Checked, 
Personal Adjustment, Intraception, Nurturance, Afilia- 
tion, and Aggression. D-2 included the scales from the 
American second factor and Italian first; it was defined 
fe the sum of the absolute discrepancies on the scales 
‘or Defensiveness, Self-confidence, ‘Achievement, Domi- 
nance, and Endurance. D-3 was based on the two third 
factors, summing the absolute discrepancies on the 
Scales for Lability, Exhibition, Autonomy, Change, 
basement, and Deference. 


Results 


_ Table 1 presents standard score means and 
oR for the self and ideal-self protocols of 
the American and Italian samples. As would 


under ideal-self instructions for scales measur- 
ing favorable or positive variables such as 
self-confidence and achievement and lower for 
scales measuring less desirable attributes such 
as the number of unfavorable words and 
abasement. 

There were 12 differences that were in the 
same direction and that also yielded statisti- 
cally significant (p < .05) ¢ tests in both 
samples. If attention is paid only to the 
direction of difference, agreement was found 
in 22 of the 24 comparisons. A binomial test 
for this consistency gave a 2 value of 3.60, 
significant well beyond the .01 level of confi- 
dence. It can be concluded that there is 
appreciable correspondence in the two samples 


1088 


Table 2 
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Means, Standard Deviations, and Intercorrelations of the Five Indices of Self Versus I deal-Self 


Congruence in the American and Italian Samples 


—_—_ 


Intercorrelation* American® Italiane 
Index 1 2 3 4 5 M SD M SD 
1. Phi = —.79 —.71 —.64 —.43 .50 -21 .42 .25 
2. D-T —.71 = .82 -80 65 190.45 87.47 183.80 77.97 
3. D-1 —.68 4 — -50 33 54.78 30.63 52.39 30.57 
4. D-2 —.40 .18 Eyi Ta 33 41.16 28.05 42.48 24,29 
5. D-3 =.51 .70 28 AS = 47.96 25.35 43.54 24.84 


Note. D-T = the sum of the absolute differences between the 24 standard scores on the Adjective Check 


List profiles for self and ideal self; D-1 = the sum of 


the absolute differences on seven scales; D-2 = the 


sum of the absolute differences on five scales; and D-3 = the sum of the absolute differences on six scales. 


See text for specification of these scales. 


* Coefficients for the American sample are above the diagonal; coefficients for the Italian sample are below 


the diagonal. 
bn = 100, 
on = 95, 


in regard to the effect of the “ideal self” 
instructional set. 

The five indices of agreement or disagree- 
ment between self and ideal-self description 
were next computed for the men in each 
sample. The means and standard deviations 
for each index are given in Table 2, along with 
the intercorrelations among the five measures 
of congruence. 

The two overall indicators of similarity (phi 
and D-T) correlated —.79 in the American 
and —.71 in the Italian sample. These coeffi- 
cients are in the expected direction, as phi 
reflects agreement and D-T reflects disagree- 
ment between the two conditions of testing. 
Although both coefficients are high, they are 
less than perfect and Suggest that the relative 
degree of correspondence between the self and 
ideal-self descriptions of an individual will 
depend in part on the measure used to assess 
that correspondence, 

The three remaining indices are each based 


on nonoverlapping subsets of scales from th 
ACL. For the Americ Sa 


e means of the 
ples on the five 


indices were all statistically insignificant, 
Suggesting that these values may be taken as 
reliable benchmarks. ; 

To examine the differences in diagnostic 
implications among the five measures, data 
were taken from the study of Air Force officers. 
As was mentioned earlier, each officer was 
described on the ACL by 10 staff observers. 
The number of times that each item was 
checked by these panels was taken as the 
descriptive score on the adjective; these 300 
Scores were then correlated with the five 
indices. To simplify the problem of interpreting 
findings, only the six adjectives with highest 
Positive correlations and the six with largest 
negative correlations were selected for each 
of the five indices, 

For the phi coefficient of similarity between 
self and ideal self, the six descriptions with 
highest positive correlations were cooperative 
(28), adaptable (.26), planful (.26), outgoing 
(.25), efficient (.24), and thorough (24 
These coefficients are all significant beyond the 
05 level of probability, and taken together 
they suggest a quite favorable picture of the 
individual whose self and ideal-self descriptions 
are similar. The six terms with largest negative 
correlations were slow (—.25), foolish (=.24), 
awkward (—.22), confused (—.22), unrealistic 
(—.21), and unfriendly (—.18). 1 

For the D-T index of dissimilarity, we sha 
Teport first those adjectives having negative 


| 


j 


relations, as they are related to closer 
‘correspondence between self and ideal-self 
descriptions. The six with largest coefficients 
were thorough (—.24), planful (—.20), adapt- 
‘able (—.19), cooperative (—.19), loyal (—.18), 
‘and reasonable (—.18). The cluster is dis- 
tinctly favorable, but the degree of relation- 
ship is weaker than that for the phi coefficient. 
"The six terms with largest positive values were 
headstrong (.26), opinionated (.25), boastful 
(24), tactless (.24), arrogant (.23), and 
individualistic (.22). 
_ For the D-1 index of dissimilarity, based 
"on 7 of the 24 scales in the ACL, the six terms 
i with largest negative correlations (and there- 
fore associated with greater congruence of 
self and ideal-self descriptions) were thorough 
(=.26), cooperative (—.25), loyal (—.25), 
reasonable (—.24), considerate (—.23), and 
‘stable (—.22). The six terms with the largest 
Positive correlations were boastful (.36), head- 
strong (.35), tactless (.33), opinionated (.32), 
arrogant (.31), and self-centered (.30). 
Although there are differences in the 
Particular words cited, the general tenor of 
the characterizations of officers showing greater 
' self-ideal congruence on these three measures 
is favorable. Officers whose self-descriptions 
< More closely approximate their ideal selves are 
: Viewed as more cooperative and less headstrong. 


_ If the diagnostic implications of the three 
indices are more or less the same, which of the 
ee measures can be recommended? One 
answer to this query can be based on the time 
| Needed for computation. For hand calculation, 
D-1 Is easier to use than either D-T or phi, 
| suming that the two ACL protocols have 
been scored and profiled. Another answer can 
Come from the certainty of the diagnostic 
plications. The median coefficient among 
12 given for each index can provide a 
Crude measure of this certainty. For the phi 
D ex the median correlation was 245, for 
IT it was .235, and for D-1 it was .280. The 
erences are slight, but they favor D-1. 
Correspondence between self and ideal self 
ee typically been interpreted as a measure 
E personal adjustment and stability. In the 
Udies of psychotherapeutic outcome reported 
Y Butler and Haigh (1954), an increase in 
a Correlation between self and ideal-self Q 
tts was accepted as a sign of improvement. 
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Wylie (1974), after reviewing a large number 
of studies, concluded that congruence between 
self and ideal-self description was generally 
taken as an indicator of self-acceptance and 
personal adjustment. The descriptive corre- 
lation found in our Air Force sample for phi, 
D-T, and D-1 gives some support to these 
inferences. A cautionary note should be 
sounded in regard to “oversatisfaction with 
self” (Block & Thomas, 1955). It is possible 
that too much congruence between self and 
ideal-self description could reflect insensitivity 
to personal problems, defensiveness, narcis- 
sism, and other undesirable attributes. A phi 
coefficient of .99 between self and ideal-self 
ACL protocols would not be viewed very 
favorably even by the most ardent advocate 
of the index. In our two samples, it should be 
remarked, exceedingly high phi coefficients 
were not encountered. For the Air Force 
sample, the highest coefficient was .85, and in 
the Italian sample the highest phi was .84. 

In the Air Force sample, several ratings of 
personal soundness and social adjustment were 
available, contributed by the staff members 
who had interviewed and studied the officer. 
The rating for personal soundness correlated 
43 with the phi index of correspondence 
between self and ideal self, —.11 with D-T, 
and —.23 with D-1. If an index of self versus 
ideal self is intended to carry a diagnostic 
implication of personal soundness or stability, 
it would appear from these three coefficients 
that the implication is strongest for the D-1 
index. Another indication of adjustment came 
from a preliminary form of the Block Q sort 
(Block, 1961). An item in that form stated 
“Gets along in the world as it is; is socially 
appropriate in his behavior; keeps out of 
trouble.” The mean staff placement of this 
item correlated .21 with phi, —.14 with D-T, 
and —.26 with D-1. Once again the advantage 
lies with the D-1 index. A final example can 
be taken from the Block item “4s socially per- 
ceptive, responsive to interpersonal nuances.” 
Correlations with mean staff placement of this 
item were .23 for phi, —.20 for D-T, and —.28 
for D-1. 

The second internal measure was D-2, based 
on the scales for Defensiveness, Self-confidence, 
Achievement, Dominance, and Endurance. 
When D-2 was correlated with the descrip- 
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tions, the six with the largest negative coeffi- 
cients (i.e., checked more often about officers 
whose two protocols were more similar) were 
planful (—.25), efficient (—.22), adapt- 
able (—.21), ambitious (—.21), sharp-witted 
(—.21), and thorough (—.21). The six descrip- 
tions with the largest positive correlations 
were slow (.28), shy (.26), awkward (.24), 
commonplace (.22), unaffected (.22), and 
slipshod (.21). Once again, officers whose two 
reports were more congruent were described 
more favorably, and those whose reports were 
less congruent were described less favorably. 

In this regard D-2 resembles phi, D-T, and 
D-1. What, if any, are the differences? One 
difference is that D-2 carried no implications 
for the Q-sort item “gets along well in the 
world as it is,” for which the coefficient was 
zero. For the rating of personal soundness, 
the coefficient for D-2 was —.01. Thus, 
although D-2 has implications for planfulness 

` and efficiency, it does not have the implications 
that D-1 revealed for personal soundness and 
everyday social adjustment. On the Q-sort 
item “lacks confidence in own ability,” the 
correlations for phi, D-T, D-1, and D-2 were 
—.10, .02, —.10, and .19, respectively. The 
sign of the coefficient for D-2 indicates that 
officers whose two descriptions were more 
discrepant were characterized by this lack. 
D-2, it appears, is reflective of a kind of 
problem-solving effectiveness but not of per- 
sonal adjustment or soundness. 

The third internal measure of congruence, 
D-3, may now be considered. The six adjec- 
tival descriptions correlating most strongly 
with similarity on D-3 were greedy (—.20), 
unselfish (—.19), fearful (—. 18), self-punishing 
(—.17), ingenious (—.16), and changeable 
(—.15). The correlations for these six terms 
are low, and all save one are less than the 
coefficient of .195 necessary for significance 
at the .05 level of probability. It should be 
observed, nevertheless, that three of the six 
are clearly unfavorable in implication and that 
only two (unselfish and ingenious) are clearly 
favorable. D-3 therefore represents a distinct 
departure from the favorable implications of 
similarity between self and ideal found for phi 
D-T, D-1, and D-2. A 

The six descriptions 


x correlating most 
strongly with dissimilarity o; 


£ the two protocols 
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on D-3 were daring (.25), tough (.24), hard. 
hearted (.23) attractive (.19), opinionated 
(19), and strong (.19). Although these six 
relationships are of borderline statistical 
significance, they do cohere in a meaningful 
manner and suggest an individual possessing 
vivid and to some extent invasive social 
characteristics. 

There were two ratings for which the 
correlations with D-3 were larger than for the ¢ 
other four indices. One of these was that for 
“ability to obtain sexual gratification,” ob- 
tained from the psychiatric life history 
interview. The coefficients were .05 for phi, 
10 for D-T, .08 for D-1, .08 for D-2, and .18 
for D-3. The other was for the staff rating of 
apparent health and vitality, for which the 
coefficients were .07 for phi, .12 for D-T, .09 
for D-1, .07 for D-2, and-.16 for D-3. These 
are very low correlations, but they are of 
interest in that they show officers who are 
more discrepant on the D-3 index to possess 
more zest and vigor. 


Discussion 


One goal of our analysis was to determine 
whether or not internal components within 
the self versus ideal-self context might have 
implications different from each other an 
different from a global or overall index 0 
congruence. We also wished to compare tw? 
methods of assessing overall congruence, on 
based on the scales of the ACL and the other 
taken directly from its items. A third aim wa 
to enhance the generalizability and reliability 
of the measures by drawing on cross-cultur 
data in their derivation. In order to identify 
internal themes or components, factor analys 
of difference scores for the 24 scales of the 
ACL on self and ideal-self protocols we? 
conducted in American and Italian samples 
Three of the four factors extracted in the tW? 
Samples were compatible and were used 
define three internal themes. ji 

The two overall indices defined by the P 
coefficient and a discrepancy score (D- A 
based on all 24 scales were highly correlate 
with each other and gave rise to BT 
patterns of descriptive implications v 
Correlated with adjectival descriptions 
ratings in the American sample of 100 a 
Force officers. The first internal compon® 
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(D-1) also correlated significantly with these 
two overali indices and produced a similar 
pattern of diagnostic implications. In general, 
the pattern for all three indices was indicative 
of positive personal adjustment, stability, and 
facility in coping with the world as it is. The 
level of these diagnostic implications was 
generally highest for the D-1 index. On the 
ACL, it follows, D-1 would be the best of the 
three measures of congruence between self 
and ideal self, if congruence is to be used as 
an indicator of personal adjustment and 
self-acceptance. 

The other two internal indices differed 
somewhat from D-1 in their diagnostic impli- 
cations. D-2 put more stress on personal 
efficiency, ambition, and work-related at- 
tributes and less on personal soundness and 
everyday social adjustment. D-3 revealed 
officers whose two protocols were more 
congruent to be more fearful, changeable, and 
self-punishing, whereas those whose protocols 
were more discrepant were more attractive, 
more daring, and more capable of seeking and 
attaining their goals in the sexual sphere. 

4 Because of these differences in the diagnostic 
implications of the three internal indices, it is 
apparent that an overall index such as D-T 
must mask or conceal these variations. For 
example, the relatively unfavorable implica- 
tions of congruence on the six scales included 
in the D-3 index are masked when a total 
discrepancy score based on all 24 scales is used. 
Differentiated measurement of similarity be- 
tween self and ideal-self description is therefore 
to be preferred. The three ACL indices pre- 
sented in this article may serve as examples of 
such an analytic perspective, even if future 
study finds other internal measures of con- 
Sruence to be preferable. An obvious cautionary 
Note, in this regard, is that the three com- 
ies of congruence reported above were 
€rived from samples of males only. Analyses 
of samples of females must be carried out to 
s whether these or other indices of 
i ngtuence will be most valid. Other ways of 
lefining the ideal self could also be considered, 
a example, “the person my parents want me 
0 be,” “the person I would like to be in 10 
TA and “myself, when I am at my best.” 
i € essential findings in our analysis, we con- 

‘ude, are (a) that there are facets or aspects of 
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the congruence between self and ideal and 
(b) that these facets have different diag- 
nostic implications. It follows that psy- 
chometric methods for assessing self-ideal 
congruence should include measurement of 
internal components as well as of overall or 
general correspondence. 
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This study evaluated a short-term skill-training intervention that taught male 
alcoholics generation of appropriate behaviors in problematic situations. Forty 
alcoholics engaged in inpatient treatment were divided into three groups—a 
skill-training group, a discussion group, and a no-additional-treatment control 
group. A verbal role-playing measure of responses to situations associated with 
drinking behavior and relapse showed significant performance improvement of 
the training group as compared to the control groups. A 1-year posttreatment 
follow-up indicated that skill training decreased the duration and severity of 
relapse episodes. Behavior on the situational role-playing task predicted post- 
treatment adjustment. While pointing out limitations of skill training as imple- 
mented, results suggest its utility as one component of a multimodal behavioral 
approach to relapse in problem drinking and other problem areas such as drug 
addiction, smoking, obesity, and crime. 


A  social-learning approach to problem 
drinking (Bandura, 1969) suggests that in 
addition to the psychophysiological effects of 
alcohol, other factors such as cultural and 


rating a variety of techniques—a broad spet- 
trum approach (Hamburg, 1975; Nathan, 
1976). 


subgroup mores, learning experiences within 
the family, peer modeling, instrumental func- 
tions, and expectancies are relevant to problem 
drinking. Thus, each individual’s drinking 
behavior is likely to have multiple determi- 
nants and would require treatment incorpo- 
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spectrum approaches is to provide the problem 
drinker with the means of achieving reinforce 
ment other than through the consumption 0 
alcohol. This strategy involves identifying 
both the discriminative stimuli for drinking 
and the reinforcing consequences, and then 
making the occurrence of other more satisfying 
behaviors more likely through a variety o! 
training techniques. Sobell and Sobell (1913) 
for instance, focused on the drinking behavior 
of their subjects. Subjects in the experiment! 
group with a controlled-drinking goal wert 
taught skills that theoretically would alley 
them to not drink or to drink in an approptiat 
social manner. Sobell and Sobell suggest H 
subjects who functioned well after dischar 
had learned to recognize discriminative sioa 
for drinking and to generate alternat! } 
responses. ý 
There is little systematic evaluatio? 2 
training problem drinkers to generate E 
native behaviors. Several investigators n 


‘ 
One of the treatment strategies of : 
| 
1 
[a 
i 


ee 


al 


outlined training approaches but have pre- 
sented little evaluative data (Foy, Miller, 
Eisler, & O’Toole, 1976; Toomey, 1972). Skill- 
training techniques only recently have begun 
to receive systematic evaluation in other 
patient populations (Arnkoff & Stewart, 1975; 
Curran, 1977; Finch & Wallace, 1977; Gold- 
smith & McFall, 1975). Skill training is 
developing as a combination of methods of 
teaching the performance of new behaviors 
and of generating appropriate content of those 
behaviors. Instruction, modeling, coaching, 
and role playing, or behavioral rehearsal, 
appear to be additive components of training 
(Rich & Schroeder, 1976). Evidence also 
suggests that training tends to be situation 
specific (Eisler, Hersen, Miller, & Blanchard, 
1975). Thus generalization of training is an 
important issue, and skill training should 
include systematic teaching of problem-solving 
skills so that subjects can analyze novel 
problematic situations and generate and 
evaluate adaptive responses (D’Zurilla & 
Goldfried, 1971). 

Problem drinkers report that they use 
alcohol in a wide variety of stressful situations. 
Studies of drinking behavior in experimental 
Situations help to confirm that heavy drinkers 
increase their consumption of alcohol when 
Subjected to interpersonal stress (Allman, 
Taylor, & Nathan, 1972; Higgins & Marlatt, 
1975; Miller, Hersen, Eisler, & Hilsman, 1974). 
To determine whether self-report and experi- 
mental findings are related to continued 
Problem drinking, the characteristics of relapse 
situations need to be examined. Unfortunately, 
few investigators have reported situational 
Characteristics of relapses as they happen 
following treatment. 

The present study follows Marlatt’s (1978) 
categorization of relapse situations of male 
Alcoholics following aversion-conditioning ther- 
apy. It was found that relapse situations 
Could be assigned reliably to the following 
types: (a) frustration and inability to express 
anger (29%); (b) inability to resist social 
Pressure to drink (23%); (c) intrapersonal 
negative emotional state (10%); (d) inability 
to resist intrapersonal temptation to drink 
(21%) ; and (e) other, or no response (177%). 

Taining situations used in the present study 
Were balanced among the first four categories. 
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The present study addressed two questions: 
Does skill training increase problem drinkers’ 
effectiveness in responding to stressful situa- 
tions? and Does an improvement in psycho- 
social problem solving have an effect on the 
subsequent occurrence of problem drinking 
behavior? The population of problem drinkers 
considered in this experiment consisted of 
volunteers in intensive inpatient alcoholism 
treatment. Subjects were assigned randomly 
to one of three groups: experimental skill 
training, discussion control, or no-additional- 
treatment control. The experimental group 
used modeling, role-playing, and coaching 
techniques to work through the problem- 
solving steps of problem definition and formu- 
lation, generation of alternatives, and decision 
making. Optimal alternatives were rehearsed, 
The discussion control group talked about the 
same situations that the experimental group 
practiced solving, but they did not use role 
playing, modeling, or coaching. A second 
control group received no additional training 
beyond the regular treatment regimen. Train- 
ing effects were assessed on a verbal role- 
playing measure of responses to situations 
associated with drinking behavior and relapse 
(the Situational Competency Test; SCT). 
Subjects also were given a structured follow-up 
interview at 1-month, 3-month, 6-month, and 
12-month intervals following discharge. Specific 
hypotheses tested were: (a) The experimental 
treatment group would show a pre-post 
increase in competency in handling problematic 
situations as measured by the SCT, compared 
with the competency level of the two control 
groups, which would not change; and (b) the 
experimental group would show superior 
posthospitalization adjustment as compared 
with the control groups. 


Method 


Subjects and Treatment Setting 


Patients in the Seattle Veteran’s Administration 
Alcoholic Treatment Program (ATP) formed the 
potential subject pool. All male patients residing in 
the program 19-26 days and agreeing to stay for the full 
program were asked to volunteer. Participation was 
voluntary, and refusal to take part did not, jeopardize 
treatment. All subjects had a primary diagnosis of 
alcoholism, and none were actively psychotic or 
organically impaired to the extent of requiring extended 
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domiciliary care in the opinion of the treatment staff. 
All subjects continued to participate in the regular 
treatment program. 

Patients admitted to the ATP initially entered an 
orientation phase in which detoxification was completed 
and acute medical problems were resolved. After a 
minimum of 5 days, the patient progressed to the 
evaluation phase, a 2-week period during which the 
patient attended six 60-minute small group therapy 
sessions, videotaped replays of the group sessions, 
daily therapeutic community groups, and several 
education-oriented lectures and movies. The program 
was abstinence oriented, and group therapy, the 
primary therapeutic modality, examined ongoing 
relationships and feelings in the patient community 
with the assumption that understanding and working 
out relationships with other group members would carry 
over to important outside relationships, obviating the 
need for alcohol. 

At the end of the 2-week evaluation period, the ATP 
required patients to make a commitment for an 
additional 2 weeks of inpatient treatment, a 4-week 
day program, and 1 year of a weekly aftercare group. 
Patients who could not make this commitment were 
referred elsewhere. Treatment activities continued 
through the day program as described for the evalua- 
tion period. Single men were required to live in a 
supervised halfway house during the day program and 
for an additional 2 months thereafter. Married men 
returned home but participated in a weekly couples 
group for 3 months. 

Of the 70 persons asked to participate, 56 consented. 
Of this group, 6 dropped out of the research before 
being assigned to a group. An additional 10 patients 
dropped out during the training phase: 4 were skill- 
training subjects, 3 were discussion subjects, and 3 
were controls, This left 40 subjects who were retested 
and followed: 15 in the experimental group, 13 dis- 
cussion controls, and 12 no-additional-treatment con- 
trols. Within the limitation of their sequential entry 
into the program, subjects were ‘assigned to the three 
groups on a random basis. 

Subjects included 2 black men and 1 Native Ameri- 
can. Subjects’ mean age was 45.6 (SD = 9.32). Seven- 
teen men were married ; 3 were single; 20 were Separated 
or divorced. The mean number of years of education 
was 12.3 (SD = 2.46), Average total monthly income 
prior to hospitalization was $335 (SD = $470). Mean 
months employed in current job was 18.6 (SD = 50.1). 
Modal social class was IV (i.e., lower middle) computed 
from major vocational skill and educational attainment, 
using the Hollingshead Two-Factor Social Position 


Index (Hollingshead, Note 1). Seven subjects were 
court referred; the rem: 


ainder voluntari 
themselves for treatment. The AA 
years of self-acknowledged problem drinking for the 
sample was 17.0 (SD = 10.18). Subjects reported an 
average of 3,250 drinks (SD = 2,450) consumed during 
the 6 months prior to admission. Drinks were defined 
as 1 ounce (29.57 cm?) of 86-proof liquor or its equiv- 
alent in alcohol content. Subjects drank an average 
of 122 days (SD = 67.1) during the 6-month period. 
Eleven subjects reported binge-drinkin; ; 


abjec g patterns; 25, 
steady drinking; and 4, a mixed pattern. The average 
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number of previous treatment attempts was 1,3 
(SD = 1.27). Average number of months elapsed since 
first alcohol treatment attempt was 61 (SD = 76.1). 
With the exception of the reported number of 
problems due to drinking, F(2, 37) = 8.8, p <.001, 
there were no significant differences among the three 
subject groups on pretreatment demographic and 
drinking history measures. The drinking problems 
measure consisted of a checklist with eight items such 
as job loss, marital difficulties, and illness. The skill- 
training group reported more problems prior to treat- 
ment than the discussion group (M = 6.5, SD = 9 vs. $ 
M = 4.4, SD = 1.5, p < .05, using Duncan’s new 
multiple-range test). The control group’s mean of 5.4 
(SD = 1.5) was not different from the other groups. 


Development of Skill Assessment and Training 
Procedures 


Several sources were used to generate situations 
likely to be problematic for the present population of } 
excessive drinkers. These included (a) descriptions of 
relapse situations by Marlatt (1978) gathered in in- 
dividual interviews with his patients during follow-up; 
(b) suggestions by treatment personnel of two treat- 
ment facilities (Seattle Veteran’s Administration 
Hospital and Mendota State Hospital, Wisconsin) who 
work with male alcoholics; (c) interviews with alcoholics 
on the Seattle ATP; and (d) modifications of situations 
from several inventories designed to assess assertive 
behavior (Eisler, Miller, & Hersen, 1973; Lawrence, 
1970; McFall & Marston, 1970). Eighty situations 
were assembled and worded to be specific enough to 
elicit a small number of appropriate courses of action 
but general enough to be useful as standardized situa- 
tions for this population. This 80-item inventory 
(the Situational Difficulty Questionnaire!) was Pr 
sented to 40 patients on the ATP who did not take 
part in any other phase of the study. They were 10 
structed to rank each situation with regard to m 
difficulty that it would present if encountered in the 
natural environment. Situations were divided a 
four categories: (a) frustration and anger; (b) inter- 
Personal temptation; (c) negative emotional sut 
and (d) intrapersonal temptation. The eight situation 
judged most difficult within each of the categories we" 
retained for use in the study. n 

In frustration and anger situations, the pera 
experiences the blocking of a goal-directed aeut 
and/or hostility toward some person or external eveni 
For example: 


Before you entered the alcoholism treatment E 
gram, your employer, who knew about your drin! R 
problem, said that you could have your job Pive 
when you got out of the hospital. When you !¢ 


. i oe 
w Copies of the Situational Difficulty Question ia 
Situational Competency Test, training man tera 
follow-up questionnaires for subject and collai 


Teports are available from the first author. 
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the program, you find that the company has hired 
someone to take your place. 


In interpersonal temptation situations, the person 
experiences explicit or implicit pressure by other people 
to drink, For example: 


You are eating at a good restaurant on a special 
occasion with some friends. The waitress comes over 
and says, “Drink before dinner?” Everyone else 
orders one. All eyes seem to be on you. 


In negative emotional state situations, the person 
experiences feelings such as loneliness, depression, 
boredom, futility, malaise, or nervousness in the 
absence of clear-cut environmental or interpersonal 
stimuli. For example: 


You get up Saturday morning and realize that you 
don’t have anything planned to do during the day. 
You sit around for a while, but you begin to feel 
bored and restless. 


In intrapersonal temptation situations, the person 
experiences a desire or compulsion to drink in the 
absence of specifically identified external or internal 
factors. For example: 


You have been out of the hospital a couple of months 
now and haven’t taken a single drink. However, 
you’ve been wondering how well the treatment 
really worked, and you get to feeling like taking a 
drink to test it out. 


Assignment of situations into the four categories 
was assessed independently by two judges (a clinical 
Psychologist and a clinical psychology graduate 
student) with 94% agreement. Of the 32 situations, 16 
Were used for training purposes, leaving 16 available 
for pretreatment and posttreatment testing. No two 
training and assessment situations were exactly alike. 


Pretreatment Measures 


Testing situations were assembled into the SCT 
(see Footnote 1). Situation descriptions were tape 
recorded, ending with the phrase “What would you do 
or say?” The subject was instructed to imagine that 
the situation was actually occurring and to say the 
Words or describe the action that he would use to 
respond to the situation. Subjects’ responses were tape 
Tecorded. For the pretest a sample situation was 
Provided to check the subject’s understanding of the 
oe A first version of the SCT was administered to 10 
i jects who did not participate in other phases of 

€ research to check that the situation descriptions 
Were clear and that they elicited varied responses. 
eee Scoring measures, latency, duration, compliance, 
o specification of new behavior, were chosen. Latency 
cero was defined as the elapsed time from the 

6 mination of the recorded situation to the semantic 
ace of the subject’s response, excluding dis- 
alin and hedging. Response duration was the 
oea of words in the response, excluding side 
dich, ents to the experimenter. Compliance was 2 

otomous score indicating whether or not the 


1095 


subject gave in to the situation without attempting 
to exert control that would change or influence the 
course of the situation. Drinking, giving in to a demand, 
not expressing feelings, and tacitly agreeing to criticism 
were scored as compliant. Specification of new behavior 
was also a dichotomous score indicating whether the 
description of the problem-solving behavior to be 
performed was given in sufficient detail so that someone 
else could use the description as a guide to perform 
the behavior. In other words, a response must have 
specified one alternative rather than a class of alter- 
natives: “I would come up to the ward” versus “I 
would get help.” This measure was applicable to both 
interpersonal and intrapersonal situations. 

Subjects took the SCT prior to treatment, immedi- 
ately after, and 3 months following discharge from the 
hospital. Measures were summed across situations for 
each subject. All scoring for the four measures across 
the three administrations was done at the same time 
by the same rater (a clinical psychology graduate 
student) who was blind to subject identity and test 
order. Reliability of the compliance and specification 
measures was assessed by Pearson correlations of the 
primary rating of 20 randomly selected protocols of 
16 situations, with ratings by an independent rater 
(also a clinical psychology graduate student; compliance 
r = .85, specification of behavior r = .82). 

During the evaluation phase of the ATP, the patient 
received psychological testing including the Shipley 
Institute of Living Scale (Shipley, 1940), a test of 
cognitive ability and intellectual impairment, Impair- 
ment, expressed by the Conceptual Quotient (CQ), 
indicates the extent to which the individual’s abstract 
thinking falls short of his or her vocabulary. 

After agreeing to participate in the study, all subjects 
were interviewed using a shortened version of the 
Drinking Profile (Marlatt, 1976). This structured 
interview instrument systematically assesses psycho- 
social functioning and drinking behavior prior to 
treatment. The Drinking Profile was administered by 
one of three graduate psychology students. Each had 
previous clinical experience and was trained by giving 

ised administrations of the Drinking 
ers also gave the SCT 
follow-up interviews. 
During the period of the study, they had no involvement 
in the subjects’ treatment, routine or experimental. 


Training of Therapists 


A training manual (see Footnote 1) was written 
ussion treatment pro- 


for the skill-training and disci t 

cedures. The manual provided an orientation for the 
therapists, specified introductory statements for, the 
two types of groups, and gave procedural guidelines. 
It also gave suggestions for working with each of the 
four situational categories, and for skill training it 
summarized the problem that each situation presented 
and gave instructions for evaluating the situational 
responses. Six therapists were trained to conduct the 
two groups. The two primary therapists were females: 
a vocational rehabilitation technician who was experi- 
enced in conducting behavioral group therapy and a 
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clinical psychology graduate student who had com- 
pleted her internship. Two male clinical psychology 
graduate students engaged in their internship served 
as cotherapists during the first half of the study. When 
they finished their internship, they were replaced by 
two male recovering alcoholic counselors. The coun- 
selors had prior experience leading groups and were 
both working toward degrees in social work, Therapist 
training consisted of reading the training manual and 
background material and role playing the therapy 
procedures for three 1-hour sessions. 

Therapists worked in male-female pairs, and each 
therapist was involved in an equal number of skill- 
training and discussion groups. All of the skill-training 
and discussion sessions were’ observed by one of the 
investigators. The observer and therapists met after 
each group for feedback to minimize drift in technique 
over time, 


Treatment Procedures 


Training for the experimental and discussion groups 
consisted of eight semiweekly 90-minute sessions. 
Cohorts of 3-5 subjects met with two therapists. 
Starting dates of skill-training and discussion groups 
alternated so that if any changes took place in the ATP 
while the study was underway, the skill-training and 
discussion groups would be affected equally. As dictated 
by random assignment, control subjects were dis- 
tributed individually throughout the duration of the 
study, 

Skill-training group. Skill-training procedures were 
based primarily on a series of experiments by McFall 
(McFall & Lillesand, 1971; McFall & Marston, 1970; 
McFall & Twentyman, 1973), who investigated effective 
components of assertion training. The content of 
training was based on D’Zurilla and Goldfried’s (1971) 
stepwise analysis of problem solving. The steps are 
(a) orientation, (b) definition, (c) generation of alter- 
natives, (d) decision making, and (e) verification. 

After giving a general orientation to problem-solving 
procedures, therapists read a description of a problem- 
atic situation. Subjects discussed how they viewed 
the situation and generated possible ways of responding 
to it. The therapists pointed out when group members 
defined the situation differently and what the conse- 
quences of different definitions were for problem solving. 
The probable consequences of the different alternatives 
proposed by members were discussed, and, if necessary, 
the therapists proposed alternatives. For interpersonal 
situations, one therapist chose an alternative, explained 

the basis of this choice in terms of probable conse- 
quences, and then modeled a response, with the co- 
therapist playing the other person in the situation. 
For intrapersonal situations one of the therapists 
engaged in a monologue, explicitly defining the problem, 
generating alternative solutions, deciding which one 
would maximize long- and short-term gains and could 
be performed, and outlining steps to implement the 
solution, After this initial phase, each group member 
decided on a particular response and rehearsed it, 
receiving feedback from the group on the probable 
peas ee of his response. If the therapists and 
group felt that the response was not likely to solve the 
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problem that the situation presented, the subject was 
required to repeat his performance. After each subject 
had rehearsed, a member summarized the method for 
generating and evaluating an adequate response to that 
situation. Two prepared situations were introduced 
during each session, covering four situations in each 
of the four categories during training. Subjects also 
rehearsed one or two situations of their own devising 
that they felt might be problematic after discharge 
from the hospital. 

In summary, skill-training groups incorporated 
instruction, modeling, behavioral rehearsal, and coach- 
ing, both of actual response behavior and of the cogni- 
tive process for generating the response. Subjects were 
taught how to define the problem that a situation 
presented by specifying the elements and to generate 
alternatives and think about the long- and short-term 
consequences. Finally, the behavior rehearsal phase of 
training provided practice in carrying out adaptive 
responses and served as a role-playing form of verifi- 
cation, assessing the adequacy of the problem-solving 
process. 

Discussion group. Discussion group procedures were 
based on the rationale that problem drinkers do possess 
the necessary skills to analyze and cope with high-risk 
situations, but, because of feelings of anger, anxiety, 
dependency, or depression, they do not effectively use 
those skills. Discussion procedures focused on eliciting 
the feelings that may be present in problematic situa 


_ tions and examining subjects’ reactions and motivations 


relevant to similar situations of the past with the logic 
that self-understanding and more effective coping 
behavior should result. Thus, in the discussion groups, 
after giving the orientation, the therapists introduced 
the same problematic situations as were used in the 
skill-training group. Then, in a nondirective manner, 
they encouraged expression and discussion of feelings 
relevant to the situations. x 

Control group. A third group of subjects received 
all assessment measures at the same intervals as other 
subjects but participated only in regular ATP treat- 
ment activities. 


Posttreatment Assessmeni and Follow-Up 


Immediately following the eight sessions of treatment, 
subjects were retested on the SCT. Subjects typically 
were discharged from the day treatment phase of the 
program and began weekly aftercare groups the we 
following completion of the experimental treatment 
program, 

Follow-up procedures were dated from the time Ve 
subject left the day treatment phase of the ATP. 
Tmonth, 3-month, 6-month, and 12-month intervals 
subjects were interviewed using a standardized follow 
up questionnaire (see Footnote 1), which was design® 
to provide information compatible with the Drinking 
Profile. The form assesses occupational stability, lin 
situation, and use of therapeutic supports and provi 
drinking disposition data by asking about periods a 
rates of drinking and periods of hospitalization," 
incarceration. The questionnaire also permits 4 0¢ 


examination of relapse situation characteristics by 


| 


ing place, people present, activity, environmental 
inner thoughts and feelings, and reasons for 
to drink. Subjects had initially consented to 
their self-reports verified, and this was done at 
follow-up interval through contact with at least 
one relative, friend, employer, or treatment agency. 
the 3-month interval, the SCT was readministered 
n or by telephone in the few cases in which the 
bject had moved out of the area. In the latter cases, 
self-report interview instrument also was adminis- 

by telephone. Subjects who had dropped out of 
treatment and/or moved out of the area often were 
dificult to locate. Court records, Veteran’s Adminis- 
tration files, employers, relatives, friends, and other 
- patients provided leads. In a few cases it was necessary 
to wait out a period of heavy, sustained drinking 

before the subject could be interviewed. Throughout 
' follow-up, the subjects’ confidentiality was protected. 
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Results 
_ Siluational Competency Test 


Heterogeneity. The four variables on which 
the SCT was scored were intercorrelated for the 
preadministration, exit, and 3-month post- 
administration to determine to what extent 
‘the derived measures were independent of each 
“other, SCT analyses, which included exit and 

$month posttest data, were based on an # 
of 37, because 1 discussion group subject’s exit 
data were lost through instrument failure and 
2 skill-training subjects’ 3-month posttest 
| 


Table 1 
Situational Competency Test I ntercorrelations 
| i. _ie 
| Dura- Noncom- 
Measure Latency tion pliance 
Duration 
id 3 .03 
Pt 14 
_ $month post 34* 
Noncompliance 
Pre -2 09 
Exit —07 26 
_ $month post —.34* 08 
Specification 
20 A sent 
Exit ‘01 ‘sims ott 
‘Smonth post  —.05 20 36* 
| Note. w = 37, 


(bP <.05. 
me 01, 
"P < 001. 


i 
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Table 2 

Effects of Training on the Situational 
Competency Test: Repeated Measures 
Analysis of Variance 


Measure Source df MS F 
Latency Group (A) 2 5.6 1.6 
Time (B) 1 00 <1 
AXB 2 1.8 1.8 
Duration A 2 1,599.39 EW aa 
B 1 61 <1 
AXB 2 326.07 3,4" 
Noncompliance A D 05 2.9 
B 1 00 <1 
AXB 2 01 <1 
Specification A 2 18: 802 
B 1 00 <1 
AXB 2 01 <1 


Note. The analysis is of exit and 3-month posttest 
scores residualized on prescores. 

*p <.05. 

p< 01. 


data could not be gathered at the required 
interval. One of the latter subjects was on a 
binge, which later resulted in his death due 
to medical complications. The other subject 
had moved and later was located out of the 
area. 

Table 1 shows that the four SCT measures 
were independent, with two notable exceptions. 
At all three test administrations, noncom- 
pliance was related positively to specification 
of new behavior. At exit, duration of response 
was positively related to specification of new 
behavior, indicating that longer responses 
were likely to specify concrete behavior for 
most subjects only at the end of the inpatient 
phase of the treatment program. 

Treatment effects on the SCT. There were 
no significant pretreatment group differences 
on the four dependent measures derived from 
the SCT. Treatment effects were analyzed by 
using the prescores to compute residuals of the 
exit and 3-month posttest scores, which then 
were subjected to a repeated measures analysis 
of variance (Huck & McLean, 1975). As Table 
2 shows, for duration of response, the group 
main effect was significant. Planned com- 
parisons indicate that at immediate retest, 
the skill-training group had a significantly 
longer duration of response than the discussion 
group, #(34) = 1.69, p < .05, and the control 
group, #(34) = 2.44, p< 01. As Figure 1 


1098 


MEAN DURATION IN WORDS 


18 


Figure 1. Response duration measured by the Situational Competency Test pre, post, and 3 months after 
training. (X = skill training; Y = discussion; Z = no treatment control.) 


shows, by the 3-month posttest, these differ- 
ences had diminished. 

Specification of new behavior also showed 
a training effect, with the group main effect 
significant. Planned comparisons show that the 
skill-training group performed better on exit 
testing than both the discussion group, (34) 
= 244, p<.01, and the control group, 
4(34) = 1.69, p < .05. By the 3-month post- 
test (see Figure 2), the skill-training group 
was still significantly different from the dis- 
cussion group only, ¢(34) = 1.69, p < .05. No 
significant differences due to training were 


found on the latency or noncompliance 
measures, 


Drinking Behavior and Social Adjustment 


Posttreatment drinking behavior. Drinking 
behavior was assessed primarily by drinking 
disposition measures (Sobell & Sobell, 1973). 
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4.000 
MONTHS 


Days during the follow-up period were a 
gorized as (a) voluntary abstinence; (b) i) i 
abstinence (i.e., jailed or hospitalized) ie 
controlled drinking, defined as any a 
during which 6 or fewer drinks were cone 
or any isolated 1- or 2-day sequence Wo 
between 7 and 9 drinks were consume a 
(d) drunk days, defined as isolated a 
during which 10 or more drinks were ee a 
or as any day that was part of a ad aks 
more than 2 days during which 7-9 a 
were consumed. Total amount drunk those | 
average drinking period length for ai | 
Subjects who drank were computed for the bs f 
Results exclude the skill-training subject Y% i 
died. too 

Of the drinking disposition measur 
few subjects (nine in all) reported ensa 
drinking days for this category to be & f the 
outcome indicator. Preliminary ana 
other drinking-related measures indicate 


why 
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4.000 
MONTHS 


Figure 2. Specification of behavior measured by the Situational Competency Test pre, post, and 3 months 
after training. (X = skill training; Y = discussion; Z = no treatment control.) 


within-cell variances were heterogeneous. The 
Bartlett-Box F (Winer, 1971) was significant 
for all measures except days voluntarily 
abstinent. Therefore, statistical analyses were 
Performed on logarithmically transformed 
Scores (Winer, 1971, p. 400). 

First, the two control groups were compared 


| On the drinking disposition measures using ¢ 


tests. No significant differences were found, 
Just as no differences between the two groups 
been observed on the SCT. Thus these 
Btoups were pooled for comparison with the 
I-training group. Since multiple dependent 
Variables were involved, a multivariate ana- 
4 € of the ¢ test, Hotelling’s T? (Winer, 1971), 
a applied to the following group of measures: 
Ys abstinent, days hospitalized, drunk days, 

l amount drunk, and average drink- 
Period length; T?(5, 33) = 2.73, p < 05. 
taving found the vectors of group means to be 
Significantly different, individual ¢ tests were 


used to determine which measures accounted 
for the difference. 

The skill-training and control groups were 
found to be significantly different on three of 
the five measures: days drunk, t(37) = —2.21, 
p < 05; total number of drinks, £(37) 
= —2.01, p< .05, and average drinking 
period length, #(37) = —2.32, p < .05. Table 
3 gives untransformed means and standard 
deviations for these and the other outcome 
measures for the 1-year follow-up period. As 
the table shows, for the follow-up year, the 
skill-training group had an average number of 
days drunk one-sixth that of the pooled control 
group, drank one-fourth as much, and had an 
average drinking period length less than one- 
eighth as long. 

To determine whether the improvement in 
drinking behavior generalized to areas of 
functioning that had not been specifically 
targeted by the intervention, two other self- 
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Table 3 A 

Posttreatment Adjustment: 12-Month Outcome 

Measure Group* M SD df t 
Days of controlled drinking i ee 
arta ne o -2a 
bc eek 
Average drinking period length 1 ER Dies 37 —2.32* 
Days hospitalized ; oan pee 37 14 
Days abstinent , pee Ae 37 94 
eni 
Weekly aftercare meetings 1 29.8 18.8 37 93 
2 24.2 17.8 


a Group 1 (n = 14) received skill training; Group 2 (n = 25) subjects were in the discussion or no 


tional-treatment groups. One Group 1 subject died after 5 months of an alcohol-aggravated illness followil 


a 2-month period of substained drinking. 


» Five subjects were retired or pensioned (2 in Group 1 and 3 in Group 2). 


*p < 05. 


report measures of outcome were examined: 
job performance (retired and fully pensioned 
subjects excepted) and continued engagement 
in the ATP outpatient treatment for which the 
subject had contracted. These measures did 
not show significant group differences, sug- 
gesting that the effects of the intervention are 
relatively specific to drinking behavior. 


Relapse Situation Category Analysis 


A relapse was defined as the initiation of a 
drinking period lasting 1 or more drunk days. 
Successive relapses for the same individual 
were considered to be distinct when the 
subject reported different setting events for 
drinking periods, which were separated by at 
least 2 abstinent or controlled drinking days. 
For the year, 25 subjects (10 in the skill- 
training group, 6 discussion, and 9 control) 
reported a total of 55 relapses (M = 2.2, 
SD = 1.29, range = 1-6). Number of re- 
lapses for the subject groups did not differ 
significantly. 

Relapse situations were categorized inde- 
pendently by two clinical psychology graduate 
students into the four categories used through- 
out the study, with 81% agreement. Disagree- 


ments were resolved by consensus, The largest 


number of relapses were of the 
emotional state type (43%). The interp 
temptation category was represented b; 
of the relapses, whereas frustration and 
situations and intrapersonal temptatio 
comprised 15.5%. Of the relapses, 9% 
not classifiable either because of la 
sufficient information or failure to fit 
the four category descriptions. 


Prediction of Outcome 


To evaluate the ability of the SCT to 
posttreatment drinking behavior and 
adjustment, a multiple regression approæ 
incorporating drinking history and 
graphic measures was used. Due to thi 
number of subjects, the number of po 
predictors was first reduced by corr 
pretreatment variables with the 1-yeat 
come indices. Controlled drinking da; 
excluded since so few were reported. 
three measures most highly related 
outcome index were entered into a stel 
regression analysis along with the three 
exit measures: latency, duration, and 
cation of behavior. Noncompliance Y 
included because of its high collinear! 
the specification measure. 


Table 4 
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Outcome variable Predi 
redictor r F R R Overall F 
Days employed Pretreatment? 35 4,42* 35 .12 
ine: Post SCT latency —.32 4.79" 150. 
Day hospitalized Post SCT latency 50 11.81*** aa 
Drinking pattern> (30 1438+ 68 47 
Kusi revious treatments E EH E bg 4 
Weeks in aftercare Post SCT latency —.39 6.62** es ae Sab 
; Marital status? 38 64it 153. 
Days abstinent Post SCT latency —.73 40.25"*** 3 a ae 
Marital status? 27 465" 76. ae 
Days drunk Post SCT latency 51 12.7484" a p: ae 
Problem duration’ 34 11.70"* 67 + 
Total amount drunk Post SCT latency .52._-13,25*#** ae 
aM Problem duration’ 34 11.71%** i * 
Average drinking period length Drinks consumed* EST )rr5 00% pt td aan 
Problem duration? 33 3,900 AT (22 
Post SCT latency 35 4,08* ASS GL ORRY ha 


Note. SSPS Stepwise Multiple Regression Program (Nie, Hull, Jenkins, Steinbrenner, 


ae = Situational Competency Test. 
‘ Months employed on current job. 
g mieeromous variable: Periodic/mixed or steady. 
Peon variable: Married or not. 
t aaa elapsed since first alcoholism treatment. 
uring the 6 months prior to treatment. 
Ae p < 05. 
phe p< 01. 
ae p < .005. 
p< 001. 


Table 4 summarizes the significant findings 
giving simple and multiple correlation coeffi- 
cients and F values for individual and com- 
PA predictors. For all outcome indices, the 
Sa ictive ability of the SCT latency measure 

Pe cre or superior to that of the most 
fone related demographic and drinking 
Abe, measures, For instance, for days 
Pana latency of response to problematic 
For oa accounted for 53% of the variance. 

ane thy employed and average drinking 
She: Nene for which pretreatment measures 
inn the regression equations first, the 
his cy measure still contributed significantly. 
hee establishes a relationship between 
a to respond to problematic drinking- 
ITa Situations on a verbal role-playing 

te ument and actual drinking-related be- 
ioral functioning following treatment. 


Discussion 


Tes He 
i a skill-training method successfully evalu- 
ere views the client as an active organism 


& Bent, 1975). 


who can learn to cope with future problems. 
Approaches that only use verbal persuasion 
or substitute medication or supportive social 
groups for the treated addiction do not assist 
the client in reevaluating expectations of 
personal efficacy. Recent reviews (Bandura, 
1977) have suggested that a person’s expec- 
tations of success or failure in coping with 
situations are important determinants of 
behavior. Efficacy expectations are modifiable 
through corrective experience and training. 
This study indicates that problem drinkers’ 
responses to situations that present a high 
risk of relapse can be improved through 
training. Since the focus was on training 
problem-solving techniques, rather than a 
repertoire of specific responses, the testing 
situations were not the same as the situations 
used in training. The skill-training intervention 
duced longer and more specifically ap- 
propriate verbal behavior in response to these 
novel problem situations. These effects would 
not have been found unless generalization 


took place. 


pro 
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The measure of noncompliance demon- 
strated a ceiling effect. Both before and after 
treatment, subjects usually indicated that 
they would exercise control over problematic 
situations. The noncompliance of problem 
drinkers’ verbal behavior in situations po- 
tentially leading to drinking probably reflects 
social desirability and demand characteristics 
of the assessment procedures rather than 
behavior in vivo. The sensitivity of the SCT 
to demand characteristics might have been 
accentuated by using the instructional set 
“what would you do” rather than “what are 
you going to do.” On the other hand, pretesting 
indicated that such instructions were necessary 
to minimize frequent denial: ‘That will never 
happen to me!” S 

The apparent decline of training effects on 
the 3-month posttest may reflect in part that 
alcoholics typically appear for treatment after 
a prolonged period of heavy drinking. Conse- 
quently, they demonstrate cognitive impair- 
ment as evidenced by their low average 
Shipley CQ. The overall mean of 82 (SD 
=15.8) is in the “moderately suspicious” range 
(Shipley, 1940). Research on the same popula- 
tion using Halstead-Reitan measures of cogni- 
tive functioning confirms this finding (Schau 
& O'Leary, 1977). The permanence of alco- 
holics’ cognitive impairment and its relation 
to their drinking behavior is unknown (Rankin, 
1975). Therapies incorporating concept learn- 
ing (such as skill training) when used with 
recently sober problem drinkers whose cerebral 
functioning is at least temporarily impaired 
may require booster sessions administered on 
an outpatient basis for maximum enduring 
effect. The continuing beneficial impact on 
drinking outcome that was found may be due 
to the environmental support from significant 
others occasioned by successful coping with 
problematic situations encountered early in 

the follow-up period. 

Evaluation of the finding that ski traini 
had a beneficial effect ome and an fiat ce 
the regular treatment Program on post- 
hospitalization adjustment is dependent on 
the validity of the self-report outcome mea- 
sures. This study corroborated self-report 
Resi CAA TS collected from col- 

. of subject and collat 


eral 
reports suggests that certain measures of 
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drinking behavior are more reliable and less 
likely to elicit socially desirable answers than 
others. For instance, if a person goes on a 
binge and blacks out, the amount drunk during 
the binge will not be remembered accurately, 
In this respect, drinking disposition is an 
improvement over other measures. 
Follow-up data indicated that during the 
posttreatment year the drinking behavior of 
the skill-training group and the control groups 
significantly diverged as measured by (a) 
days drunk, (b) amount drunk, and (c) length 
of drinking periods. Differences were found 
even though the regular treatment program was 
intensive. In the most successful behavioral 
study to date (Sobell & Sobell, 1973), much 
of the differential improvement in the function- 
ing of experimental subjects as opposed to 
controls seemed to be accounted for by different 
amounts of controlled drinking (Lloyd & 
Salzberg, 1975). The fact that very few 
controlled drinkin® days were reported in the 
present study may be related to the treatment 
goal of abstinence. The skill-training sessions 
were conducted in accordance with the 
abstinence-oriented philosophy of the treat- 
ment program. The focus was on how not to 
drink rather than how to moderate drinking. 
However, these and other results (Armor, 
Polich, & Stambul, 1976) suggest that 4 
majority of problem drinkers will resume 
drinking to some extent following treatment, 
no matter what that treatment consists 0t. 
It seems sensible not to encourage drinking 
but to stress prevention by preparing patients 
to cope with drinking situations when they 
inevitably arise. H 
A finding that supports the clinical utility 
of the SCT and the social competence approa 
to problem drinking is that the men in 18 
sample who had shorter response latencies to 
role-played problematic drinking problem 
situations drank less, were employed mot 
and had more regular aftercare attendance 
following treatment. Of all variables, this 
indicator of problem-solving skill was ™0 
highly related to outcome. Avoiding problet 
drinking completely may be even mo 
dependent on the ability to quickly gener? 
an alternate response to drinking than ont z 
precise content of the response, as modifie 
by the skill-training procedure used hel® 


| 
| 
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(Schwartz & Gottman, 1976, make a similar 
point regarding college students’ responses to 
problematic situations.) Although latency is a 
difficult response characteristic to modify 
| (Bisler, Hersen, & Miller, 1973; Hersen, 
Eisler, & Miller, 1974), future skill training 
_ might do well to incorporate explicit procedures 
to test whether this relationship is of a causal 
nature. 

Tn conclusion we wish to stress three points. 
(a) Skill training (even including booster 
sessions) alone, although a potentially effective 
component of multimodal treatment, probably 
would not be sufficient treatment for a rather 

Socioeconomically and cognitively impaired 

population such as that studied here. (b) The 

treatment strategy evaluated here included 

, hot only the training of problem-solving skills 
but also behavioral rehearsal of specific 
Tesponses (the verification phase). Deter- 
mining which of the techniques used was most 
responsible for change requires further investi- 
gation. Future studies might draw on Bandura’s 

(1977) self-efficacy conceptualization of be- 

havior change when assessing individual 
elements of skill training and incorporating 
additional treatment components. (c) Further 

Tesearch on the characteristics and prevention 

of relapse in a variety of problem areas such as 
drug addiction, smoking, obesity, and crime 
Necessary and warranted. This study, in 
Combination with Marlatt and Gordon’s 
Tecent results (in press), suggests that different 
cient populations find different types of 
se situations most problematic. A reliable 
Onomy of relapse situations related to the 
haracteristics of different client populations, 
together with further refinement of skill- 
J training procedures, would help in the attempt 
Match the treatment to the person. 


Reference Note 


L Hollingshead, A. B. Two-factor index of social 
1987. Unpublished manuscript, Yale University, 
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Evaluating Alcoholism Treatment Programs: 
An Integrated Approach 


Ruth C. Cronkite and Rudolf H. Moos 
z Department of Psychiatry and Behavioral Sciences 
Social Ecology Laboratory, Stanford University School of Medicine 
and Palo Alto Veterans Administration Hospital, Palo Alto, California 


This article examines the interrelationships among five major sets of variables 
(social background, intake symptoms, program type, treatment experiences, and 
perceptions of the environment) that are related to posttreatment jn toni 
of alcoholic patients (alcohol consumption, rating of drinking problem, physical 
impairment, and occupational functioning). The sample consisted of 429 pa- 
tients selected from five different treatment programs. The relative importance 
of each set of variables as predictors of outcome was estimated by constructing 
block variables, using path analyses, and partitioning the explained variance. 
The results showed that (a) the combined explanatory power of the program- 
related variables is considerably more than would be expected from previous 
research; (b) the importance of patient background relative to intake symp- 
toms varies with the outcome criterion being used; (c) both the treatment ex- 
periences and the patient’s perceptions of the treatment environment are strong 
predictors of outcome; and (d) a substantial proportion of the explained var- 
iance is shared between patient-related and program-related variables, suggest- 


ing important patient-program selection and congruence effects. 


One of the major issues in longitudinal 
Studies of alcoholic patients is assessing the 
relative importance of patient background 
and treatment programs in determining out- 
Come. Although contradictory findings have 
been reported, previous research has generally 
Suggested that patient characteristics at in- 
take are most strongly related to outcome 
and that treatment programs have little ef- 
fect once sociodemographic and functioning 
characteristics at intake are taken into ac- 
Count (Armor, Polich, & Stambul, 1976; 
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Craft, Sheehan, Driggers, & DuBois, 1975; 
Gerard & Saenger, 1966; Pokorny, Miller, & 
Cleveland, 1968; Ruggels, Armor, Polich, 
Mothershead, & Stephen, 1975). 

A related issue involves the relative con- 
tributions of different types of patient charac- 
teristics—in particular, the extent to which 
outcome is related to social background on the 
one hand and drinking symptoms at intake 
on the other. No clear-cut pattern of findings 
has been reported on the relative importance 
of social background variables (such as 
socioeconomic or marital status) compared 
to intake symptoms (such as alcohol con- 
sumption or behavioral or psychological im- 
pairment at intake) in predicting outcome 
(Armor et al., 1976; Craft et al, 1975; 
Ruggels et al., 1975). 

Inferences pertaining to both of these is- 
sues have been primarily based on examina- 
tion of the increments in explained variance 
(Armor et al., 1976; Bromet, Moos, Bliss, & 
Wuthmann, 1977; Craft et al., 1975; Rug- 
gels et al., 1975). This method is asymmetric 
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in that it attributes variance shared by two 
or more sets of variables to the set that is 
entered first in the regression analysis (such 
as patient background variables), and thus 
may overestimate the contribution of the 
first set by crediting it with both its unique 
and shared variance. In contrast, the incre- 
ment in explained variance attributed to the 
set of variables added last (such as program- 
related variables) represents only the ex- 
plained variance that is unique to that set of 
variables, not the variance that is shared 
with variables that have been entered earlier. 
Consequently, previous inferences about the 
relative effects of program-related variables, 
social background characteristics, and in- 
take symptoms on outcome may be mislead- 
ing.* Such inferences have important policy 
implications for alcoholism treatment. For 
example, if the unique variance attributed to 
program-related variables is small, and the 
variance that is shared with patient back- 
ground variables is attributed only to back- 
ground characteristics, then researchers may 
conclude that treatment effects are negligible 
and thus recommend less expensive and more 
uniform treatment programs, 

Another unresolved question raised by 
longitudinal studies of alcoholics is the role 
of different types of program-related vari- 
ables in predicting outcome, The relation- 
ship of treatment variations to outcome has 
been approached in several ways. Armor et 
al. (1976) and Kissin, Platz, and Su (1970) 
focused on the type and amount of treat- 
ment both within and across programs. 
Bromet et al. (1977) examined the effect 
of level of participation on outcome in several 
different treatment programs. In addition to 
studying the effects of a variety of treatment 
experiences and treatment programs, Bromet, 
Moos, and Bliss (1976) have focused on an- 
other dimension of program-related variables, 
a patient’s perceptions of the treatment en- 
vironment. This approach js based on re- 
search which suggests that the social environ- 
ments of Psychiatric and correctional pro- 
grams may be important factors in influencing 
outcome (Ellsworth, Maroney, Klett, Gordon 
& Gunn, 1971; Moos, 1974b), Each of these 
aspects of treatment variations has been 
studied Separately, but their effects relative 
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to each other have not been examined within 
a single study. 
The purpose of this article is to develop’ 
an integrated approach to studying alcohol 
ism treatment programs that will facilitate | 
clarification of the following issues: (a) What 
are the interrelationships among patient 
social background variables, intake symp“ 
toms, treatment programs, treatment expeti- | 
ences, perceptions of the treatment environ- 
ment, and outcome? (b) What is the relative 
importance of program-related variables com- 
pared to patient background characteristics 
in predicting outcome? (c) What is the rela- 
tive importance of a patient’s social back 
ground compared to intake symptoms in prt 
dicting outcome? (d) What are the roles of 
different aspects of treatment variations im | 
predicting outcome? 


A Model of Treatment Outcome 


In most longitudinal studies of alcoholic 
patients, the dependent variables are one of 
more outcome criteria related to posttreals) 
ment functioning, such as rehospitalizationy 
alcohol consumption, and occupational, phys ] 
ical, and psychosocial functioning. The in 
dependent variables vary across studies, de- 
pending on their focus. From revien 
previous research, the independent variables 
can be divided into “blocks,” labeled a 
1, 2, 3, 4, and 5. Block 1 consists of a set 0 
sociodemographic variables known to be te 
lated to drinking patterns, such as age, S 
ethnicity, marital status, and socioecon0m | 
status. Block 2 includes a patient's dnl 
symptoms at intake, more specifically, | E 
type and severity of alcoholism-related a 
acteristics, such as alcohol consumpto 
drinking patterns, physical impairment, 4 
behavioral impairment. Blocks 3, 4, ma s 
refer to program-related variables. Bloc ‘a 
includes the type of program; Block 4 A 
cludes the amount of various treatment ics 
periences, such as therapy sessions, Alcon 
Anonymous (AA) meetings, antabuse, 


eee: US d 

1See Newton and Spurrell (1967a, 19670) 
Mood (1971) for a complete technical discuss? g 
partitioning the sum of squares in regression are 
and Coleman (1975) for a discussion of this 
when applied to school effects. 
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so forth, in which the patient participates; 
and Block 5 includes the patient’s perceptions 
of the program environment. 

As mentioned earlier, researchers have 
focused on comparing Blocks 1 and 2 with 
Blocks 3 and 4 (and to a lesser extent Block 
5) in an attempt to assess the effect on out- 
come of program-related variables compared 
to patient characteristics at intake (Armor 
et al., 1976; Bromet et al., 1977; Ruggels et 
al, 1975). Some of these investigators have 
also compared Block 1 variables with Block 2 
variables to focus on the relative contribu- 
tions of social background and intake symp- 
toms (Armor et al., 1976; Ruggels et al., 
1975) in explaining outcome. 

The use of block variables not only serves 
to group conceptually similar variables, but it 
also allows one to formulate a model that 
summarizes the hypothesized causal inter- 
relationships among the block variables, as 
shown in Figure 1, Blocks 1 and 2 represent 
characteristics of a patient at the time of 
entering a program. Although some of the 
Block 1 variables can be regarded as ante- 
cedents of the onset of alcoholism (Armor et 
al, 1976), they also reflect a patient’s social 
characteristics at the time of intake to treat- 
ment and can thus be specified as correlated 
With intake symptoms (represented by the 
bidirectional line, riz). Not only are both 
Social background (Block 1) and intake 
symptoms (Block 2) related to a patient’s 
outcome after treatment, as shown by the 
Paths py, and pez, but they are also impor- 
tant determinants of the type of program 
that a patient enters (eg, patients who 
enter private treatment programs tend to be 
Older, married, and to have higher income, 
educational, and occupational levels; see 
ou et al., 1977), as specified by ps2 and 
31e 

The treatment experiences that a patient 
teceives are almost entirely determined by 
the program that a patient enters, which is 
Specified by the path pag. Bromet et al. (1976) 
ave shown that patient characteristics such 
as social background and intake symptoms 
àre unrelated to the treatment experiences re- 
aed by patients. Consequently, the paths 
tom Blocks 1 and 2 to Block 4 are not in- 
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cluded in the model (i.e., py: and paz were 
constrained to be zero).? 

Similarly, since the environments of treat- 
ment programs vary considerably in the de- 
gree of support, autonomy, clarity, staff con- 
trol, and so on, the program is expected to 
be an important determinant of a patient’s 
perceptions of the treatment environment, 
represented by pss (Bromet et al., 1976). In 
addition, it is possible that patients’ percep- 
tions of the treatment program may be af- 
fected by their characteristics at intake, as 
specified by the path psı. All of the program- 
related variables, Blocks 3, 4, and 5, are ex- 
pected to directly affect outcome, as indicated 
by Paths pes, Pos, and pes. 

This model summarizes the ‘hypothesized 
causal interrelationships among the sets of 
variables related to outcome. It also displays 
the causal reasoning behind the practice of 
entering blocks of variables sequentially into 
the regression analyses in which social back- 
ground (Block 1) and intake symptoms 
(Block 2) precede the type of treatment pro- 
gram, and, in turn, the treatment program 
(Block 3) precedes the other program-related 
variables, treatment experiences (Block 4), 
and perceptions of the environment (Block 
5). Furthermore, this model allows one to 
establish both the direct and indirect paths 
through which each block of variables can 
affect and be affected by other blocks of vari- 
ables. For example, social background not 
only has a direct effect on outcome that is 
independent of the other variables in the 
model, via por, but it can also indirectly affect 
outcome via intervening variables, such as the 
program (PosPs1)- That is, the compound 
path pesPsi represents the indirect effect of 
social background that is mediated by the pro- 
gram (or the effect of social background that 
is shared with the program effects). Other in- 
direct paths include the compound paths in 
the model that link two variables via one or 
intervening variables (eg, Popsi 


more 
). When all of the direct 


PespsaP31; and pe4P43 


dertaken with a fully re- 
cursive model revealed that the omitted paths in 
Figure 1 (po, pe, P% and ps) were all close to 
zero, thus providing empirical support for the more 


constrained model. 


2 Analyses that were un 
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Figure 1. A model of treatment outcome. 


and indirect effects are taken together with in- 
formation on the unique and joint explained 
variance, it is possible to assess more accu- 
rately the extent to which each set of block 
variables contributes to explaining treatment 
outcome. 


Method 
Sample, Programs, and Data 


The sample consisted of 429 patients from five 
residential alcoholism Programs. These facilities, 
which were selected because they are representative 
of the major types of treatment settings for alco- 
holics, include (a) a Salvation Army program offer- 
ing milieu therapy and vocational rehabilitation to 
skid-row alcoholics; (b) a public hospital-based 
facility offering milieu and group treatment and anti- 
anxiety medicines and sedatives to low-income pa- 
tients; (c) a county-funded halfway house operat- 
ommunity with individual and 


0 recreational activities; 
private aversion-conditioni 


program treating middle- 
patients; and (e) a priva ili 
emphasizing group and f: 
medication) for middle- 
patients. (For more de 
et al., 1976.) 


Patients in the study included approximately 80% 


amily therapy (as well as 
class and upper-middle-class 
tailed information, see Bromet 


of all admissions during a 10-month average perio 
Most of the nonparticipants were those who dona 
out of the program within a few days. Unlike in 
other studies, the sample is representative of pal istics 
from a wide range of socioeconomic charai 
such as education, age, income, employment status, 
and marital status. * A 
The data consisted of (a) a background inaa 
tion form administered to patients shortly a 
mission to a program; (b) a treatment exper 
form completed by a staff member after a Hite 
left the program; (c) the Community-Orien team ig 
grams Environment Scale (COPES) administe! asure 
patients about 2 weeks after admission to aa fi 
their perceptions of the program environa A 
(d) a follow-up information form completed y 
patient approximately 6 months after discharge. data 
Background information form. The following ons 
on a patient’s social background and drinking aan 
toms at intake came from the background in 
tion form: 
1. Social background (five items): age, Sex, (white 
Status (married, not married), ethnic grouP jeted. 
nonwhite), and highest grade of school comp! i e 
2. Drinking symptoms at intake: (a) ana aha 
sumption, the quantity of alcohol in ounces 0. “i one 
from beer, wine, and hard liquor consume ys 
typical drinking day (Armor et al., 1976) ; (b) P! 


marital 


ort 
* Although the validity of data based on seleh 
of alcoholics is controversial, Sobell _and liable 
(1975) have shown that it may be relatively 7 
and valid. 


d 
th 
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cal impairment, a subscale derived from the mean 
of 10 items rated on 5-point scales (from never to 
often), referring to how often patients experience 
delerium tremens or shakes, memory lapses or black- 
outs, dry heaves or cold sweat, upset stomach, dizzy 
spells, and so forth; (c) subjective rating of drink- 
ing problem, ranging from 1 for no problem to 5 
for quite often a problem; and (d) occupational 
functioning, which refers to whether the patient had 
been unemployed during the last 6 months prior to 
admission (yes/no). 3 

Treatment experiences. Together the programs of- 
fered a wide range of treatment experiences. Two 
programs (the hospital-based program and the pri- 
vate milieu-oriented program) offered ataractic medi- 
cation, psychotherapy, and educational interventions. 
Two others (the Salvation Army program and the 
public hospital-based program) emphasized primarily 
psychotherapeutic and educational interventions, 
whereas the aversion-conditioning program relied 
almost entirely on its conditioning schedule (using 
emetic and ataractic drugs) and minimized other 
types of treatment experiences. All programs offered 
AA sessions, even though they varied in the extent 
to which they were emphasized. A total of 12 types 
of treatment experiences were offered across the five 
Programs: antianxiety medications, sedatives, vita- 
mins, antabuse, psychotherapy sessions, house meet- 
ings, AA sessions, educational lectures and films, 
recreational activities, other informal group activ- 
ities, and (at the Salvation Army program only) 
Sunday worship and spot jobs. The amount of 
treatment, level of participation, and length of stay 
varied considerably among the patients within each 
„Program, 

Perceptions of the environment. The COPES was 
used to measure a patient’s perceptions of the pro- 
gram environment, The COPES is a scale designed 
to measure 10 dimensions of the treatment environ- 

_ Ment by having patients respond to 100 true-false 
items, which fall into 10 subscales. Three of the 
Subscales measure personal relationship dimensions 
(Involvement, Support, and Spontaneity). Four sub- 
Scales assess personal development or treatment pro- 
tam dimensions, such as the extent to which patients 
ate encouraged to be self-sufficient and independent 
(Autonomy), to prepare for the future (Practical 
Orientation), and to have insight into their prob- 
lems (Personal Problem Orientation and Anger and 
Aggression), The last three subscales (Order and 
ekinization, Program Clarity, and Staff Control) 
1974 system maintenance dimensions (see Moos, 

a, 1974), 

Follow-up information form. The follow-up eval- 
Tes conducted approximately 6-3 months after 
fet@"8e, was obtained using a questionnaire iden- 
oe content to the background information form 
tl peed at intake. The follow-up form was com- 
a a by 429 patients (87% of the 494 patients 
aes abe for follow-up). Four outcome criteria were 
ate to assess major dimensions of posttreatment 
anol ning: alcohol consumption in ounces of eth- 

subjective rating of drinking problem, physical 
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impairment, and occupational functioning. These four 
dimensions correspond to the four intake symptoms 
drawn from the background information form, 


Procedure 


There were three major steps in carrying out the 
analyses: (a) construction of “block” variables; (b) 
the use of path analysis to compute the direct, in- 
direct, and total effects among the block variables 
in the model; and (c) the partitioning of the sum 
of squares of the regression analysis into the unique 
variance attributed to each block variable and the 
joint variance shared by combinations of block 
variables. 

Construction of block variables. Coleman (1975) 
has suggested a procedure for constructing a single 
composite or block variable from a set of variables 
that are similar conceptually (e.g. variables such as 
age, sex, marital status, education, and ethnicity can 
jointly define a composite variable called “social 
background”). A composite variable is computed as 
a weighted sum of the variables that compose a 
block. The weights are the regression coefficients 
obtained when the set of variables are the only in- 
dependent variables used to predict the dependent 
variable. For example, a single variable representing 
social background would be the weighted sum of the 
various social background variables, the weights 
being the regression coefficients obtained when a par- 
ticular outcome criterion (such as alcohol consump- 
tion) is the dependent variable. Such a composite 
variable no longer has a natural metric, but it still 
ranges from low to high levels of social background, 
where a high level refers to those combinations of 
social background characteristics that are associated 
with better outcome. 

Four composite variables representing social back- 
ground (Block 1), program type (Block 3), treat- 
ment experiences (Block 4), and perceptions of the 
program environment (Block 5) were constructed 
separately for each of the four outcome criteria, That 
is, by using a particular outcome criterion, such as 
alcohol consumption, regression coefficients were ob- 
tained separately for each set of variables that com- 
posed a block variable and were used as weights to 
construct that block variable. Four sets of four block 
variables were computed from regression coefficients 
obtained for each of the four outcome criteria.* 

Instead of constructing a composite intake symp- 
tom variable, the intake symptom corresponding to 


4To compare the regression results using the 


ructed block variables with a full regres- 
s of the block variables 


f the coefficients actually 
jon model, This compari- 


newly const! 
sion model, the coefficient: 
were multiplied by each o: 
obtained from a full regress i 
son confirmed that the general pattern of results is 
similar even though the composite variable model is 
more restricted (in the way that it allows the vari- 
ables within and across models to be cross-corre- 


lated). 


1110 


the outcome criterion in each model was selected. 
Using a single intake symptom in this case serves to 
clarify the relationship between the same type of 
alcoholism-related characteristics at two points in 
time.5 

Calculation of direct, indirect, and total effects. 
By using path analysis with recursive models, it is 
possible to calculate the direct, indirect, and total 
effects of each block variable (and the single intake 
symptom) on the other variables in the model. The 
general method involves estimating the structural 
equations that correspond to a recursive model, such 
as the one presented in Figure 1. Alwin and Hauser 
(1975) suggested a general method for decomposing 
total effects into their direct and indirect effects 
through the estimation of successive reduced-form 
equations, 

The model illustrated in Figure 1 can be repre- 
sented by the following set of equations: 


Xs=puXitpaXete, (1) 
Xi=pasXs+u, (2) 
Xs=paXi+pssXs+2, (3) 


Xo=PaXi+peX2tpaXstpeXetposXst+w, (4) 
where X:=the block variable representing social 
background, Xs = the selected intake symptom, X3= 
the block variable representing “program type,” X, 
= the block variable representing “treatment experi- 
ences,” Xs = the block variable representing “Dercep- 
tions of the program environment,” Xs =the selected 
outcome criterion, and e, u, v, and w are random 
error terms, 

Consistent with conventional notation in path anal- 
ysis, direct effects are represented by ps (eg, pa 
is the direct effect of social background on outcome). 
Total effects are represented by qs (eg, qa is the 
total effect of social background on outcome), A 
total effect is the sum of the direct effect (e.g., pm) 
and any indirect effects via intervening variables 
(eg, dn = Pa + Pespar + PePaPar + Paspor + PesPssDat) 
The total effect is equal to the direct effect if there 
are no indirect effects (e.g, qa = Pa), and it is equal 
to an indirect effect if there is no direct path (eg. 
ir 

of the direct, indirect, and total eff = 
sponding to the model presented in Figure iran 
specified by Equations 1-4 Were estimated for the 
four outcome criteria. This decomposition of the 
eal effects allows for a detailed examination of the 
ine jal among the variables related to 

Path coefficients 
cients), rather than 
cients, 


units. 
S of squares. As noted above, 


Previous research on alcoholism treatment programs, 
5 
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which compares the amount of explained variang 
accounted for by different types of variables, has 
been asymmetric because of the sequential order in 
which variables are entered into the regression anal- 
ysis. That is, the variance associated with the vari- 
able entered first includes not only the explained 
variance that is unique to that variable but also the 
variance that it shares with other variables that are 
entered later in the analysis. Consequently, the in- 
crement in explained variance reflects only the unique 
variance associated with the added variable, since 
any variance that it shares with preceding variables 
has already been accounted for. 

Newton and Spurrell (1967a, 1967b) and Mood 
(1971) have outlined a procedure for making more 
symmetric comparisons by calculating the unique 
and shared variance accounted for by each variable 
or combination of variables in a regression equation, 
This analysis was carried out for each of the four 
outcome criteria by varying the sequential order in 
which the four block variables and the single intake 
variable were included in the regression. The e- 
plained variance was then partitioned into the vari- 
ance attributed uniquely to each block variable (or 
the single intake variable) and the variance shared 
by the various combinations of variables. 


Results 


Three sets of analyses are presented: (a) 
the interrelationships among the patient back- 
ground and program-related variables; (b) @ 
comparison of total, direct, and indirect effects 
of patient background and program-relat 
variables on outcome; and (c) an examination 
of the unique and shared explained variance 
attributed to the variables used to predict 
outcome. 

The path diagrams of the four selected out- 
come criteria are shown in Figures 2-5. T 
path coefficients corresponding to the effect 0 
the patient background variables (social back: 
ground and the intake symptom) on the Pl 
gram-related variables (program type, treat 
ment experiences, and perceptions of the treat- 


5Some analyses were conducted using composite 
intake symptom variables. The results were v 
similar to those that used only a single intake Ran: 
tom, with a tendency for the composite variable 
have a slightly stronger effect on outcome. culat 

®Since any analysis may be subject to parti at 
biases and instabilities, both path analysis aa Tiis 


titionin; i i are used. 
g of the explained variance alts and 


due t° 
ession 


allows one to have more confidence in the res 
minimizes the likelihood that the results are 
Possible instabilities or biases in the reg" 
weights. 


$ 
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ment environment) are listed in Table 1, along 
with estimates of their total and indirect ef- 
fects. The rest of the path coefficients shown 
in Figures 2-5, as well as the corresponding 
total and indirect effects, are displayed in 
Table 2. These estimates reflect the effect of 
the patient background variables (social back- 
ground and intake symptom) and the pro- 
gram-related variables (program type, treat- 
ment experiences, and perceptions of the treat- 
ment environment) on outcome. 


Relationship of Patient Characteristics to 
Program-Related Variables 


Determinants of program selection. The 
estimates of psi shown in Figures 2-5 (also 
listed in Table 1) for the four outcome cri- 
teria show that the composite background 
variable is an important determinant of the 
type of program that a patient enters. (All 
are statistically significant at the .05 level.) 
The path coefficients across three of the 
models are very close, ranging from .463 in 
the physical concomitants model to .559 in 
the alcohol consumption model. The slightly 
lower coefficient in the occupational function- 
ing model may result from the nature of this 
Particular outcome criterion [i.e., the effect of 
the intake characteristic, occupational func- 
tioning (p32) probably has accounted for 
Some of the effect of social background, be- 
Cause of its similarity to the other social back- 
8tound characteristics]. 

_ The estimates of the effects of the other 
intake functioning characteristics (ps2) indi- 
Cate that these three alcoholism-related symp- 
toms are not consistently strong determinants 
of the type of program that a patient enters, 
as shown by coefficients ranging from —.150 
to 015. The two intake variables that had 
relatively strong effects (occupational func- 
tioning and physical concomitants) are those 

t were most highly correlated with social 

ckground. 

Determinants of treatment experiences. 
The only variable hypothesized to have a di- 
tect effect on treatment experiences is the pro- 
Stam type, and, in fact, the estimates of the 
effect of the program (pss) are relatively large 
(a statistically significant), ranging from 
460 to .689 across all four models (see Table 
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1 and Figures 2-5). This finding is consistent 
with expectations, since the programs almost 
entirely determine the type of treatment ex- 
periences that a patient has. 

Although neither social background nor the 
intake symptom were hypothesized to have a 
direct effect on treatment experiences, their 
indirect effects via the program type were 
estimated and are listed in Table 1. The in- 
direct effects of the intake symptoms via the 
program type (P43p32) are relatively small, as 
shown by estimates ranging from —.089 to 
.007. In contrast, social background seems to 
have a stronger indirect effect, with estimates 
of paspsı of .124 and .360. This implies that 
patients with a “higher” level of social back- 
ground tend to enter and/or more actively 
participate in programs offering those treat- 
ment experiences that are associated with 
better outcome. r 

Determinants of perceptions of the environ- 
ment. The estimates of ps: in Table 1 show 
that social background has a positive effect 
on a patient’s perceptions of the program en- 
vironment, (with the exception of the model 
of occupational functioning), suggesting that 
patients with higher levels of social back- 
ground characteristics tend to perceive the 
environment slightly more positively. The de- 
composition of the total effect of social back- 
ground on perceptions of the environment 
(qs1) indicates that most of the total effect 
is due to the indirect positive effect of social 
background via the program type (PssPs1), 
implying that those patients with higher levels 
of social background characteristics enter pro- 
grams that provide more positive treatment 
environments. Similar to the effects of the 
program type on treatment experiences, the 
direct effects of the program type on percep- 
tions of the environment (pss) are relatively 
close and large (ranging from 434 to .552), 
indicating that the major determinant of a 
patient’s perceptions is the program that the 
patient is in. 


Relationship of Patient Background and 
Program-Related Variables to Outcome 


Table 2 displays the direct, indirect, and 
total effects of each of the five variables hy- 
pothesized to affect outcome. The results show 
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(Alcohol 
Consumption 
at BIF) 


Figure 2. Estimation of path coefficients for alcohol consumption, (FIF = follow-up inform 


B = background; I= intake; P = program; 
treatment experiences; E = treatment environment.) 


form; BIF = background information form; 


some clear-cut consistencies across the models. 
The estimates of pe: (three of which are sta- 
tistically significant) suggest that the higher 
the level of social background, the less severe 
the alcoholism-related symptoms are at fol- 


T= treatment experiences; E 
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i a i r rating of drinking problem. (FIF = follow-up info 
3 BIF = background information form; B = background; I= intake; P = progr 
= treatment environment.) i 


Alcohol 
Consumption 
at FIF 


low-up. Similarly, there is a relatively 
association between a patient’s intake 
toms and the corresponding outcome Ci 
at follow-up, indicated by -the estim 
Pez (except for a weak effect in the 


Rating of 
Drinking 
Problem 
at FIF 
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Physical 
Concomitants 
at FIF 


(Physical 
Concomitants 
at BIF) 


Figure 4. Estimation of path coefficients for physical concomitants. (FIF = follow-up information 
form; BIF = background information form; B=background; I=intake; P= program; T= 
treatment experiences; E = treatment environment.) 


consumption model). In fact, for three of the whereas the indirect effects of the intake 
outcome criteria, the intake symptoms had symptom together represent between 7% and 
slightly stronger direct effects than social 23% of their total effect. These results indi- 
background. The indirect effects of social cate that in three of the models, a substantial 
background that are mediated by the pro- proportion of the total effect of social back- 
gram-related variables together represent be- ground is via indirect effects that are shared 
tween 11% and 73% of their total effect, with the program-related variables. In con- 


Occupational 
Functioning 
at FIF 


(Occupational 
Functioning 
at BIF) 


Figure 5. Estimation of path coefficients for occupational functioning. cy iia Mee oe 
tion form; BIF = background information form; B=background; I=intake; P= program, 
= treatment experiences; E = treatment environment.) 
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trast, most of the total effect of the intake 
symptoms is direct (77%—93%) and thus not 
shared with the program-related variables. 
There is a strong total effect of program 
type on outcome (qes). When the treatment 
experiences and perceptions of the environ- 
ment are taken into account, however, the 
program variable has little direct effect (pes) 
(i.e., almost all of the total effect of the pro- 
gram is mediated by the two program-related 


Table 1 
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variables, shown by the indirect effects, Pepu, 
and pesPss, in Table 2). 

Treatment experiences (pgs) and percep. 
tions of the environment (pes) are both 
strongly associated with outcome. For all of | 
the outcome criteria except occupational func- | 
tioning, the direct effect of the perceptions 
of the environment (pes) is surprisingly 
strong, suggesting that perceptions of the | 


Direct, Indirect, and Total Effects of Patient-Related Block Variables 


on Program-Related Block Variables 


ee aaaaaaaaaaaaaaaaaasasasaaasasasaslÃõÁ 


Variable 


Dependent Independent 


Alcohol 
consumption 


Occupational 


drinking 
functioning 


problem 


Physical 
concomitants 


Program Social background 

direct & total effect: 
Ps: (qa1) (no indirect 
effects) 

Corresponding intake 

symptom 

Direct & total effect: 
Ps2 (q32) (no indirect 
effects) 

Treatment 
experiences Social background 
Indirect effect via 

Program paspsy 
(also total effect—no 
direct effect) (qa) 

Intake symptom 
Indirect effect via 

program: pasps2 
ae total effect—no 
lirect effect) (qua 
Direct effect pis Sy 


Perceptions of 


007 
-460* 


` environment 


Social Background 


.559* .522* .463* 219" 


015 —.080 —.150* —.150* 


.257 .360 274 124 


—.086 


—.055 "370" 


-689* 


—.089 
.592* 


Direct effect: psı 
Indirect effect via 
program: psy 
Total effect ae 
(Per + psapar) 
ntake symptom 
Indirect effect via 
Program : P53P32 
(also total effect, qs) 
(no direct effect) 
Direct effect: pss 


-185* 
-243 
-428 


-007 
A34* 


-104 
-269 
373 


—.041 


*b< 05 (calculated for direct effects only). 


SAS” 


31224 
-256 
-378 


—.083 
.552* 


—.025 
.112 


087 


—.017 
pil" 


treatment program may be important predic. 
Type of intake and outcome criterion used 
Rating of 
| 
k 
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Table 2 
Direct, Indirect, and Total Effects of Patient-R i 
1 - -Related Variab 
Program-Related Block Variables on Four Outcome Criteria eari 
Rating of 
Alcohol drinking Physi i 
l ysical (07 
Effects consumption problem concomitants fanero lay ; 
Direct 
Social background—p~ (B) —.167* 
Corresponding intake symptom—pp: (I) ol apr: Sine —.171* 
Program—pss (P) 068 —014 ae Fh 
Treatment experiences—pg, (T) —.295* —155* "165" ni 
Perceptions of environment—pes (E) —.190* —.283* — 264" EDE, 
Indirect ; 
B via 
P—pespat 038 007 
= 4 = .040 00. 
ae PosPsi —.035 —.029 —.032 ool 
P&T—pepapa —.076 —.056 —.045 —.021 
ian PosPsaPar —.046 —.076 —.067 —.005 
P—peapaz 001 001 
F 4 —.013 = 
P&T—pupapa —.002 {009 ‘015 me 
P via PssPs3P32 —.001 012 022 .003 
T pugi —.136 —.107 —.098 —.095 
PosPss —.082 —.146 —.146 —.023 
Total® 
Bran —.286 —.229 —.213 —.193 
pa .047 .193 .259 .284 
qes —.150 —,267 —.158 —.103 


Wit: 

atin the context of the causal model used here (see Figure 1), the total effects of treatment experi- 

a s and perceptions of the environment, qes and qos, respectively, are the same as their direct effects, pos 
Pos. Consequently, they are only listed under the direct effects. The total effects listed here were calcu- 


ies as the sum of the direct and indirect effects. 
< .05 (calculated for direct effects only). 


tors of outcome. However, there is substantial 
ae dependence among the program-related 
aa only 0%-27% of the direct effect 
dir teatment experiences and 1%-33% of the 
7 ect effect of perceptions of the environment 
bac Page of all prior variables (social 
type). round, intake symptoms, and program 


nigue and Joint Contributions of the 
“plained Variance 


freak 3 presents the results of partitioning 
eee variance for the four outcome 
ni ia. The first five rows display the 
ee and shared variance attributed to the 
S nt background variables, and the next 

rows show the unique and shared vari- 


ance attributed to the program-related vari- 
ables. The rest of the table displays the 
explained variance that is shared among com- 
binations of the patient-related and program- 
related variables. 

The pattern of results is similar to those 
obtained from the path analysis. In general, 
both social background and the intake symp- 
toms contribute substantially to the explained 
variance, with the combined contribution of 
their unique and shared variance ranging from 
12% to 61% of the explained variance. In 
three of the models, the unique contributions 


were calculated by subtracting 
out all other paths that are causally prior to the 
direct path of interest. (See Coleman, 1975, for a 
more detailed discussion of these procedures.) 


1 These percentages 
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Table 3 rae 
Partitioning of Explained Variance for Four Outcome Criteria 
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Rating of 
Alcohol drinking Physical 
Variable consumption problem concomitants 
Patient related 
nique variance 
4 Backeround .0198 003 0198 
Intake 001 028" 0478 
Shared variance 
Background and intake -002 -003 0198 
Subtotal 0228 034% 085% 
Proportion of R? {12 16 32 
Program related 
Unique variance 
Program .002 .000 .003 
Treatment experiences 0678 0124 017" 
Perceptions of treatment environment 025% 048" £043 
Shared variance 
Sum of all combinations» 0108 0478 013" 
Subtotal ] 104" 1078 .076* 
Proportion of R? 57 49 28 
Shared variance among background, 
intake, & program-related variables 
Sum of all combinations® 056% 0758 -106* 
Proportion of R? .30 35 40 
Total R? 182 217 268 


Note. B = background; I = intake; P = program; T = 


treatment environment. 
® The explained variance is greater than 1%. 
» These combinations are PT, PE, TE, PTE. 


treatment experiences; and E = pe 


° These combinations are BP, BT, BE, BPT, BTE, BPE, BPTE, IP, IT, IE, IPT, ITE, IPE, IP 


BIT, BIE, BIPT, BIPE, BITE, and BIPTE. 


of the intake symptoms are stronger than the 
unique explained variance attributed to social 
background. When taken together with the 
path analyses, the findings suggest that the 
intake symptom has a relatively stronger re- 
lationship to outcome than does social back- 
ground, 

The total of the unique and shared variance 
attributed to the Program-related variables 
ranges from 16% to 57% of the explained 
variance. The unique variance accounted for 
by the program type is very small, whereas 
that accounted for by the treatment experi- 
ences and perceptions of the environment is 
larger. This implies that most of the unique 
variance accounted for by the program-related 
variables is due to either treatment experi- 
ences or perceptions of the environment, For 
two outcome criteria (rating of drinking 
problem and physical concomitants) , percep- 


tions of the environment account 
explained variance, whereas treatmeni 
ences have more explanatory power 


patient background variables (in three 
or any of the other variables in the mí 
two models) . : 

Since each program determines 1 
ment experiences and the envir 
Creates, some of the explained Vi 
shared among combinations of the 
related variables. This corresponds to 
analysis, which shows that most of 
effect of the program type is due to 
effects that are shared with the othe nt 
gtam-related variables. 
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In comparing the contribution of the pro- 
gram-related variables with that of the pa- 
tient background variables, there is no group 
of block variables that clearly has the most 
explanatory power across all models. Patient- 
related variables account for more explained 
variance than program-related variables when 
predicting occupational functioning, whereas 
_ their relative contributions are approximately 
equal when predicting physical concomitants. 
In contrast, the explained variance attributed 
to program-related variables is greater in the 
models of rating of drinking problem and al- 
cohol consumption. 


Discussion 


We have used an integrated approach to ex- 
. amine the interrelationships among five major 
sets of variables that are related to treatment 
outcome. This was done by (a) formulating 
a model that explicitly specified the hypothe- 
sized causal ordering among these sets of 
variables; (b) constructing block variables 
to summarize the effects of sets of concep- 
tually similar variables; (c) decomposing the 
total effects of each block variable into its 
direct and indirect effects; and (d) partition- 
ing the explained variance into the unique 
contributions of each block variable and the 
shared contributions among combinations of 
block variables, The findings clarify some im- 
Portant issues concerning the way in which 
patient-related and program-related variables 
Contribute to explaining outcome. 

Consistent with previous research (Armor 
et al., 1976; Bromet et al., 1977; Craft et al., 
1975; Ruggels et al., 1975), the results show 
that social background and intake symptoms 
ate relatively strong predictors of outcome. 


In addition, although most of the total effect - 


of the intake symptoms is direct, social back- 
ground has substantial indirect effects that 
are mediated by the program-related vari- 
ables. These effects represent shared variance 
between sociodemographic characteristics and 
Ptogram-related variables and cannot be at- 
tributed solely to either set of variables alone. 

Although the relative importance of social 
ackground and intake symptoms is not con- 
Sistent across all four models, the intake 
Symptoms have stronger direct effects and 
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account for a larger proportion of the ex- 
plained variance for three of the four out- 
come criteria, Even though no clear-cut pat- 
tern has been reported in previous research, 
these findings suggest that the intake symp- 
toms may be slightly stronger predictors of 
some outcome criteria than social background, 
More research on different types of patient 
functioning characteristics is needed to clarify 
this issue. 

Previous research has reported that treat- 
ment does contribute to improved functioning 
from intake to follow-up; that is, recovery 
rates of treated alcoholic patients vary di- 
rectly with the amount of treatment, length 
of stay, and level of participation in pro- 
gram activities (Armor et al., 1976; Bromet 
et al, 1977; Craft et al., 1975). Although 
the total effect of program type on outcome 
is relatively strong in the present analyses, 
it is primarily indirect and accounted for 
through the effects of treatment experiences 
and perceptions of the environment, The 
strong relationship between these program- 
related variables and outcome may reflect 
not only treatment effects but also the com- 
bined effects of the patient’s motivation to 
recover, a more positive attitude toward the 
program, better functioning within the pro- 
gram, and a greater probability of partici- 
pating in aftercare services (Craft et al., 
1975; Pisani, 1969; Pratt, Linn, Carmichael, 
& Webb, 1977). 

In comparing the importance of patient 
characteristics relative to program-related 
variables as predictors of outcome, previous 
studies have reported relative uniformity in 
outcome among programs when patient- 
related variables are taken into account 


(Armor et al., 1976; Ruggels et al., 1975). 


Contrary to previous findings, the combined 


unique effects of the program-related vari- 
ables observed here are substantial, In fact, 
except for occupational functioning, the 
proportion of explained variance uniquely 
accounted for by program-related variables 
is almost equal to or greater than that 
uniquely accounted for by patient-related 
variables, These findings indicate that pro- 
gram-related effects may be more important 
than would be expected from previous re- 


search, 
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The results also show that 23%40% of 
the total explained variance is shared be- 
tween patient-related and program-related 
variables (see Table 3). From a different 
perspective, between 28% and 72% of the 
total effects of the patient characteristics are 
shared with the program-related variables. 
In previous research, the patient-related vari- 
ables have been credited with these shared 
effects, leading researchers to underestimate 
the explanatory power of program-related 
variables. Given that a substantial portion 
of the variance is shared between these two 
groups of variables, researchers may have 
been too limited in their analyses by looking 
primarily for either patient or program 
variance, The effects of certain combinations 
of patient and program-related variables, such 
as patient-program selection and congruence 
effects, may be particularly important for 
understanding patterns of patient improve- 
ment (Pattison, 1976), 

Although there are some consistencies 
across all four models, there are also some 
important differences, For example, although 
the intake symptom is one of the two most 
important predictors for three outcome cri- 
teria, it is least important in the alcohol con- 
sumption model. Apparently, the level of 
alcohol consumption is more strongly influ- 
enced by treatment experiences, perceptions 
of the treatment environment, and social 
background. In contrast, occupational func- 
tioning at follow-up is most strongly affected 
by occupational functioning at intake and 
much less influenced by the program-related 
variables, These variations indicate that the 
importance of program-related variables rela- 
tive to background variables depends on the 
outcome criterion used. 


Consistent with 


gram selection, our findings show that social 
background is an 


the type of Program that a patient enters 
(Armor et al., 


Tn addition, the 


tants), the intake symptoms are also a sig- 


nificant determinant of program selection, 


These two intake symptoms are th 
highly correlated 4 Sanaa 


with social background, and 
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their effects may thus reflect some u 
effects of social background. 

The program is by far the mos 
determinant of a patient’s treatm 
ences and perceptions of the envi 
Although the overall effect of 
ground on perceptions of the envi 
positive here, other findings (Moos 
1978) suggest that the effect is | 
small, and that the effects of different 
ground variables may vary (e.g., Wor 
perceive the environment more 
whereas better educated patients n 
ceive it more negatively). Social b 
also has substantial indirect effects, 
program, on both treatment experie 
perceptions of the environment, 
direct effects suggest that patien 
higher social background levels either 
and/or participate more actively in pı 
offering environmental and treatm 
periences associated with better out 
addition, the indirect effects may rel 
fact that patients with certain ba 
characteristics are (a) more motivated 
come involved in program activities 
receive more treatment and (b) funci 
ter within the program and thus pert 
environment as more positive. 4 

One of the most distinctive features 
Present analysis is the use of block vi 
The method of determining weights # 
used is one of several approaches 
be used, Alternative methods includ 
the variables equal weights by aver 
them, giving one or more variables 4 
of one and others a weight of zero by 
ping variables from the set, and usi 
ous types of factor analyses. The impo 
point is that by grouping together 4 
Conceptually similar variables that ati 
treated as one, subsequent analyses 
a more integrated and comprehensive pié 
of a complicated pattern of effects. 

Between 18% and 27% of the 
variance in treatment outcome is expla 
all of the block variables taken to; 
Although this is a somewhat larger 
tion of the overall variance than 
counted for in most similar studies, € 
jority of the variance in outcome iS 
unaccounted for by either patient-relal 
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program-related variables. Further research 
is needed to identify other important factors, 
such as the environmental resources available 
to patients in community settings (Bromet 
& Moos, 1977), that may contribute to the 
remaining variance. Nevertheless, the results 
presented here indicate that the proportion 
of variance in the outcome criteria uniquely 
associated with program-related variables is 
as great or greater than that uniquely as- 
sociated with patient-related variables. Per- 
haps the most important conclusion to be de- 
tived from these results is that alcoholism 
treatment programs may have more substan- 
tial differential effects on outcome than 
previous literature has suggested. Since some 
of these effects are probably related to 
patient-program selection and congruence, it 
may be unwise to implement the policy of 
placing all patients in uniform low-cost 
treatment programs. 
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Knowledge of the effect of therapist training and experience on the outcome of 
treatment of schizophrenic patients is scanty. This article presents data system- 
atically collected in the course of a controlled comparison of the effects of five 
different treatment methods in schizophrenia. Among the 23 outcome variables 
studied, there was not a single instance in which the effect of therapist experi- 
ence and general clinical ability was significantly related to outcome. There ap- 
peared to be, however, differences among therapists” results that were not re- 
lated to experience and general clinical ability, 
length of time that they kept their patients in hospital. Drug treatment tended 
to override but perhaps not entirely eliminate these effects. 


Do Experienced Therapists Get 
Better Results? 


Research evidence on this question is scanty. 
The vast gap in knowledge is typically filled in 
with a mixture of myth, speculation, anecdote, 
and weak research evidence, mostly relating 
to psychotherapy. (Betz & Whitehorn, 1956; 
Karon & VandenBos, 1972, 1975; Tuma & 
May, 1975). The notion that experience and 
training might influence the results of treat- 
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particularly in relation to the 


ments other than psychotherapy has be 
largely ignored or depreciated. k 3 
This article presents data on this issue 
were systematically collected in the cou 
a controlled comparison of the outcome of i 
different methods of treatment in schizophreni 


Background and Literature Review 


The literature virtually ignores all meth 
of treatment except psychotherapy. Even 
this area the studies are generally anecda 
or nonexperimental, with small numbers 
cases and equivocal or contradictory Cona 
sions. Strupp stated that there is some 
dotal evidence (Glover, cited in Kubie, 19. 
that beginners may achieve success in psy 
therapy that they are unable to equal w. 
they have had formal training; but on t 
other hand, he continued, highly experien® 
therapists might be presumed to ač 
better results, perhaps because they are ” 
circumspect in selecting their patients (Str : 
1958a, 1958b, 1958c, 1960). Experienced th 
pists are probably better aware of their oi 
Strengths and weaknesses and are not M 


h 
no 


3 
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to undertake tasks that they believe they 
cannot tackle. Unfortunately, we have little 
data on well-trained, highly experienced thera- 
pists and the reasons for their presumed 
superior accomplishments. 

An early series of studies by Fiedler (1950a, 
1950b, 1951) of experienced and inexperienced 
therapists’ description of their own attitudes 
and behaviors in psychotherapy is often cited 
in this context. However, Fiedler did not in- 
vestigate whether the therapists’ self-descrip- 
tion has any bearing on actual outcome. Nor 
were his observations based on the treatment 
of psychotic patients. Thus, although Fiedler’s 


studies focused attention on what therapists 


considered important in psychotherapy regard- 
less of theoretical orientation, his data did not 
shed any light on the influence of therapists’ 


_ level of experience on the outcome of treatment. 


Carkhuff and Truax (1965) and Banks, 
Berenson, and Carkhuff (1967) studied changes 
in such characteristics of therapists as empathy, 
warmth, genuineness, depth of intrapersonal 
exploration, and positive regard in response to 
training and experience. Again, although the 
studies provided evidence that therapists’ be- 
havior during psychotherapy may be modi- 
fiable, they say nothing about therapists’ 
characteristics relative to treatment outcome. 

There is a common belief that inexperienced 
therapists, being relatively simple and en- 
thusiastic, may get better results particularly 
with schizophrenic patients, than those who 
have been disillusioned by experience. Poser 
(1966) compared trained and untrained thera- 
pists conducting group therapy with hospital- 
ized chronic schizophrenic patients. The un- 
trained achieved slightly better results than 
the trained, and these results persisted during 
43-year follow-up. However, the trained and 
Untrained therapists did not treat their 
Patients at the same time, and there was a 
high proportion of dropouts (Rosenbaum, 
1966). Barrett-Lennard (1962) compared ex- 
Pert counselors with nonexperts. The experts 
kept their clients in treatment longer and ob- 
tained more improvement at the .10 level of 
Significance, Improvement was, however, mea- 
ured by the therapists’ own ratings—not in- 

“pendently. This could reflect the effect of 
Seli-deprecatory attitudes by the nonexperts, 
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enhanced self-esteem among the experts, or 
even the longer treatment time. 

In general, the relatively few systematic 
though naturalistic studies of this matter 
(Goldberg, Schooler, Davidson, & Kayce, 
1966; Cole, Note 1), point to the conclusion 
that psychiatrists who have completed their 
residency training and residents who are still 
at various stages of their training all obtain 
similar results. 

With regard to drug therapy, it has been 
suggested that only inexperienced therapists 
get better results, but the findings are some- 
what contradictory. In a study by Rickels 
et al, (1966), five board-certified or board- 
eligible psychiatrists treated anxious neurotic 
patients with psychotherapy plus placebo or 
with psychotherapy plus meprobamate. The 
combination with drug was significantly su- 
perior by a variety of criteria. Uhlenhuth, Covi, 
Rickels, Lipman, and Park (1972), however, 
reported that experienced doctors got better 
results with placebo, whereas the inexperienced 
got better results with meprobamate. 

In schizophrenia, reports are also conflicting. 
Karon and O’Grady (1969) and Karon and 
VandenBos (1970, 1972) reported that patients 
treated by 2 experienced therapists improved 
more and spent less time in hospital over a 
2-year follow-up than those treated by 10 in- 
experienced therapists: This applied whether 
or not drugs were used. It should not be in- 
ferred that experienced therapists got better 
results without drugs than with them, Indeed, 
well-controlled studies by Grinspoon, Ewalt, 
and Shader (1967a, 1967b, 1968, 1972) re- 
ported that experienced therapists obtained 
better results in schizophrenia with drugs plus 
psychotherapy than with psychotherapy alone. 

Karon and co-workers also reported that 
when drugs were used, the inexperienced 
therapists kept their patients in the hospital 
for less time at the beginning but with less im- 
provement in thought disorder and greater 
long-term hospital stay. This is, however, mis- 
leading. In fact, the patients treated without 
drugs spent more time (not less) in the hospital 
if one considers the entire period from ad- 
mission to follow-up (see May & Tuma, 1970). 

O’Brien et al. (1972) compared the results 
obtained by medical students, social workers, 
and psychiatrists using individual and group 
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psychotherapy combined with drugs for schizo- 
phrenic outpatients. There was a tendency for 
the medical students to get the best results, 
but the differences between the three groups 
were not statistically significant. 

Clearly, there is a lack of relevant and un- 
ambiguous data in this area and a distinct 
need for careful studies. 


Method 


The data were collected in the context of a larger 
study in which the primary aim was to evaluate the 
differential effectiveness of five methods currently used 
in the treatment of schizophrenia. All methods are still 
in use today, with little, if any, difference in procedure 
and technique from those used then. The experimental 
design and procedures are described elsewhere in con- 
siderable detail (Dixon & May, 1968; May, 1968); 
accordingly; only certain details that are specific to the 
topic of this study will be presented. 

Two hundred twenty-eight male and female first- 
admission schizophrenic patients without significant 
prior treatment were assigned by a stratified random 
method to five treatment groups: (a) individual psycho- 
therapy alone; (b) ataractic drug alone; (c) individual 
psychotherapy plus drug ; (d) electroconvulsive therapy ; 
and (e) milieu, a group that received none of the above 
“specific” treatments but only the same level of basic 
milieu care given to all the other groups. The number of 
patients in each group was as follows: psychotherapy, 
42 patients; drug, 48; psychotherapy plus drug, 44; 
electroconvulsive therapy, 47; and milieu, 43. 

The therapists were 33 male and 5 female psychiatric 


n ‘Many of their behaviors, 
hanns decisions, which may in turn influence 
Measures of Outcome 


esenta nme Measures chosen for this analysis are 
elsewhere (May, 1968). 077 details can be found 
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Table 1 
Measures of Outcome 


Variable 


Menninger Health-Sick- 
ness Scale, post- 
treatment 


Camarillo Dynamic Assess- 
ment Scale* 


2 independent rat 
(psychoanalyst 


MMPI, posttreatment 
F, Sc, Pa, psychiatric 
triad Patient 

Testability, posttreatment 


MACC Communication, 
Total score 


Psychologist 


Nurses 


Hospital stay 
Only patients success- 
fully discharged 7 
All patients = 


Release rate> me 


Clyde, Mood Scale, post- 
treatment 


(clear thinking scale) Therapist 


Symptom rating sheet, 
posttreatment, total i. 
score Therapist 

Note. MMPI = Minnesota Multiphasic Perso 

Inventory. i 

* Consists of eight scales: Affective Contact, Ans! 

Level; Ego Strength ; Extent to which Environm 

Suffers; Insight, Motivation; Object Relal 

Sense of Personal Identity. 7 k 

è Counted when patient is successful in staying 

of the hospital 31 days or more. 


Therapist Variables 


Length of experience, A number of indices 
perience were considered. These included age, "i 
of years since internship, number of years in psy¢ 
residency training, and number of years of clinical 
perience in treating psychiatric patients. F. 

Since treatment lasted a relatively long y h 
time (M = 169.4 days), and since the tretman je 
of the study lasted several years, the indices of t ; 
experience were computed using the midpoint 4 ; 
patients’ treatment. This allows for the na K 
therapists may be more or less experienced vhe 
treat particular patients. Data describing the 
sample on all four indices of experience were r. 
and there were no significant differences 4™ 
treatment groups with respect to any item. | 

The number of years of clinical experience m 
psychiatric patients was chosen as the most apr 
index of “experience” for the present analysis, 
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though the other three indices are also potentially 
relevant.! The length of this experience ranged from .16 
to 11.65 years (M = 3.77, SD = 2.87). There was no 
significant difference? among treatment groups (p = 
.1664 for all patients; and p = .3540 for those with 
only one therapist). A two-way analysis of variance 
for the milieu, psychotherapy alone, psychotherapy 
plus drug, and drug-alone groups examined the differ- 
ences in therapist psychiatric experience between the 
two groups that received psychotherapy and those that 
did not receive psychotherapy. This resulted in a p 
value of .0281 for all patients, and p = .0334 for those 
with only one therapist. This reflects a deliberate design 
bias in favor of psychotherapy, in that, as pointed out 
in an earlier publication (May 1968), therapists with 
less than 6 months of experience were not assigned to 
treat patients in psychotherapy. 

Therapists’ general clinical ability. Every 6 months 
a rating was made by a senior psychiatrist of the 
therapist’s general clinical ability relative to others of 
a comparable level of training. The rating made on the 
date closest to the midpoint of each patient’s treat- 
ment was used in the present analysis. No significant 
differences were found between the treatment groups 
in terms of ratings of therapists’ general clinical ability 
(p = .3006 for all patients, and p = .2303 for those 
with only one therapist). 


Patient Variables 


s Seven patient covariates were used to adjust for 
initial pretreatment differences among patients: pres- 
tnce or absence of precipitating stress; intensity of 
such stress relative to onset of illness; duration of 
Psychotic disorder; nurses’ pretreatment ratings of 
Cooperation on the Movement, Affect, Communication, 
and Cooperation (MACC) scale (Ellsworth, 1957; 
Ellsworth & Clayton, 1959); psychoanalysts’ pre- 
treatment ratings of affective contact, best level ever 
Attained (May & Dixon, 1969); age at onset of any 
Symptoms; and pretreatment level on the outcome 
Measure under study. Ratings on the Menninger 
Health-Sickness Scale (Luborsky, 1962) were used to 
adjust hospital stay and release rate, since no initial 
tvel on these measures is obviously available or 
‘plicable, 


Statistical A nalysis 


i General. An analysis of covariance was performed 
" which the criterion score of any particular patient 
Was taken as comprising portions of variation associated 
With: (a) therapist’s years of experience, (b) therapist’s 
feneral clinical ability relative to peers, (c) drug treat- 
t, (d) pretreatment patient variables, and (e) other 
own parameters (error). 
variance runs, The covariates (ie, patient, 
rapist, and drug covariates) were examined sepa- 
ly, first for all patients and then for patients with 
© therapist (the majority of cases). Wherever there 
maj two therapists, the data relate to the patient's 
bas t therapist, that is, the one who treated the 
nt for the longest period of time. 
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Interpretation, The analyses examined whether 
there were differences in outcome therapists and 
to what degree these differences could be attributed to 
the specified covariates. 

1. Are the differences in outcome attributable to a 
particular covariate or group of covariates; that is, is 
the regression slope significantly different from zero? 

2. Are there differences in outcome among therapists 
before and after adjusting out the covariate; that is, 
are the therapist cell means significantly different? 

The probability of residual differences among thera- 
pists afler adjusting out the covariate is compared with 
the probability before covariance adjustment, 


Results 


Was Therapist Experience Significantly 
Related to Treatment Outcome? 


Among the 23 outcome variables, there was 
not a single instance in which the regression 
slopes for the effects of therapist experience 
and general clinical ability were significantly 
different from zero, whether for all cases or 
for those with only one therapist.’ Indeed, the 
relationship reached borderline significance for 
only one outcome variable, length of stay for 
patients successfully released only, not in- 
cluding the failures (p = .0875), This relation- 
ship was nonsignificant (p = .6123) when ex- 
amined for cases with only one therapist. Our 
inclination is to discount this lone marginal 
finding as a chance occurrence in the midst of 
a great number of insignificant p values. 


Were There Significant Differences in Outcome 
Attributable to Therapist Factors Other Than 
Experience and General Clinical Ability? 


Length of stay. After adjusting for the 
effects of therapist experience and general 
clinical ability, there were significant differences 
among therapists in the length of time that 
they kept their patients in hospital. The ad- 
justed therapist means were significantly 
different for length of hospital stay for patients 
who were successfully released only, not in- 
cluding the failures (p = .0131 for all cases, 


1 Data on all indices can be made available to investi- 
tors on request. t 7 2 
es In this article the term significant is used if p < .05 


d borderline significance if .10 > p > .0501. 
es To save pis no detailed tables will be presented 


here, only p values will be presented as appropriate. 
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and p = .0239 for cases with only one thera- 
pist). For length of stay for all patients, in- 
cluding the failures, p = .0927 for all cases and 
p = .0362 for cases with only one therapist. 
These differences in patients’ hospital stay 
could not be attributed to their therapist’s ex- 
perience in the treatment of mental illness, 
to ratings of their general clinical ability, or to 
the seven patient covariates examined, since 
the differences remained significant even after 
adjusting for patient differences on these co- 
variates. (After adjustment for length of stay 
to successful release, p = .0134 for all cases, 
and p = .0234 for cases with only one therapist. 
For length of stay including the failures, 
p = .0733 for all cases, and p = .0369 for 
those with only one therapist.) 

There was some evidence that these thera- 
pist-related differences in length of hospital 
stay were partly, but not entirely, related to 
whether or not the patient was given drug 
treatment, since the significance levels were 
sharply reduced when further adjustment was 
made for drug treatment. (For only the 
patients who were released, the significance 
level changed from .0134 to .0837 for all cases ; 
p changed from .0234 to .1662 for those with 
only one therapist. For all cases including the 
failures, the p levels changed from .0927 to 
-2263 and from .0363 to .1534, respectively.) 

Other outcome measures. After adjusting for 
the effect of therapist experience and clinical 
ability, there were significant or borderline 
differences among therapists’ results for only 
1 of the other 21 outcome measures, the 
Minnesota Multiphasic Personality Inventory 
Pascale (Dahlstrom & Welsh, 1960; Hathaway 
& McKinley, 1943) (p = .0467 for all cases, 
and p = .0349 for those with only one thera- 
pist). This should probably be dismissed as 
chance occurrence among so many analyses. 

Again, the significance of the difference is 
reduced if further adjustment is made for 
whether or not drug treatment was given 

(p = .1546 for all cases, and p = .1458 for 
those with only one therapist). 


Discussion 


them as definitive findings from an experiment 


TUMA, MAY, YALE, AND FORSYTHE 


in which both therapists and patients wee 
randomly assigned to a given treatment, 
which they were not; as applying to all types 
of mental disorder besides schizophrenia, which 
they do not; and as applying to the (small and 
select number of) highly experienced persons 
who specialize in the treatment of schizo- 
phrenia, which they do not. They do deserve 
some credence, however, as an epidemiologic 
report from a carefully controlled study with} 
a moderate but carefully defined and mea 
sured range of therapist experience, in which 
patients were randomly assigned to treatment 
and therapists were assigned from a rotating 
roster, with no self-selection of patients. 

Within these limitations, some of the find- 
ings encourage further work on the relation 
ship between therapist characteristics and tht) 
outcome of their treatment of schizophrenic, 
patients. The findings suggest that there art 
possible differences among therapists in the 
results they obtain. Such differences may, hok: 
ever, bear little relationship to therapist ex 
perience or general clinical ability (as defined 
and within the range sampled in this study) of 
to the usual patient prognostic factors. Thy)” 
may turn out to be related more to other! 
therapist characteristics, perhaps of persoja 
ality, sophistication in the use of various treat 
ment strategies and activities, and ability" 
handle paranoid or hostile attitudes. 

The point must be made, however, that thet 
findings do not necessarily mean that Bie 
and experience make mo difference ao 
Obviously an inexperienced and unsuperv 
or inept therapist could abuse or misus¢ ay 


form of treatment, whereas a talented, ; 
i 


tile, and highly experienced therapist a 


optimize its value. A more likely interp"® 
tion of our data is that the kind of intens! 
supervision that our therapists received 
addition to the usual teamwork approach i 
treatment in general may have been suflie! 
to compensate for any differences 
that might otherwise have occurred due 
therapists’ factors. 

In future work, the effect of drug 
must be carefully taken into account, 5 
this powerful form of treatment seem 


Ł 5 in ow! 
override or compensate for differences pe 


q 
i 
in 
t 
jj 
restl 
to 
treatmet 

sind! 
u 
H 
ng 


come that might otherwise occu 
therapists. 
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‘Tt is concluded that future research in this 
area should focus on identifying Specific 
therapist characteristics other than general 
clinical experience and general clinical ability 
(as defined in this study) that relate to specific 

“outcome characteristics. In this way progress 
might be made toward eventually matching 
therapist and patients when a specific out- 
come characteristic is desired. Perhaps, there- 
fore, the research questions should be re- 
phrased in more sophisticated form: At what 

“point for each particular type of clinical 
problem and for each particular type of treat- 
ment do training and experience begin to in- 
fluence treatment results, and is there a point 

of diminishing returns? What is the shape of 

the experience/effectiveness curve? Is there an 
optimum cost-effective point in training and 
experience? 

It may be that the future will bring rigorous 
experiments designed to answer such ques- 
tions. At this time, however, we are forced to 
rely on less definitive material. 


Reference Note 


1, Cole J. O. Personal communication, December 12, 
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An MMPI Scale to Separate Brain-D. 
-Damaged 
From Functional Psychiatric Patients 7 
in Neuropsychiatric Settings 


Charles G. Watson and Du: 
G. ane Plemel 
Veterans Administration Hospital, St. Cloud, Minnesota 


An empirical Minnesota Multiphasic Personality Inventory scale was developed 


to separate brain-damaged from functional psychiatric patients. It consisted of 


56 items, which significantly differentiated organi i 

fe i rganic and functional groups in 
psychiatric hospital and was named the Psychiatric-Organic (P-0) aes Upon 
, cross-validation it was found capable of separating organics from process and 


reactive schizophrenics, alcoholics, and neurotics, as well as patients with char- 


acter disorders and affective psychoses. Additionally it was found that by using 
the scale in combination with a traditional brain-damage test (the Benton 


Visual Retention Test), better discrimination could be achieved than was pos- 
sible with either measure alone. The comparative probabilities of functional and 


The literature on the separation of brain- 
| damaged from nonorganic psychiatric patients 
via measures of ability is generally unen- 
| poeng: Attempts to differentiate organics 
| Ss schizophrenics with such instruments as 
“i Halstead battery (Watson, Thomas, 
i ersen, & Felling, 1968); the Trail-Making 
hose (Brown, Casey, Fisch, & Neuringer, 
ee the Critical Flicker Fusion Test 
Re Bae Thomas, Felling, & Andersen, 1969) ; 
i ender—Gestalt ; and the Graham-Kendall 
ee ee Test (Watson, 1968) 
4 n often met with failure. Somewhat more 
ae has been achieved by investigators 
a 8 personality measures to separate the 
| ° groups (Russell, 1975; Watson, 1971; 
Nee & Thomas, 1968). The Minnesota 
eee, Personality Inventory (MMPI) 
‘a qetrenia- Organicity (Sc-O) scale, which 

3 leveloped for that purpose (Watson, 1971), 

Proven to be of value and has been cross- 
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organic diagnoses for various P-O scale ranges are presented. 


validated successfully in at least six settings 
(Ayers, Templer, & Ruff, 1975; Holland, 
Lowenfeld, & Wadsworth, 1975; Watson, 1971; 
Andersen, Felling, & Seitz, Note 1; Gilberstadt, 
Note 2). In addition, personality and ability 
measures have been combined to enhance 
separation of organic and schizophrenic pa- 
tients (Watson, 1973). 

There is also reason to believe that the 
separation of organics from other functional 
disorders is problematic as well. Watson, Davis, 
and Gasser (1978) found that such commonly 
used instruments as the Halstead Category; 
Benton Visual Retention; Smith Digit Modal- 
ity; and Wechsler Adult Intelligence Scale 
(WAIS) Digit Span, Block Design, and Object 
Assembly tests are of little value in the differ- 
entiation of psychiatric hospital organics from 
depressives. These findings suggest that re- 
search designed to develop a personality scale 
capable of separating organics from individuals 
suffering from all types of functional disorders 
is needed. Although one might have hoped that 
the Sc-O would separate nonschizophrenic 
functional groups from organics, Watson’s 
(1973) article indicates that it does not, and 
that a new scale is needed. Accordingly, the 
research program described here was formu- 
lated to develop an MMPI scale for that 


purpose. 
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Table 1 p 
Mean Age, Education, and Estimated Full 
Scale IQ in Scale-Construction Groups 


Variable Brain damaged Psychiatric ¢ 
Age 48.3 40.0 3.71* 
Education 10.1 11.3 2.66* 
Estimated IQ 96.0 109.0 4,75* 
*p < 01. 

Method 


Scale Construction 


Patients referred to the Psychology Service at the 
St. Cloud (Minnesota) Veterans Administration 
Hospital specifically for evaluation of possible brain 
damage served as subjects for item selection. The brain- 
damaged group consisted of 40 individuals who were 
eventually given diagnoses of organic brain syndrome, 
and whose physician, ward nurse, and psychologist 
all agreed on the basis of history, laboratory, and/or 
clinical findings that brain damage was present. (The 
raters were instructed to make their assessments 
independent of psychological test data.) The controls 
were 60 patients who had received functional diagnoses 
and whose ward physician, psychologist, and nurse 
agreed that clinically detectable brain damage was 
not present. Only patients under 60 years of age were 
included. When more than one testing was available, 
the MMPI scores taken at the time point closest to 
referral were used. Only potential subjects with MMPIs 
taken within 1 month of referral were included, 

As might have been expected, the groups differed 
significantly on mean age, education, and Henmon- 
Nelson-based estimate (Watson & Klett, 1975) of 
WAIS Full Scale IQ. These means are reported in 
Table 1. According to the patients’ clinical files, the 
causes of brain damage in the organic group were 
alcohol (26), trauma (8), cerebrovascular accident 
and multiple sclerosis (2 each), and presenile dementia 
and brain surgery (1 each). The diagnoses of the 
controls were alcohol addiction (35); schizophrenia 
(8); depressive neurosis (7); schizoid personality and 
habitual excessive drinking (2 each); and drug intoxi- 
cation, adjustment reaction to adult life, anxiety 
neurosis, episodic excessive drinking, passive-depender.t 
personality, and inadequate Personality (1 each), 


Results 


Chi-square tests were then run on each 
MMPI item to determine which items signifi- 
cantly differentiated the two groups. Fifty-six 
were significant at the .05 leve 


tee l and were, 
collectively, eled the Psychiatric-Organi 
(P-0) scale. in 
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The (Form R) items selected, and the 
directions consistent with an organic diagnos 
were 3, 95, 96, 132, 133, 137, 170, 198, w, 
264, 287, 310, 329, 392, 450, 498, 510, 52) 
(True); 9, 21, 38, 41, 51, 61, 86, 90, 106, 12 
156, 158, 168, 179, 187, 191, 192, 195, 212, 217 
224, 225, 226, 266, 277, 284, 305, 307, 308, 311 
317, 325, 366, 372, 425, 468, 511, and 54 
(False). Eighteen items were keyed “true 
and 38 “false.” 


Scale Characteristics 


The internal consistency of the scale wa 
assessed via Ebel’s (1951) intraclass correlatia 
technique. It yielded coefficients of .90 for th 
organics and .68 for the controls. This dispant 
is puzzling. It may reflect the confusio 
commonplace in the test-taking performan 
of psychiatric patients or the heterogeneity 
the controls. y 

A moderate amount of item overlap exist 
between P-O and the other scales. 4 
percentages of the items on the various clini 
scales that also appear on P-O are detailed 
Table 2. Since 10% of all the MMPI it 
are included on P-O, meaningful overlap! 
represented by the extent to which the P 
centages exceed 10%. The table reveals 
none of the overlaps were high, the larg 
being only 20%. The greatest overh A 
peared between P-O and the Lie (L), i 
chasthenia (Pt), and Schizophrenia (S¢) sc 
The overlap with Social Introversion (Si 
strikingly low. p 

Correlations were run between the 
scale and each of the MMPI validity i 
clinicalscales for the scale construction samp 
These correlations, as well as those A 
P-O scale with the Sc-O scale develope 
Watson (1971) to separate oe if 
schizophrenics, did not differ sign! al 
from group to group. Therefore, the së H 
were pooled; the correlations for the nA 
groups are also presented in Table “ail 
largest correlations appeared on the val 
scales and on Pt and Sc. Lesser Bee d 
correlations appeared between P-O p i 
other MMPI scales studied except Aa 
chondriasis (Hs) and Hysteria (H9). ha 
the existence of a slightly higher than 


th 
quantity of overlap between P-O and bo 
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Percentages of Clinical/ Validity Scales Consisting of P-O Items, and Their Correlations ith P-O 
1 wi. a 


Overlapping items 


| MMPI Same Opposite 
f scale direction direction Total Se a % 
0 r 

| L 
A Ñ p 3 15 20 55* 

we ? k 2 64 3 —.61? 
A 3 i 4 30 13 62" 
4 $ t 4 33 12 08 
a f 3 8 60 13 —.36* 

| % 4 4 8 60 13 =.01 
f 0 ; 9 50 18 —.49* 
Me x 8 60 13 —.52* 
R 0 4 4 40 10 —:.57* 
P 0 9 9 48 19 —.61* 
A 4 11 13 78 17 —.57* 
k í 6 6 46 13 —.35* 
Si 1 1 70 1 —.47* 
dre o ee ee ee 


d Hy, the correlations between P-O and 
those two scales were low. These low correla- 
ons may reflect the fact that somatic symp- 
ms are moderately common in both organics 
d psychiatric patients. The size and breadth 
e correlations with those MMPI scales 
Moeng since they suggest that the 
4 scale reflects functional pathology of 
i 4 Sorts. Also encouraging was the presence 
A ot correlation between the P-O 
4 'c-O scales, both of which were designed 
Y parate organics from functional, although 
ewhat different, psychiatric samples. 
T Was also moderately correlated with 
M = 30) ; this coefficient can be inter- 
a as indicating that the scale capitalizes 
; 4 tendency for brain-damaged patients 
ie T than functional patients and makes 
ity f e contribution of age-related person- 
actors to the discrimination of the two 
Ups from one another. 


toss-Validation 


a Scale was cross-validated twice with 
Re drawn from the St. Cloud Veterans 
a eatoni Hospital. In the first, the 
tom fhe administered to 100 patients, none 
hs a Scale construction samples, referred 
in ee of possible brain damage. After 

8 had been completed, the patient’s 


physician, nurse, and ward psychologist were 
consulted. Only those patients in whose cases 
all three agreed on whether the patient was 
brain damaged or not were used in the study. 
P-O scores were not available to the raters. 
Scores of 40 organics and 60 controls were 
thus collected. As described in their clinical 
files, the causes of brain damage were alcohol 
(26 cases); trauma and alcohol (4); trauma, 
infection, and cerebrovascular accident (2 
each); and Wilson’s disease, cerebral arterio- 
sclerosis, Sydenham’s chorea, and unknown 
cause (1 each). Preadministered MMPIs were 
then collected from the subjects’ Psychology 
Service files and compared. The mean P-O 
score of the organics was 31.6 (SD = 7.7), 
whereas that for the controls was 25.4 (SD 
= 9,5). The difference was significant, 1(98) 
= 3.43, p < .005. This finding was interpreted 
as being very encouraging, since the samples 
drawn consisted of subjects in whose case the 
organic diagnosis was sufficiently problematic 
to require special evaluation. 

As a further test, the scale was scored on a 
sample of recently admitted/readmitted males 
under 60 at the St. Cloud Veterans Adminis- 
tration Hospital. According to their clinical 
file diagnoses, these subjects consisted of 30 
organics, 55 neurotics, 98 alcoholics (men 
with diagnoses of alcohol addition or habitual 
excessive drinking), 56 character disorders 
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Table 3 

Mean P-O Scores, Standard Deviations, and 
ts for "Differences Between Organic and 
Functional Groups 


Sample M SD me 
Organic 33.2 6.5 = 
Process 26.4 9.3 3.89** 
Reactive 27.2 8.2 3.59** 
Neurotic 27.2 9.1 3.07 
Alcoholic 27.2 8.8 4.04** 
Character disorder 26.8 9.1 He bad 
Affective psychosis 26.8 10.2 2.29* 


*p < 05. 
** p < 001. 


(men with diagnoses of personality disorder 
or sexual deviancy), 17 affective psychotics, 
and 105 schizophrenics. The schizophrenics 
were split into process and reactive groups by 
the Ullmann and Giovannoni (1964) Process- 
Reactive Scale. Those 54 who produced scores 
of 13 or above were characterized as process, 
whereas those 51 with scores of 12 or less were 
classified as reactives. As described in their 
clinical files, the causes of brain damage in the 
organic sample were alcohol (10); trauma (8); 
both trauma and alcohol (3) ; and Huntington’s 
chorea, atrophy, arteriosclerosis, Wilson’s dis- 
ease, abscess, presenile dementia, cerebro- 
vascular accident, multiple sclerosis, and tumor 
(1 each). These diagnoses were also made 
without the benefit of P-O scores. An F test 
run between the means of the seven groups 
was significant, F (6, 354) = 2.36, p < .05, and 
t tests were run between the organics and each 
of the functional groups. These are displayed 
in Table 3. The reader will note that the mean 


Table 4 


Percentages of Organic and Functio 


and Odds for Organic/ Functional Diagnoses 


nal Subjects at Various P-O Score Ranges 
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for the organics was 33.2, whereas those for 
the six functional groups ranged from 26; 
to 27.2. All és were significant at at least thd 
.05 level, and five of the six were significan 
at the .01 level. 


P-O Score Interpretation 


Although the és are encouraging, the rel 
tively high standard deviations suggest thit 
P-O scale scores between the means of 
organic and functional groups are probabi 
of dubious value in the separation of organ 
from functional patients. Therefore, i 
cutting scores are not presented here. Inste 
distributions of the P-O scores for the 
organics and 391 functional patients in 
two studies were plotted for the purpose) 
identifying the relative probabilities of orga 
and psychiatric diagnoses among scorers 
various P-O scale levels. These percentagt 
which are presented in Table 4, allow ú 
diagnostician to make a “better odds” ? inte 
tation of P-O scores. For example, given eq 
base rates, scores between 15 and 20 wert 
times more common among functional ti 
organic patients. Interpretation of these sc 
of course, should be tempered by a consid 
tion of local base rate. With even base ral 
scores below 27 are more typical of psychia 
than organic patients, whereas those above 
are more common among brain-dam 
patients. Scores between 27 and 32 contri 
little to diagnostic differentiation. 


we. 


Combined Use of P-O and Ability Measures 


In at least two studies (Watson, i 
Watson et al., 1978), the use of pers 


Percentages Odds 
P-O range Organic Functional Organic Functional 
30-44 7 i 
9 2.78 

Se 30 21 1.43 1 

SE 24 21 1.14 1 
A 15 23 1.53 
15-20 3 a ! F 
014 1 10 i 10.00 
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and ability measures in concert substantially 
improved on the discriminative power made 
available by either alone. Following the 
technique used in those studies, the scores of 
the 70 organics and 391 controls in the two 
cross-validation studies on the P-O scale and 
on the Benton Visual Retention Test (BVRT) 
error scores were graphed on a scatterplot. 
(The BVRT was used because earlier research 
[Watson, 1965; Watson, 1968; Watson et al., 
1968, 1969] has suggested that it is better 
able to separate organic from schizophrenic 
patients than most other visual/motor tests 
for brain damage.) A straight line was then 
drawn in such a fashion as to maximize 
discrimination between the two groups (see 
Figure 1), This discrimination line yielded hit 
rates of 69% and 75% for the organic and 
the control samples, respectively, and an (un- 
weighted) average hit rate of 72% over the 
two samples. These hit rates are somewhat 
better than the best that could be obtained 
using either P-O (organics, 77%; controls, 
52%; M = 64.5%; cutoff = 27.5) or BVRT 
(organics, 60%; controls, 77%; M = 68.5%; 
cutoff = 9.5) scores alone. 


Discussion 


The results of the cross-validations are 
generally encouraging. They indicate that the 
P-O scale can be used with moderate accuracy 
to separate organics from psychiatric patients 
and that it can improve the prediction avail- 
able from at least one ability-oriented test. 
Additionally, the item overlaps with the 13 

MPI validity and clinical scales are low 
enough to indicate that it makes a contribution 
to assessment that is not available from any 
One of those scales. 

Additional cross-validations from other 
laboratories are needed. In particular, the 
Utility of the scale with female samples needs 
assessment; earlier research (Watson, 1971) 
indicated that the Sc-O scale was of no value 
m Separating organicand schizophrenic females. 

5 appears that the personality correlates of 

rain damage may vary with sex, and the P-0 
Scale’s validity as a differentiator of female 
organic and functional patients should not be 
assumed, 

Itis particularly important that readers not 
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MMPI P-O Scale 


Figure 1. Optimal cutting line for separation of organic 
and functional patients. 


view the P-O scale as a technique for identify- 
ing brain damage in nonpsychiatric settings. 
There is no evidence that it will separate 
organics from psychiatric normals or medical 
patients, and earlier research has indicated 
that our Sc-O scale does not separate organics 
from chronic pain or spinal-cord injury patients 
(Sand, 1973). 
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Pia “Biosyntonic” Therapy: 
Modification of an Operant Conditioning Approach to Pedophilia 


J. Dennis Nolan and Curt Sandman 
Ohio State University 


A modified operant conditioning approach for treating sexual behavior prob- 
lems, termed “biosyntonic” therapy, is introduced. Following physiological di- 
agnosis, a child molester was successfully taught to alter his inappropriate 
physiological arousal to prepubescent females without disrupting his arousal to 
adult females. In addition to eliminating his child molesting, the procedures 
also improved the correspondence between his physiological and verbal response 
patterns and decreased his anxiety. The treatment remained successful at a 


6-month follow-up. 


Deviant sexual behaviors have been much 
More resistant to traditional treatments than 
other behavioral problems (Bieber, 1962). 
Even aversive conditioning approaches, 
which seem to be the treatment of choice for 
Some deviant sexual behaviors (O'Leary & 
Wilson, 1975), are not as effective for sexual 
behavior problems as for other behavior prob- 
lems, In fact, radical surgical techniques such 
as castration have been proposed seriously 
as the only adequately reliable intervention 
for such psychopathic sexual behaviors as 
pedophilia (child molesting). In this study 
We report the successful application of an 
approach we have labeled “biosyntonic” to a 
formerly intractable sexual behavior prob- 
lem, This approach offers a promising alter- 
hative that does not preclude all sexual be- 
havior. It attempts to bridge the hiatus be- 
tween so-called mental and physical aspects 
of human behavior generated by traditional 
approaches to understanding and modifying 
behavior, 

_Most forms of psychotherapy rely exclu- 
sively on the client’s verbal behavior (in most 
Cases the real concern is with some internal 
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state or “mental” activity) and, except in 
very rare situations, completely ignore physio- 
logical processes, The behaviorally oriented 
therapist often relies on the observation of 
behavioral changes in the client, but again, 
without regard to the physiological state of 
the organism. Conversely, although scattered 
reports of the behavioral consequences of 
conditioning physiological responses exist, for 
the most part they deal with behaviors of 
only indirect practical relevance (Beatty, 
Greenberg, Deibler, & O'Hanlon, 1975; 
Blanchard & Young, 1973; McCanne & 
Sandman, 1974). Biofeedback approaches 
have been partially successful in the treat- 
ment of some physical disorders (e.g, func- 
tional auricular arrhythmias; Weiss & Engel, 
1971), especially when the modified response 
system is part of the symptom at issue. At- 
tempts to characterize the subtle yet per- 
vasive physiological processes underlying cer- 
tain behaviors and diagnostic classes have 
also been reported (Greenfield, Katz, Alex- 
ander, & Roessler, 1963; Lacey, 1959; Lacey 
& Lacey, 1970; McCarron, 1973; Sandman, 
1975; Shagass & Schwartz, 1962), but no at- 
tempts have involved all of these perspectives 


1A superior court judge in San Diego, California, 
recently authorized castration (as an alternative to 
life imprisonment) of a child molester. The judg- 
ment was based on a psychiatrist's report that the 
molester was aware of what he was doing and could 


not be cured of his perversion. 
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in the treatment of serious human behavior 
problems. 


Rationale for Modification 


In most aversive conditioning approaches to 
modifying undesirable behavior, the condition- 
ing process involves pairing an aversive stimu- 
lus with the target stimulus (e.g., O’Leary & 
Wilson, 1975). For example, in the case of 
a pedophiliac, electric shock might be sys- 
tematically paired with slides of children or 
perhaps with the patient’s own imagery in- 
volving a child. In more sophisticated uses of 
such procedures, delivery of the aversive stim- 
ulus may be on a variable rather than a fixed 
schedule (e.g., Marks & Gelder, 1967), but 
the delivery of the aversive stimulus is in- 
variably contingent on the presentation of 
the target stimulus, not on the patient’s physi- 
ological response to that stimulus. Hence, in 
the usual aversive conditioning paradigm, 
shock is delivered regardless of whether (or 
when) the patient’s response to the target 
stimulus is inappropriate. In the approach 
described in this study, the reinforcing stimu- 
lus is delivered only when the patient’s physi- 
ological response to a target stimulus suggests 
that he is aroused. This approach insures 
that conditioning is problem specific yet does 
not depend directly on the patient’s verbal 
report of his arousal. The reinforcing stim- 
ulus can be delivered immediately and accu- 
rately at the very onset of the physiological 
response without relying on either the accu- 
racy or the speed of the patient’s verbal re- 
port. Even if the verbal report were consid- 

ered accurate, it probably would not occur as 
quickly or as reliably as the physiological 
response. Any resulting delay or unreliability 
of the delivery of the reinforcing stimulus 
could weaken the effectiveness of the condi- 
tioning procedure, The “biosyntonic” ap- 
proach avoids both of these potential prob- 
lems, An additional advantage of this ap- 
proach is that the reinforcing stimulus can be 
oe or terminated during the presenta- 

ion of a target stimulus whenever changes in 
the patient’s physiological response warrant 
such action, 


We have introduced the term “biosyntonic” 


to refer jointly to the s; chron: i 
ological syst a aA ae 


ems and to the correspondence 
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between thoughts or attitudes and the p 
cal state of the organism. That is, w 
concerned with the syntony among Wi 
(mental), physiological, and gross beha 
aspects of the human experience. A 
assumption of this approach is that attit 
emotions, and behavior are intimately re 
to the physiological state of the organism 
physiology is not considered a mere pa 
reflection of the psychological state but 
an active (though probably incomplete) < 
terminant of attitudes, emotions, motives, @ 
beliefs. Thus, it is our assumption thal 
haviors and all their attendant psychol 
constructs are represented by distinct phy 
logical patterns, each pattern relating 11 
different psychological state. To further 
plicate the matter, a sizable body of res 
(Hein, 1969; Lacey & Lacey, 1958; Me 
& Sandman, 1975; Sandman, 1975) suggë 
that such patterns are highly idiosyncral 

both persons and situations. From sul 
“biosyntonic” perspective, then, the wo) 
the therapist is to carefully identify ( 
nose) the physiological state of the indi 
in specific situations before attempt! 
intervene. Once the relationship be 
mental and physical activity has bee 
served in nonproblem areas, then the the 
can approach the problem areas searchin 
evidence of disharmony in this relations 
The goal is to target physiological systems 
the problem area that appear to be disp 
with the patterns of responses in the mi 
normal areas of the client’s life. Once a Pi 

ological system has been observed f 
abnormal in the sense described herem 
“biosyntonic” position implies that by n 
ing the physiological pattern, the client ¢ 
helped to experience a change in attitul 
emotions or both and, concomitantly, 
perience a change in overt behavior. Al 
several studies conducted in our labor 
with normal subjects support this J 
(Baker, Sandman, & Pepinsky, 1975; A 
& Sandman, 1975; McCanne & Sa 

1974), the present case study presen 

first successful clinical test of our thesis: 


Case Report 


Mr. J., a 32-year-old blue-collar E i 
had a history of numerous sexual exp 
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with children (nearly all females) dating from 
his teens. Fearful of the personal and legal 
consequences of the discovery of his experi- 
ences with a female child, he consented to the 
procedures described below as a last resort. 
Traditional approaches had been tried with- 
out success. 


Assessment 


Preliminary diagostic information indicated 
extreme sexual attraction for prepubescent 
females, although Mr. J. maintained an ade- 
quate sexual relationship with his wife. An 
extensive pretreatment physiological assess- 
ment was conducted by attaching electrodes 
for the measurement of heart rate, peripheral 
vasomotor activity, respiration, and skin po- 
tential. Mr. J. was reclined in a comfortable 
chair and viewed a standard sequence (Sand- 
man, 1975) of pictures presented on a screen 
in front of him, The series included sexually 
arousing, neutral, and highly distressing (€.g., 
Mutilated corpses) slides. He rated each of 
them on a 9-point pleasure-stress scale. 
Physiological recording was done with a Grass 
Model 7B polygraph equipped with appro- 
priate preamplifiers and driver amplifiers that 
Were housed in the adjacent control room. 

Mr. J. exhibited differential physiological 
responses to pleasurable, neutral, and un- 
pleasurable stimuli only in heart rate. His 
vasomotor and electrodermal responses were 
Not differentially responsive to stimuli of dif- 
ferent pleasure ratings. Hence, we hypothe- 
sized that the cardiovascular response system 
could be a significant component of his total 
tesponse to pleasurable stimuli, We therefore 
focused on heart rate in a discriminative con- 
ditioning paradigm used throughout 16 ses- 
sions; the conditioning was designed to alter 
his physiological responses to female children 
Without disrupting ‘his responses to adult 
females, 

Initially, four sets (male and female adults 
and male and female children) of six stimuli 
tach were chosen. Mr. J. was asked to bring in 
Pictures of preadolescent girls that he found 
arousing to varying degrees. Very prevalent 
oe the most arousing stimuli were pictures 
‘ken from mail-order catalogues of girls mod- 
‘ling underwear. Upon request he also brought 
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comparable pictures of preadolescent boys, 
though he claimed none of these were highly 
arousing. He selected adult male and female 
stimuli from a group of slides available at the 
treatment center, Mr. J. rated each of the 
stimuli in each of the four sets in terms of 
his perceived sexual arousal to them. Ratings 
were based on the 9-point pleasure-stress 
scale described earlier. 

The most striking feature of his physiologi- 
cal response pattern to these four sets of 
stimuli was the extremely elevated heart rate 
response to the pictures of the semiclothed 
female children, However, unlike his response 
to nude adult females, his verbal report of 
arousal to the slides of children was not con- 
sistent with his physiological response to 
them? (Figure 1A). Neither his verbal nor 
his physiological responses suggested arousal 
to male children or male adults. Therapy 
was therefore designed to accomplish both 
the reduction of the inappropriate arousal to 
female children and improvement of the con- 
gruity between the patient’s verbal and physi- 
ological responses to such children, 


Results of Treatments 


In Phase 1 of the treatment, heart rate 
increases during the presentation of slides of 
young female children were punished by the 
administration of electric shock (4-6 mA) to 
Mr. J.’s index finger. Heart rate was detected 
by level detectors (BRS Digibit Logic Mod- 
ules) with preset criteria, The detectors were 
connected to a stimulator (Chicago-Nuclear) 
that automatically delivered electric shock 
when Mr. J.’s heart rate exceeded the cri- 
terion. The criterion for heart rate was the 
90th percentile of resting heart rate range. 
Periodic adjustments were made for basal 
heart rate shifts, As illustrated in Figure 2a, 
throughout the first session there was a sub- 
stantial difference between the heart rate dur- 


2 An analytic interpretation of this finding could 
be that the patient is employing a psychological 
defense, denial or repression, and thereby is not 
“able” to consciously report his true feeling about 
the stimuli. Lazarus (1968) has demonstrated that 
the process of denial can even effect the physiologi- 


cal responses to stimuli. 
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PRE l 2 3 4 5 Caa Li 
HEART BEATS 


Figure 1, Correlation coefficients between verbal ratings of arousal and heart rate during Oe 
tion of slides and mature and immature females (1A) before and (1B) after “biosyntonic” therapy: : 
(Data are for the first eight heartbeats of each stimulus presentation.) 


ing presentation of slides of female children 
and the rate during slides of adult females (p 
< .05 for beats 1 and 6; p < .01 for beats 7 
and 8)." By the second session, however, the 
abnormal response to the children had been 
eliminated (Figure 2b; no significant differ- 
ences), Responses to the two male stimulus 
classes remained unchanged, and we therefore 
discontinued presentation of slides of males at 
this point. To guard against the possible 
effects of habituation, during the third session 
new slides of semiclothed female children 
were introduced without compromising the 
effects of therapy. At the end of this session, 
Mr. J. reported the first incident in which he 
experienced a sharp pain in his index finger 
(where the shock had been delivered) when- 
ever he felt the beginning of what he termed 
his “automatic arousal” response to observing 
preadolescent girls in provocative positions 
during his daily activities, The effect of this 
conditioned pain response was the abortion 
of the automatic arousal response and the 
complete elimination of the overt child mo- 
lesting that had Previously occurred under 


such conditions. He also reported that $ 
could no longer find pictures of young £l 
that were sexually arousing in any of 
catalogues or magazines he had used pre 
ously, This report was considered a very 
tive sign, but continued treatment was ni 
theless considered essential. Therefore, } 
introduced a new and potentially more i 
vocative set of slides of a female child 
Poses ranging from fully clothed to nude. 
J.’s dramatic heart rate response to the v 
provocative stimuli is illustrated in rie 
Our Phase 1 procedures were again © a 
in reducing Mr. J.’s responses to those pi n! 
ful stimuli (Figure 2d), but they di a 
eliminate the responses (p < .05 for i 
and 7; p < .01 for beats 1, 2, 3, 4, 5, am É 
A more powerful conditioning technique ™ 
needed. 

In Phase 2 we attempted to enhance © 


%One-tailed ¢ tests were performed on ee 
beat differences in heart rate during slides b E 
females versus slides of young girls. The ke 
are listed for all statistically significant resuta 
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Figure 2. Heart rate patterns during presentation of provocative pictures of mature and immature 
females through the course of treatment, (Data are for the first eight heartbeats of each stimulus 


Presentation.) 


discrimination by providing positive rein- 

forcement (monetary reward based upon the 

amount of time Mr. J. kept his heart rate 

above criterion) for appropriate heart rate 

responses (i.e, responses that were consis- 
8 


© o 
N à 


AVERAGE HEART RATE (BPM) 


3 4 
g l : HEART BEATS 


Figure 3. Heart rate response to prepubescent females before and after 


tent with the responses of the diagnostic ses- 
sions) to the adult females while continuing 
punishment for heart rate increases to pic- 
tures of the child, In Figure 2e it is evident 
that the pervasive effect of the provocative 


— BEFORE TREATMENT 
=== AFTER TREATMENT 


“piosyntonic” therapy. 


(Data are for the first eight heartbeats of each stimulus presentation.) 
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Figure 4. State anxiety scores for each session before and after “biosyntonic” therapy. 


child had diminished and the acceleratory 
response had changed by 9 beats. However, 
even though this effect was dramatic and 
substantial, we were still somewhat dissatis- 
fied with the discrimination (p < .05 for beat 
6; p < .01 for beats 7, 8). It seemed as if the 
aversive conditioning procedures, though ini- 
tially essential in establishing the reinforce- 
ment contingencies, also might have contami- 
nated both the heart rate response and the 
discrimination. The usual response to shock, 
for example, is heart rate acceleration. In 
order to eliminate this possible contamination, 
we introduced Phase 3. 

In Phase 3, positive reinforcement remained 
contingent upon heart rate acceleration to 
adult females, but positive reinforcement was 
also presented for inhibition of the accelera- 
tory heart rate response to slides of the young 
girl. This procedure proved highly effective 
within two sessions, as illustrated in Figure 
2f (no significant differences). We continued 
the Phase 3 procedure for two additional ses- 
sions with no incidents and terminated treat- 
ment after arranging for follow-up evalua- 
tions. 

At the end of the therapy the patient’s ex- 
treme heart rate accelerations to pictures of 
young female children had been eliminated 


(Figure 3),* and the correspondence between 
his verbal and physiological response patterns 
was normalized (Figure 1B). He had not 
assaulted a child for several months (com! 
pared with a history of almost weekly inc 
dents). i 

In addition to the physiological data a 
the patient’s verbal reports, state an 
(Spielberger, Gorsuch, & Lushene, 1969) a 
were collected for each session, both oa 
and after treatment. There was a consiste 
decrease in anxiety over the 16 7 
with a marked drop at Session 12, apparent y 
reflecting the elimination of the shock © 
tingency (Figure 4). 


Follow-Up 
ed, 


inat 
Six months after treatment was termis d 
Mr. J. reported that he had not expen! 


ipefore 
were “subtracted from the rate for each a 
beat. The ¢ tests on the adjusted comparison and 
Statistically significant for beats 3, 4 (p< OM ; 
5 (p<.01). jel- 
ke Pa 14, based on norms given pee 
berger, Gorsuch, and Lushene (1970), eS ol. 
was statistically significant, ¢(15) = 2.78, P+" 


inute 
*To equate starting heart rate, 7 beats/min 
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any incidents of child molesting. He remarked 
that when he was in a situation formerly 
considered provocative, he still experienced 
some pain in his index finger. Of more im- 
portance, he has consciously avoided com- 
promising situations, indicating that he is 
able to control his behavior much more effi- 
ciently than before “biosyntonic” therapy. It 
was of special interest to learn that Mr. J. 
attends X-rated movies at a far higher rate 
than he did before therapy. He had been to 
only one such movie before therapy, but in 
the first 6 months following therapy, he at- 
tended about nine. Whether this is the direct 
consequence of positively reinforcing his 
physiological responses to nude adult women 
in our procedure, or a compatible symptom 
Substitution, is uncertain. In any case, the 
result is that a socially condoned activity has 
apparently replaced his psychopathic and de- 
Viant behavior. 


Discussion 


The new procedures reported here suggest 

4 dramatically successful alternative to the 
_ treatment of an intractable psychopathic sex- 
“wal behavior problem. The important pro- 

tedural distinction between the “‘biosyntonic” 

therapy approach and other conditioning ap- 
 Ploaches is that in “biosyntonic” therapy the 
delivery of reinforcement is contingent on 
“the appropriateness of the patient’s physio- 
“logical response to a stimulus and not merely 
On the presentation of the stimulus. In the 
Present case, “biosyntonic” therapy insured 
t the conditioning was problem specific— 
fen though it was not directly dependent on 
le patient’s verbal report of his arousal. 
Apparently, it affected the underlying state in 
Such a way that only the psychopathic com- 

Ponent of his sexual drive was altered. 

_ We are convinced that the success reported 
here is largely dependent on the careful diag- 
tostic evaluations that allowed the identifica- 
On of a specific physiological system (in this 
ase, heart rate) as central to the psycho- 
Pathic behavior problem in terms of both its 
ferential responsiveness to stimuli and the 
Parity between the physiological and verbal 
dices of arousal. If correspondingly careful 

lOstic work indicates comparable physio- 
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logical substrates to other behavioral prob- 
lems, a “biosyntonic” approach to therapy 
may offer a viable treatment approach to 
other problems. 
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Brief Reports 


Female Criminal Violence and 
Differential MMPI Characteristics 


Patricia B. Sutker, Albert N. Allain, and Scott Geyer 
Department of Psychiatry and Behavioral Sciences 
Medical University of South Carolina 


A cross-validational approach was used to compare Minnesota Multiphasic Personal- 
ity Inventory (MMPI) scale elevations and profile patterns produced by female 
murderers and nonviolent offenders in two geographic regions. Murderers from both 
prison sources produced subdued group mean profiles, whereas nonviolent offenders 
were characterized by elevations on Scale 4. Discriminant function classification was 
highly dependent on scores on Scales 4, 5, K, and A and correctly identified 82% of 
violent and 78% of nonviolent offenders. A principal-components analysis yielded 
five components or profile types, but only the component defined by high positive 
loadings for Scale 4 differentiated between the groups. 


_ Representing perhaps the antithesis of tradi- 
tional womanhood, female aggressivity would 
seem to hold great interest for investigators. Yet 
few studies have described psychosocial factors 
that differentially characterize violent and non- 
Violent women (Climent, Rollins, Ervin, & 
Plutchik, 1973; Cole, Fisher, & Cole, 1968). 
More often, empirical studies of the psychologi- 
cal parameters of violent behavior among men 
have been undertaken, with the work of Megargee 
and associates as the most comprehensive exam- 
ple (Megargee, Cook, & Mendelsohn, 1967; 
egargee & Hokanson, 1970). Assuming that 
certain personality or cognitive characteristics 
May be predictive of female criminal violence, 
€ Present study used a cross-validational ap- 
Proach to identify personality dimensions or pat- 
tems measured by the Minnesota Multiphasic 
Petsonality Inventory (MMPI) that might be 
| ae to extreme violent behavior among female 
ons, 

Subjects were women convicted for murder 
4nd nonviolent offenses in the Louisiana Correc- 
tional Institute for Women (LCIW) and the 

men’s Correctional Center (WCC) in South 
f lina, Selection in both prisons involved the 
cllowing steps: random identification of roughly 
One third of the names from prison rolls (150 in 
maw; 180 in WCC), systematic reduction and 


op ests for reprints and for an extended report 

ae Study should be sent to Patricia B. Sutker, 

Me ment of Psychiatry and Behavioral Sciences, 

Acal University of South Carolina, 171 Ashley 
venue, Charleston, South Carolina 29403. 


replacement of women unable to read sufficiently 
well to complete the instruments (10%-15% of 
those initially selected) or unwilling to partici- 
pate (2%-5%), and posttest exclusion of women 
convicted for violent crimes less extreme than 
murder or manslaughter for the violent group 
and of women convicted for nonviolent offenses 
with a record of violent crimes for the non- 
violent group. Both samples also excluded newly 
inducted inmates (imprisoned less than 5 months). 
The final breakdown of LCIW women (n = 32) 
included 12 violent and 20 nonviolent offenders, 
and of the 30 women selected at WCC, 10 had 
been convicted of murder or manslaughter and 
20 of drug or property offenses. Distribution by 
race was equal for violent and nonviolent cate- 
gories in both prison samples, and groups did not 
differ on personal or history variables including 
age, education, age at current offense, months 
served on current offense, total time incarcerated, 
and intellectual level. Data collection instru- 
ments were*the Shipley Institute of Living Scale, 
Raven’s Progressive Matrices, the MMPI, and 
a structured private interview. Sixteen MMPI 
scales were scored and converted to T-score 
values: 10 clinical, 3 validity, Welsh’s A and R, 
and Barron’s Ego Strength scales. The revised 
Overcontrolled Hostility (O-H) scale (Megargee 
et al., 1967) was also scored, but no T-score 
transformation was applied. 

Violent criminal offenders responded to the 
MMPI in a less deviant fashion than nonviolent 
felons, and group mean profiles fell completely 
within the “normal” range, LCIW group com- 
parisons showed significant differences on Scale 4 
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alone, F(1, 30) = 4.29, p < .05, and differences 
for WCC were found on Scales 4, F(1, 28) = 
6.10, p<.05, and 2, F(1, 28) = 5.29, p<.05. 
Between-group comparisons for prison groups 
revealed only one difference—WCC murderers 
produced more elevated scores on Scale L, 
F(1, 20) = 6.76, p < .05, than LCIW murderers. 
The two prison samples were combined for sub- 
sequent analyses. For combined samples, mur- 
derers scored lower on Scale F, F(1, 60) = 4.45, 
p<.05, and Scale 4, F(1, 60) = 10,09, p < .01, 
and higher on Scale K, F(1, 60) = 6.07, p <.05, 
and Scale 5, F(1, 60) = 5.09, p < .05. Compari- 
sons on the O-H scale showed no significant dif- 
ferences between violent and nonviolent groups, 
F(1, 60) =.95, p>.05. Prediction of group 
identity using stepwise discriminant function 
analysis was highly accurate (82% of violent and 
78% of nonviolent offenders were classified), 
with categorization heavily dependent on scores 
on Scales 4, 5, K, and A. 
Principal-components analysis with varimax 
rotation identified five independent components 
with eigenvalues greater than 1.00, which ac- 
counted for 78% of the total variance. Compo- 
nents represented anxiety maladjustment, activ- 
ity, conversion reaction, masculinity-femininity, 
and social deviance dimensions. Group compari- 
sons of mean factor scores showed that only the 
factor suggesting social deviance, defined by high 
positive loadings on Scale 4 and moderate nega- 
tive loadings on Scale L, was differentially re- 
lated to offender classification, F(1, 60) = 9.44, 
p < .01. Using the Meehl system of profile classi- 
fication, 50% of the murderer profiles were cate- 
gorized as normal, as compared to 25% of non- 
violent offenders. Within the nonviolent offender 
group, 42.5% of profiles were classified as con- 
duct disorder, 30% as psychotic, and 2.5% as 
neurotic, whereas only 14% of murderer profiles 
were labeled conduct disorder; 36%, psychotic; 
and 0%, neurotic. The most prevalent code type 
among murderers was 4-5/5-4 (23%), with 
1-6/6-1, 2-8/8-2, and 6-8/8-6 each accounting 
for 10% of the profiles. In contrast, the 4-5/5-4 
type was characteristic of 5% of the nonviolent 
offenders. There was only one 4-9/9-4 profile 
among murderers, but this type accounted for 
30% of nonviolent offender profiles. 
Results suggest that there are signi 
reliable relationships between ee ce 
violence and aspects of MMPI performance. 
Women convicted of criminal homicide tended 
to respond to MMPI items in a manner reflective 
i of minimal involvement in a socially deviant 
a pn, with reluctance to admit unusual psy- 
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chological symptoms, limited personal insight, i 
moderate levels of interpersonal anger, and less 
identification with culturally defined femininity 
than nonviolent offenders. Thus, women who 
murdered could be said to be more defensive, 
less in touch with impulses to action, more 
socially conforming, and more removed from a 
stereotyped definition of femininity. 

These data also suggest a negative relationship 
between membership in a violent offender group 
and sociopathic psychopathology—a finding com: 
patible with the notion that so-called aggressive 
psychopaths are found infrequently to be ex 
tremely assaultive and rarely to kill (Gibbens, 
Pond, & Stafford-Clarke, 1959). Findings also 
describe a personality constellation in agreement 
with the Megargee et al. (1967) hypothesis thal 
overcontrolled persons may more often comm 
more violent crimes than those who are under 
controlled, although no differences were observed 
on the O-H scale developed by Megargee et al, 
using male samples to measure overcontrollet 
hostility. ae 

That the masculinity-femininity dimension 
may be implicated as associated with female 
criminal violence is interesting, in that homicidi 
is a crime most easily identified with masculine 
aggressive behaviors. Though Scale 5 may beal 
poor descriptor of social posture regarded M 
feminine, masculine, or both, its significance n 
this study underscores the need for further 1 
search focusing on female attitudes toward if 
men, and society and the relationship of i, 
attitudes toward assertive behavior. Finally, A 
sults point to a striking similarity in MMPI i i‘ 
file configurations across samples, with partic E 
focus on differences on Scale 4, a prepondert 
of normal profiles among murderers, and cor 
tration of conduct disorder classifications fh 
nonviolent offenders. These findings neither 6i 
etalize to less deviant populations nor a 
violent behavior in general. However, i 
target areas for future research and sugges 
need for studies to identify the vano 
sonality or cognitive factors potentially pre 
of inappropriately aggressive behavior E ‘i 
women in less deviant groups. Such ne 
would also have direct implications £0 clas 
myriad of female assertiveness training 
springing up across the nation. 
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Turning On or Turning Off: 
Sensation Seeking or Tension Reduction as 
Motivational Determinants of Alcohol Use 


Raymond M. Schwarz, Barry R. Burkhart, and Samuel B. Green 


Auburn University 


To assess the relative influences of sensation-seeking or tension-reduction motives on 
drinking behavior, 242 college students completed the Sensation Seeking Scale, the 
S-R Inventory of General Trait Anxiousness, and a self-report index of drinking 
behavior. Using correlational and multiple regression procedures, the data consistently 
indicated a strong positive relationship between sensation seeking and alcohol use, 
whereas the relationship between anxiety and alcohol use was nonsignificant. The 
importance of sensation-seeking motives to a comprehensive motivational theory of 


alcohol is discussed. 


Physiologically, alcohol functions to depress 
central nervous system activity. This decrease in 
cortical arousal as a primary consequence of 
alcohol consumption has served as a starting 
point for much of the psychologically oriented 
theory about drinking behavior, because it has 
been assumed that the psychological motivation 
for alcohol use must complement or reflect these 
physiological effects. The concept of tension re- 
duction seemed to be the best psychological ana- 
logue to the decrease in cortical arousal occur- 
ting as a function of drinking; that is, drinking 
leads to decreased cortical arousal, which results 
in a decrease in tension anxiety. 

Recent research addressed to the alleged re- 
lationship between anxiety and alcohol use has 
returned mixed findings. Alcoholic populations 
generally have higher levels of self-reported 
anxiety than normals; however, other studies 
have found no relationship between self-reported 
anxiety and alcohol use with college students. 
To further complicate matters, several studies 
have demonstrated that consumption of alcohol 
is associated with increased mood disturbances. 

In an attempt to explain these discrepant find- 
ings, it has been suggested that researchers dis- 
tinguish between the tension-reducing effects of 
alcohol and the fact that organisms may or may 
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not drink alcohol for its tension-reducing effects. 
It may be that even if alcohol is a pharmacologi- 
cal sedative, people drink for other reasons. 

One such reason can be derived from „the 
concept of optimal level of stimulation. Given 
that alcohol use is not clearly or consistently 
related to tension-reduction needs, it may be that 
a comprehensive theory of alcohol use requires” 
attention to stimulus-seeking needs. Studies of 
premorbid personality characteristics of problem 
drinkers have shown many of them to be impul- 
sive, restless, and nonconforming; this descrip: 
tion bears a marked resemblance to the charac: 
teristics of high stimulus seekers. Additionally, 
some studies have found significant correlations 
between drinking behavior and measures of stiti 
ulus seeking. f 

The present study was designed to examine 
the relationship between tension reduction ant 
sensation-seeking needs and alcohol use. If drink- 
ing behavior is a consequence of one or both 0 
these motives, then the amount and frequency © 
alcohol use should be correlated with measures 0 
the motivational predisposition. : 

The subjects for this study were 242 under- | 
graduates, both male (m= 130) and im 
(m= 112). Each student was asked to on 
the Sensation Seeking Scale (SSS); the SR A 
ventory of General Trait Anxiousness Ss 
GTA); and an index of the individual’s drin 
behavior, which was based on both frequen)” 
and amount of alcohol use. The SSS is comp 
of five interpretable factor scales: Thrill a 
Adventure Seeking (TAS), Disinhibition ve 
Experience Seeking (ES), Boredom Suscepti ee 
(BS), and a General scale of sensation s¢@ 


ight i r 
Copyright 1978 by the American Psychological Association, Inc, 0022-006X/78/4605-1144$00.75 


1144 


The S-R GTA is a self-report inventory designed 
to measure anxiety across four general stimulus 
situations: interpersonal, physical danger, am- 
biguous, and routine situations. 

The zero-order correlations between the SSS 
and the drinking index are presented in Table 1. 
| Significant positive correlations were obtained 
between all five scales and the drinking index. 
The zero-order correlations between the drink- 
ing index and anxiety scales were nonsignificant, 
with the single exception of the correlation be- 
tween the Physical Danger anxiety scale and the 
drinking index. However, this relationship was 
weak (r = —.15, p < .05) and, more noteworthy, 
in the negative direction. 

To _ determine the relative contributions of 
| sensation seeking and anxiety-reduction motives 

to drinking behavior, several multiple regression 
analyses were computed. The drinking index was 
predicted significantly by the SSS, F(5, 236) = 
} 23.68, p < .001; by the S-R GTA, F(4,237)= 
2.64, p < .05; and by both combined, F(9, 232) 
= 13.58, p < .001. In examining the relative con- 
tribution of both scales, it was found that when 
the SSS scores were stepped into the regression 
equation first, they accounted for 33% of the 
total variance of the drinking index scores. The 
anxiety scales contributed only an additional 
1% to the total variance accounted for by the 
SSS scales. When the anxiety scores were entered 
Into the regression analysis first, they accounted 
for only 4% of the total variance in predicting 
the drinking index scores, The addition of the 
SSS raised the variance accounted for to 34%. 
In both analyses, the Disinhibition scale added 
hore to the total R? than the other scales com- 
bined. f 

The results of the present study can be easily 
|‘ummarized. Alcohol use was not correlated posi- 
lively with the level of self-reported anxiety even 
When anxiety was considered as a multidimen- 
[onal construct. On the other hand, alcohol con- 
‘umption was strongly related to stimulus-seeking 
heeds, particularly needs to engage in disinhibited 
forms of sensation seeking. These results, in 
tonjuction with other recent findings, do not sup- 
port a strict tension-reduction hypothesis and 
Suggest that a new analysis of the motivational 
leterminants of alcohol use is in order. 

It should be noted that these data are specific 
to the young adult population represented by our 
‘ample. Moreover, studies with older, alcoholic 
Populations often offer support for an association 
between anxiety and alcohol use. In combina- 
tion, these results suggest that there may be a 
Particular developmental course in the relation- 
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Table 1 

Correlations Between Sensation Seeking 

Scales and Drinking Index Scores 
eS ee eee 


Drinking 
Scale index 
General 16%" 
Thrill and Adventure Seeking at 
Experience Seeking EY Atab 
Disinhibition sponse 
Boredom Susceptibility P T liitis 
Note. N = 242, 
*p < 05. 
**b < 01. 
*** p < 001. 


ship between sensation seeking, anxiety, and 
alcohol use. For young adults, drinking may serve 
primarily as an outlet for sensation-seeking needs, 
whereas, later in life, especially for vulnerable 
individuals, drinking may come to serve as a 
coping mechanism for feelings of anxiety and 
stress. 

Among the various dimensions of sensation 
seeking, the most powerful predictor of drinking 
behavior was the Disinhibition scale, which is 
best described as measuring a need for extra- 
verted, hedonistic social involvement. The ques- 
tion that obviously presents itself is, “Why does 
a pharmacological sedative serve as a releasing 
mechanism for socially extraverted, sensation- 
seeking behaviors?” 

Our hypothesis is that alcohol disinhibits be- 
havior, because of the well-established cultural 
expectancy that drinking leads inevitably to an 
inability to exercise moral or social restraint, not 
just because of its pharmacological effects. Re- 
cent research, which found that alcohol disin- 
hibits behavior only if the subjects were told 
that they were drinking alcohol, offers some sup- 
port for this hypothesis. Drinking, in effect, pro- 
vides a culturally sanctioned “time-out” from 
social control, during which exhibitionistic, 
hedonistic behavior may be expressed with im- 
punity. It follows, therefore, that individuals 
with a strong need to engage in disinhibited be- 
havior will be more likely to turn to drinking 
because of this culturally established expectancy. 

In summary, alcohol may be described as 
serving as a “releaser” for normally restrained 
social behaviors. If so, then theories of alcohol 
use will need to take into account the reinforce- 
ment potential provided by access to such dis- 
inhibited states, especially in individuals with 
strong motivational needs for sensation seeking. 
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Therapist and Client Perceptions of 
Alternative Roles for the Facilitative Conditions 


Robert B. Slaney 
University of Akron 


The present study examined therapist and client Perceptions of one of two transcripts 
of psychotherapy—one using the facilitative conditions as a treatment and one using 
them as intermediate variables leading to the suggestion of assertive training. Data 
were obtained from 100 therapists-in-training and 50 clients. For the therapist group, 
assertive training was estimated as more effective and the behavioral therapist was 
seen as more expert and appealing. No differences were found on therapist under- 
no significant differences were found. The findings are 
discussed with their implications for future research. 


standing. For the client group, 


The presence of the facilitative conditions 
(Carkhuff, 1969) in current research, writing, and 
professional and paraprofessional training pro- 
grams supports Bergin’s statement that “their 
Presence and influence is ubiquitous” (Bergin & 
Suinn, 1975, p. 521). ‘However, while some the- 
orists, such as Carkhuff, seem to place emphasis 
on the conditions as a primary mode of treat- 
ment, others (e.g., Lazarus, 1971) stress the 
intermediate relationship-enhancing aspects of 
these variables. 

The present study used two groups, therapists 
and clients, to examine perceptions of the role 
of the facilitative conditions in psychotherapy. 
For the therapist group the Participants were 50 
male and 50 female graduate students in rehabili- 
tation counseling, counseling psychology, and 
clinical psychology who were involved in train- 
ing programs in mental health facilities in the 
Syracuse, New York, area. Data were gathered 
over a 2-year period. The distribution of students 
was rehabilitation, 38; Counseling, 23; and clini- 
cal, 39. All participants had had experience in 
the treatment of clients, They were randomly 
assigned to treatment conditions by sex and area 
of specialization, The mean age was 26.36 years. 
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The mean educational level was 17.53 year 
There were no significant differences based on 
or years of education. 

A written transcript (Slaney, 1977) 
sented an excerpt from a therapy session, 
transcript consisted of nine client-therapist in 
actions. The client statements portrayed a 
who was having difficulties at work, at home, ; 
in social situations because he was anxious, 
sure of himself, and unassertive. The therapist 
responses were designed to represent facilitativ 
Tesponses using Carkhuff’s (1969) Em] 
Scale. Two independent raters, experienced 
use of Carkhuff’s scale, rated all but one resi 
as meeting or exceeding the criterion of 3.0 
Pearson product-moment correlation of the 
ings was .86. E- 

A second transcript was devised, which 
fered from the first only in the last thera] 3 
response. Instead of a facilitative response, 
Suggestion of assertive training was made. 
response was rated below 3.0 on Carkhi 
(1969) scale. A group of seven raters, all 
Perienced therapists with PhDs, were asket 
judge the appropriateness of the use of asser 
training with this client. A 7-point rating 
ranging from 1 (very inappropriate) to 7 | 
appropriate) was used. The mean rating was 

The transcripts were randomly distrib 
After reading them, subjects completed a t 
scale, which included an estimate of the € 
tiveness of the eventual outcome of the f 
ment and three therapist characteristics: 
Pertness, understanding, and appeal. The § 
Were 8-point Likert-type scales ranging, for € 
ample, from extremely inexpert to extreme) 
expert. 
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The mean ratings were slightly to moderately 
positive. None was negative. A 2 X 2 (Treatment 
X Sex) analysis of variance was performed for 
each of the therapist rating scale items. The re- 
sults indicate that for perceptions of expertness, 
there were significant differences as a result of 
treatment; F(1, 96) = 5.54, p<.05, with the 
assertive training therapist perceived as more 
expert. There were no significant differences on 
understanding. For appeal there was a signifi- 
cant treatment effect, F(1, 96) = 6.07, p< .05; 
the assertive training therapist was rated as more 
appealing. And the estimates of the effectiveness 
of the treatment yielded a significant treatment 
effect, F(1, 96) = 5.68, p < .05, with the effec- 
tiveness of assertive training rated as higher 
than the facilitative conditions. There were no 
significant differences as a result of sex or the 
Treatment x Sex interaction. 

To examine the perceptions of clients, the 
study contained 61 male clients of the Veterans 
Administration Hospital who were either out- 
Patients at a mental hygiene clinic or patients 


who were being discharged from an acute in-- 


patient treatment facility. Diagnoses were pri- 
marily personality disorders or mixed neuroses. 
The criteria for inclusion were a willingness to 
Participate, being under 35 years of age, involve- 
ment in individual therapy, being considered 
teady for discharge, and being judged as free of 
observable symptoms by the therapist involved 
and two other staff persons making inde- 
Pendent judgments. Five clients declined to par- 
ticipate, and six were rated by one or more of the 
judges as not being free of observable symptoms. 
ne final sample, gathered over an 18-month 
period, was composed of 50 clients. The mean 
 ĉ8e was 28.6 years, with a range of 20-35; and 
the mean educational level was 12.93 years, with 
à range of 10-18 years. There were no significant 
erences between the groups on the basis of 
age or educational level. The transcripts and 
Procedures were the same as for the therapists. 
The mean ratings were slightly to moderately 
Positive. A one-way analysis of variance revealed 
nO Significant treatment differences on any of the 
tating scale items. 
Overall, the results for the therapist group 
Indicate a preference for the treatment that 
Combined the facilitative conditions with the 
Suggestion of assertive training. The lack of sig- 
nificant differences on understanding would ap- 
Pear to be the result of the higher ratings that 
€ therapists in the facilitative conditions treat- 
ment gave to this variable relative to the others. 
The lack of significant differences in the rat- 
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ings of the client group can be seen as counter- 
ing the expectation that this group would be likely 
to prefer the suggestion of a specific treatment 
NE aNd s analogous to the stereotypical physi- 
cian-patient interaction. It may be that the 
client group, having experienced therapy, had a 
greater appreciation of the facilitative responses. 
A related possibility is that clients in therapy 
may have been taught to not expect specific sug- 
gestions from their therapists. 

An alternative explanation may be that the 
client group simply lacked the background in 
experience and theory that the therapist group 
had and used in rating the transcripts. Similarly, 
it may be that the particular efforts made to 
provide valid facilitative responses and an ap- 
propriate treatment had a greater effect on the 
therapists than on the clients. In any event it 
seems reasonable that the expectations of clients 
about appropriate treatments were less clear than 
those of therapists. That the client group rated 
both treatments positively can be seen as an 
indication of an overall receptiveness to psycho- 
therapy. 

The previous study that used the same tran- 
scripts (Slaney, 1977) contained students in in- 
troductory psychology courses. The results for 
this group were similar to the therapist results. 
Although the reasons for this similarity are not 
clear, the differences between the student group 
and the client group raise questions about ex- 
tending results of studies containing college 
students to clients, particularly clients in inpatient 
or outpatient treatment settings. 

The lack of women clients is, of course, an 
important limitation of this study. Another im- 
portant qualification is that the transcripts pre- 
sented only one particular client and two possible 
approaches to treatment. How or whether the 
perceptions of treatment for counselors and 
clients vary as a function of the problem that is 
presented or the treatment suggested is a ques- 
tion that will require further research. 
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Positive Versus Negative Self-Monitoring 
in the Self-Control of Smoking 


David A, Kantorowitz, Joyce Walters, and Kathy Pezdek 
California State College, San Bernardino 


This experiment compared smoking treatment programs using negative (recording 
number of cigarettes smoked) versus positive (recording number of urges resisted) 
self-monitoring. Subjects participated in one of two similar broad-spectrum treat- 
ment programs, within which they either used positive or negative self-monitoring. 
Over treatment, subjects in both self-monitoring groups demonstrated similarly sig- 
nificant reductions in smoking frequency as compared with a no-treatment control 
group. These findings were generally maintained at follow-up. Clinical findings call 
into question the accuracy of the “positive” and “negative” labels used to designate 


the two self-monitoring modes, 


Clinical programs for smoking reduction are 
generally assessed by subject self-monitoring 
(Thoresen & Mahoney, 1974). Although several 
points prior to, during, and after the act of 
smoking would appear monitorable, clinical re- 
searchers have generally asked subjects to re- 
cord their (a) successfully resisted urges to smoke 
(positive self-monitoring) or (b) number of 
cigarettes smoked (negative self-monitoring) 
(McFall & Hammen, 1971), 

Thoresen and Mahoney (1974) have sug- 
gested (hat monitoring resisted urges may prove 
to be more facilitative to smoking reduction than 
monitoring the number of smoked cigarettes. 
Presumably, the act of self-monitoring success- 
fully resisted urges becomes secondarily reinforc- 
ing and transforms the urges into SPs for self- 
control. By focusing on number of cigarettes 
smoked, Ss for self-punishment may be created, 
but since these cues appear after the behavior 
to be controlled, they may be less helpful in con- 
trolling the preceding smoking behavior, 

Previous research has attempted to assess the 
effects of self-monitoring free of confounding 
treatment techniques. McFall (1970) reported 
that negative monitoring of “unmotivated” stu- 
= increased cigarette consumption over a 13- 

iy period, whereas positive monitoring decreased 
consumption. Working with “motivated” students, 
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however, McFall and Hammen (1971) found 
similarly significant decreases in smoking among 
positive and negative monitoring groups. In ad- 
dition, positive and negative monitoring led to 
significant changes in study time (Johnson & 
White, 1971) and in nail-biting (McNamara, 
1972), with no significant differences in fre- 
quency of smoking between the two modes. 

A methodological problem with the afore- 
mentioned studies is the confounding of self- 
monitoring methods with subsequent, uncon- 
trolled usage by motivated subjects of self- , 
evaluation and covert self-praise or punishment 
(Thoresen & Mahoney, 1974). Additionally, there 
exists no research comparing different monitor- 
ing methods as the assessment vehicle for clini- 
cally relevant broad-spectrum self-control pro- 
grams. This would appear to be an important 
area in view of the transitory nature of treat- 
ment effects produced by self-monitoring alone 
(Kazdin, 1974). The present study was thus de- 
signed to compare positive versus negative self- 
monitoring as each interacted with similar broad- 
spectrum behavioral self-control programs for 
the reduction of cigarette smoking. : 

Nine volunteer subjects (M age = 36.7) were — 
assigned to self-control with positive monitoring, 
self-control with negative monitoring, and wait- 
ing list control groups. Treatment subjects were 
randomly assigned to either the positive or nega- 
tive monitoring groups; subjects who could not , 
attend either of the treatment groups due to 
scheduling restraints were assigned to the control 
group. The baseline smoking rates of the positive 
Monitoring, negative monitoring, and control 
groups were 19.7, 26.8, and 15.6, respectively. 


' Both self-control treatments consisted of eight 
(0-minute group meetings distributed twice a 
week over 4 weeks. To insure that both experi- 
mental groups were taught similar techniques of 
self-control, each of the treatment sessions was 
prearranged and standardized. Both groups were 
instructed to (a) identify, avoid, or isolate cues 
for smoking, (b) make use of incompatible re- 
| sponses when feeling urges to smoke (such as 
relaxing, chewing, sucking cloves, etc.), (c) use 
self-talk and imagery as vehicles for self-reward 
and punishment, and (d) write contingency con- 
tracts for themselves and with others. Subjects 
in both groups were instructed to reduce their 
cigarette consumption at their own rate but to 
‘attempt to reach abstinence by the eighth session. 
Subjects in the negative self-monitoring group 
were instructed to advance their counters each 
time they yielded to an urge and decided to 
‘Smoke, Subjects in the positive self-monitoring 
group advanced their counters each time they 
Tisted an urge to smoke. Both groups tran- 
scribed their daily totals onto written charts, 
Which they brought to treatment sessions. 
The control group decreased over treatment by 
“mean of 1.1 cigarettes a day; one subject 
tached abstinence. The positive and negative 
‘hie groups decreased by 14.7 (74.6% of 
: and 16.8 (62.8%) cigarettes a day, 
fy AA The number of subjects reaching 
Ba were four (positive monitoring) and 
3 pive monitoring). The follow-up rates 
finais peve and positive monitoring groups 
Ein Be pectively, 16.0 and 10.1 below the 
E 0 each group; „three positive and three 
monitoring subjects remained abstinent. 
he the variance among groups on the 
Bos p þaseline measure was nonhomo- 
eni max(3, 17) =7.28, p<.05, Tukey’s 
ON a eae among means were made 
Reve # e three treatment! groups separately. 
ite oi E change in smoking frequency over 
e control group. The negative moni- 
5 Broup, however, significantly reduced its 
frequency from baseline to end of treat- 
= 6.48, p< .01) and from baseline to 
(9=6.17, p<.01). The positive 
8 group significantly reduced its smok- 
a from baseline to end of treatment 
Hegel b RA its smoking frequency was 
teline E ut not significantly, different from 
tant indin, ollow-up (q = 3.89). The nonsignifi- 
te tate Ç oe due to the highly deviant smok- 
k sieht « F 5.62) of one subject. The remain- 
Ubjects in the Positive monitoring group 
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maintained a significant reduction in smoking at 
follow-up (g = 5.35, p < .05). 

t tests were performed on the relative differ- 
ences between baseline and end of treatment, 
and baseline and follow-up, for the positive 
versus the negative self-monitoring groups. Con- 
trary to prediction, no significant difference was 
found between the positive monitoring group 
and the negative monitoring group at end of treat- 
ment or at follow-up. A chi-square analysis of 
the number of subjects in each group who 
reached smoking abstinence indicated no dif- 
ference between positive and negative monitoring 
at end of treatment or at follow-up. There was 
a significant difference, however, between the 
combined self-control groups as compared with 
the control group at end of treatment and at 
follow-up, x?(1) = 5.39, p < .025, at both times. 

Since both monitoring groups were paired with 
broad-spectrum self-control programs, it is not 
possible to conclude that the results support a 
facilitative effect of self-monitoring alone. The 
data do indicate, however, that contrary to the 
speculations of Thoresen and Mahoney (1974), 
there was no significant difference in outcome 
between positive and negative self-monitoring 
when used as the assessment vehicle for a broad- 
spectrum self-control program. j 

Interestingly, only one subject in the negative 
self-monitoring group reported that the self- 
monitoring treatment was “negative” or self- 
punishing on a follow-up questionnaire. Most 
subjects tended to view negative monitoring in 
the overall context of their treatment goals, 
frequently citing the self-reinforcing aspects of 
watching unresisted urges decline. Unexpectedly, 
five of eight positive monitoring subjects who 
completed the follow-up questionnaire indicated 
dislike of and frustration with this method even 
though their frequency of resisted urges did in- 
crease. Generally cited reasons included pre- 
ferring to see cigarette consumption decline” or 
“lacking feelings of accomplishment.” Rather 
than stressing self-reinforcing effects pursuant to 
resisting urges, positive monitoring „subjects 
stressed heightened frustration due to increased 
awareness of resisted urges to smoke. These Be 
sults suggest that “positively” and “negatively 
designated self-monitoring modes may be mis- 
nomers. Self-presentation of praise or punish- 
ment may solely be a function of the desirability 
of the self-monitored feedback that is received 
and independent of the method by which this 


feedback is procured. ’ ù 
Additional research would be required to in- 


vestigate whether these speculations match sub- 
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a determining role in self-esteem. 


les involved in directly assessing the 
self-concept (Wylie, 1974) have 
cent attention on indirect assessment 
hd alternative conceptualizations of self- 
Coopersmith, 1967; Gergen, 1971; 
y, Smith, & Long, 1969). However, 
la regarding regularities in the experi- 
li-concept would amplify our understand- 
H-esteem, the present study attempts to 
ispects of the phenomenal self-concept 
tS varying in self-esteem by using a 
I self-report (NSR) approach. 

SR approach (Christian, 1973) involves 
Subjective meanings to the two ex- 
Fa 100-point scale and requiring sub- 
tespond numerically regarding this di- 
Of experience. In the present study, 30 
Unters, 10 from each of three self- 
els (ie., high, medium, and low) on 
uths Self-Esteem Inventory (SEI), 
omly assigned to interview times with 
terviewer, After an introductory period, 
„Was asked to enumerate his positive 
istics. NSRs of the importance of each 
tic and ratings for the salience (rela- 
nce) of each were requested. Negative 
tics were then elicited, and the same 


: Was conducted while the author was 
a University of California at Davis. 
y a Teprints and for an extended report 

ould be sent to Kenneth W. Christian, 
= the Lafayette Therapy Center, 936 
€, Suite G, Lafayette, California 94549. 


ects of the Self-Concept Related to Level of Self-Esteem 


Kenneth W. Christian 
University of California at Davis 


Aspects of the phenomenal self-concept of 30 male subjects varying in self-esteem, 
using a numerical self-report approach, were studied. Numerical ratings of impor- 
tance and salience of self-enumerated positive and negative characteristics were used 
to generate a series of scores. Significant differences (p<.005) were found on an 
overall self-esteem score, which correlated 59 (p < .001) with an independent mea- 
sure. No significant differences were found on ratings of positive characteristics. 
Striking differences were noted for negative characteristics. Results suggest that it is 
how individuals experience negative rather than positive characteristics that plays 


nificant at the .005 level, F(2, 27) = 7.898, and 
means were in the predicted direction, Further- 
more, subjects’ scores on overall self-esteem cor- 
related .59 (p<.001) with scores on the SEI. 
These results appear to indicate that as predicted, 
NSRs concerning various aspects of the self- 
concept can be combined and transformed into 
an overall score that discriminates between sub- 
jects with different self-esteem levels. 

Even though none of the analyses of the vari- 
ous subscores for positive aspects of the self 
were significant, significant differences were 
found on the negative self-regard score, F(2, 27) 
= 10.244, p < .001, on saliency ratings for nega- 
tive characteristics, F(2, 27) = 6.652, p<.01, 
and on the number of negative characteristics 
mentioned, F(2, 27) = 3.806, p <.05. l 

The high-self-esteem subjects’ low negative 
ratings seem to indicate that they do not experi- 
ence their negative characteristics in the same 
way that others do, and that the important dif- 
ference between individuals differing in self- 
esteem is the amount of negative, rather than 
positive, self-regard that they experience. 

The pattern of results for the medium group 
is noteworthy, since they frequently scored high- 
est on both negative and positive measures. These 
findings may indicate ambivalence and/or un- 


certainty. 
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decrease in male counselor preferences. 


t studies of preferences for counselors have 
ilicated that more subjects preferred a male 
fipist than a female therapist. This finding has 
for both male and female clients (Boulware 
Holmes, 1970; Fuller, 1964) and male and 
nonclients (Fuller, 1964). One exception 
is rule has been the tendency of female non- 
it subjects to indicate a preference for a fe- 
counselor for some counseling problems 
, 1964). 
bulware and Holmes (1970) reported that 
Subjects expected male therapists to be 
œ empathic, more knowledgeable, more ex- 
ced, and better adjusted; and, presumably, 
= expectations were the basis for male pref- 
tes, However, recent sex role research sug- 
W that previously held notions about the 
Hor competence of male professionals are 
g (Chobot, Goldberg, Abramson, & 
mson, 1974; Levenson, Burford, Bonno, & 
WS, 1975). If this is the case, then a decrease 
Merences for male counselors may be ex- 
a The purpose of this study was to explore 
E changes in client and nonclient pref- 
E for male and female counselors. 
Subjects were 290 undergraduates enrolled 
Ts tn, Social science courses (140 males 
emales) and 129 applicants to a uni- 
counseling service (53 males and 76 
eee were predominantly from a 
eae urban-suburban background. 
asked to state their counselor pref- 
p a a Part of their applications to a 
? lane As in Fuller’s study (Fuller, 
could indicate a male preference, 


for reprints sh 

se should be sent to Jayne E. 
Depa. nt of Psychology, University of 
St. Louis, Missouri 63121. 4 


Changes in Preferences for Male and Female Counselors 


Elaine F. Walker and Jayne E. Stake 


University of Missouri—St. Louis 


Past studies have indicated a preference for male therapists among both client and 
nonclient samples. To test for possible changes in preferences since the time of these 
studies, 53 male and 76 female applicants for counseling and 140 male and 150 female 
nonclient undergraduates completed a university counseling service application form 
that included a question regarding preference for sex of therapist. Although more 
clients than nonclients expressed preferences, results from both groups indicated a 


a female preference, or no preference. Nonclient 
subjects were tested in groups of 15-35 by either 
a male or female experimenter. They were asked 
to fill out portions of the application form com- 
pleted by the clients as though they were apply- 
ing for counseling. 

The percentages of client and nonclient sub- 
jects preferring a male or female therapist and 
the percentages having no preferences are pre- 
sented in Table 1. 

Two questions regarding these sex preferences 
were considered. First, were subjects more likely 
to have a preference or not? Second, of those 
subjects who did state a preference, were they 
more likely to state a preference for a male or 
for a female counselor? 

The number of male nonclients stating a pref- 
erence was significantly smaller than the number 
stating no preference, x*(1) = 28.35, p< 001; 
the number stating a preference for a male coun- 
selor was significantly smaller than the number 
stating a female preference, xX) = 7.61, p< 
01. In the case of female nonclients, the number 
stating a preference was significantly smaller 
than the number stating no preference, (1) = 
21.66, p < .001; among those who did state a 
preference, an approximately equal number pre- 
ferred each sex. In summary, over 70% of the 
nonclients gave no preference, and a preference 
for a male counselor was given by only 12% of 
the total group. 

To determine if the presence of female ex- 
perimenters in this study accounted for dif- 
ferences between the results of this and earlier 
studies (which included male experimenters 
only), the relationship between sex of experi- 


menter and preference (male, female, or none) 


was tested by a chi-square test of independence. 
This relationship was not significant; hence, the 
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Table 1 : 
Percentages of Nonclients and Clients 
Expressing Preferences for Counselors 


Counselor preference 


Subjects Male Female None n 
Nonclient 
Male 7.14 20.00 72.86 140 
Female 16.00 14.67 69.33 150 
Client 
Male 18.87 16.98 64.15 53 
Female 3.95 46.05 50.00 76 


decrease in preference for males cannot be ex- 
plained by the presence of female experimenters. 

Although male clients tended not to give a 
preference, the difference in number giving and 
not giving a preference was not significant; 
among males with a preference, an approximately 
equal number stated a preference for each sex. 
Among female clients, half stated a preference, 
and more of these subjects preferred a female 
counselor, y*(1) = 25.29, p< .001. These find- 
ings also indicate a decrease in preference for 
males; in the total client sample, only 10% pre- 
ferred a male counselor, 

Differences in the preferences of client and 
nonclient groups were tested in 2 X 2 chi-square 
analyses. The client group was more likely to 
State a sex preference than was the nonclient 


group, x*(1) = 29,26, 2 <.001; among those 
clients were more likely 
e a preference for a same- 
x°(1) = 21.89, p< 001. Hence, 

did tend to express more con- 
of their counselors than did the 


Stating a preference, 
than nonclients to stat 
sex counselor, 
the client group 
cern for the sex 
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they may reflect a difference in the type of 
undergraduates included in the client and non- 
client groups. 

Although the preferences of the nonclient 
sample do not coincide closely with the prefer- 
ences of the perspective clients, the nonclients 
in the present study can be compared to non- 
clients in earlier studies (Boulware & Holmes, 
1970; Fuller, 1964). Fewer nonclients in the 
present study indicated a preference for male 
counselors. This finding suggests that most 
present-day undergraduates do not view male 
counselors as superior to female counselors. 

The client group is more clearly representa- 
tive of applicants for counseling. On the basis of 
the client sample, it appears that a substantial 
Proportion of prospective clients do have pref- 
erences; however, in contrast to earlier findings, 
most of the female clients who indicated a pref- 
erence wished to see a female counselor, and 
almost half of the male clients also preferred a 
female counselor. This change suggests that 
female counselors are now being viewed more 
Positively by applicants for counseling, 
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dictors of outcome. 


agination of fear-relevant scenes is central 
Most systematic desensitization procedures. 
fever, previous investigators have failed to 
4 positive relation between imaging vividness 
ensitization success. To examine this ap- 
t paradox, we refined the method of pre- 
IS studies and conducted the following ex- 
ment. 
orty-eight snake phobics were pretested to 
blish initial fear level. A behavioral avoid- 
test, self-ratings of fear during this test, a 
tionnaire on attitudes toward snakes were 
tto assess fear of snakes, and the Fear Sur- 
Schedule (Wolpe & Lang, 1964) was used 
sure general fearfulness. Subjects were 
' Separated into high, medium, and low 
ability groups on the basis of scores on 
Betts Questionnaire on imagery vividness, 
te assigned to one of two standardized, 
tally administered desensitization proce- 
Imagined scene (conventional) or in vivo 
ization. Subjects in in vivo treatment 
eee to enact, rather than imagine, each 
archy scene. Imaging vividness during therapy 
essed by subject self-ratings at 19 sched- 
Points. Posttesting was scheduled between 
iS 3 days after completion of the treatment, 
piira and self-report measures were 


Ti of variance of Treatment X Prether- 
maging Vividness Condition indicated no 


ae reprints and for an extended report 
ko ly should be sent to John M. Dyckman, 
3 oe at the Psychiatry Clinic, Kaiser Medical 
j Sereno Drive, Vallejo, California 94590. 
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Imaging Vividness and the Outcome of in Vivo and 
Imagined Scene Desensitization 


John M. Dyckman and Philip A. Cowan 
University of California, Berkeley 


This study reexamined the role of imaging vividness in desensitization success. Scores 
on the Betts Questionnaire on Mental Imagery were used to divide 48 snake-phobic 
subjects into high, medium, and low vivid groups, who were assigned to imagined 
scene or in vivo desensitization treatments. Imaging vividness was assessed at sched- 
uled points during therapy. Significant decreases in behavioral and self-reported fear 
were observed after both treatments, though in vivo desensitization produced signifi- 
cantly greater fear reduction. Intherapy imaging vividness scores were significantly 
correlated with therapeutic success and were superior to pretherapy ratings as pre- 


significant differences among the groups on initial 
fear levels. Repeated measures analyses of vari- 
ance revealed that mean posttherapy levels of 
fear were significantly lower than at pretesting on 
all measures, Snake phobics did not show sig- 
nificant improvement on repeated testing alone 
(McLemore, 1972), so the observed changes may 
be attributed to the treatment. 

Analyses of covariance with pretherapy levels 
of the dependent measures as covariates showed 
that in vivo therapy produced significantly 
greater mean fear reduction than imagined scene 
desensitization on all measures except the Fear 
Survey Schedule. In vivo treatment was also a 
more stressful experience for the subjects, who 
required on the average more scene presentations 
to complete the hierarchy (for in vivo, M= 
81.1, for imagined scene, M=79.4) t(46) = 
2.24, p < 05. ( 

In imagined scene desensitization pretherapy, 
imaging vividness scores correlated —.41 (p< 
.05) with posttherapy avoidance of a live snake 
but nonsignificantly with all other measures. 
Imaging vividness measured during therapy, 
however, was highly associated with improve- 
ment, correlating with posttherapy levels of the 
dependent variables as follows: —.70 (p< .001) 
with behavioral avoidance of a live snake, —.60 
(p<.01) with self-rated fear during the be- 
havioral test, —49 (p<.05) with attitudes 
toward snakes, and —.02 (ns) with the Fesi 
Survey Schedule. : i 

These results suggest several conclusions. 
therapy imagery inventories like the Betts 
currently of limited utility in predicting 
sensitization success, but imaging performan 
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important. Betts scores are useful in screening 
out very low vivid imagers (in our sample, 
those who scored above 102.5). These patients 
can benefit greatly from the extra time and ex- 
pense required to arrange in vivo treatment. The 
Betts, or any pretherapy imaging measure, 
should include items taken directly from the 
proposed hierarchy to assess the subject’s ability 
to clearly imagine stressful material. Imaging 
ability is clearly related to the outcome of 
imagined scene desensitization. The failure of 
previous studies to disclose this relation may stem 
from a failure to assess imaging activity during 
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treatment, and it reminds us that psychotherapy 


research must combine examination of process ` 


f 
, 
t 


with evaluation of outcome to produce meaning- 
ful results. 


References 


McLemore, C. W. Imagery in desensitization, Be- 
haviour Research and Therapy, 1972, 10, 51-57. 
Wolpe, J., & Lang, P. J. A fear survey schedule for 
use in behaviour therapy. Behaviour Research and 

Therapy, 1964, 2, 27-30, 


Received July 26, 1977 m 


of Consulting and Clinical Psychology 
Wi, Vol. 46, No. 5, 1157-1159 


Psychophysiology of Fear Imagery: Differences Between 
Focal Phobia and Social Performance Anxiety 


Theodore C. Weerts and Peter J. Lang 


University of Wisconsin 


Spider phobics and speech anxious subjects imaged fear scenes with spider and public- 
speaking content and a series of standard scenes that were constructed to vary in 
degree of emotional arousal and movement. Heart rate, skin conductance, and ocular 
activity were recorded. Spider phobics rated all imagery contents as more vivid and 
reported more scene movement than speech anxious subjects. Both groups responded 
to their own fear scenes with higher ratings of emotion and a greater physiological 
response than to the other group’s fear scenes. The arousal response of spider phobics 
to relevant fear scenes was greater than that of speech anxious subjects. The data 
suggest that the outcome of imagery-based therapies may be partly determined by 


type of fear. 


The present research reexamined the finding 
lang, Melamed, & Hart, 1970) that small ani- 
ul phobies and subjects with anxiety over public 
paking differ in the vividness and intensity of 
beir emotional imagery. Specifically, it was de- 
Med to determine if focal phobics give higher 
Widness and emotional arousal ratings than sub- 
tets ‘with performance anxiety, are more re- 
onsive physiologically irrespective of scene con- 
tat, or whether they show this greater response 
aM emotionally activating or focal fear ma- 
i Tt was also designed to examine projected 
a movement in fear imagery, as rapidity of 
7 a change is associated with the genesis 
fis of small animal fears and to con- 
k ìt anxiety and general physiological re- 
ll factors prompting imagery differences 
ear groups. 


Method 


aa soups of undergraduate subjects were 
pider a emales and 2 males with a maximum 
le (RSs), Tesponse on the Fear Survey Sched- 
ee and with Spider questionnaire scores 
tts Profe percentile or above (all of these sub- 
ete bel ssed no speech anxiety on the FSS and 

Ow the Speech Anxiety Questionnaire 


This p 

Spee was supported in part by National 
fee Penta Health Grant 10993. 

hi sic s or reprints and an extended report of 


hould be sent to Peter J. Lang, Depart- 
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median); 13 females and 3 males fearful of pub- 
lic speaking on the FSS and from the top 25% 
of students in public-speaking fear (reporting no 
spider fear (FSS) and below the median on the 
Spider Fear Questionnaire (Klorman, Weerts, 
Hastings, Melamed, & Lang, 1974)). 

All subjects visualized eight standard scenes, 
two from each of four content categories: scenes 
(a) high in emotional arousal and high in move- 
ment (e.g, panic in a burning movie theater); 
(b) high in movement and low in arousal; (c) 
low in movement and high in arousal; and (d) 
low in both movement and arousal (eg., sitting 
at a table in the library). All subjects were also 
administered four fear scenes, two relevant to 
spider phobia (eg., a spider crawling up your 
sleeve) and two related to performance anxiety 
(e.g., presenting a report in class). 

The scenes were presented as text on a com- 
puter oscilloscope display with a 30-sec eyes- 
closed period for visualization. After each scene 
subjects reported emotional arousal during the ( 
image, perceived movement, and image vividness. 

During all scenes, horizontal and vertical eye 
movements, skin conductance, and heart rate 
(HR) were continuously monitored. In addition, 
spontaneous skin conductance activity and habitu- 
ation to a 100-dB (A) tone were assessed. 


Results 
The two fear groups did not differ in Taylor 
Manifest Anxiety Scale (TMAS) scores, in spon 
taneous skin conductance activity, or in skin 
conductance response (SCR) habituation to pure 


tones. 
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Standard Scene Ratings 


There was no significant group difference in 
report of emotional arousal to the standard 
scenes (F <1). However, spider phobics gave 
significantly higher vividness ratings than did 
speech anxious subjects to all scenes, F(1, 25) = 
7.47, p <.02. Spider phobics also tended to see 
more movement in the standard scenes than sub- 
jects with performance anxiety, F(1, 25) = 3.63, 
p<.10. 

All subjects found the low-arousal scenes to be 
less emotionally stimulating, F(1, 25) = 99.36, 
p < .001, and marginally more vivid than the 
high-arousal scenes, F(1, 25) = 4.14, p<.06. 
The highest vividness ratings were generated by 
high-movement scenes when the arousal proper- 
ties of the scene were low. High movement 
in combination with high arousal produced the 
lowest vividness ratings, Arousal X Movement 
interaction F(1, 25) = 7.00, p <.03. The latter 
findings are consonant with Lang's (1977) view 
that multiple response propositions increase the 
information-processing load in imagery and could 
thus attenuate reported vividness. 

High-TMAS subjects regardless of their spe- 
cific fear tended to give similar vividness ratings 
to all scenes; low-anxious subjects showed a 
specific reduction in vividness reports to high- 
arousal scenes, Emotion X Group interaction 


F(1, 32) = 6.07, p <.025. This finding suggests 
that the lower mean vividness ratings found for 
arousing stimuli than for neutral stimuli in the 
total sample may be due to the low-anxiety sub- 
jects, who are less likely to spontani 
arousal contents and are thus less 
this task. 


ously image 
practiced in 
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Fear Scene Ratings 


The four fear scenes were administered ran- 
domly among the standard scenes but were sub- 
jected to a separate statistical analysis. The 
spider phobics rated fear scenes as more vivid 
than did public-speaking anxious subjects, 
F(1, 25) = 13.74, p < .05. The spider phobics 
also rated spider scenes as significantly more 
vivid than the public-speaking scenes; the two 
fear scene types did not differ in vividness for 
the speech anxious, and both their ratings were 
similar in level to the ratings assigned by spider 
phobics to speech anxiety scenes, Scene X Groups 
interaction F(1, 25) = 5.54, p < .05. 

Spider phobics tended to rate both fear-relevant 
and fear-irrelevant scenes as more arousing than 
did speech anxious subjects, F(1, 25) = 12.53, p < 
005. However, rating differences between the two 
Scene types were about equal for the two fear 
groups, and both groups assigned emotional 
arousal ratings to their fear-irrelevant scenes 
similar to that given to their standard low-arousal 
scenes, | 

Spider phobics found significantly more move- 
ment in their fear-relevant scenes than they did 
in the speech scenes, and they found more in 
these spider scenes than the speech anxious re- 
Ported experiencing in either scene type, Scene x, 
Group interaction F(1, 25) = 15.67, p < .001. 


Physiological Response to Fear Scenes 


: Eye movements did not vary significantly with 
either scene type or phobic group. However, both 
groups tended to produce faster HR, F(1, 23) = 
7.34, p <.03, and a greater total SCR, F(1, 25) 
= 7.40, p< .025, to their fear- relevant scenes 
than to fear-irrelevant scenes, Furthermore, the 
spider phobics had higher mean responses in their 
relevant fear imagery (HR = 78.3 beats/min, 
SCR=.015 a0) than the speech anxious in 


their relevant scenes (HR=74.5 beats/min, 
SCR = ,008 0-2), 


Discussion 
y 

In summary, the data support the hypothesis 
that small animal phobics generate more vivid 
imagery than subjects with social performance 
anxiety. The fear-relevant images of spider 
Phobics are not only more vivid, but they also 
Prompt somewhat stronger Physiological re- 
Sponse and higher ratings of affect, The results 


consistent with the notion that object 
y play a role in small animal fears. 
rences between fear groups were un- 
S differences in anxiety or general pat- 
physiological reactivity. They are con- 
ith the hypothesis that focal phobics’ 
experiences of fear situations, and per- 
potential therapeutic responses, are 
different from those of socially 
jects. These results prompt us. to re- 
earch on imagery therapy that was 
‘only one of these populations. 
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Correlations and Factor Analysis of the 
WISC-R and the Peabody Picture Vocabulary Test 
for an Adolescent Psychiatric Sample 


Allan DeHorn Valerie Klinge 
Department of Lafayette Clinic, Detroit, Michigan 


Wechsler Intelligence Seale for Children Revised (WISC-R) and Peabody Picture 
adolescent psychiatric patients were correlated 


3 
I 


on the factors emerging from 


suficient 
ent battery. Three factors tare S ionii reported factors 
the empeting that the PPVT adds little to information gained from the 


95.03 (SD = 18.90), The difference between 
two scores was significant using a ¢ test 
for correlated means, #(99) = 3.70, p < 001 
A correlational matrix was computed for the 
WISC-R IQ scores and subtests and the PPVT 
intercorrelations among the WISC-R 
scores were all significant (p < .05), ex- 
Span was not significantly related 
(r= .10, p> .05) or to 
Arrangement (r= .15, p > .05). The sig: | 
coefficients of the WISC-R 
ranged from .19 to .79. The cor 
coefficients of the WISC-R scores with 
TQ ranged from .21 to .79 and were | 
{P < 05, at least), including those 
-B 1Q and WISC-R Full Scale 
< 001), the WISC-R Verbal 1Q 
201), and the WISC-R Per 
(r = 65, p < 001). 
4 principal components factor analysis 
A three-factor varimax rotation, three fac- 
merged that accounted for 93.8% of the 
Presents the complete 
three factors and 
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Mentifies those varisbles that exceeded the load- 
ise criterion of 40. Factor 1, Verbal Compre- 
raped Mounted for 78.7% of the variance 


Mepmise determination. Factor 2, Perceptual 
for an additional 8.7% 
Comprehension and 
for another 6.5% of the 
seen, the PPVT-B IQ was 

on both Factors | 
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is of the WISC-R Scores and PPVT IQ 


Factor 

Test 1 2 3 
824 290 142 
750 447 216 
444 .395 .489 
.822 310 74 
sion 649 375 A49 
234 .095 370 
ompletion 305 683 —.007 
angement 375 577 122 
sign 368 .708 234 
mbly 197 -760 128 
.272 302 .572 
.748 443 048 


R = Wechsler Intelligence Scale for 
; PPVT = Peabody Picture Vo- 
Test, The loading criterion was .40, 


actors that emerged are very similar to 
ported from other samples (Kaufman, 
in Hagen & Kaufman, 1975). The first 
md factors, Verbal Comprehension and 
Organization, appear consistently in 
Studies and offer construct validity to 
fition of Verbal and Performance scores 
WISC-R. The third factor, including 
and Coding, is very similar to the 
from Distractibility factor described 
E (Kaufman, 1975), which also includes 
pin. These findings suggest that the 
Scores of an adolescent psychiatric 
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sample can be compared meaningfully to those 
of retarded or normal children, since there seem 
to be no qualitative differences in the structure 
of the intellectual abilities tested among the 
three groups. 

The substantial loadings of the PPVT-B IQ on 
the two main factors in this study suggest that 
the PPVT adds little to the knowledge gained 
from the WISC-R. Although the correlation be- 
tween the PPVT-B IQ and the WISC-R Full 
Scale IQ is rather high, studies using other than 
correlational designs indicated that the PPVT 
IQ should not be used as an estimate of the 
WISC-R IQ (see, e.g., Condit et al., 1976), 
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"X my”) responses as described by Blatt and 
T Wier (1974). Additionally, responses that were 
wi clearly transparencies but that seemed to 
nform to Blatt and Ritzler’s interpretation of 
J) iansparencies as “an attempt to establish three- 
mensional representations, but without the 
Gpicity to represent volume” (p. 282)—for ex- 
imple, “splattered paint’”—were scored as cri- 
tial. All Rorschach protocols were scored inde- 
pendently by two judges. 

When the number of transparency, translu- 
“Macy, and cross-sectional responses were totaled 
tor each patient, the suicide group showed a 
Wnificantly greater number of criterial responses 
(M=3.00) than did the control group (M = 
178); one-tailed ¢ test for matched pairs, ¢(13) 
/=233, p <.05. The two patient groups were 
ft found to differ when transparencies (includ- 
translucencies) and cross-sectional responses 
i analyzed separately. 

These results replicate, in large part, the find- 
Mgs of Blatt and Ritzler (1974). That Blatt and 
izler found differences between groups when 

g transparency (and translucency) and 
i ctional responses independently while we 

nd significant group differences only when com- 

ing these responses does not seem critical. Both 
ts of responses were seen by Blatt and Ritz- 
äs exemplifying the same formal property—an 
“equate representation of volume. Our find- 
#, in cross-validating those of Blatt and Ritz- 
i Provide evidence that a single sign on the 
chach may be a useful predictor of suicide. 
Roth and Blatt (1974), in considering why 
Parencies are given primarily by suicidal in- 
5, reasoned that this response reflected a 
or collapse of three-dimensional representa- 
Paralleled by a loss of self-other differenti- 
leading to the self and other becoming 
“d as objects of aggression. This interpreta- 
n is Consistent with the findings that suicidal 
$ give transparency responses on the 
hach; however, it does not explain specifi- 
- Why patients who complete suicide give 
k rency responses more frequently than 

patients who may threaten or attempt sui- 


We Suggest an alternative model for under- 

sang the association of suicide and transpar- 

A his model is admittedly speculative, but 
Snsistent with some clinical literature and 

ect to empirical validation. 

„the characteristic that most obviously 

emshes completed suicides is that they are 
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responsible for their own death, our approach 
has been to consider the meaning of death for 
these individuals. In this regard, Morse (1973) 
has described suicide-promoting fantasies. In these 
fantasies, according to Morse, death is conceptu- 
alized as a means for satisfying important wishes, 
rather than an end in itself. A presupposition of 
suicide-promoting fantasies is that the individual 
will be alive after he/she kills himself/herself. 
Morse suggests that even though all humans are 
incapable of truly conceiving of their own death, 
suicides are distinctive in the degree to which 
they accept their after-death fantasies as real. 

We suggest that transparency responses on 
the Rorschach (and, more generally, inadequate 
representations of volume) can be understood in 
the same terms as suicides’ conceptions of death. 
Just as death seems to be viewed as a transi- 
tional phase rather than as a definite end, so too 
transparencies exemplify this same penetrable 
quality—An object is not represented as solid 
and bounded but as something insubstantial or 
pregnable, 

At present, there is a dearth of systematic 
research assessing the conceptions of life and 
death held by suicidal and nonsuicidal individu- 
als. Empirical assessment of the hypothesis of a 
relationship between transparencies on the Ror- 
schach and attitudes toward death is one clear 
direction for research that would facilitate an 
understanding of the present findings and perhaps 
suggest further avenues of investigation, 
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Premorbid Social Com Construct Generalizability Across Ethnic 
Groups: bid 
Social Competence Components 


M. Costello 
eag E eee University of Texas Health Science Center at San Antonio 
Veterans Administration Hospital, San Antonio, Texas 
‘The ability across ethnic groups of interrelations among five variables was studied 
wiag path analysis, Certain findings involving four variables were consistent with 
theoretical expectations for the Anglo-American group but were reversed in the 
and theoretical implications are presented. 


Standard causal order, and a stan- 
score transformation for each variable for 
group preceded analysis. A correlation of 
BO was required for significance. Ten bivariate 
were decomposed. Nine of 42 Mexi- 
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‘across ethnic groups, were observed 
between education and symptom severity; educa- 
and behavioral adjustment; and first hos- 
symptom severity, In each 
of the zero-order correlation, 
causal effect, and of the total 
reversed across ethnic groups 
Magnitude of correlation were 
t relationships replicable 

were observed between 
age and first admission 
age. Nonsignificant relationships 


al 
HIHHH 
ie 


oa oe Observed for the remaining five variable 
“pf As education and first admission age are two 
High scores Components of the premorbid social competence 
ned social Construct (Zigler & Phillips, 1961), this study 
has implications for psychosocial developmental 

hea. theory. A successful developmental history (Le. 
higher education and older age at first psychiatric 
“birakdown") should be associated with a good 


(Le, less severe symptomatology and 
ope epee adjustment while hospitalized) 

expectations were confirmed for 
three statistically significant 
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Mexican American group, they ually 
fevered in sn. Better educatcd Anglos tended 
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1 
position of Statistically Significant Bivariate Relationships for Each Ethnic Group 
Causal 
(A) Total Noncausal 
covariation (B) Direct (C) Indirect (D) =B+C =A-D 
MA AA MA AA MA AA MA AA MA AA 
24 —44 36 —.42 —.12 =.02 24 —44 None None 
—.01 -30 —.07 30 .06 .00 —.01 30 None None 
33 —.24 34 —.31 08 08 42 —.23 —.09 —.01 
—.34 —.17 —.21 —.17 —.13 .00 -34 —17 None None 
67 64 -63 64 None None 63 64 04.00 


rt fewer alcohol-related symptoms and 
t observed to make better verbal and social 
with other patients than did poorer edu- 
Anglos, Anglos who were older at first 
tric admission tended to report fewer 
related symptoms on current admission 
did Anglos who were younger at first ad- 


Within the Mexican-American group. 
can only speculate about the meaning of 
eralizability. In the Anglo-American 
education is obviously an additional adap- 
tool. In the Mexican-American group, 
y a low-socioeconomic laboring class 
however, education may have been. as 
an adaptational liability as an asset, as 
more educated people may have been less 
to be Satisfied in laboring occupations. Or, 
can Americans who were in a posi- 
Acquire a more advanced education may 
experienced additional sociopsychological 
not ordinarily encountered in their eth- 
as a result of socioeconomic/occupa- 

| upward mobility, 

P Fegard to experimental methodology in 
ic comparisons, techniques such as 
matching or analysis of covariance 


. The obverse was true for each phenom-. 


MA = Mexican American; AA = Anglo American; Xs = education; X, = age at first psychiatric 
n; X, = current age; Xz = alcoholism symptom severity; X; = Behavioral adjustment. 


would be inappropriate controls of extraneous 
subject characteristics variance if the matching 
variable/covariate(s) show ethnic unreliability, 
as was found in the present study. 

The obvious conclusion is that when predictor 
or criterion variables (and the theory by which 
they are subsumed) derive from work on eth- 
nically homogeneous or ethnically unspecified 
groups are to be used in work with groups of 
specified ethnic composition different from the 
derivation group, the variables must first be 
studied with regard to their freedom from ethnic 
bias (i.e., generalizability). Failure to do so may 
lead to distorted findings with possible social 
ramifications. Second, the pathanalytic formula- 
tion has good potential as an instructional device. 
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Effects of the Sex of Both Interviewer and Subject 
on Reported Manifest Dream Content 


Ross B, Kremsdorf, Lucy J. Palladino, Douglas D. Polenz, 
and Barbara J. Antista 
Arizona State University 


present study examined the effects of the sex of both interviewer and subject on 
reported content of dreams, Three male and three female interviewers each inter- 
viewed five male and five female subjects to elicit dream reports. In contrast to pre- 
content of dreams were found, although 

were more vivid, active, and aggressive. Opposite-sex pairing 
mobilized reports of conflict within dreams, whereas same-sex pairing increased the 
sexual content. These results support the hypothesis that environmental factors arc 
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that 
What is often thought to be common to waking male and female subjects. In the most well-known 
fantasy processes and dreams is that they both investigation, Hall and Van de Castle (1966) 
involve the psychological transformation of stored found dreams of males to be generally more 
information with a relative lack of attention to active, to contain more male characters than 
external input. However, Whitman, Kramer, and female characters, to exhibit more aggression, 
Baldridge (1963) maintain that the conditions ai to involve more sexual content than those 
more females when dream diaries were submitted to 
pat a Rute breton re arrate a male oin aetan Similar results have been 
k | akijaa reported literature, though in almost every 


to use interviewers of both sexes. Special empha- 
given to those dream content ratings con- 
to reflect the most basic dimensions of 
the dream (Hauri, Sawyer, & Rechtschaffen, 


both separately and in interaction, fluenced by the s likely to reflect differences in- 
jects 


which sex of the participants. 
this variable has The subjects were 30 male and 30 female 


dream content have commonly Undergraduate students enrolled in introductory 
ces between the dreams of Psychology courses. Male subjects ranged in age 
from y to 30 years, with a mean age of 212 
Requests for reprints years. Female subjects ranged in age from 17 tO 
ds a pi stended report of E ears, with a mean age of ay years. Inter- 
Department of Psychology, PAo i Keemador!, Were three male and three female clinical 
Tempe, Arizona 85281, University, psychology graduate students of approximately 
the same age and therapeutic experience. 
Copyright 1978 American Popek 
TE ciation, Inc. 0022-006X/78/4605-1166$00.75 


prs he effects to study the more study of dream content male interviewers were 
pipet e ph owe Sahig stress on used. There has been no controlled experimental 
attention has been given to the effects little attempt to assess the effects of the sex of both 
of the interviewer on the dream pockal the sex participants on the dream content reported. 
reported. that is The purpose of this experiment was to sys- 
Those studies investigated tematically examine the subsequent effects on 
Mopin d ur ipdestareaprdkrdiad reported dream content when male and female 
research have found that this variable subjects were interviewed by either male or fe- 
uence such diverse can in- male interviewers. This design allowed for ob- 
and acquiescence, need affiliation, femo serving whether the differences generally reported 
eer oy vielai conditioning, and = between the dreams of male and female subjects 
menter bias. of the interviewer has also might be related to the failure of these studies 
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h interviewer was randomly assigned five 
ind five female subjects and requested each 
st to report the content of the last dream 
could recall. After allowing the subject to 
the spontaneous dream report, the in- 
wer clarified any ambiguities. All inter- 
were audio recorded. Most of the dream 
dimensions used in this investigation were 
d from the factor analytic study of dream 
Sby Hauri et al. (1967). The six indepen- 
am content dimensions were vivid fant- 
donic tone, active control, verbal aggres- 
physical aggression, and sexuality. Affect in 
fam reports was also assessed by the con- 
halysis scales developed by Gottschalk, 
f and Gleser (1969). These dimensions 
pxiety, hostility directed outward, hostility 
d inward, and ambivalent hostility. 

use of the relatively good reliabilities ob- 
‘on this free-response material, all data 
used mean ratings of the four raters. 
l the dreams reported by male and female 
did not differ in either the intensity or 
Of sexual content. However, same-sex 
of subjects and interviewers resulted in 
exual content, an interaction approaching 
ince, F(1, 56) =3.82, p<.10. The 
| reported by male subjects were more 
(1, 56) = 6.76, p<.025; active, F(1, 
58, p < .05; and physically aggressive, 
= 4.36, p < .05; than those of females. 
on, the dreams of males exhibited more 
ubjective impact than the dreams of 
1 FU, 56) = 5.40, p < .025. Opposite-sex 
Of subjects and interviewers resulted in 
Teports that reflected greater anxiety, 
J= 6.40, p < .025; hostility directed in- 
1, 56) = 6.84, p < 025; and ambivalent 
FU, 56) = 2.78, p<.10, 

‘ast to previous assertions, the results 
study argue against the assumption that 
al differences between the sexes determine 
Fee of sexual content within dreams. En- 
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vironmental factors such as the sex of the in- 
dividual to whom the dream is reported seem 
influential in the resulting dream report. It is 
difficult to determine whether the differences be- 
tween these results and those of studies conducted 
over 20 years earlier are due to cultural changes 
in sex roles or to the fact that earlier studies 
used only male interviewers. The other differ- 
ences found between the dream reports of males 
and females are consistent with similar com- 
parisons made in a number of studies over the 
last 30 years. This does raise the question of 
whether some basic dream processes are some- 
how different between the sexes. 

The results pertaining to opposite-sex pairing 
and dream reports support the assertion that to 
stimulate conflict for therapeutic purposes, one 
should expose the individual to a member of the 
opposite sex. However, these findings also sug- 
gest when sexual problems are the area of con- 
cern, pairing a client with a therapist of the same 
sex may facilitate discussion. Finally, the re- 
searcher of dream content and the clinician need 
to consider that the dream report that they col- 
lect is not immune to the effects of the setting 
in which it is reported. 
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Androgyny and Self-Esteem in the Upper-Middle Class: 
A Replication of Spence 


Kevin O'Connor, David W. Mann, and Judith M. Bardwick 
Universi Michigan 


ity of 


A middle-aged upper-middle class sample was used in a replication of Spence, 
Helmreich, and Stapp’s study of androgyny and self-esteem in an undergraduate 
sample, The earlier findings were largely replicated. Self-esteem scores for the men 
were substantially higher than those found by Spence et al., but the earlier relation- 


ships of androgyny, masculinity, and femininf 
characteristics of the sample for the generality of 
discussed. 


Implications of the 


ity with self-esteem received support. 


current sex role and sex identity research are 
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individual regardless of sex (e.g., Bem, 1977; 
Spence et al., 1975). 

Our purpose was to see whether our more tra- 
ditional, established, conservative sample would 
Produce the pattern of sex role attribute ratings 
and the relationship of androgyny with self- 
esteem reported by Spence et al. 

Specifically, Spence et al. reported findings 
from several sets of analyses. We were con- 
cerned with the following: 

1. Spence et al. compared the mean scores of 
men and women on each of their measures: sex 
role attribute ratings—endorsement of “valued” 
Sex-typed attributes, measures of the willingness 
to describe “typical” men and woman in sex 
Stereotyped terms, a self-esteem scale, and a scale 
i Ege with feminist attitudes toward 

men. reported significant sex differences 
in all but the self-esteem scores, We tested for 
Sex differences using the same measures in our 
sample, 


2. Spence et al. correlated their measures. 
reported significant positive correlations 
men’s masculinity scores and use of mas- 
culine stereotypes and between women’s fem- 
ininity scores and use of feminine stereotypes. 
They reported that Sex-appropriate sex role at- 
a self-ratings tended to relate negatively to 
feminism for men and women, but they found 
strong negative correlations between sex stereo- 
typing and feminism, Finally, they found posi- 
tive correlations between sex role attribute scores 
and self-esteem but no relation between self- 
esteem and feminism. We examined these cor- 
relations in our data, 
3 Spence et al. placed the men and the 
Women into groups based on their sex role at- 
tribute scores. For example, those above the 
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median on masculinity were classified as high 
‘masculine, those below, low masculine. They 
tossed the dichotomized masculinity and fem- 
jninity groups to produce a fourfold table for 
wch sex. The cells were high masculinity-high 
femininity (defining androgyny), high mascu- 
lnity-low femininity (traditional male), high 
femininity-low masculinity (traditional female), 
‘ind low femininity-low masculinity (which 
[Spence et al. call undifferentiated). Comparing 
the mean self-esteem scores of the four groups 
for each sex, they found that for each sex self- 
eleem means were highest in the androgyny 
ll and descended in the order given above. 

We applied these procedures in our data to see 
whether this central aspect of Spence et al.’s 
findings would be replicated in our strikingly 
Afferent sample. 


Method 


Procedure. Subjects were Caucasian men 
(#=43) and women (n= 48) between 40 and 
$0 years old. All were members of an informal 
ibdivision association who responded to a letter 
nt to members requesting volunteers for the 
budy. The mean number of children in the 
[umple was 3.26 per family; annual income for 
Ir Sample ranged from $50,000 to well over 
100,000, Many of the women were employed 
jiliside the home; and the men were employed 
|" Professional and executive positions. 

The self-administered questionnaires were left 
the subjects’ homes and were picked up after 
E days during the fall of 1975. Self-esteem 
aS measured by the Texas Social Behavior In- 
tory (TSBI); sex role attribute ratings and 
Tole stereotyping, by the Personal Attributes 
ote (PAQ); and degree of feminism, 
Mg Attitude Toward Women Scale (AWS). 
ar Of presentation was counterbalanced. All 
4 tuments are described in Spence et al. (1975) 

Were used as reported there. 


ts 


ee whole, sex differences in the data re- 
inple ose reported by Spence. The men in our 
m Ple described themselves as significantly more 
1, he fan did the college men, ¢(289) = 
it. 5. Unlike Spence’s men, ours were 
h Eene to use sex role stereotypes than 
h ice, And, in our sample as in Spence’s, 
tre profe were only slightly (insignificantly) 
Sen Peet than the men. Regarding self- 
hiñicant] pee oes successful men scored 
k- ne igher than the women, ż(89) = 
Spence? who scored at the same level as all 
€s subjects regardless of sex. 


fo 
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Similarly, correlations generally reflected rela- 
tionships consistent with those of Spence, except 
that femininity was not related to self-esteem 
among the men. Masculinity and self-esteem 
were significantly correlated among men and 
women, r(41) =.38 and r(45) =.77, ps < .05; 
femininity and self-esteem correlated only among 
women, r(45) = .46, p < .05. 

The relationship between self-esteem and 
masculinity reflects the vocational achievements 
of these men, having clearly met and surpassed 
the standards of competence embodied in the 
Spence et al. Masculinity scale. The interpersonal 
skills indexed in the Femininity scale seem not 
to contribute to self-esteem for these men, unlike 
the college sample. 

Thus, two questions remained: (a) Could any 
of our samples be classified as androgynous? and 
(b) if so, Would the pattern of self-esteem 
means reported by Spence et al. hold? 

The answer to both questions was a straight- 
forward yes. Our results were in close agreement 
with Spence. The androgynous men and women 
were highest in mean self-esteem, followed, 
within each sex, by masculine, feminine, and un- 
differentiated groups. 

Overall, our data lend support to and extend 
the generality of the work of Spence et al. 
Principally, androgynous self-descriptions did 
occur reliably in our sample and androgynous 
self-descriptions predicted the highest levels of 
self-esteem for men and women. The high esteem 
and masculinity scores of the men may well re- 
flect the broader opportunities to test their com- 
petence that are available neither to their spouses 


nor to college students. 
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Each response is a triadic experiential unit 
— an image(l), a somatic response(S), 
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tic and emotional patierns and 
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Uses: Easily administered within 
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valid developmental and per- 
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therapy process, for use 
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Full instructions for psy- 
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method." = Behavior Therapy 


"An exciting and ingenious way of 
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therapeutic sessions...A new and 
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cal and Experimental Hypnosis 
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figures, references, metrics, and typing (all 
copy must be double-spaced) appear in the 
Manual. Authors should submit manuscripts 
in triplicate and should keep a copy to guard 
against loss, 
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acknowledgments. If you have exceeded 185 
lines, shorten the material. (d) If your brief 
report barely exceeds 75 lines (one printed 
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Effects of Experimental Manipulation of Self-Disclosure 
on Group Cohesiveness 


Barry J. Kirshner 
Jewish Social Service Agency of Metropolitan Washington 
Rockville, Maryland 


Robert R. Dies and Robert A. Brown 
University of Maryland 


Self-disclosure in 8-hour experiential groups was systematically controlled by 
providing detailed audiotaped instructions and illustrations through a series of 
structured exercises. Two levels of self-disclosure (level of intimacy) were es- 
tablished. Four eight-person heterosexually balanced groups were exposed to 
encounter group tapes that instructed them to share intimate feelings and ex- 
periences. Examples of high self-disclosure and openness were presented to 
clarify the instructions. In contrast, four comparable groups were conducted 
by encounter group tapes that furnished only moderate levels of personal dis- 
closure and interpersonal sharing. Groups in both the high and low intimacy 
conditions received the same set of exercises and differed only in the instruc- 
tions and accompanying behavioral examples. Results of the study indicate 
that higher levels of disclosure produced greater group cohesiveness, as hypoth- 
esized, on four separate measures of the dependent variable. Findings on three 
different types of self-report instruments were corroborated by an unobtrusive 
behavioral measure of cohesiveness. 


viewers of the experiential group literature 


The effects of group cohesiveness in therapy v 
have arrived at similar conclusions regarding 


and encounter group contexts have now been 


widely documented. Yalom (1975), in his 
Popular text on group psychotherapy, sur- 
veyed evidence demonstrating that group co- 
hesiveness is an important determinant of 
Positive therapeutic change and produces 
many results that are considered to mediate 
Successful therapy outcome. Yalom saw this 
Variable as sufficiently important to list it as 
lof 11 curative factors in group therapy, and 
‘0 devote an entire chapter to it. Other re- 
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the importance of cohesiveness (Goldstein, 
Heller, & Sechrest, 1966). Clarification of the 
factors that produce cohesiveness could be 
enormously useful in designing more helpful 
group experiences. 

Although the consequences of group cohe- 
siveness have been relatively well established, 
its determinants remain less clear. Goldstein 
et al. (1966) offered a set of hypothesized 
variables that ostensibly enhance group co- 
hesiveness, including pregroup expectancies, 
intergroup competition, temporary inclusion of 
a “deviant plant,” resolution of subgroup dif- 
ferences, and verbal reinforcement; only a few 
studies have tested these inferences. Other 
investigators have suggested that group com- 
position (Yalom & Rand, 1966) and leader- 
ship style (Yalom, 1975) may be important 
precursors to cohesiveness. One variable that 
has particular relevance for the present inves- 
tigation is the role of self-disclosure. 
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Yalom (1975) contends that the group 
therapist greatly influences the self-disclosure 
level of group members, and that this in turn 
contributes to cohesiveness and intermember 
attraction. The leader, as technical expert, 
prompts members to share personal material 
and reinforces self-disclosure through a vari- 
ety of verbal and nonverbal acts. As a model- 
setting participant, the leader shares personal 
feelings, reactions, and experiences, and 
thereby paves the way for similar risk taking 
among group members. 

There is a small body of research relating 
self-disclosure to various measures of group 
cohesiveness and interpersonal attraction, 
Certner (1973) found that subjects had a 
greater liking for those from whom they had 
received more intimate divulgements. Ana- 
logously, group members who are higher dis- 
closers early in group life have been found to 
assume high popularity in the group (Hurley, 
Note 2). In addition, research by Bean (1972) 
and Query (1964) indicated that groups high 
in interpersonal openness were also high in 
cohesiveness, 

Unfortunately, the link between self-dis- 
closure and cohesiveness has occurred mainly 
within the context of correlational research. 
One exception is a study by Ribner (1974), 
who manipulated self-disclosure by means of 
a pregroup contract. Even though Ribner 
successfully demonstrated that an initial con- 
tract emphasizing self-disclosure did indeed 
influence subsequent group cohesiveness, his 
study was limited by the brevity of his group 
sessions (1 hour), the small variability of 
topical categories discussed within his groups, 
and by the analogue nature of his study. 

The present investigation was an attempt to 
extend Ribner’s (1974) findings to an actual 
experiential group setting in which partici- 
pants had convened for the explicit purpose of 
improving self-understanding and interper- 
sonal awareness. This study also endeavored 
to extend the correlational and clinical evi- 
dence regarding the presumed association be- 
tween self-disclosure and cohesiveness by sub- 


jecting this relationship to i 
scrutiny, p experimental 


B. KIRSHNER, R. DIES, AND R. BROWN 


Method 
Subjects 


Subjects were recruited from undergraduate classes 
in psychology and education at the University of 
Maryland. The students were given a brief descrip- 
tion of the research project and were informed that 
the groups would be 8-hour tape-led interpersonal 
growth groups designed to facilitate awareness of 
self and others. Participation was strictly voluntary, 
without the additional incentive of extra course 
credit. The final sample consisted of 64 subjects 
divided into eight groups of 4 males and 4 females 
each. Half of the groups were assigned to the high 
intimacy condition and half to the low intimacy 
condition, 


Procedure 


All groups met for one 8-hour extended session. 
Structure was provided by an audiotape similar in 
format to the Encountertapes developed by Berzon 
(eg., Berzon, Reisel, & Davis, 1969). The tape pre- 
sented detailed instructions for the group exercises 
and furnished illustrations of the behaviors expected 
in the structured group interactions. The instructions 
and examples were used to systematically control 
for levels of intimacy or self-disclosure. 

Groups in the high and low intimacy conditions 
received the same set of exercises and differed only 
in the instructions and accompanying examples. In 
the low intimacy condition, instructions and il- 
lustrations referred to levels of self-disclosure that 
were nonpersonal to mildly personal, whereas in the 
high intimacy condition the self-revelations were 
moderately to highly personal or private. The se- 
quence of structured activities on the tape was as 
follows: 

Introduction and warm-up exercises (35 minutes). 
All groups began with a brief introduction explain- 
ing the nature of the group. Subjects were informed 
that after each exercise an unrecorded segment of tape 
would continue to run, corresponding to the time 
allotted for the exercise. Following this introduction 
the group members participated in three warm-up 
exercises to help them become acquainted with each 
other and to share some relatively nonthreatening 
experiences. The structured activities were the name 
ton) trust walk, and negotiated combat (Stevens, 
3 Top secret (75 minutes). This was the first activ- 
ity to experimentally manipulate the level of inti- 
macy within the groups. Participants were requested 
to write an anonymous “secret” about themselves 0n 
2 slip of paper. Members in the low intimacy condi- 
tion were asked to furnish secrets or personal in- 
formation that did not have to be highly revealing. 
Examples were provided on the tape, for example, 
T am a junior and still haven’t decided on a major 
yet” In the high intimacy condition, the secrets 
were to be private and difficult to share with most 
people. Examples such as “There were times when 
felt so miserable I wondered if life was worth living 


SELF-DISCLOSURE AND GROUP COHESIVENESS 


anymore” were presented on the tape. Group mem- 
bers were then instructed to place their secrets in 
the top of a box and then after all the statements 
had been enclosed and the box was shaken to pull 
out a secret from the bottom. The box was specially 
constructed for this task and, unknown to group 
members, contained a false bottom with 12 pre- 
selected statements to further insure the proper 
level of intimacy early in the group. Participants 
were told that several additional secrets were placed 
in the box beforehand to further insure anonymity. 
In this way group members would not get suspicious 
when their secret was not read. Later questioning by 
the experimenter indicated that the deception was 
effective. The deception involved in using contrived 
secrets initially concerned the investigators. How- 
ever, the particular secrets used were comparable to 
those generally obtained with this technique in 
encounter groups, and similar to those furnished by 
the present subjects. Furthermore, subjects were not 
disturbed or resentful when the manipulation was 
revealed, and both groups reported the experience to 
be meaningful and personally rewarding. So even 
though the deception was effective, it did not seri- 
ously detract from the richness of the experience. 
After retrieving a statement, members were in- 
structed to take turns reading them to the group 
and to consider each one in turn for 5 minutes. After 
all members had read their secrets, a 20-minute group 
discussion of their reactions ensued. The tape pro- 
Vided guidelines and an illustration of how to con- 
duct a group discussion. This portion of the tape 
served to further reinforce the experimental manip- 
ulation of intimacy by providing either low or high 
levels of personal group conversation. 

Self-disclosure and cohesiveness measures (5 
minutes), All group participants completed two 
Questionnaires at this point in the group. These were 
4 self-disclosure measure, willingness to disclose, and 
i pee of group cohesiveness (Gruen, 1965). These 
apo are described in detail below. 

i iain. (90 minutes). Group members 
Bage in a TAA oose a partner and then to en- 
provided by th pa conversation focused on a topic 
Participants. ws e tape. Following the conversation, 
another nen ae to choose new partners for 
furnished Pai te cussion on a different topic 

is ape. This procedure continued until 
eae ee members had met with each other. The 
either were selected from material rated as being 
(1966). or low in intimacy by Taylor and Altman 


Lun i ici 
in cn, (30 minutes). Participants were instructed 


Something 
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Self-disclosure exercise (110 minutes), This ex- 
ercise was similar to top secret, but at this point, the 
disclosures were no longer anonymous. Groups in the 
high and low intimacy conditions were again given 
differential instructions and concrete examples to 
control for level of intimacy. Then, members took 
turns reading and commenting on their statements, 
and when they had finished there was a group dis- 
cussion based on guidelines furnished by the tape. 

Group fantasy (20 minutes). This exercise was 
presented as a “recreational” activity for the group 
(Stevens, 1971). There was no attempt to manipu- 
late intimacy. 

Self-disclosure and cohesiveness measures (20 min- 
ues). The Willingness to Disclose and Gruen ques- 
tionnaires were readministered along with two addi- 
tional cohesiveness measures, The first was an index 
of cohesiveness developed by Schutz (1958), and 
the second was the Comfortable Interpersonal Dis- 
tance Scale (Duke & Nowicki, 1972). These are 
described below. 

Closing and review of the group experience (25 
minutes). The final portion of the group experience 
was used to review what happened and to end with 
a group hug, in which members gathered close to- 
gether and silently hugged one another as a group. 
They were told to terminate the hug whenever they 
pleased. The group experience was then concluded, 
and the experimenter met with the participants to 
debrief them and to provide them with the oppor- 
tunity to express their reactions to the experience. 


Experimental Measures 


Self-disclosure was systematically manipulated at 
several key points in the sequence of structured ac- 
tivities presented on the tape. A check on the ef- 
fectiveness of the manipulation was essential, since 
the hypotheses regarding the link between self-dis- 
closure and cohesiveness could not be adequately 
tested without a clear demonstration that self- 
disclosure was indeed systematically controlled. Two 
different measures were used to verify the efficacy of 
the manipulation. 

The first measure was the Willingness to Disclose 
Questionnaire, which was administered to group par- 
ticipants after the initial manipulation of intimacy 
in the top secret exercise. The questionnaire was 
specifically constructed for this investigation and 
contained 38 statements of various levels of intimacy 
across a variety of categories. Group members were 
required to indicate all items that they would be 
willing to discuss in their experiential group at that 
time (roughly 2 hours into the session). The items 
were selected from a pool of statements rated for 
intimacy by Taylor and Altman (1966). The sum of 
the intimacy values for the items selected yielded the 
Willingness to Disclose score. > 

The second measure of self-disclosure was a modi- 
fied version of a rating procedure developed by Dies 
(Note 1) to assess intimacy level of tape-recorded 
segments of group interaction. Content was judged 
along a 7-point continuum ranging from impersonal 
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and event-oriented conversations on the low end of 
the scale to very private feelings, reactions, and 
experiences on the high end. In the present study, two 
segments of the group interactions were tape re- 
corded—the top secret exercise and the self-disclo- 
sure exercise—providing portions of early and late 
group interaction, respectively. Three-minute samples 
were taken from the middle of the first, second, and 
last thirds of each segment, yielding three early and 
three late ratings for each group. These 3-minute 
samples were coded and randomized before being 
rated by two trained judges whose scores were 
averaged to determine the final rating. Interrater 
reliability was .80 on the 48 samples evaluated in 
this study, 

Several measures of the dependent variable, co- 
hesiveness, were incorporated into this investigation, 
based on prior research which suggests that cohe- 
siveness is a multifaceted variable capable of being 
evaluated from a variety of perspectives (e.g. self- 
report, behavioral). The first was the Gruen (1965) 
measure, administered along with the Willingness to 
Disclose Questionnaire after the top secret exercise. 
This scale contains four group tasks and six product 
outcomes for each. Participants were instructed to 
select the outcome that they thought their group 
would produce if they engaged in the task. The Gruen 
scale was the only measure of cohesiveness repeated 
in this study. 

Three additional measures of cohesiveness were 
used, however. A scale described by Schutz (1958) 
has been shown to produce fairly consistent validity. 
It is a seven-item Guttman cumulative index to 
assess favorability of attitudes toward the group and 
its members. A third measure of cohesiveness, the 
Comfortable Interpersonal Distance Scale (Duke & 
Nowicki, 1972), was developed to assess interpersonal 
closeness. This measure is in the form of a diagram 
with seven calibrated lines radiating from a central 
point. Subjects were instructed to imagine themselves 
at the center of the diagram and to respond to the 
other group members as if they were moving toward 
them along the protruding radii, They were to draw 
a line on the corresponding radius indicating how 
close they would allow that particular member to 
advance. The final measure, the group hug, was 
included to provide an unobtrusive behavioral mea- 
sure of cohesiveness, This measure was based on 
research by Dies and Greenberg (1976), which indi- 
cated that physical contact was related to here-and- 
now feelings of interpersonal closeness, ‘The group 
hug came at the end of the tial session and 


was timed; the duration of the h i 
an index of cohesiveness, A Oat emcee 
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ratings and group hug, which were designed 
as total group measures.* 


Self-Disclosure (Independent Variable) 


The analysis of self-disclosure was essen- 
tially a check on the effectiveness of the ex- 
perimental manipulation. Two separate self- 
disclosure scores were available. The Willing- 
ness to Disclose Questionnaire was admin- 
istered twice to the group. The second measure 
was the judges’ ratings of self-disclosure from 
the tapes: Scores from both early (top secret) 
and late (self-disclosure exercise) group inter- 
action were available, A repeated measures 
analysis of variance was carried out for the 
self-disclosure scores. The results sum- 
marized in Table 1 indicate that the treatment 
effects on both variables were highly signifi- 
cant, thus confirming the efficacy of the ex- 
perimental manipulation. 


Group Cohesiveness (Dependent Variable) 


Results summarized in Table 1 indicate that 
higher levels of intimacy or self-disclosure 
produced greater group cohesiveness on all 
four dependent measures (p< .01). The 
Gruen instrument was administered at two 
different times to the group and was therefore 
analyzed with a repeated measures analysis 
of variance. The other three measures were 
administered at the conclusion of the 8-hour 
marathon. Results of the three self-report 
measures were confirmed by the unobtrusive 


pa i measure of cohesiveness, the group 
ug. 


Development Over Time 


Analysis of the Willingness to Disclose in- 
strument indicated that self-disclosure sig- 
nificantly increased over time (p < .01), and 
the tape ratings reflected a trend ( p < .10) in 
the same direction. Analysis of the Gruen 


1 Group analyses were, in fact, calculated for all 
variables, since it could be argued that individual 
scores within groups were not truly independent ob- 
servations. These group analyses supported the indi- 

ual analyses in virtually every case. Only the 
dividual analyses are included here to simplify the 


ion and presentation of results. 


in 
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scale, the only measure of cohesiveness re- 
peated in the investigation, also demon- 
strated a statistically reliable increase as a 
function of time (p < .01). The means and 
corresponding F values for these analyses 
appear in Table 1. 

The repeated measures design also indicated 
a significant interaction effect between willing- 
ness to disclose and treatment condition (p < 
01), and between cohesiveness and treatment 
condition ( < .01). Tukey’s method for mul- 
tiple comparisons was computed to test the 
differences between all means. In terms of 
willingness to disclose, there was a significant 
difference between treatment conditions, both 
early and late in the group. However, the in- 
Crease over time was much greater for the 
low intimacy condition groups than for those 
groups in the high intimacy condition. How- 
ever, although differences in cohesiveness were 
significant between high and low conditions at 
both times in the group life, the greater in- 
Crease over time on the Gruen measure was 
found in the high disclosing condition. Ap- 
parently, self-disclosure and cohesiveness 
showed different developmental trends as the 
8roups progressed. 


Discussion 


age Practitioners agree on the impor- 


ce of self-disclosure in the formation of 
Meaningful interpersonal relationships. Yalom 
os), for example, theorized that as dis- 
td Proceed in the group, the entire mem- 
cf IP gradually increases its level of in- 
a ement, responsibility, and obligation to 
ee ther: If the timing is right, there is 
A ing which will commit an individual toa 
a. More than to receive or to reveal some 
Ee Secret material” (p, 360). Results 
Tesent investigation strongly support 
as hypothesized relationship oe lige 
osure and cohesiveness, 
KA aih self-disclosure and cohesiveness 
respect 4 defined in a variety of ways. With 
on ont 0 self-disclosure, this study focused 
E E of its many salient dimensions, 
other TRA of intimacy. The influence of 
and cont ameters—breadth, frequency, time, 
: a onentation—were not addressed, 
ey received sufficient attention by 
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Table 1 

Mean Scores (Main Effects) and F Values 
—_—. 
F(1, 62) 


Measure M 
Self-disclosure 
Willingness to disclose 
Treatment (A) 
Low 180.50 . 
High 244.20 580 
Time (B) 
Early 197.27 
Late 2742 a 
AXB — 10.22* 
Tape rating 
A 
Low 2.00 
High T M 
Early 2.75 
Late T e 
AXB — 12 
Cohesiveness 
Gruen (1965) 
A 
Low 11.58 * 
High e tots 
Early 11.21 * 
Late TOE 
AXB — 7.30* 
Schutz (1958) 5 
Low 4,70 
High 5.65 gs 
Comfortable Interpersonal 
Distance Scaled 
Low 33.48 * 
High 3998 ote 
Group hug* 
Low 67.50 15.11>* 
High 140,00 i 


a Cell means are available on request. 

> df = 1,6. 

° Trend toward significance (p < .10). 

“Only treatment effects are reported, since the 
measures were only administered once. 

*p<.01. 


other researchers. Yet it is conceivable that 
these factors are just as influential, or perhaps 
even more so, than level of intimacy. 

Similar arguments could be offered for the 
concept of “group cohesiveness.” Prior re- 
searchers have often equated cohesiveness 
with attraction or liking, whereas the present 
study defined cohesiveness in several ways: 
liking, physical contact, comfort with inter- 


4 
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personal distance, and estimates of group 
productivity. Although these measures were 
intercorrelated in this investigation, the mod- 
erate level of the correlations suggests that 
they are not identical scales. A multimethod 
approach in future studies of group cohesive- 
ness is strongly recommended. 

The specific mechanisms accounting for the 
empirical association between disclosure and 
cohesiveness have received comparatively 
scant attention in the literature. Altman and 
Taylor (1973), however, have suggested that 
the relationship between self-disclosure and 
liking (one aspect of cohesiveness) is mutually 
reciprocal and based on interpersonal rewards 
and costs. Self-disclosure could be rewarding 
and thereby enhance liking if it implies that 
the receiver is trusted (Worthy, Gary, & 
Kahn, 1969); the more intimate the disclo- 
sure, the more rewarding it is for the person 
receiving it (Certner, 1973). Moreover, dis- 
closures are rewarding if they reveal attitude 
similarity between the individuals involved 
(Bersheid & Walster, 1969), or if they imply 
that the discloser likes the other person (Wal- 
ster, 1965). 

Other possible mechanisms for relating self- 
disclosure and cohesiveness pertain more to 
costs than to rewards. For example, highly 
intimate revelations, rather than being simply 
rewarding, could make the disclosers feel 
vulnerable. Cohesiveness may develop as a 
source of protection for members to insure 
that their “secrets” are kept within the 
group. As Yalom (1975) stated, 


the receiver . . . is likely to consider himself char; 
with certain responsibilities or obligations to send 
discloser. He generally responds to the disclosure by 
some appropriate comment . . . and then recipro- 
cates with some disclosure of his own. The receiver 
now, as well as the original discloser, is vulnerable 
and a deepening relationship usually continues, with 
the participants making slightly more open and inti- 
mate disclosures in turn until some optimal level for 
that relationship is reached, (pp. 359-360) 


Clarification of the mechanisms involved i 
ved in 
the disclosure-cohesiveness telationship is of 
considerable theoretical and practical impor- 


tance. For example, group $ 
experiences de- 
signed to enhance cohesiveness would be 


dramatically different if the maj 
major determi- 
nants of cohesiveness were Sain with 
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rewards rather than with interpersonal costs 
(eg., attitude similarity as opposed to anxi- 
ety over vulnerability). 

Even without clarification of these mecha- 
nisms, however, group practitioners who wish 
to enhance group cohesiveness may apparently 
do so by concentrating on methods of in- 
creasing self-disclosure. A variety of pro- 
cedures can facilitate this process. As a first 
step, the practitioner can pay careful atten- 
tion to group composition. Evidence suggests 
that groups composed of members with high 
interpersonal skills (including self-disclosure) 
may be more cohesive (D’Augelli, 1973). The 
group leader can also introduce a pregroup 
contract to influence members’ expectancies 
regarding the group experience. A variety of 
role-induction techniques have been studied 
including preparatory interviews, written 
contracts, audiotaped and filmed examples of 
“appropriate” client behavior, and interper- 
sonal skills practice. A review of this litera- 
ture indicates that these procedures are effec- 
tive in enhancing levels of self-disclosure, 
interpersonal feedback, and other positive 
member behaviors in both therapy and brief 
encounter group contexts (Bednar, Melnick, 
& Kaul, 1974). 

Once the group sessions have begun, the 
clinician can continue to augment member 
self-disclosure. One method, used in the 
present study, is to use verbal and nonverbal 
structured group exercises, Egan (1976), for 
instance, presented a series of structured ac- 
tivities specifically designed to foster inti- 
mate sharing among group participants. Doz 
ens of other compendiums of similar tech- 
niques are now available on the market. Some 
Tesearch suggests that these structured tech- 
niques can heighten feelings of intermember 
closeness and willingness to take risks within 
the group setting (Dies & Greenberg, 1976). 
The group leader can also increase self-dis- 
closure among group members by adopting 4 
relatively transparent style of leadership 
(Dies, 1977a), by reinforcing such openness 
among group participants (Dies, 1977b), and 
by facilitating the establishment of relatively 
clear group norms (Lieberman, Yalom, 
Miles, 1973), 

There were differences between the rates 
of increase in self-disclosure and cohesiveness- 
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Even though cohesiveness increased over time, 
it apparently did not directly follow the 
growth in willingness to self-disclose. Conse- 
quently, one cannot conclude that the incre- 
ment in readiness to self-disclose over time 
produced the increased cohesiveness. It is pos- 
sible that the production of high levels of 
intimacy early in the group, as opposed to a 
gradual development over time, was the cru- 
cial factor leading to heightened cohesiveness. 
This interpretation received some earlier sup- 
port from Hurley’s (Note 2) findings that 
individuals high in disclosing behavior early 
in their group achieved high popularity later 
in their group. However, this issue is not en- 
tirely clear in the present study, since there 
were possibly ceiling effects on the willingness 
to disclose measure. 

The form of the relationship between self- 
disclosure and cohesiveness and the question 
of which comes first pose important theoreti- 
cal and pragmatic questions. Nevertheless, 
this study suggests that we can systematically 
enhance the level of these variables within 
our groups and thereby increase the proba- 
bility of positive therapeutic outcomes. 


Reference Notes 


1. Dies, R. R. Self-disclosure in group interactions. 
i finblished manuscript, University of Maryland, 
2. Hurley, J. Self-disclosure in small counseling 


groups. Unpublished manuscript, Michigan State 
University, 1967. 
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Subtle-Obvious Ratings of MMPI Items: 
New Interest in an Old Concept 


William L. Christian, Barry R. Burkhart, and Malcolm D. Gynther 
Auburn University 


Previous attempts to devise subtle-obvious ratings for Minnesota Multiphasic 
Personality Inventory (MMPI) items have been too gross, restricted to certain 
scales, or have failed to consider the perspective of the psychologically naive 
inventory user. To correct these deficiencies, university students were asked to 
rate all 566 MMPI items, answered true and false, on a 5-point subtle-obvious 
scale. Reliability of ratings was assessed via responses to duplicate items; the 
reliability coefficients were .98 for items answered true and .91 for items an- 
swered false. Males and females gave very similar ratings; correlations were 
.94 and .90 for true and false items, respectively. Mean ratings for all subjects 
showed that MMPI scales F and Sc were most obvious and that Mf and Si 
were least obvious with regard to pathology. The correlation between obvious- 
ness ratings and desirability ratings was —.78. Analysis of previous measures 
of the subtlety-obviousness dimension in terms of the present item ratings indi- 
cate that previous measures did not adequately represent the very subtle or 
very obvious extremes of the distribution. Further research with these ratings 
should demonstrate whether empirically derived inventories with their subtle 


items are, in fact, useful tools for personality assessment. 


Responses to personality inventories are 
determined by instructional sets, stylistic fea- 
tures, and certain other characteristics of item 
content, that is, social desirability, sentence 
structure, ambiguity, and subtlety. Content of 
Minnesota Multiphasic Personality Inventory 
(MMPI) items was considered relatively un- 
important for many years, partly as a func- 
tion of Meehl’s (1945) influential article. 
However, the pendulum has swung the other 
way; more and more attention is now being 
paid to content (Jackson, 1971; Koss & 
Butcher, 1973; Wiggins, 1966). 

Measures of desirability have long been 
available (Edwards, 1957; Heineman, 1960; 
Messick & Jackson, 1961) for those who wish 
to explore the relations between MMPI item 
endorsement and desirability, Sentence struc- 
ture of MMPI items has been analyzed (Wig- 

, 1964), and ambiguity ratings are avail- 
able (Harris & Baxter, 1965). (All of these 
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“itemmetric” data are available in Dahlstrom, 
Welsh, & Dahlstrom, 1975.) However, subtle- 
obvious ratings (despite appearances to the 
contrary) have not been developed. This lack 
is curious when one considers that the sub- 
tlety-obviousness distinction has been an in- 
tegral part of the controversy over empirical 
versus intuitive internal test construction 
strategies, 

There have been three efforts to deal with 
the subtle-obvious issue: X and O classifica- 
tion (Meehl & Hathaway, 1946), Wiener’s 
(1948) subtle-obvious keys, and Dufi’s 
(1965) subtle-intermediate-obvious categories. 
X items are those that relatively few normals 
endorsed in the scored direction. Content of 
these items is usually undisguised and obvi- 
ously pathological. O items are those endorsed 
by a majority of normal subjects, but they 
are endorsed by an even higher percentage of 
clinical patients (and consequently scored in 
the direction of pathology). Content of these 
items is heterogeneous and not especially in- 
dicative of emotional disturbance. Wiener and 
Harmon (Note 1), using a rational-intuitive 
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procedure, divided all MMPI items into those 
to which endorsement clearly indicated emo- 
tional disturbance and those to which emo- 
tional disturbance was not indicated by en- 
dorsement. Subtle and obvious keys were 
actually developed only for the D, Hy, Pd, Pa, 
and Ma scales (cf. Dahlstrom, Welsh, & Dahl- 
strom, 1972). Duff (1965) asked graduate 
students to identify the clinical scale from 
which an item was drawn and its keyed re- 
sponse. He used the 162 items from the Hy, 
` Pd, and Sc scales plus 34 items from other 
clinical scales and 30 items not included on 
any clinical scale. Items whose scales and 
pathological responses were identified by no 
more than 10% of the judges were assigned 
to the subtle category, items identified by 
11%-50% of the judges were assigned to the 
intermediate category, and items correctly 
Placed by at least 50% of the judges were 
assigned to the obvious category. 

Cronbach (1970), in discussing the MMPI, 
Stated that “separate scoring of subtle and 
transparent items is emphatically needed” 
(p. 532). He pointed out that if subtle and 
obvious keys for the different scales were 
available, it would be possible to determine 
where the discriminating power lies. If the 
Subtle keys proved to be invalid or the obvi- 
ous items carried most of the discriminating 
Power, as suggested by the work of Duff 
(1965) and Koss and Butcher (1973), the 
whole criterion-keyed empirical approach to 
test construction would be called into ques- 
tion. From the point of view of psycho- 
metric theory, the most important aspect of 
peat is the provision of data whereby 
Becher can be investigated in a more 
ble. lve manner than previously possi- 
Po investigator has used standard scaling 
eed to determine mean subtle-obvious 
ratings Rt se MMPI items. In this study, such 
ively ere obtained from individuals rela- 
aone Are respect to psychological 
kas Ta à though college students, as a 

Groen : © be more defensive- than men 
More likely AA and, consequently, are 
Patholog Perceive items as containing 

Bical content, the 
Sentative of Ai » they are more repre- 
ose to whom the inventory is 
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typically administered than are graduate stu- 
dents or professional psychologists. 


Method 
Subjects 


One hundred thirty-eight undergraduate students 
enrolled in introductory psychology sections served 
as subjects. The subjects were divided into three 
groups, each containing 25 males and 21 females. 


Procedure 


The first group of subjects rated 387 MMPI items, 
namely, Items 1-283, answered true, and the first 
104 of the 209 items keyed false on the clinical 
scales answered false. The second group rated Items 
284-566 answered true and the 105 remaining items 
keyed false on the clinical scales answered false. The 
third group of subjects rated the other 357 items 
answered false. Subjects were instructed to read each 
item carefully and to decide how clearly each item 
was indicative of a psychological problem. Very 
obvious items were to be assigned a rating of 5; 
obvious, a rating of 4; neither obvious nor subtle, 
a rating of 3; subtle, a rating of 2; and very subtle, 
a rating of 1. 

Mean ratings of each item by males, females, and 
males and females combined were computed. Mean 
obviousness scores for standard MMPI scales were 
determined. Relationships between male-female rat- 
ings and desirability-obviousness ratings were evalu- 
ated by correlational analyses. The mean obviousness 
values of the X and O items; Wiener’s (1948) subtle 
and obvious items; and Duff’s (1965) subtle, inter- 
mediate, and obvious items, were calculated using our 
mean item ratings. Percentages of items in the MMPI 
clinical scales that fell into each of the five rating 
categories also were calculated. 


Results and Discussion 


Analyses of the ratings of the duplicate 
items in the MMPI showed very high agree- 
ment between the groups of judges. The 
product-moment correlation for the duplicate 
items answered true was .98 (m = 14) and for 
the duplicate items answered false was .91 
(n = 5). Means and standard deviations for 
the subtle-obvious ratings of all MMPI items 
answered true and false are contained in 


(text continued on p. 1184) 


1 Although there are 16 duplicate items in the 
MMPI, the reliability coefficients are based on less 
than the full 16 items, because in dividing the items 
for rating by the groups of judges, some of the 
duplicate items were not rated by two groups of 
judges. ` 
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Table 2 : x 

Percentage of Items in Subtle-Obvious Categories for 

Minnesota Multiphasic Personality Inventory Clinical Scales 


Neither 
Very Somewhat subtle nor Somewhat Very 
Scale” subtle subtle obvious obvious obvious 
Hs 03 09 58 30 00 
D 08 28 33 25 05 
Hy 08 30 32 30 00 
På 04 20 38 26 12 
Mf 42 28 26 05 00 
Pa 02 20 20 27 30 
Pt 00 04 31 58 06 
Sc 01 00 32 46 20 
Ma 06 30 39 15 08 
Si 12 34 35 17 00 


* The rating weight for an item was assi 
item was scored in the scale bein 


igned within any particular scale in correspondence to the way that 
g considered. Therefore, if an item was scored false for that scale, then the 


false value was used, and if scored true for that scale, then the true value was used. 


Table 1. It is interesting to note that when 
all MMPI items were rated as true, 333 of 
them were defined as subtle (i.e., mean value 
of less than 3.00) and 233 were defined as 
obvious (i.e, mean values greater or equal to 
3.00). 

Although mean values were calculated sepa- 
rately for males and females, space limitations 
allowed only for presentation of the combined 
ratings. It should be noted, however, that the 
correlations between male and female raters 
were very high; the product-moment correla- 


364; Pa = 3.52; Pt = 347; Hs = 3.13; Pd 
=3.13; D=2.94; Ma = 2.82; Hy =2.81. 
Si= 2.64; L= 241; : 
Aaa a ag Weight for an item was as- 
any particular scale in co; 
pondence to the din 
the scale being considered. Therefore, if an 
tem scored 


and obviousness of item content of the scales 
is understandable. Although the ordering of 
the scales suggests that both Pa and D, but 
especially Pa, share the item characteristics 
associated with the need for the addition of 
K, McKinley, Hathaway, and Meehl (1948) 
found that the addition of K did not increase 
the predictive power of D and Pa. Explana- 
tions of this apparent anomaly are that (a) 
correction items were included on the D scale 
to minimize elevations on psychiatric cases 
whose primary diagnosis was not depression, 
and (b) more than 20% of the Pa items were 
subtle in character. (Cf, Table 2 for a com- 
parison of Pa, Pt, and Sc with regard to per- 
centages of very subtle and somewhat subtle 
items.) The correction items and the subtle 
items both function as K; hence, the addition 
of some fraction of K to these two scales was 
unnecessary, as K did not improve discrimi- 
native efficiency, 

Table 2 shows the percentages of items in 
each of the subtle-obvious categories for each 
of the MMPI clinical scales, Examination of 
this table shows that Pa contains the highest 
percentage of items rated as “very obvious.” 

wever, if the two “obvious” categories are 
combined, one finds that Sc with 66% and Pt 
with 64% rank first and second on this di- 
mension. Duff (1965) found that Sc contained 
a higher proportion of obvious items than 
Hy or Pd; percentages were 40, 22, and 6, 


tively. Mf contains far more items rated 
ery subtle” (i.e., 42%) than- any other 
e. If the two “subtle” categories are 
bined, it can be seen that Mf ranks first 
| Si second in subtlety. There was also 
siderable variability in the neutral cate- 
; it is somewhat surprising to find that 
half of the Hs items were considered 
her obvious nor subtle, especially in view 
he fact that this scale has been consid- 
d as a marker variable for obviousness 

Wiener, 1948). 
The obviousness ratings for all subjects 
e compared with the desirability ratings 
Messick and Jackson (1961). Since Mes- 
and Jackson’s ratings are for items an- 
ed true only, only our ratings of true 
mses were used. The product-moment 
elation was —.78 (p < .001), which sug- 
ts that the subtle-obvious dimension and 
rability have much in common. However, 
relationship is not high enough to substi- 
desirability for obviousness ratings or 

vice versa, 
Analysis of previous subtlety-obviousness 
ures in terms of the subtle-obvious rat- 
ss obtained in the present study demon- 
ated that the different procedures produced 
lts that are superficially similar. Mean 
atings of X and O items ? were 3.16 and 2.04, 
spectively. Statistical analysis showed that 
differences were highly significant, 
65) = 3.58, p < .001; however, it should 
noted that the mean value of the X items 
in our “neither subtle nor obvious” cate- 
. Wiener’s (1948) subtle items (M = 
ee a more subtle than his 
items = 3.45), £(221) = 2.06 

05), but neither set oh aus in pe 
peur ratings, The mean ratings of Duff’s 
Ni 65) ‘subtle, intermediate, and obvious 
ms were 2.69, 3.38, and 3.60, respectively. 
ysis of variance indicated that these 
i ee significantly different, F(2, 185) 
4 hp eee: Further post hoc tests 
ee e aan ratings of Duff’s subtle 
edinu gnificantly more subtle than the 
et te items, but the intermediate items 
AS ite significantly different from the obvi- 
ist th Here again it might be pointed 
as, rele value of Duff’s subtle items 
neither subtle nor obvious” 
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category. The similarity then is more apparent 
than real, inasmuch as these previous rating 
schemes for the most part yielded values that 
tend to cluster in the neutral or obvious cate- 
gories. Very subtle and very obvious items, in 
particular, seem to be strikingly underrepre- 
sented. 

Jackson (1971) has argued that “those 
items are the best which are clearest and 
which contain referents about which people 
agree” (p. 239). Goldberg and Slovic (1967) 
compared responses to abstract designs with 
responses to items designed to measure 
achievement and affiliation and demonstrated 
that “only scales built from items of the 
highest face validity had significant cross- 
validity” (pp. 466-467). Duff (1965) found 
an inverse relationship between MMPI item 
subtlety and item discriminating power. Koss 
and Butcher (1973) showed that the MMPI 
items that characterized crisis situations dis- 
played content clearly relevant to the par- 
ticular situation (e.g, “The future seems 
hopeless to me” for the depressed-suicidal 
crisis). We have used these subtle-obvious 
ratings to show that subtle items are highly 
resistant to faking (Burkhart, Christian, & 
Gynther, 1978) and that endorsement of 
subtle items is characteristic of psychologically 
minded subjects (Burkhart, Gynther, & 
Christian, 1978). However, the most pressing 
problem is to determine the amount of trust- 
worthy information provided by the endorse- 
ment of obvious and subtle items or keys. It 
may be that obvious/subtle keys for certain 
scales provide useful information, whereas 
obvious/subtle keys for other scales do not. 
In that case, one would want to use the valid 
obvious and subtle keys to maximize the dis- 
criminative power of the inventory. However, 
if further research confirms Duff’s (1965) 
contention that subtle keys contribute no or 
very little information, development of a new 
inventory containing all obvious items (or an 
MMPI revised toward the same end) would 
be in order. 


2 These classifications are given in Appendix A of 


Dahlstrom, Welsh, and Dahlstrom (1972). It should 
be noted that the O items found in this appendix 
are not the same as the zero items used by Seeman in 
several studies (e.g, Wales & Seeman, 1969). 
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Reference Note 


1. Wiener, D. N., & Harmon, L. R. Subtle and ob- 
vious keys for the MMPI: Their development. 
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Factor Structure of the SCL-90 in a Psychiatric Population 


Norman G. Hoffmann and Peggy B. Overall 
University of Texas Medical Branch, Galveston 


| The responses of an unselected psychiatric outpatient sample to the SCL-90, a 
self-report checklist of symptom complaints, were factor analyzed using a prin- 
cipal components procedure. Varimax rotation yielded five interpretable factors 
for which factor scores were derived. Results were compared with earlier studies 


Most clinical research in psychiatry over 
the past decade has relied on symptom and 
| behavior ratings made by professional observ- 
ers to describe psychopathology and to mea- 
sure therapeutic response (Lorr, Klett, & 
ji McNair, 1963; Overall, 1974). There are 

numerous instances, however, for which it 
would seem desirable to assess how patients 
perceive their own states. The Minnesota 
Multiphasic Personality Inventory (MMPI; 
Hathaway & McKinley, 1967), the Psycho- 
logical Screening Inventory (PSI; Lanyon, 
1970), and other pencil-and-paper inventories 
have proved useful for description and classi- 
fication of psychopathology, but as a general 
tule they have not appeared as useful as 
might be desired for the assessment of change. 
This is perhaps because such instruments are 
| designed to measure more stable personality 

traits rather than states. A self-report psychi- 
atric symptom checklist that was developed 

Out of the more general item pool of the 

Cornell Medical Index (CMI) has been pro- 

oa an instrument capable of measuring 

a States subject to therapeutic inter- 

n ion and has undergone preliminary tests 

Be olny research. Originally 
(ESCL. e Hopkins Symptom Check List 

), the instrument has undergone sev- 
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using symptom checklists on selected outpatient diagnostic groups. Implications 
for future work with self-report symptom checklists are discussed. 


eral revisions (Derogatis, Lipman, Rickels, 
Uhlenhuth, & Covi, 1974). 

Investigations have considered the factor 
structure of the original 58-item HSCL (Dero- 
gatis, Lipman, Covi, & Rickels, 1971, 1972; 
Lipman, Covi, Rickels, Uhlenhuth, & Lazar, 
1968; Lipman Rickels, Covi, Derogatis, & 
Uhlenhuth, 1969; Mattsson, Williams, Rick- 
els, Lipman, & Uhlenhuth, 1969; Williams et 
al., 1968). Four to six factors have been re- 
ported to account for the bulk of the variance. 
Although the labels assigned to the factors 
have varied, depression and anxiety factors 
have consistently emerged. Other factors fre- 
quently noted are represented by items deal- 
ing with somatic concerns, obsessive-com- 
pulsive themes, and interpersonal sensitivity. 

The SCL-90, a 90-item revision of the 
HSCL, was created by the addition of 32 
items concerned with symptoms of more seri- 
ous psychopathology. Content of the added 
items concerns psychotic symptoms, paranoid 
ideation, phobic anxiety, and hostility (Dero- 
gatis, Lipman, & Covi, 1973). Thus, the SCL- 
90 would appear to have potential for use in 
the assessment of psychopathology in a more 
general psychiatric population. 

Lipman, Covi, and Shapiro (1977) ac- 
complished a factor analysis of the SCL-90 
using data collected in a study involving 
chemotherapy of depressed patients. They 
identified eight factors, which they labeled 
as Interpersonal Sensitivity, Phobic Anxiety, 
Retarded Depression, Anger-Hostility, So- 
matization, Obsessive-Compulsive, Agitated 
Depression, and Psychoticism. Thus, they ob- 
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tained factors similar to those found most 
often in analyses of the 58-item HSCL plus a 
psychoticism dimension defined predomi- 
nantly by the added items. 

A major concern that motivated the present 
investigation arises from the fact that the 
bulk of all previous factor analyses of the 
HSCL as well as the SCL-90 have focused on 
patients selected for symptoms of anxiety 
and/or depression. Diagnostic differences have 
been shown to affect the interpretation of fac- 
tors on the HSCL (Derogatis et al., 1972), 
and similar results could be expected with the 
SCL-90. 

This may be a particular problem when the 
intent is to use the SCL-90 for assessment of 
psychopathology in more general psychiatric 
populations. Factors that appear to be rela- 
tively independent in a restricted sample may 
be too highly correlated to be identified as 
separate factors in a more heterogenous sam- 
ple. Conversely, factors that are clearly pre- 
sent in a general psychiatric population may 
be absent in a highly selected, homogenous 
sample, The purpose of the present article is 
to document the structure and statistical 
Properties of the SCL-90 in a clinic population 
unselected for diagnosis and representative of 
the general psychiatric outpatient population 


seen in public facilities such as medical school 
clinics, 


Method 


The SCL-90 was administe: 


red to 358 i 
seen in the Adult Psychi: mene es 


atry Outpatient Clinic at 
al 
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Figure 1, Eigenvalues for Primary factors, 
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the University of Texas Medical Branch. The sample 
was unselected with the exception that an eighth- 
grade education and professed ability to read the 
newspaper was required. The examiner also heard 
the subjects read aloud the two sample items and 
respond to them. Verbal informed consent was ob- 
tained from each patient for the statistical research 
use of the test data. Demographic and diagnostic 
information was recorded to provide additional de- 
scription of the sample. 

The factor analysis used a principal components 
procedure, and the significant factors were rotated by 
the orthogonal normalized varimax method. 

Factor scoring keys were defined to include the 
items that were found to project most highly on 
each of the rotated factors. To be included in a 
factor score, an item was required to have its 
highest loading on that factor. No items with a 
loading less than .350 were considered for factor 
Scoring. Factor scores were then calculated for each 
subject by summing the values of the items in- 
cluded in the factor. The reliabilities of a single 
total score of pathology and the individual factor 
Scores were estimated using coefficient alpha (Cron- 
bach, 1951), which is an internal consistency statistic 
that assesses the average of all possible split-half 
coefficients. The intercorrelations among the factor 
Scores were calculated to evaluate the extent to 
which the factors really represent distinct or inde- 
pendent aspects of psychopathology. 

Inspection of the demographic variables indicates 
that the patients seen in this clinic are predomi- 
nantly younger women from the lower socioeco- 
nomic classes. Almost 75% of the sample consisted 
of women, and over half of the patients were under 
30 years of age. Ethnically, the group contained 
blacks and whites with a distinct minority of Mexi- 
can Americans who comprised 10% of the sample. 
An approximation of social class using educational 
and work levels in the “Two-Factor Index of Social 
Position” (Hollingshead, Note 1) resulted in almost 
60% of the patients being classified into the lowest 
social class and only 13% of the sample placed into 
the three upper social classes, 

In terms of psychopathology the current sample 
Seems to differ substantially from those in previous 
studies of the HSCL and SCL-90. Although over 
40% of the patients had a diagnosis of depression, 
many of those would not be candidates for inclusion 
in drug studies. Many cases involved personal crises 
as precipitating events, Indeed, 34% of all the pa- 
tients seen were not Prescribed any medication. Fur- 
thermore, the sample includes individuals with a 
variety of neurotic complaints, and personality 
Problems as well as thought disorders, as is re- 
flected in the fact that 16% of the patients were 
diagnosed as schizophrenic or schizoaffective. 


Results 


The pattern of eigenvalues obtained from 
the Principal components analysis is shown in 
Figure 1. These results are presented to em- 
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phasize the prominence of a single general 
factor in the SCL-90 responses of patients in 
this outpatient clinic population. The first 
“unrotated factor accounted for 6.45 times as 
much variance as the next largest and more 
than twice the variance of the next six factors 
combined. This suggests that in this patient 
population, the SCL-90 tends to measure a 
unitary global complaint factor and that the 
self-report complaints of the patients are not 
highly differentiated with respect to the more 
specific factors the instrument may be capa- 
ble of measuring. 

The display of the eigenvalues also suggests 
that five factors account for variance greater 
than the random scree level. The fifth factor 
is the first to depart in any measure from 
the nearly flat linear function that might be 
expected from random noise. 

In point of fact, rotated solutions for 3-8 
factors were examined, and the five-factor so- 
lution provided the most interpretable re- 
sults. When only 3 or 4 factors were rotated, 
the content of the items loading on each 
factor appeared heterogeneous, making inter- 
Pretation of the factors less clear. Conversely, 
when more factors are rotated, the later fac- 
tors tend to fragment into interpretatively 
Similar factors. 

The five normalized varimax factors that 
a clearly defined and interpretable are 
ey Somatization, Phobic Anxiety, 
ki sen Impairment, and Hostile Sus- 
on “am Of the 90 items, 81 met the cri- 
* a loading greater than .350 and 

a be placed in one of the factors. 
ton of the item content found 
te die Provides a straightforward 
the factor TE rae for the interpretation of 

Bree i, he factors identified here tend to 
N (1977 with those reported by Lipman et 
iin bulk of the items in their two 

pression fa ne a oe ne 
Bon and a or found here. The Somatiza- 

Identical to te Anxiety factors are almost 
items with t ure et al.’s in terms of the 

factor, Th ighest loadings on the respective 
a "id three factors were found to be 
i Patients Sh eee aoe 
stri epee while not showing such a 
‘arity to those obtained by previ- 
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ous investigators, do tend to have consider- 
able overlap of items. The Functional Impair- 
ment factor in the present study most closely 
corresponds to the obsessive-compulsive fac- 
tor in the Lipman et al. (1977) analysis. The 
bulk of the items in the Anger-Hostility and 
Psychoticism factors of the earlier study 
project on the Hostile Suspiciousness factor in 
the present study. The Interpersonal Sensitiv- 
ity factor in the Lipman et al. study had high- 
est loadings for six of its seven items evenly 
divided between the Depression and Hostile 
Suspiciousness factors described here, and it 
is the only Lipman et al. factor not related 
primarily to only one factor of the present 
study. 

Factor scores were calculated by summing 
the responses to the sets of items indicated 
under each factor. Reliability of the factor 
scores was estimated using coefficient alpha, 
which is derived from internal consistencies 
among the item responses and is equivalent to 
the mean of all possible split-half reliability 
coefficients. The reliabilities for the five fac- 
tor scores were .94, .90, .89, .83, and .94, re- 
spectively, indicating the high degree of con- 
sistency among the items that compose each 
factor. 

The correlations between the factor scores 
were quite high, ranging from .504 to 147. 
This state of affairs is not unexpected in view 
of the very large eigenvalue associated with 
the first unrotated factor from the primary 
analysis and the fact that many items with 
their highest loading on one factor also had 
substantial projections on other factors. Even 
though it might have been possible to use 
only those items that projected onto only one 
factor, this would have greatly reduced the 
number of items available for the factor 
scales. 

The total score on the SCL-90 was reliable 
and highly correlated with each of the fac- 
tors, further suggesting that a single global 
score might well be used as an index of psy- 
chopathology or psychological discomfort. The 
correlations of the total score with the five 
factors were .928, .746, .831, .789, and .906, 
respectively. The Spearman-Brown split-half 
reliability between odd and even items was 
976, and the alpha coefficient for the total 
test was .975. 


1190 


For comparison, the factors defined by 
Lipman et al. (1977) were scored for this 
general outpatient sample. Intercorrelations 
among the eight Lipman factor scores ranged 
from .407 to .853, and the correlations of 
these factors with the total score ranged from 
.769 to .910, Again, it is apparent that at 
least some of the factors are not clearly 
distinguishable in the responses of patients in 
this general psychiatric clinic population, If 
one were to correct for attenuation due to 
measurement error, some of these correlations 
would approach unity. These results support 
our own in suggesting that scores that are 
ostensibly associated with distinct aspects of 
psychopathology are, in fact, measuring the 
same thing in this type of psychiatric clinic 
population, 

Discussion 

The rotated factors derived from the SCL- 
90 responses of a sample of unselected psy- 
chiatric outpatients appeared quite similar 
to the responses of depressed and/or anxious 
patients selected in previous Ppsychopharma- 
cology studies. These similarities were noted 
despite the fact that the current sample in- 
cluded a substantial proportion of individuals 
who were not deemed in need of medication, 
as well as more disturbed individuals who 
were taking antipsychotic medication, The 
checklist items and format would appear to 
tap some common dimensions in a broad Tange 
of patients. 

The Depression, Somatization, and Phobic 
Anxiety factors seem to be the most clearly 
defined in the present Study and are most 
consistent with other work on the HSCL and 
SCL-90. Of the three, the Somatization factor 
appears to be a slightly more independent 
pers in that its correlations with the 
other factors scores and 
tend to be lower, peta test seòr 

Despite consistencies with faci 


f E tor s 
in previous studies, the Jar ere 


plaint or general discomfort dimensi tha 
sgt, . on 
distinct dimensions of Psychopathology, Be. 
cause of this, we have reservation about the 
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use of factor score profiles as has been 
proposed. More investigation is needed to 
determine whether factor scores of the SCL- 
90 do indeed produce profiles that provide a 
basis for differentiating different aspects of 
pathology. 

Whether or not the SCL-90 can be used for 
specialized interpretations, such as distin- 
guishing between anxiety and depression, our 
results indicate definite promise for the 
instrument as a global index of psychopathol- 
ogy or psychological distress. The total score 
derived from the summation of all items can 
be recommended for such a purpose. The in- 
clusion of the SCL-90 in psychopharmalogical 
Studies seems quite appropriate for furthering 
the understanding of the properties of the in- 
strument and exploring its construct validity. 
The evolution of better criterion measures is 
certainly needed in drug evaluation research, 
and the SCL-90 may hold promise as such a 
measure. 


Reference Note 


1. Hollingshead, A. B. Two-factor index of social 
position. Unpublished manuscript, Yale Univer- 
sity, 1957, 
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Internal_External Expectancies and Health-Related Behaviors 


Bonnie R. Strickland 
University of Massachusetts, Amherst 


The present article is a review of research on internal—external (I-E) locus of 
control expectancies and health attitudes and behaviors. The theoretical back- 
ground of the I-E construct is described. Topics covered include I-E in relation 
to health knowledge, precautionary health practices, reactions to physical dis- 
orders, psychological responding, psychological disturbances, and responses to 
psychological treatment. Some problems and issues are also noted. 


Health concerns and health costs constitute 
a social problem of enormous magnitude in 
this country (APA Task Force on Health Re- 
search, 1976). Estimates of numbers of indi- 
viduals with symptoms severe enough to war- 
rant attention with respect to treatment range 
as high as 75% of the general population 
(Mechanic, 1972), The cost of medical care 
is the fastest growing item in the U.S. con- 
sumer price index, close to $100 billion an- 
nually, and it is expected to more than double 
over the next 5 years (‘Health-Cost Crisis,” 
1977). For good reason, individuals appear to 
be increasingly concerned not only with rec- 
ognizing and attending to debilitating symp- 
toms but also preventing the occurrence of 
illness or accident. The rewards to those per- 
sons who remain free of disease or disability 
are, of course, not only financial but include 
enhanced physical and emotional well-being as 
well. The findings from a broad range of 
studies demonstrating a generally greater 
adaptive functioning for those Persons holding 
internal as opposed to external expectancies 
(Strickland, Note 1, Note 2) have clear im- 
plications with respect to health, 
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The internal-external control of reinforce- 
ment (I-E) dimension is an expectancy varia- 
ble couched within Rotter’s social learning 
theory (Lefcourt, 1976; Phares, 1976; Rot- 
ter, 1954; Rotter, Chance, & Phares, 1972; 
Strickland, 1977). Simply stated, I-E refers 
to the degree to which an individual perceives 
the events that happen to him/her as de- 
pendent on his/her own behavior or as a 
result of luck, chance, fate, or powers beyond 


: one’s personal control and understanding. 


Assessment of I-E expectancies is via ques- 
tionnaires, of which the Rotter I-E scale 
(Rotter, 1966) has been the instrument of 
choice for most ongoing research with adults. 
Several multidimensional instruments to as- 
sess I-E have been devised (Collins, 1974; 
Gurin, Gurin, Lao, & Beattie, 1969; Leven- 
son & Miller, 1976), and some researchers 
have developed I-E measures specific to 
health (Kirscht, 1974; Wallston, Wallston, 
Kaplan, & Maides, 1976). Results of research 
conducted with the various instruments sug- 
gest that beliefs about internal versus external 
control are related in significant and even 
dramatic ways to health-related behaviors. 
The purpose of the present article is to review 
these relationships across several broad areas 
of health, ranging from prevention to suscepti- 
bility to remediation of physical and psycho- 
logical dysfunctions. 


The Theoretical Framework 


The implication that I-E expectancies ate 
related to the facilitation of health behaviors 
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is derived from social learning theory. Rotter 
(1954; Rotter et al., 1972) postulated be- 
havior to occur as a function of expectancy 


_ and reinforcement within a specific situation. 


If a situation is novel or ambiguous, then an 
individual will depend on generalized expec- 
tancies that have served him/her in the past. 


' More specific expectancies are used when the 


aspects of the situation are straightforward or 
routine. The I-E dimension is a generalized 
expectancy that occurs when individuals have 
learned that events are contingent or non- 
contingent on their behavior, Individuals 
holding internal expectancies are more likely 
than externals to take responsibility for their 
actions (Davis & Davis, 1972; Phares, Wil- 
son, & Klyver, 1971) and to attribute respon- 
sibility to agents who activate chance (Hoch- 
teich, 1972; Phares & Wilson, 1972; Schiavo, 
1973; Sosis, 1974). In performance task situ- 
ations, internals are perceptually alert and 
attentive (DuCette & Wolk, 1973; Lefcourt, 
Gtonnerud, & McDonald, 1973; Lefcourt & 
Wine, 1969; Wolk & DuCette, 1974) and 
appear to gather and process information ef- 
fectively for problem solving (Davis & Phares, 


| 1967; DuCette & Wolk, 1972; Pines & Julian, 


1972), Research on social action (Gore & 
Rotter, 1963; Levenson & Miller, 1976; Paw- 
licki & Almquist, 1973; Sanger & Alger, 1972; 
Strickland, 1965) suggests that individuals 
Who believe that events are related to their 
m behaviors are more likely than persons 
Poe fate or powers beyond their control 
fon e steps to change aversive life situa- 
z S. Phares (1976, p. 78) proposed that the 
nitive and motivational aspects of the I-E 


_ „mension lead internals to a superior position 


‘aed Power and control over their en- 
a ent, If this is the case, then I-E ex- 
ton an may have significant impact in rela- 
Bie, ae maintenance, a most important 
ta al concern for many of us. Our parents 
A ae are enjoying a longer life span, 
k on ees infirmities of age, than 
Rae 4 Tue in the past. Even in good health, 
„are daily bombarded by the public media 
ba mation about the likelihood of 
ii ating from such ordinary events 
Eton, a city air and eating bacon. We 
Osing antly urged to improve our health by 
weight, jogging, and engaging in all 
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those inviting, energetic activities designed to 
enhance our physical functioning. When we do 
experience physical or emotional distress, pro- 
fessional health care is more readily available 
to us than it has been in the past, and most 
of our friends and acquaintances have advice 
and their own favorite home remedies to 
share. One would expect that internals, in 
contrast to externals, would be more sensitive 
to health messages, would have increased 
knowledge about health conditions, would 
attempt to improve physical functioning, and 
might even, through their own efforts, be less 
susceptible to physical and psychological dys- 
function, 


Health Knowledge and Precautionary 
Measures 


As early as 1962, Seeman and Evans found 
evidence that hospitalized patients with tu- 
berculosis who were internal as assessed by an 
early I-E measure, with intelligence controlled, 
knew more about their disease than their 
matched external counterparts. The medical 
staff also rated internal patients higher in 
objective knowledge about tuberculosis than 
externals. In wards in which information was 
difficult to obtain, internal patients were sig- 
nificantly less satisfied with the flow of infor- 
mation than externals. The Wallstons and 
their colleagues (Wallston, Maides, & Wall- 
ston, 1976; Wallston, Wallston, Kaplan, & 
Maides, 1976) have also found that internals 
who value their health are more likely than 
others to collect information about disease 
and health maintenance when alerted to possi- 
ble hazards, such as hypertension. 

Following the growing concern in the 1960s 
about a link between cigarette smoking and 
cancer, numerous studies were conducted 
which suggested that individuals who were not 
smokers or individuals who were able to stop 
smoking were more internal than individuals 
who smoked (Coan, 1973; James, Woodruff, 
& Werner, 1965; Mlott & Mlott, 1975; Steffy, 
Meichenbaum, & Best, 1970; Straits & Se- 
chrest, 1963; Williams, 1973; Platt, Note 3). 
These results have not always been replicated 
(Danaher, 1977; Lichtenstein & Keutzer, 
1967), but taken together they do suggest 
that individuals with internal rather than ex- 
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ternal expectancies are more likely to take ac- 
tion to improve their health habits, particu- 
larly when faced with evidence that needed 
changes may result in improved physical func- 
tioning. 

In a study of inoculation against influenza, 
Dabbs and Kirscht (1971) reported that col- 
lege students who were assessed as internal on 
eight “motivational” variables were more 
likely than externals to have been inoculated, 
although internals on eight “expectancy” 
variables were not likely to have taken the 
shots. In other words, subjects who said that 
they were motivated to exert control in rela- 
tion to health behavior did indeed take pre- 
caution against susceptibility to an infectious 
disease. Subjects who simply reported expec- 
tancies about contingencies between behavior 
and events were not differentiated according 
to having received flu shots. In a large-scale 
study with high school students, Williams 
(1972a) found that internal students re- 
ported greater use of seat belts when riding in 
an automobile than externals. These students 
also reported themselves to be significantly 
more likely to engage in preventive dental 
care, that is, to go to the dentist for checkups 
and maintenance even when teeth or gums 
are not sore or hurting (Williams, 1972b). 
Sonstroem and Walker (1973) found internal 
college males to hold more positive attitudes 
toward physical exercise and cardiovascular 
fitness than externals; these internal students 
were also more likely to participate in volun- 
tary exercise. 

In other research designed to investigate 
preventive health practices, Balch and Ross 
(1975) tested 34 females and reported in- 
ternal beliefs to be predictive of both success 
in and completion of an overweight treatment 
program. Wallston, Wallston, Kaplan, and 
Maides (1976) found 22 female subjects in 
weight reduction programs consistent with 
their specific I-E beliefs about health to be 
more satisfied with treatment. Results of 
weight loss for subjects in congruent condi- 
tions were in the expected direction, although 
they failed to reach significance. Other ex- 
perimenters have not been able to relate I-E 
to attempts at weight loss (Bellack, Rozen- 
sky, & Schwartz, 1974; Manno & Marston, 
1972; Tobias & MacDonald, 1977), although 
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some reported overweight subjects to be ex- 
ternal, a finding also reported by O’Bryan 
(1972). 

At least two studies suggest that females 
who are internal are more likely than exter- 
nal females to practice birth control effec- 
tively (Lundy, 1972; MacDonald, 1970). 
However, Harvey (1976) assessed almost 
200 female college undergraduates as to their 
reliance on safe (pill, intrauterine devices) 
versus risky (condom, rhythm, etc.) birth 
control methods and found no differences as 
a function of I-E expectancies. Segal and 
DuCette (1973) reported that middle-class 
white high school females who became preg- 
nant were more external than control sub- 
jects; lower class, black, high school females 
who became pregnant were more internal. 
The authors suggested that pregnancy has 
different meanings for these populations. At 
that time, in the upper-class group, preg- 
nancy might have been an undesirable event, 
and internal students would have been ex- 
pected to take precautions against pregnancy. 
Pregnancy may have been more acceptable 
among the lower-class group ‘and for some 
students might even have been a source of 
pride and success. If these assumptions are 
correct, then the discrepant I-E predictions 
make sense. Obviously, a number of complex 
factors enter into any decision, or lack there- 
of, about pregnancy. However, the I-E vari- 
able appears to be a promising one in rela- 
tion to predicting the use of birth control, 
especially if goals of family planning and 
rationale for use of birth control can be 
specified. 

With some exceptions, the bulk of the re- 
ported research on I-E and precautionary 
health practices lends credence to the ex 
pected theoretical assumptions that individ- 
uals who hold internal as opposed to external 
expectancies are more likely to assume Te- 
sponsibility for their health. Internals appear 
to attempt to maintain their physical well- 
being and to guard against accidents and 
disease to a greater extent than individuals 
who hold external expectancies. As would be 
expected within Rotter’s theory, internals 
who value their health (reinforcement value) 
Seek more information about health mainte- 
nance, and when stricken with a disorder 
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they appear to learn more about the disease 
that afflicts them. Other specific and general- 
ized expectancies as well as situational con- 
tingencies would also be expected to interact 
with what is likely a complex relationship 
between I-E and precautionary health prac- 
tices. 


Reactions to Physical Disorders 


When a person is faced with a chronic 
handicap or a debilitating illness, do I-E 
expectancies play any part in an individual’s 
_teaction to this situation? MacDonald and 
Hall (1971) asked healthy college students 
“how they might respond to various physical 
handicaps in regard to social relationships 
and feelings about themselves. Externals 
rated physical disabilities as more debilitat- 
an did the internals, who apparently 
. anticipated less severe consequences of handi- 
taps, But, what happens in real life? Lipp, 
Kolstoe, James, and Randall (1968) inves- 
Ügated the responses of handicapped and 
Normal subjects to pictures of disabled peo- 
ple, The handicapped subjects suffered a 
Variety of disabilities including amputation, 
Paralysis, fractures, arthritis, and congenital 
deformities, Normal subjects were of approx- 
imately the same age and sex as the handi- 
capped group and were matched on the basis 
A responses to the James I-E scale. Slides 
sensible. and normal persons were pre- 
a ed tachistoscopically to all subjects. As 
nb eaters had hypothesized, disabled 
es took significantly more trials to rec- 
ae the disability slides than the normal 
ie ka Of interest to the present article are 
bas arte of an interaction between three 
ü Biota, I-E and the disabled/nondisabled 
aa ae The external disabled individuals 
y ae denying of disability, as measured 
3 en time of disability slides, than 
Stores f me, including those subjects whose 
Vestigaty in the middle I-E ranges. The in- 
DA rd that the internals are 

a eas ome dovine of 


evi i 
D ea other studies have been conducted 

fy m S LE in relation to threat to 
SPond it internals and externals do re- 


differently, Phares, Ritchie, and Davis 
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(1968) provided subjects with either posi- 
tive or negative feedback, which ostensibly 
resulted from their answers to a number of 
personality tests. Consistent with a denial 
model, externals, as assessed by the Rotter 
scale, recalled significantly more of the eval- 
uative material, both positive and negative, 
than did internals. Internals did, however, 
demonstrate a greater willingness to engage 
in remedial behavior to confront their alleged 
problems. Another investigation by Houston 
(1972) gives additional information about 
the differential influence of I-E beliefs in 
regard to response to stress. Houston told 
one group of subjects that they could avoid 
electric shock in an experimental condition 
by not making mistakes on a subtest of the 
Wechsler Adult Intelligence Scale. Another 
group was told that there was no way to 
avoid shock. Subjects who perceived that 
they had some control over the shock re- 
ported less anxiety but actually evidenced 
greater physiological arousal than the no- 
control group. Although they did not re- 
port more distress, internal subjects showed 
an increase in heart rate significantly greater 
than did externals across both conditions. 
Evidently the internals were more aroused in 
the control of shock condition, but they 
denied their anxiety about the situation. Ex- 
ternals may have found it easier to “accept” 
the threat of shock and resign themselves to 
the situation. These results, of course, raise 
implications about the relation of LE be- 
liefs to a host of psychological findings on 
“perceived control” (Glass & Singer, 1972). 
LE beliefs may have direct relevance, for 
example, to the Type A and Type B behavior 
patterns that Glass (1977) described. Glass 
wrote that Type A individuals, who sound 
strikingly like internals, appeat to be en- 
gaged in a struggle for control and try to 
develop strategies for coping with uncontrol- 
lable stress. This continued attempt at mas- 
tery of life events, however, takes a tragic 
toll in that these individuals appear to be 
more prone to coronary heart disease. 
Aside from studies on reactions of in- 
dividuals to perceived threat, numerous 1n- 
vestigations with persons who have actually 
suffered some traumatic event have been con- 
ducted. Blind children (Land & Vineberg, 
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1965), children with cerebral palsy (Egg- 
land, 1973), and children with severe read- 
ing problems (Strickland & Hill, Note 4) 
have all been assessed as more external than 
comparable, nonhandicapped children. Jones 
(1974), however, investigated orthopedically 
disabled children and found no relationship 
between I-E beliefs and degree of mobility. 
Wendland (1973) tested 80 males, ages 18- 
35, with muscular skeletal impairment. Sub- 
jects who had been disabled less than 14 
years were significantly more external than 
subjects disabled 3 years or longer. Wend- 
land suggested that disabled individuals have 
a tendency to expect increased direction from 
external forces during the initial period fol- 
lowing disability onset. 

Bruhn, Hampton, and Chandler (1971) 
compared a group of 36 male hemophilics, 
ages 12 and over, with a control group of 
normals, They found that overall, the hemo- 
philic group was more internal than normal 

. controls. But, within the hemophilic group, 
a marginally severe group was significantly 
more external than either a mild or a severe 
group. These investigators suggested that the 
marginally severe hemophiliac views his clini- 
cal state as unpredictable and is more de- 
pendent on external cues to determine his 
well-being. Goldstein (1976) also compared 
24 long-term male hemodialysis patients with 
22 male patients, all of whom were recover- 
ing from minor medical problems. The hemo- 
dialysis sample patients obtained significantly 
higher denial and externality scores than 
the nonhemodialysis control subjects. 

A number of studies that are investigations 
of attempts to influence health care, once dis- 
abilities occur, are available. Weaver (1972) 
found that internal patients with severe kid- 
ney disorders who were using dialysis ma- 
chines to stay alive were significantly more 
likely than matched externals to comply with 
diet restrictions and to keep scheduled ap- 
pointments, Internal patients hospitalized 
with spinal cord injury had higher self-con- 
cepts and reported themselves to be less de- 
pressed than matched externals (Dinardo, 
1972). Those internals who scored high an 
the repression side of Byrne’s Repression- 
Sensitization scale showed the best adjust- 
ment, and external sensitizers showed the 
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poorest adjustment to spinal cord injury. 
However, Bulman and Wortman (1977) 
found no differential I-E predictions in a 
sample of 24 paraplegics or quadraplegics. 
Ireland (1973) attempted to investigate par- 
ticipation in treatment in relation to I-E 
beliefs among pulmonary emphysema pa- 
tients. Ratings were difficult to obtain, and 
no clear findings emerged. He did find, how- 
ever, as noted earlier, that internal patients 
knew more about their disorder than ex- 
ternals, even with intelligence controlled. 
Presurgical external patients report more 
anxiety than matched internals (Lowery, Ja- 
cobsen, & Keane, 1975), and, following ab- 
dominal surgery, internal female patients are 
more likely to attempt to influence post- 
operative care in relation to obtaining more 
analgesics than external patients (Johnson, 
Leventhal, & Dabbs, 1971). If they were 
firstborns, internals also had longer hospital 
stays than externals. Johnson et al. explained 
the interaction between I-E and birth order 
as follows: They postulated that firstborns 
who are exposed to stress become f. rightened 
and dependent and so they try to remain in 
contact with the authorities who control 
danger. Internal. firstborns have learned to 
manipulate the health personnel in order to 
remain in the hospital. External firstborns 
do not have sufficient skill to influence post- 
operative stay. Since later born patients are 
less dependent and frightened, they feel no 
need to stay longer regardless of I-E beliefs. 
In another rather complicated research 
design, Auerbach, Kendall, Cuttler, and Lev- 
itt (1976) found no relationship between I-E 
scores and anxiety about impending dental 
procedures among a group of patients sched- 
uled for surgical removal of a tooth. How- 
ever, they did find that internal and external 
patients responded differentially to specific 
and general presurgery information, Accord- 
ing to dentist ratings, internals adjusted 
poorly in surgery after receiving general, 
marginally relevant material about the den- 
tal procedure. However, internals showed 
good adjustment in surgery after viewing 4 
tape that imparted specific information about 
the procedures and sensations that they 
might expect; the reverse was true for €% 
ternal patients. These investigators suggest? 


that internals responded favorably to the 
‘specific information, since it provided rele- 
yant input consistent with their cognitive set 
that they exert control of the occurrence of 
reinforcers and punishments provided them. 
The authors hypothesized that the specific 
‘information provided data that enhanced the 
internal’s perception that he/she might ma- 
Mipulate the impending aversive situation, 
Whereas the general information reinforced 
the ambiguity of the situation and a lack of 
personal control over it. They suggested that 
for externals, the specific information led to 
4 diminished reliance on outside sources as 
the precipitating events leading to this aver- 
Sive situation. The general tape allowed the 
ternal patient to avoid personal responsi- 
bility, an action congruent with a defensive 
posture about control of reinforcement. 
In one of the most complex and well-con- 
trolled studies of personality characteristics 
relation to stress, heart attack, and re- 
very, Cromwell, Butterfield, Brayfield, and 
Surry (1977) manipulated nursing care, par- 
ticipation in various activities, and informa- 
lon about heart attack for 229 coronary pa- 
tients. Eighty medical patients with illnesses 
mparable in severity but without cardio- 
Vascular involvement served as controls. De- 
Pendent variables included stay in intensive 
tare, stay in hospital, rate of alarms (heart 
tate changes while on unit), a number of 
biochemical and physiological indices, re- 
lspitalization, and death. Overall, in regard 
0 -E scores as assessed by the Rotter scale, 
fotonary patients were more external than 
Mean controls. No patients who were 
i ae n congruent combinations of locus 
Eo beliefs and participation in self- 
(intervals with high participation 
hee with low participatih) re- 
le 06) tres hospital (p < 06) or died (p 
; sky a 12 weeks following their hos- 
Pace e small number of patients who 


be ee levels, these findings cannot 
“the na with great confidence. However, 
f the ise ic in the predicted direction, 
Pee is nall number of cases could have 

ntributed to the lack of statistical 
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significance. Since these are long-term effects, 
Cromwell et al. suggested that patients whose 
hospital treatment was incongruent with per- 
sonal expectancies may have resisted a de- 
cision to return to the hospital when crucial 
symptoms of another myocardial infarction 
appeared, I-E beliefs were also related to a 
number of other dependent variables. Except 
in two cases in which internality was linked 
with high anxiety, externality was always 
associated with undesirable physical charac- 
teristics such as higher temperature and 
higher sedimentation rates. Also, internal pa- 
tients were more cooperative in response to 
treatment demands, and they left the coro- 
nary unit and the hospital earlier than ex- 
ternal patients. It should be noted, however, 
that Marston (1969) found no relationship 
between I-E beliefs and compliance behavior 
among a group of coronary patients and, with 
58 male patients, Garrity (1973) found that 
subjects’ perceptions of health status, social 
class, and a belief in external control pre- 
dicted return to work after their first myo- 
cardial infarction. 

Aside from the Cromwell et al. (1977) 
study, very little research is available linking 
I-E beliefs to specific physical illnesses. How- 
ever, Naditch (1974) considered data on 
over 400 black men and women (in six 
American cities) who were diagnosed as hav- 
ing essential hypertension—a “silent” health 
hazard for large numbers of black people. 
Among externals who rated themselves as 
discontented with their lives, the rate of hy- 
pertension was 46%, more than double the 
21% rate for the total sample and consid- 
erably higher than rates for all other group- 
ings (e.g, a 7% rate for contented inter- 
nals). In further analysis, Naditch concluded 
that these results occurred primarily as a 
function of the responses of males but not 
of females in the sample. Similar to Crom- 
well et al., and as would be expected from 
the Glass (1977) research, the Naditch I-E 
findings have implications for cardiovascular 
involvement. Taken together, these studies 
suggest that I-E beliefs may be particular 
salient for understanding the influence of 
belief about control on physiologically adap- 
tive responses to stress within the cardio- 
vascular system. (See Strickland, in press, 
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for a more complete review of I-E and car- 
diovascular functioning.) 

Darrow (1973) tested several hundred per- 
sons who presented themselves to community 
health centers for diagnosis and treatment 
of venereal disease. He found men assessed 
as internal to be significantly less likely to 
be infected with gonorrhea. No such results 
emerged from females, although internal wo- 
men were more likely to return for follow- 
up treatment with the appearance of new 
symptoms than were external females. Dar- 
row interpreted this latter finding as most 
likely being due to the tendencies of these 
women to notice physiological changes and 
to seek an explanation for the reoccurrence 
of symptoms after treatment, Olbrisch 
(1975) also reported a complex relationship 
between externality and naive beliefs about 
gonorrhea, in that external subjects appear 
to have a more casual, helpless attitude 
about how venereal diseases are contracted. 
In this study, however, externals did not 
differ from internals in plans to make fur- 
ther precautions. 

Any impending or disabling disorder, 
whether chronic or temporary, has a varying 
degree of influence on the responses of the 
persons faced with the handicap. The sever- 
ity of the disorder, the time of onset, the 
current status of the patient, the support that 
he/she receives, and so on, all interact with 
what is probably a complex set of cognitions 
about the disorder. When an individual is 
more helpless than he/she once was, or is 
handicapped in relation to others, beliefs 
about locus of control would be expected to 
be, and apparently are, related to reactions 
to the disorder and the struggle to recover. 
Although chronically handicapped individ- 
uals, particularly children, appear to be more 
external than their control counterparts, in- 
ternal adults, in spite of the fact that they 
may respond to disablement with initial de- 
nial and concern, appear to know more about 
their disorder and attempt to influence health 
care to a greater extent than externals, These 
results are most clear in those complex de- 
signs that give attention to the interactions 
of I-E expectancies and situational di 
For example, i oe. 

mple, internals seems to be able to 
use specific information about their disease 
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and treatment, whereas externals respond to 
general instructions. Given the initial de- 
fensiveness and denial of internals faced with 
trauma, a number of questions are raised as 
to when internal versus external expectancies 
are more adaptive. It may be that a defensive 
stance is helpful when a person who is ac- 
customed to considerable personal control is 
suddenly faced with events beyond his or 
her influence. 


I-E and Physiological Responses 


Aside from the findings of relationships of 
I-E expectancies and general health prac- 
tices, it is of interest to note the degree to 
which I-E beliefs may be related to an in- 
dividual’s ability to monitor and change spe- 
cific physiological responses. A number of 
investigators and health care personnel have 
attempted to teach individuals to improve 
physical functioning via the use of biofeed- 
back, and indications of individual differences 
in response to this technique become par- 
ticularly important. Again, a logical assump- 
tion is that persons who hold strong locus of 
control expectancies, whether internal oF 
external, will have differing responses to at- 
tempts to control their own internal physical 
states. Internals would be expected to be 
sensitive to internal states, alert to biofeed- 
back cues, and motivated to attempt self- 
control of bodily function. 

Results of several studies do show inter- 
nals to be generally superior to externals in 
responding to biofeedback paradigms. Inter- 
nals are better able to increase and maintain 
electroencephalogram alpha responding (Gos- 
ling, May, Lavond, Barnes, & Carreira, 1974; 
Johnson & Meyer, 1974) and to lower gal- 
vanic skin responses via biofeedback than 
externals (Wagner, Bourgeois, Levenson, 
Denton, 1974), Other biofeedback research 
also demonstrates the influence of I-E €- 
pectancies on the control of vascular 1°- 
sponses. Ray (1974) found internals to be 
more proficient at increasing heart rate an 
externals at decreasing heart rate than com” 
Parison subjects. These results. were esse" 
tially replicated by Gatchel (1975) for the 
first training trial of a biofeedback paradig®- 
Fotopoulous (1971) reported internal sub- 


to be more capable of increasing heart 
jithout either reinforcement or external 
, whereas externals could increase 
“rate only under a reinforcement para- 


e the clinical findings on I-E/cardio- 
relationships, these data on the 
control of heart rate suggest that 
nals and externals may be using dif- 
strategies in biofeedback paradigms and 
effective responding might be enhanced 
viduals are in conditions that are con- 
it with their expectancies for control. As 
by Cromwell et al. (1977), internals 
Tespond to opportunities to work indi- 
ally, and externals may need conditions 
fucture or outside influence to enhance 
Tesponding. At least one laboratory 
y Suggested that this is indeed the case. 
(1975) studied 24 internal and 24 
hal subjects under one of two aversive 
č avoidance procedures. Half of the sub- 
could escape shock by asking for a rest 
id, and half had rest periods imposed by 
perimenter. Control over initiation of 
had an arousal-reducing effect on systolic 
pressure for all subjects. Diastolic blood 
sure change appeared to be a function of 
teraction of I-E expectancies and the 
. Elevations were lowest when per- 
4 and situational control factors were 
í ent, that is, for internals in conditions 
li-initiation of rest and for externals un- 
mposed rest, 
ta particularly complex biofeedback de- 
Carlson (in press) studied 24 male and 
male college students who equally repre- 
three distinct ethnic groups: Cauca- 
, Japanese, and Chinese. The feedback 
4 ts acquired lower frontal electromyo- 
mic (EMG) levels than control subjects, 
nternal subjects in the feedback condi- 
quired lower levels than externals. No 
t differences in EMG levels were ob- 
n the control condition as a function 
Ss assessed by the Nowicki-Strick- 
io scale for adults; Nowicki & Strick- 
73). These results were stable across 
€s, all three ethnic groups, and across 
Teplications. On pretest and posttest, 
found that external subjects in the 
© conditions shifted significantly 
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toward internality, suggesting that the ex- 
periencing of the opportunity for changing 
bodily states was related to an enhanced 
internal expectancy, No change scores oc- 
curred for the internal subjects, possibly 
because of a ceiling effect. Internal subjects 
in the control conditions did shift toward 
externality, but this change was not statisti- 
cally significant. Additionally, I-E shifts were 
apparently not related to actual performance 
changes. Despite the fact that internals in 
the feedback condition achieved the lowest 
frontal EMG levels, they also reported feel- 
ing somewhat less relaxed during training than 
their counterparts in the control condition and 
the externals in the feedback conditions. Carl- 
son suggested that in their efforts to perform 
well in a frontal muscle relaxation task, in- 
ternals may actually sacrifice their subjective 
state of general relaxation. Again, these find- 
ings are reminiscent of the cardiovascular re- 
sults in which internals appear to deny 
arousal while actually experiencing distress 
in situations in which they have no control. 
Berggren, Ohman, and Fredrikson (1977) 
shed some additional light on the possible 
mechanisms that lead to the differential physi- 
ological responses of internals and externals to 
stimulus conditions. When college students 
with extreme I-E scores were asked to re- 
spond to a recurring tone of moderate inten- 
sity, external subjects took significantly longer 
to reach a criterion of habituation than in- 
ternals on a measure of skin conductance. Evi- 
dently, externals were exhibiting continued 
electrodermal orienting responses to nonsignal 
stimuli. The investigators then ran additional 
subjects in the same methodological proce- 
dure plus a second experimental manipula- 
tion—The stimulus was given more signifi- 
cance by arranging that it would be a signal 
for a forthcoming task. Some groups of sub- 
jects were asked to press a switch at the of- 
set of a recurring tone, and other groups Te- 
peated Experiment 1. Again, externals took 
significantly longer to reach criterion of ha- 
bituation than did internals. In the signal 
condition, however, this effect was reversed 
so that internals habituated more slowly. The 
external group failed to differentiate between 
signal and nonsignal conditions. The investi- 
gators interpreted these results to suggest that 
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externals have poorer control of attention 
than internals. External subjects attended to 
irrelevant events and did not seem to differ- 
entiate between relevant and irrelevant cues. 
Internals, on the other hand, differentiated 
sharply between cues and stopped responding 
to irrelevant cues quickly. Overall, as theo- 
retically expected, internals appeared to be 
vigilant, with more active attentional pro- 
cesses and more focus on task-relevant cues. 

Physiological responding in biofeedback 
designs appears to be functionally related to 
I-E expectancies and enhanced in congruent 
conditions. It is likely that these effects occur 
because internals and externals attend and 
respond to relevant stimuli in different ways. 
Internals appear to be more motivated to 
succeed and are generally superior in perform- 
ance than externals. However, their vigilance 
and attempts to control their bodily states 
may result in immediate increased arousal 
and diminished future well-being. Again, it is 
important to know when internals should be 
encouraged to relax and/or to relinquish con- 
trol for enhanced physical functioning. These 
investigations could have direct and practical 
implications for preventive health practices, 
especially in regard to cardiovascular disease. 
Both internals and externals can learn to con- 
trol heart rate. However, they use different 
strategies to do so, and their efforts may lead 
to different long-term outcomes. More specific 
investigations are needed, but evidently cog- 
nitive mediating variables about perceived 
control are impactful in relation to basic 
cardiovascular functioning. 


I-E and Psychological Disturbances 


Research on the I-E variable and the re- 
porting | of psychological and/or emotional 
difficulties is much more extensive than that 
on I-E and physical disorders. At a general 
level of overall functioning, internal individu- 
als including the elderly (Felton & Kahana 
1974; Wolk & Kurtz, 1975) are significantly 
more likely to report themselves as EA 
with their life situations than externals (Na- 
ditch, Gargan, & Michael, 1975; Palmore & 
Luikart, 1972). The relationships among I-E 
and adjustive behavior and attitudes how- 
ever, is moderated by the nature of the set: 
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tings in which people reside (Wolk, 1976). 
With regard to dysfunctional difficulties, in- 
vestigators have found a belief in external 
locus of control to be related to debilitating 
anxiety (Butterfield, 1964; Feather, 1967; 
Finch & Nelson, 1974; Platt & Eisenman, 
1968; Shriberg, 1974; Strassberg, 1973; Wat- 
son, 1967), to the holding of irrational values 
(MacDonald & Games, 1972), to mood dis- 
turbances (Kilpatrick, Dubin, & Marcotte, 
1974), and to indices of maladjustment on 
paper-and-pencil questionnaires (Duke & No- 
wicki, 1973; Hersch & Scheibe, 1967; Powell 
& Vega, 1972). With patients who have been 
hospitalized for psychiatric reasons, a number 
of researchers have reported a relationship 
between externality and severity of psychi- 
atric diagnosis (Cash & Stack, 1973; Croft, 
Johnson, & Fox, 1975; Cromwell, Rosenthal, 
Shakow, & Zahn, 1968; Duke & Mullins, 
1973; Harrow & Ferrante, 1969; Lefcourt, 
1976; Levenson, 1973; Lottman & DeWolfe, 
1972; Palmer, 1971; Shybut, 1968; C. E. 
Smith, Pryer, & Distefano, 1971). These data 
are correlative, and there is no way of know- 
ing if external beliefs accompany a predispo- 
sition to psychological difficulties or if locus 
of control beliefs occur as a function of the 
disturbances. At the least, it appears that the 
reporting of life contentment is related v 
internality, whereas pathological difficulties 
are linked to external expectancies. 

A puzzling issue throughout a consideration 
of I-E and maladaptive behavior, however, 
concerns the discrepant predictions about €X- 
ternality and depression. One might logically 
expect that individuals who believe that they 
are responsible for the results of their þe- 
havior would be more likely to become de- 
pressed when life events do not go well for 
them than persons who are able to attribute 
traumatic events to luck, fate, God’s judg- 
ment, and so forth. Indeed, Phares (1972) has 
hypothesized that “depressions tend to be 
associated with people who possess a strong 
generalized expectancy that outcomes are 
their own responsibility” (p. 466). The guilt 
and self-punitiveness often expressed by de- 
Pressives would be expected to occur only if 
individuals actually believe that they infu- 
ence life occurrences. On the other hand, many 
depressives report themselves to be powerless 
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about life events, to experience loss of con- 
trol, and to feel helpless about influencing 
future events—all perceptions that sound 
similar to externality. Strickland (Note 1) 
and Lefcourt (1976) have both hypothesized 
a relationship between depression and ex- 
ternality, and much of the empirical literature 
supports this contention (S. I. Abramowitz, 
1969; Calhoun, Cheney, & Dawes, 1974; Di- 
nardo, 1972; Emmelkamp & Cohen-Kettenis, 
1975; Goss & Morosko, 1970; Moyal, 1977; 
Naditch et al., 1975; Prociuk, Breen, & Lus- 
sier, 1976; Wareheim & Foulds, 1971; Haley 
& Strickland, Note 5). Considerable evidence 
has also accumulated with regard to I-E be- 
liefs in relation to “learned helplessness”—a 
phenomenon proposed by Seligman (Abram- 
son, Seligman, & Teasdale, 1978; Seligman, 
1974, 1975) as a model for reactive depres- 
sion. Generally, researchers find that externals 
show poorer performance in the learned help- 
Iessness paradigm than do internals (Cohen, 
Rothbart, & Phillips, 1976; Hiroto, 1974). 
Although, in at least one instance, individuals 
tesponded to helplessness manipulations with 
increased attempts at control (Roth & Boot- 
mn, 1974). In fact, Brehm’s reactance theory 
(Brehm, 1966, 1972) would predict that a 
person threatened with loss of freedom will 
ecome motivationally aroused to prevent 
their loss. Obviously, individuals have differing 
‘oping styles and respond to circumstances in 
ie but possibly predictable ways. In- 
teased research with the I-E variable might 
blve additional clues as to individual responses 
ag or traumatic life situations. Re- 
Mon in this area must remain completely 
ae owever, without presupposed judg- 
ents of what is “good” or “adaptive.” As 
p pes and Brehm (1975) caution, an em- 
ris ae Personal causation may be danger- 
Ei te individuals are faced with situations 
eae truly uncontrollable. Additional com- 
3 fhe ae kind of research have to do 
Fotses, 1p y culty of defining depressive re- 
Order es ina is a multidimensional dis- 
“Ws, Beni Sie of depression (e.g., reactive 
telated wie AOD) may be differentially 
ote 6) f . In fact, Strickland and Hale 
More stp ‘ound external expectancies to be 
Ongly related to a measure of chronic 


Linen 
Pression than to temporary depressed mood. 
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Further, when factors or dimensions within 
the I-E construct are examined, they lead to 
differential predictions. For example, depres- 
sives take responsibility for negative but not 
positive events in their lives (Haley & Strick- 
land, Note 5). Clearly, the relationships are 
complicated and further research is neces- 
sary. The I-E variable would appear to be 
particularly useful in relation to cognitive 
theories of depression (Beck, 1976) and in 
models incorporating response-contingent hy- 
potheses (Lewinsohn, 1972). Of particular 
importance are those critical studies linking 
cognitive variables to overt behavior. 

Even though depression takes its toll in 
lessened feelings of well-being and happiness 
as well as in loss of energy, perhaps the most 
serious aspect of depression is the possibility 
of suicide that occurs for many depressed 
individuals. Several investigators have consid- 
ered the relationships among suicidal 
thought, behavior, and I-E expectancies in 
an attempt to better understand life-threaten- 
ing behaviors. Lambley and Silbowitz (1973) 
could not predict the contemplation of suicide 
via Rotter’s scale, but Williams and Nickels 
(1969), in a study of 235 college students, 
found externality to be related to suicide 
potential as measured by the Minnesota 
Multiphasic Personality Inventory. Crepeau 
(Note 7) also asked college students how 
often they contemplated suicide. Generally, he 
found suicide ideation to be linearly related 
to Collins’ (1974) I-E measure, with persons 
who reported suicidal thoughts being assessed 
as more external than students who had never 
considered suicide. 

Melges and Weisz (1971) talked with 15 
patients who had recently made serious sui- 
cide attempts. They asked them to recall as 
vividly as possible the feelings and thoughts 
that they were experiencing immediately be- 
fore the attempt. With pretest and posttest 
measures, they found increased externality 
following the specific suicide ideation. Pa- 
tients also reported more negative evaluations 
of the future and less extension of a span of 
awareness toward the future. Both of these 
variables were also related to changes toward 
externality. Thus, findings from reports of 
the reexperiencing of suicidal thoughts sug- 
gest that the original suicidal impulses may 
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reflect a feeling of loss of control and an in- 

ability to foresee a pleasant future. 
Although many of the empirical results 
reported in psychological literature point to a 
relationship between beliefs about external 
control and psychopathology, this finding does 
not hold in some selected samples of maladap- 
tively functioning persons. For example, con- 
siderable confusion is apparent in studies with 
individuals who are substance abusers. Some 
investigators reported alcoholics to be more 
external than control samples (Butts & Chot- 
los, 1973; Naditch, 1975; Nowicki & Hopper, 
1974; Obitz & Swanson, 1976; Palmer, 1971), 
and others found alcoholics to be more in- 
ternal (Goss & Morosco, 1970; Oziel, Obitz, 
& Keyson, 1972), Caster and Parsons (1977) 
found that alcoholics score higher than con- 
trol subjects on the Chance Control dimen- 
sion of Levenson’s scale and noted that treat- 
ment success varied in relation to I-E scores, 
A correlation between external Control by 
Powerful Others and depression was evident in 
successfully treated groups, whereas a rela- 
tionship between depression and Chance Con- 
trol was found for failures. Caster and Par- 
sons suggested that alcoholics who are de- 
pressed and who see their distress as occurring 
because of powerful others may respond better 
than those alcoholics for whom depression is 
psychologically related to fate or chance. 
Other investigators have suggested that I-E 
differentiates among alcoholics, with those 
holding internal expectancies being less se- 
verely disturbed. Internal alcoholics also ex- 
perience a greater magnitude of control over 
intrapersonal and interpersonal stresses than 
the external alcoholics (Donavan & O'Leary 
1975; Donavan, O'Leary, & Schau, 1975: 
O'Leary, Donavan, & Hague, 1974a, 1974b: 
Pryer & Distefano, 1977), Although one 
might theoretically expect drug abusers to be 
external, some investigators reported them to 
be more internal than control groups (Berzins 
& Ross, 1973; Calicchia, 1974; Smithyman 
Plant, & Southern, 1974). Again, a number of 
factors may be operating here, with complica- 
tions arising because of the complexities of 

the disorders, difficulties in diagnosi 
impact of situational demands in the tears 
emands in the testing 


sessions. Rotter (1975) noted that alcohol 


and drug abusers may be responding to the 
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exhortations of staff members in treatment 
programs who tell patients that the success of 
treatment is “up to them.” They may dis- 
simulate on I-E measures to present them- 
selves in a favorable light. Although Rotter 
(1966) attempted to control for social desir- 
ability in the early I-E assessment instru- 
ments, a number of investigators have re- 
ported a relationship between internality and 
socially desirable responses (Cone, 1971; 
Harris, 1975; Hjelle, 1971; Vuchinich & Bass, 
1974). Thus, some of the reported relation- 
ships between externality and the reporting of 
psychopathological symptoms may be a func- 
tion of approval-motivated response set rather 
than veridical representation of locus of con- 
trol expectancies. 


I-E and Psychological Treatment 


Assuming for the moment, however, that 
externality and maladaptive functioning are 
related raises a number of questions about 
responses to psychotherapy or treatment pro- 
grams in relation to I-E expectancies. Con- 
ceptually, internals might enter therapy oF 
treatment at a less disturbed level than ex- 
ternals. Once in treatment, internals would 
be expected to respond in adaptive ways, 
assuming responsibility for their difficulties 
and attempting to change. This assumption, 
of course, depends on the therapeutic ap- 
proach (Nowicki, Bonner, & Feather, 1972). 
Internals might be quite resistant to inter- 
ventions that they perceive as limiting their 
freedom or control. And, as mentioned earlier, 
therapeutic benefits might be most enhanced 
when individuals are in treatment situations 
congruent with their locus of control beliefs. 

Generally, results of a large number of 
Studies across different treatment modalities 
Suggest that individuals in therapy or self- 
improvement groups do become more internal 
as treatment progresses (Diamond & Shapiro, 
1973; Dua, 1970; Eitzen, 1974; Gillis & Jes- 
sor, 1970; Lewis, Dawes, & Cheney, 1974; 
Kilmann & Howell, 1974; Lynch, Ogg, & 


Christensen, 1975; Pierce, Schauble, & Far- | 


kas, 1970; Schallow, 1975; R. E. Smith, 
1970). Academic underachievers also appe™ 
to become more internal in response to cou? 
seling and structured group activities (Fe 
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ton & Biggs, 1972, 1973; Felton & Davidson, 
1973; Felton & Thomas, 1972; Nowicki & 
Barnes, 1973; Reimanis, 1974), as do juve- 
nile felons (Moser, 1975). 

Several more studies demonstrate the more 
| complicated interactive effects when subjects 
“are differentiated according to I-E expec- 
fancies and placed in varied treatments 
(Abramowitz, Abramowitz, Roback, & Jack- 
son, 1974; Friedman & Dies, 1974; Kilmann, 
1974; Kilmann, Albert, & Sotile, 1975; Mor- 
ly & Watkins, 1974). Generally, if given an 
opportunity, internals report that they prefer 
more client control than do externals. Inter- 
tals respond more positively to nondirective 
| approaches in which therapist intervention is 
minimal and structure is not imposed from the 
| outside, Externals, on the other hand, appear 
| More positively influenced by structured ap- 
| Ptoaches. These findings parallel those be- 
avior modification results that occur when 
individuals are differentiated as to I-E ex- 
Pectancies and exposed to different proce- 
‘tures (Best, 1975; Wallston, Wallston, Kap- 
n, & Maides, 1976). For example, Best and 
Steffy (1975) involved internals and externals 
M one of two smoking modification proce- 
lures, Congruence of I-E expectancies and 
|Xperimental conditions produced the most 
profound changes. Internals responded to an 
on satiation procedure, and externals 
| ee to an agent who decided the rate 
At which smoking would be reduced. 
pe from Tesponses of clients or patients, 
nM Investigators have assessed I-E ex- 
f M ncies of mental health personnel and 
may et that locus of control beliefs 
of es in determining the efficacy 
(1972) eee care delivery. Beckman 

ospital to ty volunteers in a state mental 
a e we internal than undergradu- 
elena a to be less likely to believe 
tial ee s ould be restricted in their so- 
mental ee ern (1973) reported 

unction of a y to be more internal as 
tam, Martin T n training pro- 
senior eae epel (1974) trained 21 
tounseling a ah urban hospitals in 
"lationship ay asizing developing a helping 
tin areas” entifying and exploring prob- 
T , and devising plans of action. 
etesting and posttesting, they 
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found both an increase in counseling percep- 
tiveness and a shift toward internality. A cor- 
relation of .56 was found between counseling 
skills and internality at the posttesting. 

Overall, the reported research for both indi- 
viduals in treatment programs and individuals 
delivering mental health care suggests that 
internal expectancies may facilitate responsi- 
ble adaptive responses, Results appear to be 
highly influenced, however, by the conditions 
under which counseling or treatment occurs. 
Again, congruence between locus of control 
expectancies and the structure of the thera- 
peutic endeavor appears to lead to the most 
pervasive changes. 


Problems 


The increase in research on I-E expectan- 
cies and health-related behaviors is quite 
striking since the first Seeman and Evans 
study in 1962. The blending of interest in 
individual difference variables and health most 
likely results from an increased focus on pre- 
ventive health care in this country and a 
changing social and political awareness in 
which individual responsibility for one’s own 
health is emphasized. Certainly, a number of 
health practitioners have noted that many of 
the psychological and physical disorders that 
persons bring into a physician’s or counselor’s 
office result from, or are exacerbated by, be- 
havior such as smoking, improper diet, lack 
of exercise, and substance abuse. Others have 
remarked on the striking individual differ- 
ences in response to treatment programs once 
problems or physical dysfunctions are identi- 
fied. Finally, increasing health costs have 
caused massive concern on the part of indi- 
viduals who can no longer afford health care 
and the politicians who represent them. Over- 
all, research with the I-E dimension suggests 
that beliefs about locus of control of rein- 
forcement are influential in relation to health. 

Some major problems with this research 
should be noted, however. First, much re- 
search that does not produce clear-cut or 
substantial results is not published. Implica- 
tions of the published results are important, 
but these would, of course, be attenuated if 
they reflected only a small part of the work 
that has been done but not published. 
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A second major problem has to do with the 
relative magnitude of the I-E/health rela- 
tionships that are reported. The I-E variable 
is only one of a number of complex factors 
that may converge to predict health attitudes 
and behaviors. The amount of variance for 
which I-E accounts is probably quite small in 
many, if not most, situations. For example, 
whether persons present themselves to a phy- 
sician’s office for relief of symptoms may be 
much more a function of severity of symp- 
toms, the individual’s financial condition, and/ 
or the availability of health care than the 
person’s beliefs about control of reinforce- 
ment, Thus, practical prediction and clinical 
intervention must be continually accompanied 
by further investigation of the complex of 
components that accompany health behaviors. 

One of the most pervasive problems of re- 
search with the I-E dimension has to do with 
the differing definitions that have been at- 
tached to this construct and the fact that I-E 
expectancies are often, even usually, explored 
outside the theoretical net in which this con- 
cept was first described. Investigations using 
the I-E variable have proliferated, sometimes 
without a clear grounding in a precise under- 
standing of the construct and its implications, 
Rotter (1966, 1975) has always noted that 
I-E is only one of a number of variables that 
would be expected to predict behavior in spe- 
cific and novel situations. An equal concern 
with the nature of the situational demands 
and reinforcement value should improve pre- 
diction, 

Aside from theoretical concerns, a number 
of methodological weaknesses are apparent in 
much of the I-E/health research, Particularly 
with the use of clinical populations, controls 
are often lacking with respect to severity and 
length of disorder or illness, Also, treatment 
methods vary when disorders are identified 
Moreover, much of the research is correla. 
tional ìn nature and gives no indication of 
direction of causality, The findings that do 
emerge demand further investigation as to 
their continued viability and validity, with 


particular emphasis on antecedent conditi 
Another major methodologi atl 
io bee logical problem has 


the measurement of I-E 
tancies. Numerous I-E assessment as. 
ments are currently being used. Most of the 
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research surveyed in the present article, un- 
less otherwise noted, has been conducted with 
the Rotter scale, which has been the instru- 
ment in most frequent use. However, I-E 
measures different from the Rotter scale were 
sometimes used, particularly in the very 
earliest and the latest studies. The conflicting 
I-E/health findings that emerge might then 
be explained as occurring because of the use 
of the different assessment instruments that 
may not be measuring the same expectancies. 
Moreover, some I-E assessment devices are 
more appropriate for and better predictors of 
health concerns than others. The scale devel- 
oped by the Wallstons, for instance, is spe- 
cific to health and would be expected to en- 
hance prediction of health-related behaviors. 

Finally, the problem of response set is al- 
ways one that haunts any self-report measure. 
This is especially telling for the I-E dimen- 
sion, since significant correlations between the 
various I-E measures and social desirability 
responding are often reported. It is difficult to 
know the degree to which I-E responses re- 
flect veridical descriptions of locus of control 
beliefs or are colored by the respondent’s at- 
tempt to present himself/herself in a favorable 
light, Particularly in situations with strong 
social demand characteristics, such as treat- 
ment programs for incarcerated substance 
abusers in which responses may influence 
length of stay in institutions, individuals may 
be expected to respond in ways that are de- 
signed to please the controlling agents. Even 
within a more typical outpatient situation, 
Persons may wish to present themselves 
favorably to the authorities and therapists 
responsible for their care, implying that they 
are motivated, concerned, conscientious, 4” 
so forth. Also, response bias may occur on the 
various I-E measures as a function of Te- 
spondents’ social-cultural background (No- 
wicki & Strickland, 1973), race (Gurin et al., 
1969), sex (Strickland & Haley, Note 8), and 
political ideology (MacDonald, 1972). More 
accurate assessment of the degree and spec 
ficity of I-E expectancies, independent of 
mediator variables and response bias, if p05- 
sible, would be enormously helpful. 

In spite of the problems, research on the 
LE dimension in relation to health appea! 
to have opened significant avenues of invest 
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“gation that should be pursued, Although re- 
sults are not altogether as clear, convincing, 
and as free of conflict as one might hope, the 
| bulk of the research is consistent in implying 
| that when faced with health problems, in- 
‘ternal individuals do appear to engage in 
| more generally adaptive responses than do 
| externals. These range from engagement in 
preventive and precautionary health measures 
through appropriate remedial strategies when 
disease or disorder occurs, Findings suggest 
| that the development of an internal orienta- 
| tion could lead to improved health practices 
for some individuals who have been inclined 
to believe that life events are beyond their 
tesponsibility and more a function of external 
control. One must be quite cautious, however, 
in assuming that internal beliefs are always 
facilitative. The continued alertness of in- 
ternals and their attempts at mastery be- 
havior is most appropriate when events are 
actually controllable. When individuals per- 
‘ist in efforts that bring no relief, then they 
may find themselves to be actually exacerbat- 
ng the undesirable characteristics of the sit- 
tation in which they find themselves. Perhaps 
the wisest course is that people learn to spe- 
| cily the reality of their life situations, their 
Possible responses, and the potentiality of 
i forthcoming reinforcement. 

Another major finding emerging from this 
| e is that congruence of expectancies and 
ase appears to enhance behavior change. 
ka Be cations are that change agents 
tive a ealth personnel will be most effec- 
vidual en techniques are tailored to indi- 
i expectancies. External individuals evi- 
y y respond more easily to conditions in 


Which structure is imposed from outside. In- 
| 


‘ea Prefer situations in which they can 
‘me responsibility and work independently. 
viously, continued research is necessary. 

eee extant in the investigations 

a view are myriad, but the already ob- 

results hold promise for both theoreti- 


‘al and practical advances. 
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Multivariate Classification of Day-Care Patients: 
Personality as a Dimensional Continuum 
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The categorical versus dimensional view of psychiatric abnormality was exam- 
ined for three groups of dysthymic neurotics, schizophrenics, and alcoholics. A 
discriminant function analysis correctly assigned 70% of the patients to their 
a priori diagnostic profiles, though the “hit rate” rose to 90% when patients 
were aligned on a continuum of severity in two-dimensional person space. Uni- 
variate F ratios showed the three phenotypes to differ on 12 of 26 parameters, 
yet 18 significant Fs emerged when the patients’ degree of neuroticism was 
taken as a putative index of general maladjustment. Results showed that 
Eysenck’s Extraversion and Neuroticism personality factors share a consider- 
able degree of collinearity with the factorially pure Tryon-Stein-Chu trait 
clusters. Many forms of psychopathology, it is argued, simply reflect gross 
deviations of the continuously variable dimensions of personality, not discon- 
tinuities with qualitative change points. Given a broader context, a “dimen- 
sional” model of personality functioning may well, owing to its theoretical 


quantifiability, supercede the notion of psychiatric “disease” types. 


For over 25 years, and perhaps a half- 
century, the controversy between advocates 
of the medical model in present-day psychi- 
atry (Panzetta, 1974; Robins, 1976) and 
its critics, those who prefer to advocate a 
dimensional system (Eysenck, 1970; Ey- 
senck & Eysenck, 1976; Kendall, 1968; Sjé- 
bring, 1974) has steadily gained momentum. 
The “dimensional” approach, favored by 
psychologists, repudiates the notion that 
functional disorders exist as qualitatively 
distinct entities, much in the sense that in- 
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fectious diseases do in re naturae. As such, 
psychiatric theory leads to the practice of 
viewing a problem as conspicuously present 
or absent, that is, to labeling a patient either 
as a hysteric (Briquet’s syndrome) or as 
something else. The underlying assumption 
on which such terminology and its applica- 
tion rests, inextricably linked to general med- 
icine, is not only of doubtful value but has 
led to many polemical attacks on the alleged 
usefulness of the “medical” model in the 
study and treatment of psychologic morbid- 
ity (Eysenck, 1973). 

Instead, the “dimensionalists” claim that 
a more sensible course to follow is to de- 
pict any given debilitating trait (or symp- 
tom) on a variable severity continuum, from 
normal to grossly abnormal. In its simplest 
terms, this allows the clinician to portray 
each person according to their infinitely 
graded degree of observable behavior. 
viously, some people, usually in response to 
teal problems and pressures will, at times, 
appear less their usual extraverted self aS 
they display increasing signs of social isola- 
tion, regress from a confident mood to 02° 
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PERSONALITY AS A DIMENSIONAL CONTINUUM 


of tension or worry, or experience an abrupt 
cognitive shift from lucid perceptivity to 
exhibit bizarre delusions of guilt, power, or 
inferiority. These deviations signal a dis- 
ruption of the normal continuity of the per- 
sonality. Any sudden change of emotion, 
attention, arousal, or perception reflects but 
one of many intermediate variations on a 
continuum between healthy and psychiatri- 
cally ill states, rather than a clear-cut sepa- 
ration between the two. By analogy, a pa- 
tient’s mental or behavioral state can be 
portrayed in a manner directly akin to how 
IQ scores are distributed in the population 
continuum, fluctuating from severe mental 
handicap to creative genius. 

Briefly, dimensional analysis is based on 
a number of orthogonal (or independent) 
personality traits that serve as reference axes, 
normally distributed and genetically based. 
Each person is represented by one or more 
diagnostic points in n-dimensional space, rel- 
ative to other patients’ points. No arbitrary 
Cutoff rules are imposed on the patients’ posi- 
tional loci in multidimensional space. Nor is 
the dimensional model hampered by the pres- 
ence of “mixed” or “undiagnosed” psycho- 
Pathology, Other than pure diagnostic types, 
the majority of patients will be uniquely 
Placed relative to one or more dimensional 
force. factors, enabling a reliable proba- 
es 3 estimate to be made of the degree to 

neurotic or psychotic personality traits 

are causing the disturbed behavior seen in 
pe ee It is widely known that in the 
Eble ee omin person space and vari- 
eas S 9 i not correspond to the medical 
theory. As a ae it exists in psychiatric 
HPA P m Aeaee data are accumu- 
keidi R ne generalizability, as was 
tests. a sass feted for norming intelligence 
guide fue dimensional profile can 
Pramas crane choice of treat- 
follow-up, Yy, and need for community 
E perh, the major criterion of any dis- 
outcome ic from etiology, course, and 
aed one of empirical—not theo- 
types (Kent ey between nosological 
implies ae 1975). This, of necessity, 
Sonality disord eS a NOL only Dek 
ers, but many other diagnostic 
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subtypes are “notoriously unreliable cate- 
gories”; and reviews of studies under opti- 
mal conditions of diagnosis (Spitzer & Fleiss, 
1974; Walton & Presly, 1973) clearly sup- 
port this proviso. Similarly, an extensive 
United Kingdom -United States cross-cul- « 
tural study has shown that American psychi- 
atrists overdiagnose schizophrenia and under- 
diagnose affective disorders (Cooper, Ken- 
dall, Gurland, Sharpe, & Copeland, 1972). 
Particularly noteworthy is the lack of satis- 
factory agreement among clinicians, incon- 
sistent use of terminology, and inherent gaps 
in the nomenclature itself, further com- 
pounded by the indecision whether to rate a 
trait as pathologic or as a normal variation 
(Presly & Walton, 1973). 

For the present, at least, the psychiatric 
classification scheme and its newly proposed 
Diagnostic and Statistical Manual of Mental 
Disorders (DSM-III), though based on a 
multiaxial system, is not in a form that per- 
mits empirical testing by suitable experi- 
ments. Potentially, as every science pro- 
gresses from descriptive, symptom-based clas- 
sification to a dimensional framework (Hem- 
pel, 1967), the real issue may be more of a 
question of within what dimensionality the 
“symptom sign” patterns, or personality 
traits, lie. Like cognitive abilities, the proc- 
esses underlying arousal and clinical abnor- 


1If we single out phenylketonuria and color blind- 
ness, they reflect but two mutually exclusive kinds 
of biological abnormality, each occupying a dis- 
crete factor space. Yet both of these disease en- 
tities can occupy the same person space in cases 
in which they co-occur in the same individual. 
Anxiety and depression, two intermediate condi- 
tions often viewed as phenomenologically distinct, 
not only share the same person space (e.g, as 
when a patient displays both), but they frequently 
load on the same factor across studies, confirming 
their factorial interdependency (Lang & Frost, 
Note 1). Unlike infectious diseases, surprisingly 
few of which truly exhibit uniquely identifiable 
causes, behavioral or personality traits often share 
a multiple, or overlapping, etiology and, in this 
sense, cannot be adequately defined by the arbi- 
trary cutoffs imposed by a medically based, noso- 
logical psychiatric system. Many other common 
illnesses such as diabetes mellitus, rubella, peptic 
ulcer, and coronary heart disease can now be more 
readily investigated in dimensional than in cate- 
gorical terms. 
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mality are often discovered to be complex 
and continuous rather than discrete. In this 
sense, they are more likely to be Aierarchi- 
cally organized within dimensions and less 
likely to exist in some either/or paradigm as 
predicted by psychiatric theory. The typo- 
logic model tends to confuse the predisposi- 
tion with the neuroses and psychoses while 
ignoring the underlying continuum of men- 
tal disturbance. 

Since little is actually known about the 
underlying distribution of clinical pathology 
in the general population continuum, or 
about those subclinically predisposed to de- 
velop mental conditions, a major task is to 
specify, as a probability estimate, the like- 
lihood that a person will become behavior- 
ally disturbed. The risk factor will vary and 
can, in dimensional terms, be expressed as 
a probability function, such as in the ‘“dia- 
thesis model of stress” proposed by Ed- 
wards (1969). 

Unquestionably, the eventual resolution of 
the continuum-category debate has major 
ramifications for the propylaxis and treat- 
ment of the deviant personality and direction 
dictated to future research trends. If the ad- 
vent of a “dimensional scientology” is al- 
ready underway—and not 300 years ahead 
as Robins (1976) intimates—it must stand 
or fall on its own merits, In fact, there is 
good reason to believe that people vary enor- 
mously in their predisposition to become 
stressed, whether genetically or environmen- 
tally induced. Physically, anyone can po- 
tentially break a leg (ie., qualitative), yet 
the person with weak fibrous tissue is much 
more susceptible to accidental injury (i.e, 
quantitative). As such distinctions are mea- 
surable, a coordinate dimensional view of 
personality would provide valuable data to 
assess, in probability terms, the degree to 
which a referred patient is genetically or so- 
cially predisposed to overreact to daily stres- 
sors. The dimensional model, owing to its 
theoretical quantifiability, is obviously supe- 
rior, as it can be tested empirically, 

The continued use of qualitative change 
points, arbitrarily applied, ensures the kind 
of diagnostic unreliability that is common]; 
found. By definition, such poi E 

, points remain un- 
definable. Perhaps, as Wing (1976) re- 
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marked: “Some combination of the two... 
may eventually allow clinician and scientist 
to make the best of both worlds” (p. 394). 
There is, demonstrably, a considerable degree 
of collinearity between the dimensional per- 
sonality model and the psychiatric disease 
system (Eysenck & Eysenck, 1976) to make 
this transition possible. 

Given some of these empirical and theo- 
retical issues, the main purpose of the study 
is addressed to three questions. First of all, a 
discriminant function analysis examined the 
extent of agreement between psychiatrically 
assigned diagnoses and computer-generated 
diagnoses for different subsets of clinical 
predictor scales. A second purpose was to 
contrast the categorical-dimensional ap- 
proaches by taking the dimensionally com- 
plex Neuroticism factor as a putative index 
of severity (or level) of maladjustment to 
the exclusion of the patients’ designated di- 
agnostic codes. The third aim was to check 
the degree of “collinearity” between two of 
Eysenck’s major personality dimensions, 
namely, Neuroticism and Extraversion, and 
behavior as measured by the factorially 
pure Tryon-Stein-Chu “sign symptom” clus- 
ters, derived from the standard Minnesota 
Multiphasic Personality Inventory (MMPI) 
item pool. Recent evidence by Wakefield, 
Yam, Bradley, Doughtie, and Cox (1974) 
suggests that the Eysenck Personality In- 
ventory and MMPI converge in multivariate 
space, even though the two instruments rep- 
resent separate approaches to personality 
assessment. Much like Eysenck, Royce 
(1973) pointed out that the many primary 
factors can be adequately accounted for by 
three major “superfactors,”, which Royce 
calls anxiety, introversion, and superego. As 
a subsidiary hypothesis, it was expected that 
the real differences in morbidity reside not in 
the erroneous reliance on a diagnostic label 
but inherently exist within the configura- 
tions of disturbed patients at each symptom 
level, and presumably of how their scores on 
critical traits covary together. 


Method 
Subjects 


One hundred fifty-seven outpatients recently ad- 
mitted to the Holy Cross Hospital, Day Care Treat- 


nent Centre, located at Calgary, Canada, served 
is subjects. Fifty-four were male; 103 were female. 
The mean age of the total group was 37.84 years 
($D = 11.83 years). 
Upon initial admission to the hospital, every 
Patient was first seen by the duty doctor in the 
gency clinic and was then interviewed by a 
ior consultant psychiatrist, who assigned patients 
the Diagnostic and Statistical Manual of Mental 
Dis (DSM-II; American Psychiatric Asso- 
lation, 1968). The classifications were 86—neu- 
Totic depression (300.4), 38—alcoholic abuse (303.0), 
nd 33-paranoid schizophrenic (295.3). Clearly, it 
Would have been desirable to have two resident 
ps chiatrists separately assign the working diag- 
noses, enabling a measure of interrater reliability 
fo be computed. Even so, under optimal conditions 
with two or more raters, “there are no diagnostic 
egories for which reliability is uniformly high” 
(Spitzer & Fleiss, 1974, p. 344). In dealing with 
uch patients, the continued usage of qualitative 
ferms to designate arbitrary cutoff points along a 
dimensions ensures, almost certainly, the kind of 
Poor reliability that besets psychiatry as a whole. 
Thus, for all patients who were psychiatrically 
diagnosed, there is always the unreliability of such 
diagnoses to contend with. I am well aware of the 
Mnreliability of making such diagnostic judgments; 
Beever, common practice used in handling the 
Vast majority of new admissions is to have one 
enior staff psychiatrist assign a differential diag- 
osis. Therefore, it is quite reasonable to contrast 
its usefulness against that of the dimensional ap- 
proach, This article will attempt to clarify this 
Point by showing that the dimensional model is 
Empirically superior, owing to its theoretical quan- 
tifiability, 
j Following the preliminary intake, each patient 
Was interviewed by the day care management team 
determine their suitability for the day care pro- 
$ am. The psychometric evaluation was carried out 
ol office-type Tooms, usually in two sessions 
Prints 1- to 3-day interval. Most patients were on 
es therapy. The prescribed drug was 
Pioa tricyclic antidepressant (e.g, amitripyline 
(eg rae imipramine), an antianxiety drug 
ae a cl lorodiazepoxide, diazepam) or a antipsy- 
iC medication (e.g, pimozide, haloperidol, flu- 
Datel e a ariak trifluoperazine) at mod- 
ee: >» rather than peak dosage levels, so as 
Bien mpair their ability to provide suitable co- 
NE responses, 
i “ea drained for at least 10-14 days; 
convulsive a o had undergone recent electro- 
Were sulficientiy x ve tested. Thus, all patients 
nd the nature z ma z pepe wes 
they did Aa of the questionnaire forms, and 
Bie to thei isplay a marked intellectual deficit. 
dist eir deteriorated condition, certain highly 
‘urbed or acutely ill seer avs Aas 
cal heres th ly Persons, irrespective of clini- 
Manic phase, or Me Psychotically depressed, in a 
‘simply ee mee badly hallucinated), were 
Were ipso E Pus few i psychotic pa- 
acto included in the study be- 
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cause they, behaviorally, precluded themselves. 
Otherwise, the final sample, excluding these few, 
formed a fairly representative sampling of psy- 
chopathology from the general population con- 
tinuum. 


Measures 


Wechsler Adult Intelligence Scale (WAIS; Wechs- 
ler, 1955). Only the WAIS Vocabulary and Block 
Design subtests were administered on the basis that 
in independent analyses of dyads of WAIS subtests, 
both Maxwell (1957) and Silverstein (1970b) have 
shown that the Vocabulary -Block Design pair is 
the “best” combination for computing a prorated 
full-length WAIS IQ. 

Tennessee Self Concept Scale—Clinical and Re- 
search Form (TSCS; Fitts, 1965). The TSCS 
booklet is made up of 100 first-person statements to 
which the subject responds on a 5-point Likert- 
type scale, ranging from completely true to com- 
pletely false. The TSCS contains 90 items that 
evaluate five external references of self-concept: 
“physical self,” “moral-ethical self,” “personal self,” 
“family self,” and “social self,” supplemented by 
a 10-item Self-Criticism scale. 

Eysenck Personality Inventory (Eysenck & Ey- 
senck, 1968). Form A of this questionnaire was 
used; it consists of 57 yes-no questions: 24 evaluate 
neuroticism-emotional stability, 24 evaluate intro- 
version-extroversion, and 9 items constitute the Lie 
(or “social desirability”) scale. 

Tryon-Stein-Chu (TSC) scales (Stein, 1968). This 
200-item inventory was developed by cluster ana- 
lyzing the 550 original MMPI item pool. This 
eliminated the contamination due to item overlap, 
as only the purer nonoverlap items were retained 
to reduce the effects of spuriously shared variance. 
There are eight TSC scales: Social Introversion; 
Body Symptoms; Suspicion and Mistrust; Depres- 
sion and Apathy; Resentment and Aggression; 
Autism and Disruptive Thinking; Tension, Worry, 
and Fears; and Lying (or Dissimulation). J 

Nurses’ Observation Scale for Inpatient Evaluation 
(NOSIE; Honigjeld & Klett, 1965). This rating 
scale contains 80 behavioral items, each of which is 
rated on a 5-point frequency continuum, ranging 
from never to always. Only five social adjustment 
scales were use: Social Competence, Social Interest, 
Personal Neatness, Cooperation, and Irritability. A 
team of mental health workers and psychiatric nurses 
rated each patient on each of these indices follow- 
ing no less than 7 days of careful observation in vivo. 


2 The mean WAIS IQ, prorated from the summed 
Vocabulary and Block Design scores using Silver- 
stein’s (1970a) conversion formula (age corrected 
for the 34- to 44-year-old group), was 103 to 
indicate that patients were of average intellectual 


ability. 
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Procedure 


Phase 1. To begin, all patients were classified into 
a priori groups in conformity with their assigned 
psychiatric diagnoses. A stepwise discriminant func- 
tion analysis was carried out to ascertain the extent 
of agreement between the categorical diagnoses and 
computer-generated profiles for three types of data 
variables (scores based on observational, self-report, 
and cognitive assessment). Three combinations of 
predictor scales were used: (a) all 26 independent 
variates; (b) the seven TSC subscales; and (c) the 
Extraversion and Neuroticism factors. 

Phase 2. Second, all patients were divided into 
three clusters along a continuum of neurotic severity. 
Of the possible choices, Eysenck’s Neuroticism was 
preferred as it represents a universally stable, per- 
sonality factor. Thus, the 157 cases, regardless of 
assigned diagnoses, were positioned within a dimen- 
sional framework in keeping with their level of mal- 
adjustment, namely, Neuroticism (N) scores. The 
severity continuum consisted of three levels: low 
N scores (0-7), moderate N scores (8-13) and high 
N scores (14-23). In this way, it was possible to 
plot, or align, the patients along a graded, dimen- 
sional personality factor and then contrast this ap- 
proach against the notion of a disjunctive Psychiatric 
entity. 

Statistical treatment. Initially, Pearson product- 
moment correlations were computed for all 26 vari- 
ates in the raw data matrix, The suitability of the 
resultant multiple correlation matrix for further 
multivariate analysis was ensured by the Dzuiban 
and Shirkey (1974) test, which ruled out the possi- 
bility of random fluctuation, Given each subject’s 
observed series of scores, the discriminant program 
computes a set of lambda weights (the larger lambda 
is, the less discriminating power present) that are 
applied to the raw, scores of the data variables for 
each case, producing a standard score for each sub- 
ject (Cooley & Lohnes, 1971), The individual test 
scores are then transformed into a single discrimi- 


nant score, and that score is the patient’s locus in 
n-dimensional space. For two o; 


number of discriminant functio; 
tential maximum of g 
define an orthogonal 


1972) and rorTRAN (Veld- 
grams generate an approxi- 
ince based on Wilk’s lambda 


lations denote the ability 
mally separate the groups, 


When two discriminant 
group centroids 


ee 
The mathematical ee a 
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criminant analysis assumes that the data variables 
are sampled from a multivariate normal distribution 
having equal variance-covariance matrices within 
groups. However, owing to the robustness of the 
technique, these assumptions are often not strongly 
adhered to. To correct for unequal N sizes, a Bay- 
esian adjustment of the log JJ, sign was made 
based on a priori knowledge of group membership 
probabilities. The prior probability values were thus 
set equal to the cell frequencies. The classifications 
are based on the separate group covariance matrices, 
rather than the pooled within-groups covariance 
(dispersion) matrix, as illustrated by Cooley and 
Lohnes (1971, chap. 10). 

Moreover, the magnitudes of the weighting coeffi- 
cients, in standardized form, are closely analogous 
to beta weights in multivariate regression formulas. 
Hence, like regression coefficients, all discriminant 
function coefficients change in value with the addi- 
tion or deletion of a variable from the specified 
analysis, requiring a recomputation of the functions 
at each stage. The standardized function coefficients 
depict the relative contribution of its associated 
variable to that function, As in factor analysis, these 
coefficients are often used to “label” each function 
by identifying the dominant characteristic that they 
measure, Based on the discriminant score, cases are 
assigned to the diagnostic category for which the 
computed probability density is largest. Finally, sub- 
Program DISCRIMINANT tests the significance of the 
scale means of the measures for each subgroup by 
univariate and step-down F values. 


Results 
Analysis 1: Diagnostic Criterion 


Twenty-six variables. The test of overall 
differentiation for the three clinical typologies, 
Wilk’s lambda (the associated chi-square of 
each discriminant function), and correlations 
of the 26 perdictors are presented in Table 1. 
For the overall analysis, Wilk’s A = .652, 
which is significant (p < .001). The nature 
of these discriminant functions can be deter- 
mined by examining the large contributors to 
group separation. The first function is bipolar, 
being positively weighted by WAIS Vocabu- 
lary; and Neuroticism; Self-Criticism; In- 
troversion; Depression; Tension, Worry, an 
Fears; Cooperation and Social Interest; and 
negatively by Extraversion; Personal Self, 
and the EPI Lie scale. This first function, or 
ordinate, contains a number of the primaries 
that comprise Eysenck’s Neuroticism or Cat- 
tell’s Anxiety superfactors. 

The second variate, namely the abscissa, 
has its positive pole defined by Suspicion- 
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Table 1 
Correlations of 26 Variables With Two Discriminant Functions and Tests of Significance for Three 


Psychiatric Subgroups 


Discriminant function 


Diagnostic codes* Symptom severity” 
Predictor 1 2 1 2 
Age —.19 —.04 —.06 —.19 
WAIS Vocabulary .36 —.07 -16 —.37 
WAIS Block Design —.13 —.08 .04 —.35 
Extraversion — 48 —.44 21 45 
Neuroticism .65 33 = = 
Lie scale (EPI) —.41 35 —.05 Al 
Physical Self 15 —.11 .05 —.13 
Moral-Ethical Self —.04 -31 17 —.35 
Personal Self — 43 AS — 49 14 
Family Self 32 —.03 —.17 —.06 
Social Self .24 .23 —.30 =li 
Self-Criticism 43 —.08 67 18 
Social Introversion 49 .05 71 AT 
Body Symptoms Al —.30 .32 14 
Suspicion, Mistrust —.23 49 —.30 .18 
| Depression, Apathy 69 23 63 19 
l Resentment, Aggression 19 Al —.11 59 
Autistic Thinking —.11 ST .26 61 
Tension, Worry, and Fears 66 12 76 30 
Lie scale (MMPI) AS 44 —.69 51 
Cooperativeness 65 .09 31 —.10 
Personal neatness 51 —.37 .34 A3 
| Irritability .08 —.32 12 37 
Social interest 38 —.45 —.32 ,53 
Social competence 31 —.69 —.14 — Al 
Sex gender 22 —.33 42 Ad 
% variance 58.59 41.39 74.43 25.57 
Bartlett’s x? 83.66 35.87 93.19 53.09 
af 50 24 49 23 
is .002 056 001 012 


Note. WAIS = Wechsler Adult Intelligence Scale; EPI = Eysenck Personality Inventory; MMPI = Min- 


Sea get Personality Inventory. 
p Willes A = 462, F(2, 154) = 20.90, p < .001. 
ilk’s A = .447, F(2, 154) = 18.78, p < .001. 


Mistrust, the EPI Lie scale, Resentment-Ag- 
is Autistic Thinking, and the MMPI 
a scale, whereas the negative pole points to 
ne in Extraversion, Personal Neatness, 
ie mer and Social Competence. Gen- 
eG en, the subset of variates that define 
ie eee function reflect the more acute 
an typical of the psychoticlike 
aaa esumably refers to a Psychoticism 


To facilitate interpretation, the group cen- 


even though the discrimination is statistically 
significant. For the dysthymic neurotics, the 
total battery classified them with 75.6% 
efficiency, misclassifying 21 of these 86 pa- 
tients. Table 2 shows how this ratio dropped 
to 60.6% agreement between diagnosis and 
objective tests for schizophrenics, whereas @ 
65.8% concordance was achieved for the al- 
coholic group. For the most part, the 30% 
misclassification rate may be due largely to 
the speculative nature of the defining cri- 


ne 
roids are plotted in Figure 1. As is apparent, 


; ra is substantial overlap between the three 
ogies. They are not clearly truncated, 


terion, not the standardized objective tests. 
That the differential diagnoses of 47 of 157 
patients showed a poor fit to the clinical data 
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Table 2 


Diagnostic Classification by Discriminant Function Analysis for Different Combinations of 


Predictor Equations 
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Predicted group membership 


Measure Actual group 1 2 3 
26 variates* Neurotics 65 (75.6) 6 (7.0) 15 (17.4) 
Schizophrenics 10 (30.3) 20 (60.6) 3 (9.1) 
Alcoholics 10 (26.3) 3 (7.9) 25 (65.8) 
7 MMPI scales? Neurotics 66 (76.8) ` 7 (8.1) 13 (15.1) 
Schizophrenics 14 (42.4) 17 (51.5) 2 (6.1) 
Alcoholics 12 (31.6) 2 (5.3) 24 (63.1) 
2 EPI scalese Neutrotics 66 (76.8) 8 (9.3) 12 (13.9) 
Schizophrenics 13 (39.4) 15 (45.4) 5 (15.2) 
Alcoholics 11 (29.0) 4 (10.5) 23 (60.5) 


Note. Numbers in parentheses are percentages. For neurotics, n = 86; schizophrenics, n = 33; alcoholics, 


n = 38, MMPI = Minnesota Multiphasic Personality Inventory; EPI = 
"x? (4) = 134.27, p < 001; 70.70% of cases were correctly identified. 


Eysenck Personality Inventory. 


»x2(4) = 97.20, p < .001; 68.15% of cases were correctly identified. 
©x*(4) = 89.77, p < 001; 66.24% of cases were correctly identified. 


is in accord with their widespread unreliabil- 
ity (Kendall, 1975). Moreover, the test(s) 
may be more valid than the criterion! 
MMPI scales. The overall test of between- 
group differences, Wilk’s lambda criterion, 
converted to Rao’s F approximation, for the 
abbreviated MMPI (TSC trait clusters) is 


4 


ee BS 
a ai. 
Dimension I ji 


variates based on 26 discriminant scores, (Latent 
$ < 002; M= .717 


highly significant (p < .001). Of seven TSC 
scales, five contributed significantly to group 
separation. According to Table 3, the two 
best predictors were Social Introversion and 
Tension, Worry, and Fears. Despite two sig- 
nificant roots, the discriminant analysis was 
only able to correctly identify 68.15% of 
patients as members of the consensual class 
that they were intended to depict (see 
Table 2). Classification figures for alcoholics 
(68.1%) and paranoid schizophrenics 
(51.5%) were uniformly poor but were some- 
what better for dysthymic neurotics (76.8%). 
As the shorter MMPI (i.e, Midi-Mult) en- 
Joys wide usage as a diagnostic tool, the sub- 
stantial overlap runs contrary to the notion 
of a qualitatively distinct, categorical entity. 
That a sizable number of patients (50 of 
157) do not exhibit finite, qualitative change 
points in multiperson space provides, at least, 
a justifiable raison d’être for exploring the 
merits of a dimensional system of personality 
description. Here again, the absence of any 
marked points of “rarity” runs counter to 
classical theory. 
_ EPI scales. Third, a prediction model us- 
ing only a single pair, the extraversion and 
neuroticism personality traits, was used. 
Wilk’s lambda for the three-group differen- 
tiation was .572, which was significant (Ż < 


ptual categories. Indeed, this is actually 
ly 2% fewer than the foregoing TSC scales 
id scarcely 4% less than the overall battery. 
For this subanalysis, nearly one half of the 
oholics and about 40% of schizophrenics 
Te erroneously assigned, much like the 
orter MMPI. By way of comparison, the 
fo (or three patients) higher “hit rate” of 
e pure TSC validity scales, over the much 
defer EPI, is hardly a noteworthy advan- 
ge for an inventory directly enquiring about 
igns and symptoms.” Interestingly, each in- 
ntory classified 66 or 86 neurotic depres- 
ves (a difficult category). Whenever the aim 
§ to classify patients into psychiatric clus- 
S, certain scalar combinations may prove 
re useful than any single unidimensional 
le. Inclusion of too large a number of vari- 
les may obscure discrimination just as 


Discriminant 


function Uni- 
—_—_ iat 
Scale 1s 2 Ka i $- 
od .51 —.76 22.19 .001 
ey Symptoms 31 14 98 .084 
n, Mistrust —.81 -50 8.82 .009 
67 .29 3.95 .019 
.23 49 1.16 .162 
529 52 7.93 .010 
81 .09 9.81 .003 
-10 7A 4.55 .019 


‘lke? 
ES. A = .589, F(2, 154) = 24.18, p < .001. 
a innesota Multiphasic Personality In- 
a ae = Tryon-Stein-Chu. 
D-a a $ < .001; % of variance = 71.6. 
00, p < .001; % of variance = 29.4. 
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Table 4 

Intercorrelation Between MMPI-Derived TSC 
Scales and Scores on the Eysenck Personality 
Inventory 


Extra- Neu- 
Scale version roticism Lie 
Social Introversion —.58** 49** —.10 
Body Symptoms —.27* 500 = 
Suspicion, Mistrust .03 44e* —.21* 
Depression, Apathy —.39** .61** .—.26* 
Resentment, 

Aggression —.14 67% = —.37%* 
Autistic Thinking —.17 61%* —.24* 
Tension, Worry, and 

Fears —.28* 10%% —,28* 
Lying (Dissimulation) .04 —.22* .47** 


Note. N = 157. MMPI = Minnesota Multiphasic 
Personality Inventory. TSC = Tryon-Stein-Chu. 

* p < .05 (two-tailed). 

** p < .01 (two-tailed). 


is of interest to note the intercorrelations 
shown in Table 4. Evidently, the second-order 
Neuroticism factor is strongly correlated with 
the factorially pure MMPI validity scales. 
Had these been corrected for attenuation, the 
multiple correlation values would have been 
substantially higher. This moderately high 
relationship, in terms of shared variance, be- 
tween the two inventories partially explains 
why the comparatively short EPI suffers little 
or no loss in overall predictive power (68.15% 
vs. 66.24%). The results reaffirm that a good 
deal of “collinearity” exists between Eysenck’s 
personality traits and psychiatric disorders, 
as measured by symptom-sign questionnaires 
such as the MMPI—sufficiently so, it would — 
appear, to suggest that the emphasis on di- 
mensional over categorical criteria is not mis- > 
placed. HNED 


Analysis 2: Continuum of Severity 


Twenty-five variables. The measure of in- 
tergroup dispersion, Wilk’s lambda, as shown 
in Table 1, was significant (p < 001). Two 
roots emerged to account for 75.5% and 
24.5% of the between-group variance, respec- 
tively. The first function had moderately high 
weightings on Moral Self; Self-Criticism; 
Social Introversion; Depression; Tension, 
Worry, and Fears; and the MMPI Lie scale. 
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Figure 2. Dimensional profile of 28 low, 46 median, 
and 83 high-level Neuroticism (N) Patients based 
on 25 predictor variables. (Latent roots: M= .448 
(75.5%), p< .001; M= .764 (24.5%), p< 016.) 


This combination is somewhat suggestive of 
the intermediate forms of behavioral distur- 
bance usually attributed to the dysthymic 
personality (i.e., those patients whose prob- 
lems focus on phobic, anxious, depressive, or 
obsessive concerns). 

The second function was positively loaded 
by Suspicion-Mistrust, Autistic Thinking, Re- 
sentment-Aggression, the MMPI Lie geal 
and Extraversion, and negatively by Social 
Interest. This configuration seems to allude 
to the distorted reality contact, social devi- 
ance, and depersonalization commonly found 
at the higher endpoint of the normality—psy- 
choticism spectrum. Such vectors may well 
signal the diffuse, yet quantitative, level at 
ee extreme forms of normal thinking and 
eeling processes fade into gross i 
abnormality. If this is indeed A ee 
expected that the more highly disturbed pa- 
tients would occupy this region of the severit 
continuum. On any such conti 


nuum would lie 
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the commonly referred to, though vague 
multiple diagnostic types like pseudoneurotic 
schizophrenia, schizophreniform personality, 
chronic anxiety depression, or alcoholic psy- 
chosis-paranoid type. These could well be 
dropped as they simply add to the internal 
contradictions of categorical psychiatry. 

An inspection of Table 5 indicates that for 
the diagnostic model, only 12 of 26 clinical 
predictor scales significantly distinguished be- 
tween the three conceptual typologies. How- 
ever, for the “degree” of severity continuum, 
18 of 25 (excluding Neuroticism itself) pre- 
dictor variables register a significant level of 
discriminant validity. Two TSC scales (Ten- 
sion, Worry, Fears and Resentment-Aggres- 
sion) were the most powerful predictors for 
the dimensional analysis. The dimensional ap- 
proach, even in the present limited context, 
yields more encouraging results, suggesting 
that patients can more easily be aligned on a 
continuous maladjustment axis. Conversely, 
the search for “pure” or sharply demarcated 
subtypes not only leads to the misapplication 
of a science but encourages widespread 
criticism. 

Of considerable interest are the individual 
points in multiperson space seen in Figure 2. 
Of the three ellipsoids, one contains all but 
3 low Neuroticism cases. The second region 
contains, with 4 exceptions, the 46 patients 
divided at their median Neuroticism levels. 
The third zone is occupied by, with the ex- 
ception of 7 moderately disturbed patients 
Positioned in the second region and 1 stray 
subject assigned to the lower end of the con- 
tinuum, all of the most debilitated patients 
with highly elevated Neuroticism profiles. 
There is an impressive degree of separation 
along the unbounded severity continuum ($ 
< .001 by Fisher’s exact test) for all three 
levels of general maladjustment. 

In terms of classification efficacy, Table 6 
shows that the dimensional model is able to 
portray the patients’ loci in n-dimensional 
space with 90.44% accuracy taking, as a 
putative index, each subject’s level of Neu- 
Toticism. Dimensional analysis, it seems, 
brings out the “real’ covariations with sharper 
clarity. Naturally, this illustrative procedure 
Temains to be carried out with other carefully 
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Table 5 
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Univariate Analysis of Group Means for Diagnostic Versus Dimensional Criteria on 25` 


Objective Tests 


hil 


A priori diagnosis 


Neuroticism severity continuum 


Schizo- 
Neurotics phrenics Alcoholics Low Moderate High 

Scale (n = 86) (n = 33) (n = 38) P (n = 28) (n =46) (n = 83) P 
Age 36.87 38.86 37.78 .17 36.00 42.57 34.96 1.99 
WAIS Vocabulary 51.01 51.62 52.11 34 55.00 59.43 50.62  3.80* 
WAIS Block Design 32.93 31.08 35.22 1.48 35.12 31.73 32.39 99 
Extraversion 9.79 10.00 13.11 3.07* 14.56 10.21 8.15 See 
Neuroticism 18.39 14.07 17.72 3.96* — — — — 
Lie (EPI) 2.70 4.31 3.20 3.41* 4.45 3.43 2.33 5.65** 
Physical Self 54.58 62.08 55.45 3.93* 64.67 57.07 50.36 6.92** 
Moral-Ethical Self 60.12 67.39 59.79 3.16* 59.67 66.86 60.76 2.72 
Personal Self 49.02 61.23 48.78 9.25%** 57.33 53.62 48.11 7.54** 
Family Self 57.45 59.63 54.69 2.01 60.66 57.93 53.20 5.46** 
Social Self 56.57 60.62 55.06 1.29 60.33 58.36 53.54 4.75** 
Self-Criticism 35.92 33.23 37.68 1.83 32.00 35.07 39.75 5.83** 
Social Introversion 18.01 12.31 16.16  4.36* 12.03 14.70 19.73 8.32*** 
Body Symptoms 17.81 15.07 16.29 1.63 11.67 15.93 21.58 8,69%** 
Suspicion, Mistrust 14.53 19.07 11.39 5.82** 11.18 14.43 18.92 7.18** 
Depression, Apathy 20.68 15.76 20.50 4.42* 15.67 17.07 24.20 14.47*** 
Resentment, 

Aggression 12.49 10,20 12.53 1.05 8.00 10.37 16.85  20.91*** 
Autistic Thinking 14.03 19,38 12.72 4.52* 12.44 13.67 20.03 11.98*** 
Tension, Worry, and 

Fears 23.77 22.84 17.95 4.19* 15.33 20.57 28.69 22.83*** 
Lie (MMPI) 2.95 4.09 3.28 1.08 5.07 2.79 2.46 8.54*** 
Cooperation 10.28 8.69 8.72 2.51 10.93 8.72 8.05 2.73 
Personal neatness 13.19 11.08 11.56 2.86 12.57 12.64 10.58 1.32 
Irritability 6.99 9.23 8.06 1.76 6.01 820 10.08  3.54* 
Social interest 32.66 25.69 30.72 4.04* 33.44 30.21 27.41 4.01* 
Social competence 43.81 38.53 41.89 2.42 43.29 41.29 39.66 1.06 
Sex gender 59 46 61 1416 49 54 62 RN 


Note. Patients were divided into low-, moderate-, and high-Neuroticism clusters irrespective of diagnostic 


profiles. WAIS = Wechsler Adult Intelligence Scale; EPI = Eysenck Personality Inventory; 


= Minnesota Multiphasic Personality Inve: é 
“df = 2, 154. 3 4 Har 
* p <05. 
we < 01. 
* p < .001. 


chosen factors for even broader discrimina- 
bility. If actually more realistic, the dimen- 
on view of personality would, obviously, 
ah to be augmented by sensitive tests of 
Physiological, biochemical, and cognitive func- 
tioning. 
Be, scales. The seven TSC scales were 
Ra into a prediction formula that pro- 
ee wo canonical variates; both latent 
e highly significant. Dimension 1 
666), na for 64.3% of the variable (A= 
eo (14) = 50.76, p < .001; whereas the 
function accounted for the residual 


MMPI 


35.7% of trace (A= .793), x°(6) = 25.42, 
p < .001). The group centroids (or mean dis- 
criminant scores) are illustrated in Figure 3. 
As previously shown, taking each patient’s 
degree of neurotic variability results in an 
optimal interdimensional (as opposed to inter- 
group) dispersion to pinpoint the locus of 
85.35% of all psychiatric patients compared 
to the 68.15% correct consensus achieved 
for the diagnostic criterion. Were it not for 
the relatively high proportion of dysthymic 
neurotics positively identified, the percentage 
range would surely have fallen into the mid- 
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Table 6 
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Dimensional Dispersion of Patients on a Neuroticism Severity Continuum jor Global Versus 
Tryon-Stein-Chu Diagnostic Scales 


Predicted group membership 


Actual 
Measure Neuroticism group 1 2 3 
26 variates* Low 25 (89.3) 3 (10.7) 0 (0.0) 
Moderate 1 (2.2) 42 (91.3) 3 (6.5) 
High 1 (1.2) 7 (8.4) 75 (90.4) 
7 MMPI scales? Low 24 (85.7) 3 (10.7) 1 (3.6) 
Moderate 3 (6.5) 37 (80.4) 6 (13.0) 
High 2 (2.4) 8 (9.6) 73 (88.0) 


Note. Numbers in parentheses are percentages. For Low Neuroticism, n = 28; for moderate Neuroticism, 
n = 46, for high Neuroticism, n = 83. MMPI = Minnesota Multiphasic Personality Inventory. 
"x3(4) = 182.79, p < .001; 90.44% of cases were correctly identified. 


b xa(4) = 119.13, p < .001; 85.35% of cases were correctly identified. 


dle 50s or 60s in line with those shown for 
the alcoholic and schizophrenic groups. 
Across diagnoses, then, the number of cor- 


a Low N=28 
© Medium N=46 
o High N=83 
Æ Group centroids 
E Overlap zone 


Figure 3. Dimensional position of 157 iatri 
patients on a Neuroticism (N) severity Eee 
predicted from seven Tryon-Stein-Chu scales, (La- 


tent roots: M = 666 64.3 . yee 
(35.7%), » < 001.) (613%), p< 001; w= 193 


rectly identified cases for the global dimen- 
sional analysis, and seven-variable TSC bat- 
tery, is clinically superior to that of psychiatric 
discriminability. Noticeably, the less impres- 
sive evidence for the reliability of the diag- 
nostic concepts, as compared with objective 
laboratory tests, is based on only a conserva- 
tive number of hospital patients and, as such, 
needs to be replicated on larger sample sizes, 
related indices of psychopathology, and alter- 
nate experimental designs. 

A measure of the observed relations be- 
tween the assigned psychiatric diagnoses and 
computer classification matrix was also de- 
termined (see Table 7). The Kappa coefficient 
can be described as an interclass correlation 
that gives the proportion of diagnostic agree- 
ment corrected for chance (Fleiss & Cohen, 
1973). For the 26-variable analysis, weighted 
K = .500, rising to 843 for the dimensional 
approach, Clearly, Kappa for the dimensional 
analysis is significantly larger (p < .01) than 
Kappa for the diagnostic classification ma- 
trices. Notably the weighted Kappa values 
for the DSM-IT diagnostic codes are only 
slightly higher than those reported by Fleiss, 
Spitzer, Cohen, and Endicott (1972) for 
similar psychiatric categories. The magnitude 
of these values is generally in the lower range 
expected for well-trained psychiatric person- 
nel. The implication for the clinician is that 
it may, be easier to depict patients in terms of 
their degree of psychopathology along a num- 
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ber of continuously variable dimensions than 
to fit patients into predetermined classes. 


Discussion 


Unlike studies that have sought to affirm 
that clinical diagnostic profiles are etiologi- 
cally distinct, the present study offers some 
provisional evidence that aligning psychi- 
attic patients along a severity continuum, 
namely, “degree” of Neuroticism is somewhat 
superior to allocating them into diagnostic 
categories. This may possibly apply to Ey- 
senck’s Psychoticism factor as well. Such a 
distinction, doubtless, rests on the extent to 
which conceptually derived stereotypes co- 
incide with the clinician’s experience of the 
observed characteristics of actual patients. 
In many instances, T data (objective tests) 
and L data (life behavior, rated) are used 
to verify the diagnostic nomenclature, yet 
| they report only nominal agreement. Con- 
| sidering the minor tolerances of empirical 
methods due to peculiarities of observation, 
methods of data analysis, and varying sample 
characteristics, there seems a limit to what 
even a perfect diagnostic test could do when 
compared against the psychiatric criterion 
used. The fairly poor agreeement, as re- 
flected in the weighted Kappa values shown 
in Table 7, raises the issue of whether the 
broad typologies have any clinical utility 
beyond that evident for a dimensional model 
of personality functioning. 

In terms of clinical utility, the general 
equivalence of the EPI to a factorially pure 
see is rather encouraging, even though 

œ tormer was not constructed specifically 
za a view to replicating psychiatric diag- 
ati ae congruence is reassuring, since 
5 struments denote separate approaches 

personality assessment that converge at 
4 n vanat level. Owing to the larger 
mber of critical distinctions shown in 


T . 
; = > it appears that the decision rule to 
‘ah a patient’s degree of emotional insta- 


ili i 

T on a continuum of regression from 

eo “teases” out some of the real dif- 

a EERI much more sharply, and with less 
ji x 

‘2 Y, than a categorical reference sys- 


m. Th iscrimi 
two discriminant functions, in com- 
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Table 7 $ 

Overall Agreement (Mean Weighted Kappa 
Between Diagnostic Versus Dimensional Profiles 
Generated by Discriminant Function Analysis 
for Three Psychiatric Subgroups 


M 

% reliability 

Scalar correctly (weighted. 

combination assigned Kappa) 
Diagnostic profile 
26 variates 70.70 -500 
7 MMPI scales 68.15 455 
2 EPI scales 66.24 .426 
Severity continuum | 

25 variates 90.44 +843 
7 MMPI scales 85.35 759 


Note. The Neuroticism factor itself served as the 
putative criterion, leaving only 25 predictors for the 
global dimensional analysis. MMPI = Minnesota 
Multiphasic Personality Inventory. 


bined form, capture 90.44% of the inter- 
dimensional continuity of this major per- 
sonality factor. 

Abolition of diagnostic categories, it is 
often argued, involves loss of clinical-de- 
scriptive, social, and etiological data, includ- 
ing the predictions about outcome and treat- 
ment response implied by the original diag- 
nosis. Yet, critics of the dimensional model 
ignore the fact that modal profiles explain — 
lawful covariations between normal and psy- 
chologically ill persons in a more meaningful _ 
way. All psychiatric diagnoses have their 
own “degree of certainty,” which varies ac- 
cording to whether the etiology is fragmen- 
tary or well defined. As few psychiatric 
phenotypes are immutable, mutually exclu- 
sive, or nonreflexive, the art of diagnosis re- 
mains a fallible decision process founded on 
the assumption of homogeneity of psychi- 
atric disease types—a still largely unproven 
axiom. Evidence now emerging from the 
“spectral analyses” of alcoholic, depressive, 
and antisocial patient groups strongly sug- 
gests, in many cases, that such phenotypes 
are plainly heterogenous, with fairly un- 
known etiology and neurophysiologic basis. 

Notably, assigning any person into a pre- 
determined class, apart ftom the harmful 
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effect of an erroneous diagnosis (Morrison 
& Flanagan, 1978), presupposes much loss 
of idiographic data, for few labels reflect 
all the content. Whatever dimensional model 
is eventually chosen would need to compen- 
sate for this loss by providing specific details 
on the interdimensional axes relevant to the 
patient’s varied symptomology. The patient’s 
prodromal signs would serve to guide clini- 
cians in their choice of the “number of con- 
tinua needed” problem. 

Some recent research has already focused 
on the extent to which variable hierarchies 
of personal illness do exist (Foulds & Bed- 
ford, 1975) and the degree to which symp- 
tom patterns can be fitted within a broader 
dimensional personality system (Eysenck & 
Eysenck, 1976). Further study of person- 
ality functioning within a combined hierar- 
chical-dimensional context may well take us 
on the path to a scientific realism. This, per- 
haps, is one of the foremost tasks awaiting 
present-day behavioral scientists. Paradoxi- 
cally, early psychiatric theory, has had to 
undergo a good deal of revisional thinking, 
as it found itself faced with a large number 
of overlapping intermediate conditions that 
did not fit the classical concept, particularly 
the syndromes, In their everyday experience, 
clinicians hear far too rich and varied mate- 
tial to fit into discrete or bipolar categories, 
The lack of strict diagnostic criteria by 
which to codify the abnormal aspects of 
human behavior suggests that the pretension 
to a scientific psychiatry is weakly defensible, 
Under more careful scrutiny, it is usually 
only the pure cases—like the Cotard or Cap- 
gras syndromes—to which the patient has 
been assigned, Procrusteslike, that are 
uniquely categorical, And any such unicorns 
are rare indeed! 

In the personality domain, certain higher 
order factors such as Neuroticism (otherwise 
known as anxiety or general emotionality as 
some prefer) occur in the general population 
continuum with predictable frequency. The 
predisposition of being affected will vary 
according to the degree of stress that each 
individual can tolerate, A. E, Maxwell 


(1971) showed that neurotics, affecti: 

K h , aftective psy- 
chotics, and schizophrenics all exhibited S 
basic core of symp 


toms of the type generally 
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referred to as neurotic. Likewise, Vaz Serra 
and Pollitt (1975) found that the level of 
Neuroticism for 100 depressives increased 
proportionately with their level of depression 
(as measured by the Beck Depression In- 
ventory), whereas the reverse was true for 
Extraversion. This is in perfect accord with 
the present findings. 

More likely than not, the plotting of di- 
mensional profiles can be effectively applied 
to the treatment process by providing clini- 
cal data on symptom elevation, so as to 
schedule patients into optimal treatment re- 
gimes. Patients could simply be referred to 
a specific treatment modality depending on 
the relative position that they inhabited 
within a multidimensional grid. Apart from 
simply projecting a subject’s scores on per- 
sonality traits alone into a hypergeometrical 
space, more precise biological variables such 
as level of sedation threshold, blood gases, 
skin conductance, electroencephalogram and 
electrocardiogram anomalies, blood pressure, 
or extrapyramidal arousal could be included 
to pinpoint a person’s unique dimensional 
profile. In this way, the less debilitated per- 
son would readily be distinguished from the 
more severely disturbed person. A person’s 
dimensional position, in the hyperspace de- 
fined by the major coordinates selected, 
would thus govern critical treatment deci- 
sions about drug dosage levels, degree of 
Psychopathology, nature of neurological im- 
pairment, or ability to benefit from the 
various forms of therapy. The patient’s de- 
gree of reality contact, self-assertion, or oe 
pacity for love, as assessed by objective 
laboratory tests and systematic observation 
techniques, would help determine the client’s 
response to treatment. As this growing body 
of knowledge evolves from hypothesis to 
theory to empiricism to culminate as scien- 
tific law, the advantages of an applied di- 
mensional approach can be more formally 
addressed. 

Extending the argument, it has been shown 
that schizophrenics with high levels of auto- 
nomic arousal (high sedation threshold) €x- 
hibit more paranoid features, whereas those 
with low levels (low sedation threshold) 
tend to be retarded, affectively flattened, and 
socially withdrawn (Claridge, 1972). More 
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| importantly, Claridge emphasized evidence 


which suggests that certain central nervous 


| system parameters in those predisposed to 


schizophrenia not only covary together dif- 
ferently in their organizational structure, 
rather than deviations, but show continuity 
between normal and pathological thinking. 
This also applies to normal, neurotic, and 
psychotic patients as well. Much like per- 
sonality traits, “many biological phenomena 
are continuous variables” (Lader, 1975, p. 
75). Others, such as Weckowicz, Yonge, 


| Cropley, and Muir (1971), have demon- 


strated the importance of level of arousal 
and symptom severity for choice of treat- 
ment, whereas level of neuroticism or psy- 
choticism is a more valid predictor of treat- 
ment and outcome of depressive illness than 


| is diagnosis (Kendall, 1969). 


Patients who exhibit irrational mental 
processes or cyclic mood fluctuations (e.g., 
elation-depression) could readily be deline- 
ated on a sympathetic arousal continuum, 
together with any other relevant features, to 


| distinguish both hypomanic and dysphoric 
| states from normal excitatory states. Aside 


from simple linear deviation along an 
arousal continuum,” the hierarchical or- 
ganization of central nervous system pro- 
Cesses, as a causal determinant of the atypi- 
cal personality, can be studied within the 
Context of a dimensional framework. Psy- 
chiatric abnormalities can be arranged in a 
hierarchy (like a Thurstonian taxonomy) to 
enable a more detailed study of how changes 
in clinical symptomology covary with higher 
Order personality traits. For the most part, 
many forms of bizarre behavior and psy- 
chotic thinking processes appear to exist as 
continuities of regression from normality, 
he discontinuities. As such, the deviation 
De normality is quantifiable, rather than 
eing defined by qualitative change points. 
ee the dysfunctional personality fea- 
Rican be dimensionally rather than cate- 
al eroaa with no loss of vital clini- 
a r If indeed more realistic, one may 
krial why the dimensional view of ab- 
Sa ewes functioning is not more 
A ya opted as a suitable replacement for 
Psychiatric reference system. 

modal profiles, based on dimensional 
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analysis of normal and psychiatrically ill 
populations are standardized, with wide gen- 
eralizability, the clinician will be able to 
speedily calculate the probability associated 
with each level (or stage) of symptom se- 
verity, either by means of an accessible 
computer terminal or a handy desk calcula- 
tor. As data are accumulated for the under- 
lying continuum of clinical pathology, domi- 
nant symptoms, mood elevation or depres- 
sion, and fluctuating delusional states would, 
of necessity, each have their own set(s) of 
frequency distributions and tests of signifi- 
cance for the critical parameters under study. 

Above all, a purely disease-entity model 
does not reflect the “best fit” of the con- 
ceptual nosology to the actual empirical fea- 
tures of the general population continuum. 
Instead, in its place, a dimensional system 
of personality functioning, sufficiently mod- 
ernized, would permit the helping profes- 
sional to portray psychiatric abnormality in 
a scientific context, and thereby facilitate 
evaluation and treatment. To this end, daily 
clinical judgments that affect treatment 
strategies and therapeutic outcome would 
thus be made with an impressive degree of 
certainty. Given a broader scope, the de- 
velopment of applied dimensional models of 
behavior and personality remains an intri- 
guing research topic. 


Reference Note 


1. Lang, R. J., & Frost, B. P. Some personality 
features of anxiety and depressive neurotics in 
a day hospital. Unpublished material, University 
of Calgary, Alberta, Canada, 1977. 
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‘The Wechsler Memory Scale (WMS) has 
important place in clinical assessment, 
despite the restricted size of the nor- 
ive population and the small variances 
some of the subtests, it is one of the 
y memory tests that is generally available. 
e of the more important of the WMS 
btests is Logical Memory. A number of 
tor analytic studies have shown that this 
btest, together with Associate Learning 
d Visual Reproduction, load on a factor 
ving the acquisition and immediate re- 
l of complex stimuli, and that of the 
n WMS subtests, these three subtests 
pear to be the most useful in the detection 
cerebral dysfunction (Bachrach & Mintz, 
i Kear-Colwell, 1973). 
in the case of the Logical Memory sub- 
however, there are several weaknesses 
_ ambiguities in the administrative and 
ng Procedures that affect its reliability, 
dity, and hence its clinical utility. (a) 
e di ision of the story into memory units 
A essentially an arbitrary procedure. (b) The 
' oring Criterion is poorly defined; it is left 
the clinician to decide whether synonyms, 
aphrasing statements, or approximations 
ould be credited. (c) Many studies sug- 


€ authors would like to thank F. N. Simpson, 
chologist, Department of Veterans Af- 
dney) for permission to use tests and 
ve data prior to publication. 

| a sent to Peter 
i entre, SANE = 
New South Wales 2010, eee ant date 


Cued Recall and Discrimination of Memory Deficit 


J. G. Lyle 
University of Sydney, Australia 


Some ambiguities in the scoring procedure for the Logical Memory subtest of 
the Wechsler Memory Scale were eliminated by adopting a cued recall tech- 
nique for eliciting responses. A cued recall technique discriminated significantly 
better than the Logical Memory subtest between memory-impaired and control 
groups. This appears to have been due to the fact that on Logical Memory, 
controls showed worse performance on the final sections of the memory 
passages, whereas this did not occur in cued recall. 


gest that in recalling meaningful prose, sub- 
jects will reconstruct a story so as to keep 
the general theme intact rather than attempt 
to reproduce it verbatim. Summing the num- 
ber of words or memory units reproduced 
verbatim from the original passage may not 
appropriately assess this style of recall. 

This study began as a practical attempt to 
devise and test a modified version of the 
Logical Memory subtest that would avoid 
some of the shortcomings outlined above. A 
cued recall procedure was devised whereby 
the subject would be presented orally with 
a standard list of questions after the reading 
of each passage. Subjects no longer needed 
to be uncertain whether they were meant to 
recall the gist of the passage or attempt to 
recall it verbatim. In addition, by providing 
a detailed statement of the scoring criteria 
for each of the standard questions, the ob- 
jectivity of scoring such a test should be 
greatly enhanced. 

Subjects (all males) were selected from a 
local Veterans hospital. Excluded from the 
study were those who were disorientated or 
who suffered any motor, linguistic, or psy- 
chiatric symptoms likely to interfere with 
performance on the test battery. Subjects 
were allocated to the memory deficit and 
control groups on the basis of their perform- 
ance on two standardized memory tests, one 
involving recall of a number of pictures 
(Simpson Memory Pictures) and the other 
involving recall of a sequence of meaning- 
less shapes (Simpson Shapes Test). This em- 
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pirical method of subject allocation was 
adopted in recognition of the fact that docu- 
mented evidence of a cerebral lesion is, of 
itself, no guarantee of memory loss. All sub- 
jects were also administered the Simpson 
Adult Vocabulary Scale as a control mea- 
sure of intelligence, since it is known that 
vocabulary tests are relatively resistant to 
aging and dementia. 

Finally, two groups of subjects were se- 
lected. The age range of the subjects was 
50-64 years; mean age of the deficit group 
was 56.73 years (SD = 3.84) and 57.40 
years for the control group (SD = 4.03). 
The mean IQ of the deficit group was 97.67 
(SD = 11.12), and for the control group it 
was 99.37 (SD = 10.75). 

Both passages of the Logical Memory sub- 
test of the WMS and a modified version of 
this test involving cued recall by means of 
standard questions were administered. Two 
prose passages that comprised cued recall 
differed in content from those of Logical 
Memory so that there would be minimal 
transfer of learning from one to the other, 
All five tests were administered in a counter- 
balanced order to avoid order effects; cued 
recall and Logical Memory were never ad- 
ministered consecutively, Raw scores of Logi- 
cal Memory and cued recall were converted 
to T scores for direct comparison of means 
and standard deviations. These scores were 
subjected to analysis of variance, 

Mean scores for Logical Memory were 
45.24 for the deficit group (SD = 8.05) and 
55.15 for the controls (SD = 8.71). Mean 
Scores for cued recall were 42.42 for the defi- 
cit group (SD = 6.10) and 57.61 for the 
controls (SD = 6.22), A split-plot analysis 
of variance, in which the main effects were 
groups and tests, was carried out on the stan. 
dardized scores, The focus of interest in this 
analysis was the Groups x Tests interaction 
which was significant, F(1, 58) = 5.48 pe 
.05. This indicates that cued recall differen- 


tiated between the groups significanti: 
than did Logical Memory, ipa 


ees examination of the 
value of Logical Memory and cued Te 

‘ call 

was carried out by comparing group recall 

scores for the initial, middle, and final thirds 

of each Separately, For Logical Memory, the 


discriminating 


N 
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two passages were each divided into thirds 
according to the total number of memory 
units, and the number of units recalled from 
each third were then summed across both 
passages. For cued recall, three questions 
related to the content of each third of each 
passage; part scores were the sum of ques- 
tions correctly answered from each third of 
both passages. 

Split-plot analyses of variance, in which 
the main effects were groups and parts, were 
carried out on Logical Memory and cued 
recall separately. Logical Memory signifi- 
cantly discriminated between the groups, 
F(1, 58) = 18.46, p < .001; the parts of the 
test differed significantly, F(2, 116) = 13.47, 
$ < .001; and there was a significant Groups 
X Parts interaction, F(2, 116) = 6.08, p< 
01. The two groups were discriminated by 
Logical Memory significantly better in the 
fitst and middle thirds of the passages than 
they were in the final third. Cued recall dis- 
criminated between the groups highly sig- 
nificantly, F(1, 58) = 132.00, p< .001; 
there was a significant difference between 
parts of the test, F(2, 116) = 18.91, p< 
001 (with the first third being better re- 
called than the final third); but the Groups 
X Parts interaction was nonsignificant. It 
was apparent from the analysis that the 
Superior discrimination of cued recall rela- 
tive to Logical Memory was due to the fact 
that recall of the controls reduced sharply 
in the final third of Logical Memory, whereas 
in cued recall the Separation of the groups 
was maintained throughout the test. 

In summary, it was found that a modified 
version of the Logical Memory subtest of 
the WMS, in which a cued recall technique 
was used, discriminated between a memory- 
impaired group and a control group sig- 
nificantly better than did the standard Logi- 
cal Memory subtest. This was due to the re- 
duced performance of the controls on the 
final third of the prose passages of Logical 
Memory, which may have been the result 
of proactive inhibition, whereby the recall- 
ing of the earlier sections interfered with 
recall of the final sections, 

It was noted qualitatively that three types 
of errors seemed to characterize the memory- 
impaired group. (a) Generalization: This 0° 


of 


qurred when patients responded in the right 
ategory but were inaccurate; for example, 
Smith in lieu of Jackson; 10 o’clock in lieu 
of 2:30. The controls made 12 such errors, 
Whereas the deficit group made 40 such er- 
tors. This finding is contrary to that of Tal- 
land (1965), who found that this type of 
tror did not discriminate between impaired 
nd normal groups. (b) Fabrication: This 
curred when patients introduced material 
that was not in the original passage. This 
lype of response occurred 8 times in the 
deficit group and once in the control group. 
) Contamination: Here, material from 
Passage A was given in response to questions 
ertaining to Passage B. Patients from the 
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deficit group made 3 such errors, control 
group patients made no such errors. 


References 


Bachrach, H., & Mintz, J. The Wechsler Memory 
Scale as a tool for detection of mild cerebral 
dysfunction. Journal of Clinical Psychology, 1974, 
30, 58-60: 

Kear-Colwell, J. J. The structure of the Wechsler 
Memory Scale and its relationship to “brain 
damage.” British Journal of Social and Clinical 
Psychology, 1973, 12, 384-392. 

Talland, G. A. Deranged memory: A psychonomic 
study of the amnesic syndrome. New York: Aca- 
demic Press. 1965. 


Received May 18, 1977 m 


ge- 


bats Soe mii mae 


Effects of Differences in Suggestibility 
Within Self- and External-Control Conditions 


T. Souheaver and W. John Schuldt 
es | etd dE 


This research was designed to study possible effects of differences in waking : 
wampeetibllity on performance within self- and external-control conditions. Ten 
subjects and 10 low-suggestible subjects, as measured by body 


wsigned to cach of 


experimental conditions—self-control, 


and no reward. Response rates of self and external groups were 
group. However, response rates of high-suggestible 


subjects is the self-control condition were not significantly different than similar 
sabjects in the pe-reward groups. Moreover, performance of high- and low- 
weqpretitbe sebjecte was not significantly different in the external-control con- 
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subjects were significantly 


high-suggestible 
the self-control condition. 


Weiss, Ullman, & Krasner, 1960). Thus, 


waking suggestibility may not be related to — 


acquisition of responses, but it may serve as 
& moderator variable for modification and/or 
malntenence of existing behavioral patterns. 

This view seems congruent with Thoresen 
and Mahoney's (1974) view that suggesti- 


Moreover, it seems consistent with 
Tolor's (1971) evidence that high-suggest- 
ible subjects are externally oriented. 

there is evidence that self 
external reward occurring 
t conditioning procedures has 
tial effects on performance 
Perloff, 1967; Liebert, Speiglet, 
), there have been no systematic 
examine possible effects of dif- 
waking suggestibility on per 
thin self- and external-con' 
this research was desi 


JUH 
ihun 


rewards within groups 
who differ in levels of waking sug- 
prslibility. Specifically, the following hypoth 
tested: (a) There will be more 
tesponding in self- and external-control com 
ditions than in no-reward conditions, (b) 
subjects will respond more 
Ñ external control conditions than will low 


emotion, Jac. 06:12 204%./78/4406-1250800.15 


iggestible subjects, and (c) low-suggestible 
bjects will respond more to self-control 
mditions than will high-suggestible sub- 


Method 
jects 


ixty volunteer undergraduate students, single 
d between the ages of 18 and 25, participated as 
» Restrictions on age and marital status were 
d, since suggestibility has been found to vary 
function of these variables (Morgan & Hilgard, 
73). 


atus 


The index of suggestibility was postural body 
ay. An apparatus, similar to that described by 
bb (1962a), was designed to convert horizontal 
BY movement into movements of a marker along 
tical scale; that is, a nylon string attached to 
/Subject’s collar activated an ink marker posi- 
d behind the subject, 
pparatus for the self-reward, external-reward, 
ho-reward conditions was similar to that de- 
d by Liebert et al, (1970). The subject was 
d in front of a panel that contained two paral- 
columns of lights—20 red lights on the left and 
Ween lights on the right, Adjacent to each fifth 
Of lights, the numbers 5, 10, 15, and 20 were 
led in ascending order, The subject activated 
fed lights (which served to indicate the cri- 
for delivery of reward) by turning a criterion 
Sor knob at the center of the panel to one of 
Positions corresponding to the selected cri- 
The green lights were successively lit by 
ing a hand crank positioned in front of the 
» Two complete clockwise turns activated 
fren light. When the subject turned the crank 
th times to reach the preset criterion, lights on 
Columns extinguished to signify the end of a 
torage container for rewards (tokens) was, 
Adjacent to the nondominant hand of the 


Subject was initially requested to stand, 
blank wall, on a floor marker 4 feet (1,2 
Suggestibility apparatus, After the 


CONTROL AND SUGGESTIBILITY 


played on a standard cassette player: 


I want you to listen carefully to what this 
says while you go on just standing there, 
still and relaxed, with your eyes closed, Now, just 
keep standing there please, re. 


and relaxed, and listen to me, Now I want you 
you are falling forward, 
are falling, falling forward, falling forward 
the time, . , . (adapted from Eysenck, 1947) 


Subjects whose body sway exceeded 2 inches 
(5.1 cm) were considered high suggestible; those 
who swayed less than 2 inches were considered low 


Procedures for the self-control and external-con- 
trol conditions were similar to those of Bandura 
and Perloff (1967). Subjects in the self-control 
condition were allowed to select their own reward 
criterion and to self-administer tokens. These sub- 
jects were instructed to choose their performance 
criterion on the reward apparatus, and whenever 
the criterion was attained by turning the crank, 
they were to give themselves a token with the 
nondominant hand by placing it in the token con- 
tainer, They were also told that these tokens could 
be exchanged for money, at the rate of one penny 
per token, at the end of the experiment, Subjects 
in the self-control condition were told that they 
could change the performance criterion only once, 
either higher or lower. 

Subjects in the external-control condition did 
not select a reward criterion, Rather, their cri- 
terion was individually yoked with self-control sub- 
jects, This allowed for a control of possible rein- 
forcement scheduling effects, External-control sub- 
jects were told that they would receive 
changeable for pennies, each time they turned 
crank a sufficient number of times to turn 
red light, Each time the criterion 
experimenter placed a token in 
tainer. 


Subjects in the no-reward condition 
individually yoked to subjects in the 
condition. They were told to turn the 
the dominant hand. Performance was not 
with tokens. 


Results 


Body sway means for high-suggestible sub- 
jects (3.58 inches (8.9 cm)] and low-sug- 
gestible subjects [1.01 eg hed ae 

si different, =i. 
p<. oe han of the groups (high 
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suggestible = 19.10; low suggestible = 19.70) 
did not differ significantly (£< 1). More- 
over, mean grade point average of 2.84 and 
2.82 for high- and low-suggestible subjects, 
respectively, did not differ significantly (t < 
1). 
The dependent measure for the three ex- 
perimental conditions was number of crank 
turns performed by each subject within 4 
minutes (Liebert et al., 1970). Means for 
each treatment, at both levels of suggestibil- 
ity, are presented in Table 1. 


Dunnett critical differences tests (Keppel, 
1973) were used to compare responses of 
no-reward subjects to subjects within each 
reward condition at each level of suggesti- 
bility, Mean performance of low-suggestible 
acer cat the eemabecatiel self- 

was tl; 
($ < .05) than oo oe 


sponded ly more than did h 
suggestible paaa flag 
tion (p< .05), However, high-suggestib 
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Table 1 

Mean Number of Turns 

um 
Treatment group 


Suggestibility No 
level Self External reward 
High 218.20 250,70 172.00 
Low 276.40 268.40 126,00 


treatment, #(18)= 2.11, p< .05. No sig- 
nificant difference was noted between high- 
and low-suggestible subjects within the no- 
reward treatment, (18) = 1.42, p> .0S. 


Discussion 


This study demonstrated that increased 
responsivity resulted with either self-control 
or external-control procedures when levels of 
Suggestibility were disregarded. However, 
there was no evidence of conditioning for the 
Nigh-suggestible subjects in the self-control 
condition. Thus, self-control procedures ms” 
not be equally as effective as external-cop- 
trol procedures when “personality” variables 
are considered. 

Tt was hypothesized that high-suggestible 
subjects would respond more to external con- 
trol than would low-suggestible subjects. This 
prediction was based on the conceptualiza- 
tion that high-suggestible persons are more 
Susceptible to influence from others ends 
therefore, are more amenable to external im- 
Position of standards and rewards to increase 
performance levels, However, the results of 
this study do not support this prediction; 


that is, the high- and low-suggestible sub- 


jects responded at essentially the same rate 

in the external-control condition. This find- 

ing is not congruent with the findings of | 
Webb (1962b) and Weiss et al. (1960), who! 
reported that high-suggestible subjects did 

respond more than low-suggestible subjects 

under response-contingent external reward 

conditions. These studies, however, used ver- 

bal praise as rewards in verbal conditioning 

Procedures. The present study, in contrast, 

used tokens as rewards in an effortful motor 

task. Thus, differences in procedure may ac 

count for the apparently discrepant results. 
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Another hypothesis was that low-suggest- 
ible subjects would respond more to self- 
contro} than would high-suggestible subjects. 
This prediction was based on the conceptual- 
ization that low-suggestible persons are less 
susceptible to influence from others and are, 
therefore, less likely to require external im- 
position of standards and rewards to increase 
performance rates. The results were consist- 
ent with this prediction. 

Based on the results of this research, one 
can speculate that low-suggestible persons 
respond equally well to either self- or ex- 
ternal demands and rewards, whereas high- 
Suggeéstible persons are so dependent. on ex- 
ternal factors that their performance is mini- 
mal when self-determination of reward con- 
tingencies is required. Moreover, self-de- 
livered rewards may not have reinforcement 
value for high-suggestible persons when they 
are allowed to determine their own standards. 

Our findings, if supported by future re- 
search, would seem to have clinical signifi- 
bance. For example, it appears that either ex- 
ternal-control or self-control techniques would 
be effective with low-suggestible clients. 
Additionally, it would seem that high-sug- 
gestible clients would not initially benefit 
from psychotherapy procedures in which self- 
control is advocated or imposed. However, 
Caution is urged in applying these prelimi- 
nary findings to clinical practice, since there 
is a lack of research relating suggestibility 
arse and because this study may 

nsiderable analogue error. 
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Self-Directed Treatment for Premature Ejaculation 


a 
Robert A. Zeiss 
University of Oregon 


One promising variation of the now standard Masters and Johnson approach to 
treating premature ejaculation lies in the use of self-administered treatment 
manuals. In a test of one such manual, couples with premature ejaculation 
problems were assigned randomly to (a) totally self-administered treatment; 
(b) self-administered treatment in conjunction with minimal therapist (tele- 
phone) contact; or (c) standard therapist-administered treatment. Couples were 
successfully treated by therapists or by themselves when they maintained min- 
imal contact with a therapist. Couples working on their own, with no therapist 
contact, failed to complete treatment successfully. Follow-up data indicated 
that although there was deterioration in therapeutic gain following the termina- 
tion of treatment, improvement over pretreatment responses was maintained on 
all relevant measures. An analysis of posttreatment data indicated that greatest 
improvement in ejaculatory control occurred when couples continued to use the 
squeeze or pause to delay ejaculation, but significant improvement in latency 
to ejaculation also occurred when couples used neither technique to lengthen 


intercourse. 


During the past few decades, a number of 
approaches to the treatment of premature 
ejaculation have appeared. These have in- 
cluded strategies relying on pharmacologic 
intervention (e.g., Aycock, 1949; Bennett, 
1961; Boneff, 1971; Mellgren, 1967), as well 
as strategies relying on psychotherapy (e.g., 
Cooper, 1969) or combining chemotherapy 
with psychotherapy (e.g, Friedman, 1968; 
Schapiro, 1943), In the 1950s, Semans 
(1956) and Wolpe (1954, 1958) introduced 
learning-oriented approaches to the prob- 
lem; these approaches were largely ignored 
until the work of Masters and Johnson 
(1970) refined the approaches and revived 


This report is based on a doctoral di: ji 
submitted to the University of Oregon, oa 

The author gratefully acknowledges the assist- 
ance of Debra Jackson-Spangler, Robert Kurlychek 
Dennis McClure, Terry S, Trepper, and Katie 
Whalen, all of whom served as therapists in this 
tudy. ahi encouragement and contributions of An- 
onette Zeiss were invaluable i 
sey luable in all stages of the 

Requests for reprints should be sent 

$ to Ri 

A, Zeiss, who is now at Valle Del Sol, 1209 ma 
First Avenue, Phoenix, Arizona 85003. 


Copyright 1978 by the American Psychological Assocation 


interest in the behavioral treatment of pre- 
mature ejaculation. 

Since the Masters and Johnson publica- 
tion, most research in this area has focused 
on increasing the efficiency of their highly 
effective approach (e.g., Clarke & Parry, 
1973; Kaplan, 1974; Kaplan, Kohl, Pome- 
roy, Offit, & Hogan, 1974; Zeiss, Christen- 
sen, & Levine, 1978; Zilbergeld, 1975). 
However, few of these studies have presented 
any data to support contentions of thera- 
peutic effectiveness and efficiency. 


Self-Directed Treatments 


A recent and promising direction in the 
treatment of sexual dysfunction has been the 
development of behavioral self-help programs 
written to allow couples to treat sexual prob- 
lems on their own, without the extensive in- 
tervention of a professional therapist. These 
have included programs for the treatment of 
general sexual dysfunction (e.g, Kass & 
Strauss, 1975) and female orgasmic dysfunc- 
tion (e.g., Barbach, 1975; Heiman, LoPic- 
colo, & LoPiccolo, 1976; Kline-Graber & 
Graber, 1975), as well as specifically for 
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SELF-DIRECTED TREATMENT FOR PREMATURE EJACULATION 


premature ejaculation (Lowe & Mikulas, 
1975; Vandervoort & MclIlvenna, 1972; Zeiss 
& Zeiss, 1978). The treatment for premature 
ejaculation seems particularly amenable to 
a self-directed format, since it enjoys a high 
probability of success, is straightforward, and 
can be easily standardized. 

If empirically validated, this type of pro- 
gram has the potential to make effective, 
yet inexpensive, treatment available to the 
general public. Unfortunately, data have 
been reported for only two of the sexual 
self-help programs (Kass & Strauss, 1975; 
Lowe & Mikulas, 1975), and only one has 
been submitted to any controlled evalu- 
ation. Lowe and Mikulas (1975) found that 
couples using a written program in conjunc- 
tion with telephone contact (of unspecified 
length) with a therapist reported significant 
improvement in latency to ejaculation, where- 
as untreated control couples reported no 
change in latency. Follow-up data were not 
reported. 

Zeiss (1977) reported on two couples who 
Successfully used a second self-directed pro- 
gram (Zeiss & Zeiss, 1978) in conjunction 
with minimal phone contact with a thera- 
pist (less than 1 hour total per couple) to 
treat premature ejaculation difficulties, An 8- 
month follow-up revealed that therapeutic 
gains maintained. 

Even though these studies suggest that 
self-directed treatment for premature ejacu- 
lation may be effective, there were sufficient 
Problems with each to preclude firm con- 
clusions. Lowe and Mikulas (1975) relied 
only on subjective estimates of ejaculatory 
atency and did not report on the mainte- 
hance of treatment gains or on the amount 
F re phone contact required. Zeiss (1977) 
a objective indices of ejaculatory la- 
i Y, provided an 8-month follow-up, and 
Sle the amount of therapist contact in- 
ieee but firm conclusions cannot be drawn 

a rolled case reports. 
ful al it is unclear exactly what success- 
Both ate of premature ejaculation does. 
ai Masters and Johnson (1970) in- 
B lon and the Semans (1956) interven- 

nvolve teaching clients to interrupt the 


ar 
Susal Process, either by applying the 
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squeeze or by ceasing all stimulation of the 
penis. It is not clear, however, if greater 
latency to ejaculation results simply from 
repeated interruption of the arousal process 
with squeezes or cessation of stimulation or 
if practice with these exercises enables cli- 
ents to tolerate greater and lengthier stimu- 
lation without the need for interruption of 
arousal, 

To address these issues, three treatment 
conditions were compared. One group was 
treated in a “standard” sex therapy format 
in which both partners of the dysfunctional 
couple regularly saw a therapist in the clinic. 
Two treatment conditions used the Zeiss and 
Zeiss (1978) manual. Treatment was en- 
tirely self-administered for one of these 
groups; the other group had minimal thera- 
pist contact (phone) to increase the prob- 
ability of successful outcome. 


Method 
Clients 


Twenty heterosexual couples with self-defined 
premature ejaculation difficulties began treatment. 
These client couples also met the following criteria: 
(a) mean timed ejaculatory latency was less than 5 
minutes, (b) both partners agreed to participate in 
the treatment program, (c) the female had no 
severe gynecological problems, and (d) the couple 
had experienced the problem for at least 6 months. 
The 20 couples who began treatment were randomly 
assigned to one of the three treatment conditions. 

Of the 20 couples who began treatment, 2 (1 in 
each self-directed treatment condition) completed 
treatment and verbally reported success but failed to 
complete posttreatment assessment. Because there 
were no posttreatment data for these couples, they 
were excluded from all data analyses and further 
consideration. Data are reported on 18 client cou- 
ples, 6 in each treatment condition. Demographic and 
descriptive data for the three conditions are pre- 
sented in Table 1. One-way analyses of variance 
revealed that couples in the three treatment condi- 
tions did not differ significantly on any of these 
variables before treatment. 


Therapists 


Therapists included three male graduate students in 
counseling psychology and two female BA-level 
paraprofessionals, in addition to the author (a male 
graduate student in clinical psychology). Only the 
author had previously treated couples with sexual 
dysfunction; the other therapists were trained and 
supervised by him. 
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Table 1 
Descriptive Statistical Means for the Three Treatment Groups 
Standard Phone No 
Variable treatment contact contact 
age 28.5 33.2 30.8 
Females 28.8 27.8 29.8 
ri a 15.3 14.3 14.2 
Females 14.8 14.5 14.5 
Length of marriage or cohabitation® 3.2 5.5 6.8 
No. children 1.8 21 1.2 
Gross yearly income $8,883.33 $12,900.00 $14,216.67 
Duration of premature ejaculation problem* 7.7 14.7 9.7 
Mean timed ejaculatory latency 
(pretreatment)? 83.0 118.8 54.2 
Marital Adjustment Test 
(Locke & Wallace, 1959) 
Males 104.2 114.5 110.2 
Females 110.7 105.3 100.2 
a In years, 
b In seconds. 
Procedure 


During pretreatment assessment, all couples com- 
pleted the Locke and Wallace (1959) Marital Ad- 
justment Test and a sexual background inventory. 
After receiving general information about the study 
and deciding to participate, couples were asked to 
report the male’s latency from intromission to 
ejaculation, timed by stopwatch, on two separate 
occasions of self-defined “normal intercourse.” 

At a second meeting with their therapist, couples 
paid a $10 treatment fee and a refundable contin- 
gency deposit of $30, Refund of the deposit was 
contingent on regular completion of assignments and 
completion of posttreatment assessment. 

At the conclusion of treatment or after 15-20 
weeks from the start of treatment (whichever was 
earlier), couples again completed the Marital Adjust- 
ment Test and the sexual background inventory, In 
addition, couples timed the male’s ejaculatory Ja- 
tency (a) twice as they “normally” had intercourse, 
reporting the number of squeezes i 


mpletion of posttreat- 
Marital Adjustment Test and 
inventory were mailed to the 
Were asked to complete them. 


Treatments 


Sell- administered manual (no contact). Client 
couples in this condition were given the treatment 


manual (Zeiss & Zeiss, 1978)! at their second 
meeting with their therapist. This manual describes 
a 12-week training program incorporating the Mas- 
ters and Johnson (1970) squeeze technique and the 
Semans (1956) pause technique. During each week, 
about 3 hours of specific sexual and talking activi- 
ties are asigned. The talking assignments are in- 
tended to facilitate a couple’s communication about 
Sexuality and to enhance verbal intimacy. Each 
week’s lesson is followed by a trouble-shooting guide 
that discusses problems commonly encountered dur- 
ing that week’s activities. After an introductory 
discussion of premature ejaculation and the use of 
the treatment program, the manual instructs couples 
in the squeeze and the pause procedures, as well as 
in sensate focus (Masters & Johnson, 1970), or 
“pleasuring,” exercises. Subsequently, the squeeze and 
Pause are incorporated into the pleasuring exercises 
to delay ejaculation and to prolong sexual activity 
in a graduated Sequence of sexual interactions. The 
Pleasuring exercises and extravaginal stimulation 
systematically approach “normal” intercourse, with 
the expectation that a couple will learn to enjoy 
ejaculatory control during sexual activity of their 
own choosing before completion of the program. 
Couples in the no-contact condition were encour- 


*This is a revised version of the manual actually 
used in the study. The version used by clients in the 
study was a prepublication draft. This preliminary 
version can be made available, at cost, to those re- 
questing it from the author. 
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aged to work diligently and regularly at the treat- 
ment program and were told to contact the clinic on 
completion of the program or earlier if they ran into 
insurmountable problems. (None actually requested 
help during treatment.) After 15-20 weeks, all 
couples were asked to complete the posttreatment 
assessment. 

Self-administered manual with minimal therapist 
contact (phone contact). This condition involved 
brief weekly phone contact with the therapist at 
prearranged times. Total phone contact averaged 71.5 
minutes per couple, or about 6 minutes each week, 
Phone contact was supportive in nature and served 
primarily to check on couples’ progress, to con- 
gratulate their successes, to provide encouragement 
when couples felt discouraged, and to help them 
resolve minor problems when they arose. 

Therapist-administered treatment (standard treat- 
ment). In this condition, client couples were seen 
once weekly, at the University Psychology Clinic, by 
a therapist. The same specific treatment procedures, 
exercises, and schedules were used as are described in 
the manual; treatment was limited to 12-20 therapy 
Sessions of no more than 1 hour each for no more 
than 20 weeks. The manual was not given to these 
clients, 


Results 
Posttreatment 


The 18 couples who began treatment and 
Provided posttreatment data were classified as 
Successful or unsuccessful in treatment ac- 
cording to the following criteria: (a) Mean 
timed latency in normal intercourse must be 
greater than 5 minutes or must have improved 
Y 3 or more minutes from pretest, and (b) 
both Partners must report improvement in 
Satisfaction with ejaculatory latency. By these 
criteria, 11 of the 18 couples were successful 
M treatment, All couples in standard treat- 
Ment and 5 of 6 couples in the minimal phone 
Contact condition successfully overcame their 
ae ejaculation problems, whereas none 

the 6 couples using the manual entirely on 

is vie was successful. Most couples 
oe aa early in treatment, typically 
ee that they were too busy with other 
ieee e to devote enough time to the treat- 
fhei ia that they would return to it when 
an os lives allowed. Couples who dropped 
One of eS were offered treatment in 
ti ten e successful treatment conditions, 

© accepted the offer. Only 1 couple in 
oh aa condition completed treat- 
despite ; IS couple was rated as unsuccessful, 

mprovement in latency, because the 
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Table 2 

Mean Timed Ejaculatory Latencies for 
Successfully Treated Couples 
amaaa 


Pre- Post- 
Treatment treatment treatment 
Standard* 1 min 23 sec 10 min 48 sec 
Phone contact 1 min 54 sec 10 min 


Note. Timed latency data were not collected at 


follow-up. 
*n=6, 
bn=5. 


female partner did not report improvement 
in her satisfaction with the male’s ejaculatory 
latency. 

Additional analyses compared the effective- 
ness of standard in-clinic treatment and self- 
directed treatment with phone contact, using 
two-factor repeated measures analyses of 
variance (unweighted-means solution) com- 
paring pretreatment and posttreatment infor- 
mation. The first of these analyses examined 
mean timed ejaculatory latencies. This analy- 
sis showed a strong main effect from pre- 
treatment to posttreatment, F(1,9) = 21.24, 
p < .005 (means appear in Table 2). There 
were no significant differences between treat- 
ments or interaction effects. A similar analysis 
of variance was computed using couples’ esti- 
mates of ejaculatory latency (averaged across 
the two partners). This analysis also indi- 
cates a significant main pre-post effect, F(1, 
9) = 66.74, p < .001, and no interaction or 
differences between treatments. Mean latency 
estimates are shown in Table 3. 

Analyses of variance revealed no significant 
change on the Marital Adjustment Test 
(Locke & Wallace, 1959) either for males or 
for females. The effects of successful treat- 
ment for premature ejaculation on the gen- 
eral sexual relationship were assessed through 
consideration of the sex quality composite 
scores. This index is composed of questions 
from the Sexual Background Inventory (pre 
or post), chosen on an intuitive, a priori 
basis.2 Analyses of variance revealed signifi- 


2For males, the index consists of items concern- 
ing frequency of intercourse, usual length of foreplay, 
frequency of premature ejaculation, satisfaction 


with ejaculatory latency, anxiety over ejacu- 
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Table 3 
Mean Estimates of Ejaculatory Latency for Successfully Treated Couples 
Treatment Pre 
Standard 1 min 23 sec* 
Phone contact 2 min 8 sec> 


Post Follow-up 
8 min 22 sec* 4 min 52 sec» 
8 min 18 sec? 3 min 55 sec® 


Note, Because of the small n, follow-up data were not included in analyses of variance. 


cant main pre-post effects on this variable, 
both for males, F(1,9) = 91.71, p < .001, 
and for females, F(1,9) = 41.59, p < .001. 
There were no significant differences between 
treatments or interactions for either sex, 
Mean scores on this variable are shown in 
Table 4, 


Maintenance 


Of the 11 couples who were successfully 
treated, 8 returned follow-up data at a mean 
of 4.4 months (range = 3-9 months) after 
posttreatment assessment, Of the 8 couples 
who provided follow-up data, 4 were classified 
as successfully treated at the time of follow-up 
on the basis of the two criteria described 
previously, Three of these couples had re- 
ceived standard in-clinic treatment for prema- 
ture ejaculation, and 1 had self-directed 
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no improvement (over pretest) in their satis- 
faction with ejaculatory latency. 

Of all couples completing the follow-up as- 
sessment, two couples reported the same la- 
tency as at posttreatment, and six couples 
estimated their latency to be less than at post- 
treatment. The differences between the two 
mean estimates (8 min 11 sec at posttreat- 
ment and 4 min 31 sec at follow-up) was sig- 
nificant by a ¢ test for paired observations, 
t(7) = 3.81, p < .005, indicating that ejacu- 
latory latencies did decrease in the months 
following treatment. However, despite the 
decrease in latencies, follow-up estimates of 
latency remained significantly greater than 
pretreatment estimates (M = 1 min 43 sec at 
pretreatment), ¢(7) = 3.47, p < .01. A cor- 
relation between the two measures (.58) in- 
dicated that posttreatment latencies were 
largely predictive of follow-up latencies. 

The sex quality composite scores for these 


Table 4 


Mean Sex Quality Composite Scores for 
Successfully Treated Couples 


Follow- 
Treatment Pre Post up 
Males 
Standard 23.88 42.4" 40.8" 
Phone contact 25.2° 385° 30.14 
Females 
Standard 25.88 43.12  37.8° 
Phone contact 28.7¢ 47.79 34.7 


SSE a umŘŮĖ— 


Note. Because of the small n, follow-up data were 
not included in analyses of variance. 

*n=6, 

banas, 

nag 

4m = 3, 


ouples were also examined. The pattern 
same as with ejaculatory latency—im- 
ent with treatment, followed by a mod- 
‘deterioration by the time of follow-up, 
for males and for females. Means for 
es were 26.36 at pretest, 42.57 at posttest, 
and 36.18 at follow-up; means for females 
Were 28.50 at pretest, 45.50 at posttest, and 
5 at follow-up. ¢ tests for paired observa- 
indicated that sex quality scores dropped 
icantly between posttreatment and fol- 
waup assessments for both sexes: For males, 
33, p< .05; for females, (7) = 
$ P < .01. Despite this decrease, the qual- 
‘of Sex at follow-up remained significantly 
eater than the quality of sex before treat- 
ent for both sexes: For males, (6) = 2.80, 
025; for females, ¢(7) = 3.53, p < .005. 
‘Was also found that the posttreatment sex 
lality composite for males was highly pre- 
ive of follow-up ejaculatory latency esti- 
tes, r(5) = .87, p < .01, whereas the com- 
€ for females was not (r = —.10). 


chanism of Successful Treatment 


examine the mechanism by which treat- 
ot for premature ejaculation works, couples 
“Posttreatment were asked to time their 


Or pauses; 10 couples provided these 
Mean latency was 10 min 33 sec as in- 
€ normally occurred and 4 min 21 sec 
sno squeezes or pauses, t(9) = 2.90, p < 
Mean posttreatment latency without 
£S or pauses was significantly greater 
an pretreatment latency, which was 1 
Sec for these 10 couples, ¢(9) = 3.82, 
5. Thus, although treatment was most 
© when couples continued to use the 
Or pause, there was also a significant 
of treatment on simple ejaculatory 
CY when squeezes and pauses were not 


Discussion 


results of this study indicate, in a 
„With fairly stable relationships 
histories of premature ejaculation, 
can successfully treat themselves 
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using a written instructional guide if they 
maintain minimal contact with a therapist. In 
this instance, minimal therapist contact was 
by telephone, averaged 6 minutes per week, 
and generally was not specifically therapeutic 
in nature. Rather, phone contacts served pri- 
marily to check on, encourage, and congratu- 
late treatment progress. Without the non- 
specific therapeutic effects of ongoing contact 
with a therapist, couples failed to complete 
treatment successfully, generally because of 
ceasing to follow the treatment plan. These 
results were consistent across a variety of 
measures, including timed ejaculatory latency, 
estimated latency, and the sex quality com- 
posite, There did not seem to be indications of 
concomitant improvement in the general mari- 
tal relationship. 

The brevity and nature of phone contacts 
with minimal therapist contact couples sug- 
gest that the treatment exercises outlined by 
the Zeiss and Zeiss (1978) manual are 
sufficient for successful treatment when fol- 
lowed. However, evidence from the no-con- 
tact condition indicates that an external 
source of encouragement and motivation is 
probably necessary in order for couples to 
follow through with treatment exercises. 
Graduate student and relatively inexperienced 
paraprofessional therapists filled the therapist 
role in this study; the same function could 
probably be fulfilled by others with minimal 
training in the treatment of sexual dysfunc- 
tion. These might include physicians, psy- 
chologists, mental health paraprofessionals, 
or the clergy. 

The success of treatment with minimal 
therapist contact suggests that the cost of a 
standard, in-clinic _ therapist-administered 
treatment for premature ejaculation may no 
longer be justifiable for those couples with 
well-defined and simple premature ejaculation 
difficulties. Rather, couples could be offered 
the use of the self-directed treatment program 
in conjunction with minimal contact with a 
therapist. In this framework, intensive pro- 
fessional counseling could be reserved for 
those couples who fail to succeed with the 
minimal contact approach or for couples 
whose premature ejaculation difficulties are 
compounded by other clinical problems. 

A recent review of the behavioral self-help 
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literature (Glasgow & Rosen, 1978) indicates 
that dropout problems similar to those found 
in this study have been noted with self-di- 
rected treatments in other areas such as fear 
reduction (Clark, 1973; Marshall, Presse, & 
Andrews, 1976; Phillips, Johnson, & Geyer, 
1972; Rosen, Glasgow, & Barrera, 1976), 
weight reduction (Hanson, Borden, Hall, & 
Hall, 1976; Mahoney, Moura, & Wade, 1973), 
and study skills training (Beneke & Harris, 
1972; Harris & Ream, 1972). Many of these 
studies noted dropout problems with minimal 
therapist contact applications, as well as with 
strictly self-administered treatment applica- 
tions. In the current study, the dropout prob- 
lem was restricted to the self-administered 
condition but persisted even with the existence 
of monetary contingency deposits. 

Previous work on the treatment of pre- 
mature ejaculation has not specified the 
mechanism by which treatment is effective, 
although Levine (1976) has reported that 
unlimited control is often not developed. An 
analysis of posttreatment data indicated that 
greatest improvement in ejaculatory control 
occurred when couples continued to use the 
squeeze or pause to delay ejaculation, but sig- 
nificant improvement in latency to ejaculation 
also occurred when couples used neither tech- 
nique to lengthen intercourse. This finding 
concurs with Levine’s conclusions. 

By the criteria used to define success, half 
of the successfully treated couples for whom 
there were follow-up data could no longer be 
classified as successfully treated at follow-up. 
The reasons for this decrease in latency and 
general quality of sex are not clear, although 
it may be that there is a glow of Success that 
tends to deteriorate in the months following 
treatment for sexual dysfunction, or couples 
may have simply decreased their use of the 
Squeeze or pause after completing treatment. 
It should be noted that although estimated 
latencies and the sex quality composite both 
decreased following completion of treatment 
couples remained significantly improved on 

ese variables at follow- 
treatment Bena ee 

Also of interest is thi i 
treatment ejaculatory By wees a 
: of 
ejaculatory latency at follow-up, This sug- 
gests that even though there was a general 
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decrease in latencies from posttreatment to 
follow-up, the decrease occurred across most 
couples, resulting in similar ranking of im- 
provement at follow-up. Even more power- 
fully predictive of ejaculatory latency at fol- 
low-up was the posttreatment sex quality 
composite for men. The composite for women 
was not predictive of follow-up results. This 
indicates that improvement in latency tends 
to maintain when the male partner is satisfied 
with the sexual relationship; if he is less satis- 
fied with the sexual relationship, improve- 
ments in his ejaculatory latency fail to main- 
tain. 

This finding may shed some light on a factor 
that has perplexed sex therapists at this clinic 
for years. Many couples who have been suc- 
cessfully treated for sexual dysfunction have 
improved on their target problem during 
treatment, but after the termination of treat- 
ment, the frequency of sexual activity tends 
to revert to very low pretreatment levels, in 
spite of reports that improvement in the 
target areas has maintained. That is, the man 
retains his newly learned ejaculatory control 
or the woman retains her capacity for orgasm 
(LoPiccolo, Note 1). This contradiction is 
perplexing because of the presumed reinforce- 
ment value of sexual activity in general and 
orgasm in particular. 

Although couples are referred with specific 
dysfunctions, and although we as behavior 
therapists eagerly label and treat their dys- 
functions, it may be that resolution of the 
specific dysfunctions does not adequately 
resolve these couples’ unspecified needs for a 
general improvement of the sexual relation- 
ship. Thus, in addition to resolving specific 
dysfunctions, therapy should probably con- 
cern itself with all aspects of the sexual rela- 
tionship. If the general relationship does not 
improve along with the target behavior, im- 
Provements in that target behavior will not 
necessarily maintain, 


Reference Note 
f? eae J. Personal communication, August 2, 
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Therapeutic Effectiveness of Setting and Monitoring Goals 


Russell R. Hart 
Copper Mountain Mental Health Center, Salt Lake City, Utah 


This study investigated the therapeutic effects of goal-setting strategies on pa- 
tients whose behaviors were disintegrating but not fragmented or disoriented. 
Two techniques of goal setting were introduced into the therapeutic interaction 
to effect greater beneficial changes in patient attainment of goals: (a) collabora- 
tion of patients, therapists, and “collaterals” (other persons significant to the 
patient, such as spouse or probation officer) on 3-month treatment goals and 
(b) weekly monitoring with structured feedback. Sixteen outpatients were ran- 
domly assigned to a treatment group for which the Behavioral Monitoring 
Progress Record was used. The Behavioral Monitoring Progress Record is a 
method in which the patient and therapist collaborate in setting and monitoring 
weekly therapeutic goals. Sixteen other outpatients were randomly assigned to 
the same individual therapy but without weekly goal setting or monitoring. 
Collateral persons and behavioral criteria were used at intake for collecting 
information in each problem area and were used at follow-up to validate the 
patient’s self-report. A score was used as an index of change. All patients im- 
proved from the time of intake to the time of the follow-up interview, eight 
individual therapy sessions later. There was greater success in the attainment 
of goals for patients using the Behavioral Monitoring Progress Record than 
for patients in the group not using this format. 


Because of the increasing emphasis on 
accountability within community mental 
health centers, Kiresuk (1973) has proposed 
a method for evaluating treatment effective- 
ness in psychotherapy using goal attainment 
scaling procedures, Goal attainment scaling 
is a process in which therapeutic goals are 
set for, by, or with the subject. The possible 
levels of predicted attainment of these goals 
are scaled from the least favorable to the 
most favorable outcomes in behavioral terms, 

Although the purpose of Kiresuk’s work 


goals. To accomplish this objective, the an- 
Swer to two questions were sought: Did be- 
havior change occur from intake to follow-up 
evaluation? and Was there a difference in 
behavior change between the monitored and 
control groups? 


Method 


The Subjects were 32 adult patients at a com- 
munity mental health center whose mental func- 


was to develop a method of quantifying be- 
havioral change, the process of setting and 
scaling goals appeared to have potential as a 
therapeutic procedure, 

The major objective in this study was to 
evaluate the therapeutic effectiveness of set- 
ting goals in behavioral terms while monitor- 
ing the subject’s progress in attaining these 
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tioning was not sufficiently impaired to require 
hospitalization. The decision to include a patient 
depended on the answers to two essential ques- 
tions: (a) Is short-term (3 months). individual 
Psychotherapy clinically appropriate treatment for 
the patient? and (b) Is the patient capable of par- 
ticipating meaningfully and responsibly in a ther- 
apy program ? 


Instruments 


The instruments used in this study were the 
Goal Attainment Scale (GAS; Kiresuk & Sherman, 
1968), the Behavioral Monitoring Progress Record, 
and the Follow-up Interview Schedule. 
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BEHAVIORAL MONITORING PROGRESS RECORD 


nawe_vohin Doe 


SEE SCALE 
HEADINGS ON 


BE SPECIFIC, 
OBSERVABLE /OR 
TASK- ORIENTED 


PRIORITIZE ¢ WET. 
MAKE DECISION 


tHERAPisT_Stmith 


OR PROBLEMS: 
UNHAPPY WITH PRESENT] FEELS DISHONEST, | Doesn't STAND UP 
EMPLOYER PHONY, PLAYING GAMES | FOR RIGHTS 
CAN'T BE SELF 


DATE oP IST SESSION 5/1/16 


TO Be INVoLYED IN 
ONE OR TWO FRIEND- 
SHIPS OR CONFIDENCE: 


COMMUNICATION 


Figure 1. Behavioral Monitoring Progress Record. 


eg rere Monitoring Progress Record 
the eat me BMPR was designed to increase 
Faas oe ic movement of the patient by having 
CaS tis. goals and report on their progress at 
peek oe session (see Figure 1). A predicted 
RRA SEa was set with the patient, and succes- 
mined Rites approximations were also deter- 
nial eae each Problem area, a weekly goal 
methods sy of attainment was specified. These 
Bente’ rie carefully defined behavioral “assign- 
the dee T patient and therapist jointly assessed 
cee tia attainment of each goal. For example, 
Week goal ness might be a problem area. The 4- 
Boke hide be to be assertive on seven occa- 
Weekly a the week; the method for the first 
once in a + might be to stand up for one’s rights 
with one’s estaurant, once with the boss, and once 
Phasis ans Spouse. Hence, a most important em- 
servable A a setting weekly goals that were ob- 
at wer, efinable, and measurable and methods 
e structured in a step-by-step, reasonable, 


and realistic manner and collaborated and monitored 
weekly by both patient and therapist. 

The Follow-up Interview Schedule. The GAS 
Follow-up Interview Schedule (adapted from Gar- 
wick, 1974a), is a structured interview that allows 
for quantitative assessment of patient functioning 
within the problem areas for which goals have been 
set. 
Patients who were selected as subjects in the 
study were asked to complete the “guide to goals” 
(Garwick, 1974b). Then an “intake history” was 
prepared on the basis of two interview sessions. 

At a third session, the patient and an experienced 
clinician “scaler,” trained in goal attainment scal- 
ing, collaboratively prepared a follow-up guide. The 
scaler interviewed “collateral” persons significant 
to the patient’s problem areas. The scaler initially 
interviewed the collateral within each problem area 
that the patient presented and in a separate inter- 
view with the patient collaboratively constructed 


the goal attainment follow-up guide. 
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The construction of the follow-up guide consisted 
of setting treatment goals and predicting five levels 
of goal attainment with an “expected” level of 
attainment by the eighth therapy session. The pa- 
tients were then randomly assigned to the two 
treatment groups and to the psychotherapists for 
individual counseling. 

The monitored group received individual ther- 
apy and weekly goals using a structured feedback 
technique (BMPR). The nonmonitored group re- 
ceived individual therapy without setting goals. 

The Follow-up Interview Schedule ratings were 
completed after eight therapy sessions within a 3- 
month period by one of four master’s level psy- 
chiatric nurses. Separate posttest scores were re- 
corded for both patients and collaterals. 

The collateral person was a source of external 
validation of ‘the patient's self-report. Validation 
included identification and definition of the pa- 
tient’s problems at intake (pretest score) and in- 
put as to the level of functioning on the attain- 
ment level of the follow-up guide at the follow-up 
evaluation (collateral posttest score), 


Results 


The purpose of this study was to deter- 
mine what changes, if any, occurred from the 
time of intake to the time of follow-up be- 
tween two treatment groups, only one of 
which used a structured feedback technique 
for collaboration and monitoring goals. 

The following hypotheses were used to ex- 
amine the therapeutic effectiveness of goal 
attainment scaling: 

Hypothesis 1. There is no difference in 
the mean Goal Attainment scores for the 


pretest, the posttest, and the collateral Post- 
test. 


Hypothesis 2, There is no difference in 


the means on the goal attainment scores of 
the treatment and control groups. 


Source of variation 
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Hypothesis 3. There is no interaction be- 
tween testings and treatment groups on the 
Goal Attainment scores. 

To test these hypotheses, a two-way analy- 
sis of variance (see Table 1) with repeated 
measures on one factor (Winer, 1962) was 
used. 


Mean Total Pretest Versus Patient Posttest 
Versus Collateral Posttest Goal 
Attainment Scores 


The mean totals of the patient (49.16) 
and collateral (50.15) after therapy were 
More than twice as great as the mean total 
pretherapy score (23.29). As shown in Table 
1, a significant treatment effect was found 
for pretherapy versus patient posttherapy 
versus collateral posttherapy Goal Attain- 
ment scores (F = 327.66, p = .01). 


Mean Total Control Group Versus Monitored 
Group Attainment Scores 


The mean total Goal Attainment score for 
the monitored group (45.73) was much 
greater than the mean total Goal Attainment 
Score for the control group (36.01). The 
monitored patients attained significantly 
higher scores than the control patients (F 
= 32.43, p = 01). 


Interaction 


Control Group Versus Monitored Group 


Both the patient (56.82) and collateral 
(57.72) posttherapy scores of the monitored 


Between subjects 
Monitored vs. control (A) 
Subjects within groups 
Within subjects 
ee post; and collateral observations (B) 


B X subjects within groups 


Note. N = 32, 


SS df MS F 

4,370.05 31 
2,270.20 1 2,270.20 32.43* 
2,099.84 30 70.00 

17,656.01 64 

14,847.10 2 7,423.55 327.66* 
1,449.53 2 724.80 31.99* 
1,359.38 60 22.70 


*p <0. 


N 


Wn 
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Table 2 
Mean Initial and Follow-Up Goal Attainment Scores 
M score M score 
based on based on 
M initial patient ratings by Total 
scores (all report at collaterals (between 
Item scales) 3 months at 3 months groups) 
Initial follow-up guide only (control group) 23.92 41.51 42.59 36.01 
Weekly goals review plus initial follow-up 
guide (monitored group) 22.66 56.82 57.72 45,73 
Total (within groups) 23.29 49.16 50.15 40.87 


group were significantly higher than both 
the patient (41.51) and collateral (42.59) 
posttherapy scores of the control group (F = 
31.99, p = .01). 

Table 2 represents the mean Goal Attain- 
ment scores at pretherapy, patient post- 
therapy, and collateral posttherapy interac- 
tions, There were differences between test- 
ings and treatment groups on the Goal At- 
tainment scores. Patients in the monitored 
group had significantly higher mean scores 
in both posttestings on patient reports and 
collateral ratings than did patients in the 
control group, whereas mean initial scores 
for both treatment groups were similar. 


Discussion 


The significant change in the mean Goal 
Attainment scores from pretherapy to post- 
therapy suggests that positive therapeutic 
changes occurred during treatment. Unde- 
Sitable behaviors decreased, and desirable 
behaviors increased. The treatment goals that 
were selected and the levels of outcome that 
were achieved demonstrate that reasonable 
and realistic goals were set and accomplished 
With these patients. 

Wad significant difference between the 
op oal Attainment scores for patients in 
TA ‘iba group and the control group 
hie ote that greater beneficial changes in 
os attainment of goals were effected in 
e ia group using a structured for- 
Week patient-therapist collaboration on 
‘tis Stee than when the guide to goals 

ik without weekly goal setting. 
ea opie that goal-setting strategies, 
1n a procedural format, allow the 


patient and therapist to be more responsible 
and accountable in assuring the process of 
appraisal and improvement of the quality. 
Behavior changed gradually and improved 
when the patient and therapist had the feed- 
back of the established goals of the prior 
week and methods to accomplish these goals. 
Feedback provided both the patient and 
therapist with information on treatment de- 
cision and outcome and provided alternative 
approaches for future goal attainment 
strategy. 

These results indicate that the goal at- 
tainment model with periodic monitoring 
may be useful in the therapeutic process to 
collect information, as an aid in organizing 
and recording the process of therapy, to de- 
sign treatment for outpatients, as an out- 
come effectiveness measure, to evaluate ther- 
apeutic progress, and to provide new data 
for setting additional therapeutic goals. 


References 


Garwick, G. An introduction to reliability and 
goal attainment scaling methodology (Program 
evaluation project report). Minneapolis, Minn.: 
Program Evaluation Resource Center, 1974 (a). 

Garwick, G. Recent findings on the use of goal- 
setting in human services. Goal Attainment Re- 
view, 1974, 1, 1-4. (b) 

Kiresuk, T. J. Goal attainment scaling at a county 
mental health service. Evaluation, 1973, 1, 12-18. 

Kiresuk, T. J., & Sherman, R. Goal attainment 
scaling: A general method for evaluating com- 
prehensive community mental health programs. 
Community Mental Health Journal, 1968, 4, 443- 
453. . 

Winer, B. J. Statistical principles in experimental 
design. New York: McGraw-Hill, 1962. 


Received July 18, 1977 m 


Journal of Consulting and Clinical Psychology 
1978, Vol. 46, No. 6, 1246-1257 


An Observational Approach to the Assessment of 
Anxiety in Young Children 


Blair Glennon 
Harvard University 


John R. Weisz 
University of North Carolina at Chapel Hill 


Clinical research on anxiety has long relied on assessment techniques that may 
be inappropriate with young children (e.g., self-report inventories). The present 
article describes an alternative to such techniques—a scale using observational 
methodology. To assess the reliability and validity of this instrument, the Pre- 
school Observational Scale of Anxiety (POSA), preschoolers were observed and 
scored on the scale during two test sessions. Session 1, with mothers: absent, 
was expected to provoke relatively high anxiety; Session 2, with mothers pres- 
ent, was expected to provoke minimal anxiety. Total POSA scores assigned by 
two independent judges correlated .78 (p < .001), with highly significant inter- 
judge correlations for most of the scale items. Regarding the validity of the 
instrument, it was found that (a) POSA scores were significantly correlated 
with teachers’ and parents’ inventory ratings of children’s anxiety (all ps < .01), 
and (b) children’s POSA scores were significantly higher in Session 1 than 
Session 2 (p < .01). The findings suggest that the POSA may provide a means 
of assessing situationally induced anxiety in children who are too young to 
accurately report their internal states. 


Anxiety has long beén a topic of central 
importance in clinical research and practice. 
Achenbach (1974) noted that among person- 
ality traits emerging from trait theories,. 
“anxiety is perhaps the most frequently in- 
ferred and measured (p. 574)”; yet, demon- 
strably valid and reliable measurement of 
anxiety has been difficult to achieve, particu- 
larly among children, Especially acute is the 
need for accurate measures that can be used 
to assess specific situational effects on anxiety 
states in children (see Spielberger et al., 
1972). Although many investigators recognize 
the need for such measures, particularly in 
research with children who are too young to 


accurately report their own internal states, 
most researchers also recognize the difficulties 
that inhere in applying previous anxiety as- 
sessment techniques to children. 

Historically, the principal methods for mea- 
suring anxiety have been physiological mea- 
surements, projective techniques, self-reports, 
and behavior ratings by observers. Through 
direct gauging of autonomic activity, physio- 
logical measures bypass the problems of sub- 
jective judgments. However, agreement among 
physiological measures is frequently poor, 
since people have different styles of autonomic 
responding (Phillips, Martin, & Meyers, 
1972), and the physiological instruments may 
tap emotions such as anger or joy rather than 
anxiety (Lazarus, 1966). Furthermore, the 
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unusual instruments used for physiological 
measures may distract subjects and make 
naturalistic observations difficult, particularly 
when the subjects are children. 

Projective techniques have also been used 
to obtain ratings of generalized anxiety (Mc- 
Reynolds, 1968), but most projective tech- 
niques require individual administration as 
well as substantial time and expertise on the 
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part of the examiner, thus limiting their use- 
fulness. Furthermore, McReynolds (1968) 
has reported that projective techniques “are 
generally found to be unrelated to inventory 
essays of anxiety” (p. 258); and Achenbach 
(1974), after reviewing the massive literature 
on projective techniques, has concluded that 
“there is little evidence for the reliability or 
validity of most of the interpretations made 
from them” (p. 604). 
Three main types of self-reports have been 
used in assessing anxiety. First, direct self- 
ratings involve asking subjects specifically 
how anxious or nervous they feel. McReynolds 
(1968) has suggested that such self-ratings 
are most useful for measuring current levels 
of anxiety or changes in anxiety from one 
situation to another. A second type of self- 
report is the Adjective Check List, on which 
subjects indicate which of several adjectives 
(eg., “jittery”) best characterize their mood. 
Adjective Check Lists have been used to 
assess either current anxiety level or charac- 
teristic anxiety (McReynolds, 1968). The 
third type of self-report is the inventory 
method, with questionnaires designed to de- 
termine how subjects feel in a variety of situ- 
ations. This method is usually aimed at as- 
Sessing characteristic rather than current anxi- 
ety level. Inventories may focus on generalized 
anxiety (i.e., across a wide variety of situa- 
tions) or anxiety related to particular types of 
situations (such as separations). Several in- 
vestigators have noted drawbacks of the vari- 
ous self-report methods (see Spielberger, 
1972); however, probably the most important 
drawback in the present context is that such 
methods assume both the ability and the will- 
inghess of subjects to correctly describe their 
a feelings. The first assumption is clearly 
ae with young children. And, with regard 
fa second assumption, there is evidence 
abe oy children give inaccurate reports 
ee their anxiety due to defensiveness or 
A oe (Sarason, Davidson, Light- 
sae alte, & Ruebush, 1960). Consequently, 
veins (1966) has stated that “the verbal 
nse to our [children’s self-report anxi- 
ae bee may be telling us more about the 
k an the affect” (p. 79). 
ehavior ratings of anxiety by observers 


May j 
Y involve global, subjective judgments 
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about the subjects’ apparent nervousness or 
ratings on many specific, concrete behaviors 
thought to indicate anxiety (e.g., stuttering). 
Such approaches appear to offer distinct ad- 
vantages relative to those discussed above 
(see Spielberger et al., 1972). For instance, 
unlike physiological measures, observers’ be- 
havior ratings can be made unobtrusively; 
unlike projective techniques, they do not nec- 
essarily require trained clinicians; and unlike 
self-report measures, they do not rest on 
tenuous assumptions about the abilities and 
attitudes of the subjects. In addition, behav- 
ioral observations may be particularly useful 
with children, since they appear to disguise 
expression of their feelings less effectively 
than adults (Sarason, 1966). 

The observers for the behavior rating scales 
of children’s anxiety are sometimes teachers 
or parents, with the scales including items 
relevant to school or family situations (e.g., 
cries at bedtime). However, these particular 
scales entail a risk of distortion due to the 
parent’s or teacher’s lack of care or objectiv- 
ity in observing, or to their bias or defensive- 
ness about reporting on a child whom they 
know personally (Sarason et al., 1960). The 
observers for the behavior rating scales, how- 
ever, may be trained observers (see Buss, 
Wiener, Durkee, & Baer, 1955), and the scales 
may comprise behaviors indicative of anxiety 
across situations, thus being more widely 
applicable and less subject to distortion by 
observers. The latter type of behavior rating 
scale would, in principle, permit detection of 
subtle relationships between specific events 
and ensuing anxiety in a way that satisfies 
methodological requirements and is develop- 
mentally appropriate as well (see Weisz, 
1978). 

It is true that different individuals may 
reveal their anxiety through different be- 
haviors; also, different observers may have 
difficulty achieving high agreement in their 
behavior observations. But these problems of 
behavior ratings should be surmountable. 
Clearly defining the behaviors to be observed 
and increasing the training time of observers 
should reduce problems of low interrater 
agreement. And total frequency scores on rat- 
ing scales including the gamut of behaviors 
suggestive of anxiety should give good indi- 
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cations of the relative anxiety of individuals 
with widely varying anxiety manifestations. 

The most accurate behavior rating scale 
would require observers who are able to con- 
centrate fully on the occurrence of the target 
behaviors and to record the occurrence of 
these behaviors for later frequency analysis. 
It should be noted that in the typical clinical 
interview situation, the clinician would not 
be able to meet these requirements. However, 
a second observer could independently record 
the indicators in clinical situations in which 
precise anxiety measurements are desired, or 
the clinicians themselves could use the indi- 
cators of behavior rating scales as an aid in 
making less exact, more global ratings of the 
client’s anxiety, In the latter case, the clini- 
cians would not be using the scale in the 
prescribed way but would probably add ob- 
jectivity to their observations by reference to 
it, 

In short, behavior observation measures of 
anxiety seem to have clear advantages over 
other types of anxiety measures and rela- 
tively minor drawbacks. But despite the po- 
tential usefulness of such measures, we have 
found little systematic research on behavior 
indicators of anxiety in children; In fact, 
there appears to be no carefully validated 
behavior rating scale for anxiety in children. 
Perhaps the closest approach is that of Gross- 
man (1968), who used observer ratings of 
anxiety with 6-year-olds. However, his scale 
consisted of a mixture of specific objectively 
observable behaviors (e.g., nail-biting) and 
general indicators requiring subjective in- 
ference (¢g., “reactions suggesting that the 
child was frightened”); as described by the 
author, the scale appeared to contain only six 
items, thus showing limited sensitivity to the 
range of behaviors through which children 
may display anxiety. Validity data consisted 
of only two nonsignificant correlation coeffi- 
cients relating the behavior scale to the Gen- 
eral Anxiety Scale for Children (Sarason 
Davidson, Lighthall, & Waite, 1958), and to 
our knowledge, no independ 
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children facing surgery. Again, only very lim- 
ited validity data on this scale are available. 
Also, only four items of the scale were pub- 
lished, and these represent an exceedingly 
modest sampling of the possible behavioral 
manifestations of anxiety. 

In the present study, a much more de- 
tailed list of behavior items was assembled 
by means of a systematic search of the anxiety 
literature. This list was used to form the Pre- 
school Observation Scale of Anxiety (POSA). 
The interobserver reliability of the POSA was 
assessed by two independent judges. The 
validity of the scale was determined (a) 
through assessment of its relation to three 
independent inventory measures of anxiety 
and (b) through an experimental manipula- 
tion of stressors. Significant positive relations 
between the POSA scores and the inventory 
scores were anticipated, but these relations 
were expected to be modest, since (a) the in- 
ventory measures and the POSA were de- 
signed to assess somewhat different aspects of 
anxiety, and (b) the inventory measures relied 
on parents’ and teachers’ judgments, whereas 
the POSA involved relatively objective obser- 
vations by trained observers. The experi- 
mental manipulation involved a contrast be- 
tween an initial mother-absent condition and 
a subsequent mother-present condition. Be- 
cause the format and experimental procedure 
would be more familiar to the children in the 
second session, and because mothers would be 
present, it was predicted that children would 
show lower levels of anxiety in Session 2 than 


in Session 1, and thus score lower on the 
POSA. 


Method 
Subjects 


The 36 Cornell University Nursery School children 
who formed the sample ranged in age from 32 to 59 
months (M=47 months; SD = 7.7 months). Mean 
Hollingshead (Note 1) socioeconomic status (SES) 
was 147 (1= highest; 7 = lowest; SD = .94). There 
were 21 girls and 15 boys, 


Procedure 


The Preschool Observation Scale of Anxiety. The 
t step in devising the scale was a systematic search 
of the Education Index (1929-1976), Psychological 
Abstracts (1927-1976), Resources in Education/Re- 


search in Education [1966 (year or origin) to 1976], 
‘and Current Index to Journals in Education [1969 
(year of origin) to 1976], for studies using or men- 
‘tioning behavioral indicators of anxiety in children 
‘or adults, Next, three child-clinical psychologists 
‘examined a list of items based on the literature search 
to suggest additions, to make minor modifications, 
and to give their final approval to the scale. Some 
behavior indicators suggested by the literature or 
the clinicians (e.g., hand perspiration) were not in- 
‘cluded in the final scale because pilot testing indicated 
at they were too difficult to observe accurately and 
reliably. Table 1 includes a description of each be- 
vioral indicator of the final POSA, along with a 
erence to the article(s) supporting the use of each 
indicator in the scale. 
Parent and teacher questionnaires: Independent 
‘anxiety measures. Several independent measures of 
xiety were obtained for the purpose of validating 
e POSA, One was a questionnaire, the Parent 
‘Anxiety Rating Scale (Doris; McIntyre, Kelsey, & 
Lehman, 1971), completed by each child’s parents 
and comprised of six questions about the child’s 
Separation anxiety (PARSEP) and 19 questions about 
e child’s general anxiety (pARcEN). Two weeks be- 
fore school began, this questionnaire was sent to the 
Children’s parents, who completed it by the beginning 
of the nursery school year. A second independent 
Measure of anxiety was the Teachers’ Separation 
“Anxiety Scale (TSAS; Doris et al., 1971), comprised 
Of 11 items about the child’s reaction to separation 
from his/her mother or father when left at the 
Nursery school at the beginning of the school day. 
One of two teachers (one for each of two nursery 
“School sessions) rated the child on each of the first 
10 consecutive days of the child’s attendance at 
ne”, school for the year. Ratings were made at 
pe end of each school day and pertained to the 
‘Period from the child’s arrival at the nursery school 
With parent to the parent’s departure. If the child 
Was not delivered to the nursery school by his/her 
Parent, the child was not rated that day, and his/her 
AS score was prorated. 
À Experimental manipulation of anxiety. In addi- 
$ pe obtaining questionnaire scores of the chil- 
oa anxiety, we used a manipulation designed to 
OA differing levels of anxiety in two experi- 
sessions, both involving cognitive tests. The 
} session was expected to be more anxiety arous- 
dividust it involved an unfamiliar adult who in- 
iting y tested each child in an unfamiliar setting 
fichoot ae a the first few days of the nursery 
ae can ttaey) the second session occurred 
after the <a weeks following the first session, 
hel: new : ren had had a chance to settle in to 
Mother ue urroundings ; furthermore, the child’s 
the same ¢ Present at the second session, along with 
‘first sae who had tested the child at the 
Bime ee inally, the children were given the 
had alread ae in the second session that they 
cats ieee en in the first session. 
"Valuative given because it was felt that an 
atmosphere would make the experimental 
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situation more anxiety arousing, especially in the 
first session when the materials would be unfamiliar. 
Also, performance on the cognitive tests by the 
high-anxious children compared to the low-anxious 
children was of interest, since existing evidence is in 
conflict over this question. Some evidence sug- 
gests that high anxiety should interfere with test 
performance, especially when the tests are fairly 
difficult ones for the subjects (e.g., Feldhusen & 
Klausmeier, 1962; Tamaroff, 1976; Young & Brown, 
1973). However, other evidence (e.g., Denny, 1966; 
Hodges & Durham, 1972; Katahn, 1966; Spielberger, 
1966) suggests that anxiety may have facilitating 
effects on performance for children of higher socio- 
economic and intellectual levels, (See other con- 
flicting evidence in Fischer & Awrey, 1973; Mazzei 
& Goulet, 1969.) Since, according to the evidence, 
anxiety might lead to performance enhancement in 
some children and performance decrements in other 
children, no prediction was advanced regarding the 
relation between anxiety and task performance in 
the present study. However, the relation between 
task performance and anxiety scores was reported in 
an effort to shed light on the controversy just 
described. 

Global anxiety self-ratings and ratings by the 
examiner. In the Introduction to this article, we 
emphasized the disadvantage of both self-reports by 
children and global (and thus subjective) ratings by 
adults. To determine whether our negative assess- 
ment was correct, we included two such measures 
in the present study, so that their effectiveness might 
be contrasted with that of the more specific and 
presumably more objective POSA. At the end of 
each of the two testing sessions, the examiner rated 
the child’s general anxiety level during the session 
on a scale of 1 to 6. Also, a teacher asked each child 
to choose one of six pictures depicting progressively 
more fearful facial expressions to show how anxious 
the child felt during the testing session. The teacher 
rather than the examiner asked this question, since 
it was felt that the children would be more candid 
about their feelings with their teacher. The word- 
ing of the question was the following: 


These pictures show a picture of a child who is 
more and more scared as you go from this top 
picture to the bottom picture here [point]. You 
see, up here the child is not scared at all [point]; 
here [second picture from top] the child is a 
little more scared but still pretty happy; here 
[third from top], he’s getting more scared; and 
here [bottom], very, very scared. Which- picture 
would show how scared you felt when you went 
downstairs to the testing room with that lady? 
One of these up here where the child isn’t scared 
at all, or one of the bottom ones showing a child 
who is more and more scared? 


The observation periods. At the beginning of 
the first session, the examiner, a 26-year-old ex- 
perienced female teacher, approached each child 
individually in the nursery school and told the child 
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Table 1 ' 
Items of the Preschool Observation Scale of Anxiety and Interrater Agreement 

During the First 10 Minutes of the First Session 

A E T A 


Range of Interrater Concor- Concor- 
Item frequency reliability* dance 1 dance 2 


1. Physical complaint: Child says he or she 

has a headache, stomachache, or has 

to go to the bathroom (B,C,D,E,G,K,Q) 0-0 nc? ne nce 
2. Desire to leave: Child says he or she 

wants to leave the testing room or 

makes excuses about why he or she 

must leave; desire or “need” to leave 

must be explicit (D,E,P,Q). 0-5 .99* 99 46 
3. Expression of fear or worry: Child 

complains about being afraid of or 

worried about something; must use 

the word “afraid,” “scared,” 


“worried,” or a synonym (F). 0-0 nc nc ne 
4, Cry: Tears should be visible 

(G,L,K,P,Q). 0-2 nc ne nce 
5. Scream (P). 0-0 ne ne ne 
6. Whine or whimper (G,P). 0-2 .69* 97 M 
7. Trembling voice (F,G,I,M). 0-1 ne ne nec 
8. Stutter (A,F,G,H,M,0). 0-0 ne ne nc 


9. Whisper: Child speaks softly, without 
vocal cords; should not be a playful 
whisper (E,G). 0-11 .67* 87 39 

10. ar to one question in the interval 
(E). 


interval (E). 0-5 

12. Nail-biting: Child actually bites his or 
her nails in the testing room (F,G,I). 0-3 

13, Lip-licking: Tongue should be visible (G). 0-13 

14, Fingers touching mouth area: not k 
counted if bites nails while touching 
mouth, 0-17 .96* .91 19 

15. Sucking or chewing object : not 
fingernails (G,P), 0-1 47* 99 33 

16. Lip contortions. 0-13 67* 76 44 

17. Trembling lip (B). á ; 

18. sit hand movement at ear area 

11,J,N,P). * 
19, Camran EN movement at top of rE a ea si 
ead (G,I,J,N,P). * 

20. Gratuitous hand movement at an object ee ie Ke an 
separate from body or at a part of 
uik pees from body 

rtJ, N, P). 

21. Gratuitous hand movement at some part Ts Sn, ne a 
of eA eo mouth, or 

ni 8, perdi, it 

22. Gratuitous hand movement (N), a ae D = 

23, Gratuitous leg movement (M, N). SN a 1” 

24, Gratuitous foot movement : below ankles, ue Sa i 
distinguish from foot merely moving — 
along with leg (M,N), 0-20 R 6 

25. ae contortions (e.g., arching back) aa 7 


0-3 .74* 98 -50 
11, Silence to more than one question in the 
0-20 .89* 81 -68 
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Table 1 (continued) 
al 
Range of Interrater Concor- Concor- 
Item frequency reliability* dance 1 dance 2 
26. Rigid posture: Part of body is held 
unusually stiff or motionless for the 
entire 30-sec interval (B,G,N). 0-11 By tri’ 94 16 
27. Masturbation: touches genital area (K). 0-3 .81* .99 50 
28, Fearful facial expression (E,I). . 0-4 .92* 98 26 
29. Distraction: Must be indicated by a 
verbal reminder by the examiner to 
the child to pay attention (B,I,L,P). 0-7 .86* 94 38 
30, Avoidance of eye contact: Examiner 
should be having clear trouble making 
eye contact with child (G, M). 0-1 —.04° .98 00 
Total 26-107 .78* 92 58 


Note, Letters in parentheses following each item de: 
anxiety indicator. The following code was used: A: 
(1955); C: Cowen, Zax, Klein, Izzo, and Trost (1965 


scription refer to studies suggesting that item as an 
Boland (1953); B: Buss, Wiener, Durkee, and Baer 
); D: Endler, Hunt, and Rosenstein (1962); E: Fink 


(1956); F: Grossman (1968); G: Insel and Spencer (1972); H: Kasl and Mahl (1965); I: McReynolds 


(1965) ; 


: Melamed and Siegel (1975); K: Miller, Barrett, Hampe, and Noble (1971); L: Nottelman 


(1975); M : Paul (1966); N: Raskin (1962) ; O: Santostefano (1960); P: Tamaroff (1976); Q; Wolff (1969). 
Items 14 and 16 were suggested by our clinician consultants. The range of frequency column gives the lowest 
score obtained by any subject and the highest score obtained by any subject. The interrater reliability 
column gives the correlation coefficients for the Pearson correlations between the scores of the two observers. 
The Concordance 2 column gives the interrater agreements on the occurrence of indicators in pairs of 
adjoining intervals (the number of agreements divided by the sum of agreements plus disagreements). 


The Concordance 1 column gives the interrater agreement on 


both the occurrence and nonoccurrence of 


indicators in pairs of adjoining intervals (again, the number of agreements by the sum of agreements plus 


disagreements), 


an = 33 for interrater reliability and concordance calculations. 
nc = not calculated due to infrequent occurrence (i.e., no children or only one child had nonzero scores 


on the indicator). 


«Based on only two children with nonzero scores. 
$ < .001. 


that she wanted him or her to go with her to an- 

other room “to do some tests.” 1 
peie testing room was 5.7 mX 3.5 m, with one- 
ie Mirrors on top of its four walls and a micro- 
fete about 30 cm above the child’s head for 
eas sounds from the testing room to the 
eae areas. A male and a female observer, 
a in their early 30s, sat behind the one-way 
rrors at a distance of about 1.2 m from the child. 
A perves sat around a corner from each other, 
i aa separated by about 1.5 m and a parti- 
ee aa observer spoke softly into a tape recorder 
the Ta he or she observed the child emit one of 
s Beania indicators. The two observers 
Bien 01 hear each other speaking. Also, the ob- 
oi ik were blind to the independent anxiety ratings 
e children whom they observed and to the 


preventing observers from habituating to the be- 
havioral indicators, this procedure was used to aid 
data analysis: Behaviors were given a score of 1 
for each interval in which they occurred. 

During the testing session, the examiner gave 
each child the three tests (Digits, Blocks, and Sen- 
tences) in standard form and in as neutral a man- 
ner as possible. The examiner was instructed to try 
to keep the children in the testing room for at least 
10 minutes. The Digits test was taken from the 
Illinois Test of Psycholinguistic Abilities (Kirk, Mc- 
Carthy, & Kirk, 1968) and involved repeating a 
series of orally presented digits from memory. Sen- 
tences were taken from Tamaroff’s (1976) adaptation 
of the Sentences subtest of the Wechsler Preschool 
and Primary Scale of Intelligence (Wechsler, 1967). 
This test requires the child to repeat increasingly 


fact 4 

the ae lower anxiety ratings were expected for complex orally presented sentences from memory. 
A ey Session in comparison to the first session. Finally, Blocks was taken from Tamaroff’s (1976) 

eek ‘andard time-sampling procedure involving 30- 


meee was used, A Davis Scientific Instru- 
eel saa Purpose ‘Time Interval Generator 
a red-light 1) emitted to each observer a beep and 
flash at the 30-sec intervals. Aside from 


1 Two children refused to do any tests in the first 
session, and one refused to do any in the second 


session. 
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adaptation of the Block Construction Test of the 
Yale Scale of Child Development. It requires the 
child to copy visible block constructions with sepa- 
rate blocks, and it is timed. 

In the second testing session, the same, procedure 
was followed, except for two variations. First, the 
child was accompanied to the testing room by ‘his 
or her mother as well as by the examiner, and 
the mother sat unobtrusively reading a magazine 
in the corner of the testing room during the session. 
Second, only the female observer was behind the 
one-way mirrors, since the second observer was 
used just during the first session to obtain inter- 
rater reliabilities. Of the 36 children in Session 1, 
32 participated in Session 2, One child refused, 
and the parents of 3 children were unable to par- 
ticipate, Technical difficulties with recording equip- 
ment further reduced the sample (for whom com- 
plete data on both sessions were available) to 29. 

Training for the observers. Both observers spent 
2-3 hours memorizing the behavioral indicators and 
6 hours in observation-training sessions. In training, 
the observers watched videotapes of three children 
being tested as other children would be tested in 
the actual experiment. Training also involved pilot 
observations of three children in the actual setting 
and conditions of the experiment. Finally, the ob- 
Servers spent approximately 2 hours discussing dis- 
agreements in their observations and ways to achieve 
better consensus. The detailed description of the 
items of the POSA given in Table 1 includes all of 
the details that the observers devised in order to 
maximize agreement. 


Results 


Scores for all subjects on the POSA were 
calculated by using the number of 30-sec 
intervals in which a given indicator occurred 
during the first 10 min (20 intervals) of 
the sessions, The examiner had been in- 
structed to try to keep each subject in the 
testing room and working for a full 10 min; 
thus, a 10-min interval was chosen as the 
target time, in part, because of concern that 
the experimenter’s behavior might have 


changed significantly after the 10-min period 
had expired,? 


Interrater Reliabilities 


BLAIR GLENNON AND JOHN R. WEISZ 


portant correlation coefficient, that for the 
30 indicators together, was .78 (p < .001). 
The intraclass correlation was .77 (p< 
.001). 

Even though the preceding analysis an- 
swered the central question about reliability 
of overall POSA and individual indicator 
Scores, we also sought to learn the level of 
specificity at which observer agreement took 
place. Toward this end we used a demanding 
procedure designed to gauge the degree of 
concordance between observers within ob- 
servation intervals. The procedure involved 
grouping every 2 adjoining intervals (result- 
ing in 19 interval groups) and counting how 
often the observers agreed or disagreed as 
to whether an indicator occurred within each 
interval group. Adjoining rather than single 
intervals were used for this agreement mea- 
sure, since the observers sometimes reported 
the same behavior at slightly different times 
so that the interval cutoff occurred between 
their reports. Using this method of measure- 
ment, a quotient of concordance was calcu- 
lated by dividing the number of agreements 
between the observers by the number of 
their “agreements” plus “disagreements” for 
each indicator separately and for the total 
of all 30 indicators. Table 1 (last two col- 
umns) shows the results of these calculations 
(a) when agreements between observers that 
a given indicator did not occur were included 
in the “agreement” score and (b) when such 
negative agreements were excluded from the 
agreement score. Considering the rigorous 
nature of the procedure (particularly b 
above), the percentages shown in Table 1 
represent rather substantial agreement be- 
tween observers ‘at the level of brief ob- 
servation intervals, 


*The alternate scoring method of dividing the 
frequency of intervals in which each indicator oc- 
curred during the total session by the number of 
intervals in the total session would have posed an 
additional Problem, in that fatigue and familiarity 
could have affected the scores of children having 
longer sessions, Unfortunately, using the 10-minute 
cutoff, all subjects were not engaged in the same 
activities during the target time, but this seemed to 


be a less significant consideration than those noted 
above. 
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Table 2 
Correlations Among Anxiety Measures and Test Scores 
2 3 4 5 6 7 8 Sentences 
1, POSA .37* .30* .47** —.12 —.02 —.04 .38* —.16 
2, PARSEP 41** .36* .01 .06 .05 .28 .24 
| 3, PARGEN 122 —.18 .07 -03 .05 .05 
4, TSAS —.05 19 —.07 .07 —.18 
5. Self-rating 14 —.45** —.35* —.24 
6. Examiner rating —.24* —.10 —.13 
7, Blocks A2** 102°%* 
8. Digits AISA 


Note, POSA = Preschool Observation Scale of Anxiety; PARSEP = questions about the child's separation 
anxiety; PARGEN = questions about the child's general anxiety; TSAS = Teachers’ Separation Anxiety 


Scale, 

*p <05. 
*b <0. 
"eb < .001. 


Correlations With Independent Measures 
oj Anxiety 


As an initial step in assessing the validity 
of the POSA, the correlations of the POSA 
with the PARSEP, PARGEN, and TSAS were 
calculated using data from the first session 
(since correlations with data of the second 
session would have been less meaningful due 
to the mothers’ presence). As Table 2 indi- 
Cates, all three correlations were significant. 
Other correlations noted in the table are also 
of Interest. Note that the self-rating and 
examiner ratings were not correlated with one 
fot or with POSA, PARSEP, PARGEN, or 
ie scores. Thus, even though the POSA 
oe Meet the first set of validity criteria that 
a hed (i.e., significant correlations 
Fits e three inventory measures) , the self- 
E and examiner ratings did not. The 
aiir cant negative correlations between self- 
a two of the cognitive tests sug- 
a ae self-ratings may have been influ- 
ati y the children’s ‘awareness of the 

D of their test performance. 

P 0 assess the contributions of individual 
ee to the correlations between 
A se and the three inventory measures, 

the eared of each of the 30 items with 

Shot rento scores were calculated. 
nsigna ese individual correlations were 
Dower of ant, suggesting that the predictive 
combinati indicators lies mainly in their 

ton with one another. Those corre- 


lations that did attain statistical significance 
were lip contortions with PARSEP (.39, p< 
01) and TSAS (.43, p< .01), gratuitous 
hand movement in the ear area with PARGEN 
(.32, p< .05), gratuitous hand movement 
at the top of the head with PARSEP (—.38, 
p< .01), gratuitous hand movement toward 
object with TSAS (.53, p < .001), gratuitous 
hand movement at other body part (—.31, 
b < .05), gratuitous leg movement with PAR- 
SEP (.34, p < .05) and TSAS (.35, p < .05), 
gratuitous foot movement with PARSEP (.35, 
~<.05) and TSAS (46, p<.01), trunk 
contortions with PARSEP (.29, p < .05) and 
TSAS (.42, p < .01), and masturbation with 
PARGEN (.39, p < .01) and TSAS (.58, p< 
.001). 


Anxiety Scores in the First and 
Second Sessions 


As a second way of assessing the validity 
of the POSA, the mean POSA score for all 
children in the first session was compared 
with the mean score for all children in the 
second session (designed to be less anxiety 
producing than the first). As predicted, the 
children obtained significantly higher POSA 
scores in the first than in the second session, 
4(28) = 2.53, p < .01 (one-tailed). Of the 
29 children, 22 had higher scores in the first 
than in the second session, x(1) = 6.76, p 
< .01. Further comparisons between the first 
and second sessions were made for children 


1254 


BLAIR GLENNON AND JOHN R. WEISZ 


Table 3 s 
Comparisons of POSA Scores Between Sessions 
Session 1 Session 2 
Group M SD M SD t df 
All children 61.4 21.3 51.3 14.0 2193F 28 
High PARSEP scorers 70.0 19.6 49.7 15.5 4.35** 14 
High PARGEN scorers 64.3 19.2 49.0 14.3 ni bb 15 
High TSAS scorers 61.4 22.4 47.6 14.0 2.66* 17 


Note. POSA = Preschool Observation Scale of Anxiety; PARSEP = questions about the child's separation 
anxiety; PARGEN = questions about the child’s general anxiety; TSAS = Teachers’ Separation Anxiety 


Scale. 
* p < .01, one-tailed. 
** p < .001, one-tailed. 


who scored above the median on the PARSEP, 
PARGEN, and TSAS, since it was thought that 
these children might be especially sensitive 
to the situational manipulations of stressors. 
Again, the high scorers on the three question- 
naire scales showed significantly more þe- 
havioral indicators of anxiety in the first 
than in the second session. The results for 
these analyses are shown in Table 3. Note 
that in each of the three groups, the magni- 
tude of the Session 1- Session 2 difference is 
greater than for the entire sample. In fact, 
of the four groups, children rated by their 
parents as high in separation anxiety showed 
the highest mean POSA scores during Session 
1 with mothers absent and the largest Ses- 
sion 1— Session 2 difference—more than dou- 
bling the difference shown by the full sample. 
Session 1- Session 2 differences were also 
calculated for the children’s self-ratings and 
the examiner ratings, The change in the chil- 
dren’s self-ratings from Session 1 to Session 
2 indicates a nonsignificant increase in anxi- 
ety, whereas the change in the examiners’ 
ratings indicates a highly significant decrease, 
Anxiety ratings on the children’s 6-point pic- 
ture scale averaged 2.22 in Session 1 and 2.59 
in Session 2 (p= .28), Ratings on the ex- 
aminer’s 6-point scale averaged 3.07 in Ses- 
sion 1 and 2.00 in Session 2, t(29) = 3.74 
P < 001. Thus, both the POSA and examine? 
ratings of anxiety met the second validity 
criterion, that is, significantly higher scores 
in Ps a than in Session 2. 
Inally, ion 1~Session 2 differences 
for each of the 30 POSA indicators were 


~ 


calculated in an effort to gauge the contri- 
bution of the individual indicators to the 
overall sessions difference in total POSA 
scores. As was true in the correlational analy- 
sis reported earlier, most individual item 
effects were nonsignificant, suggesting that 
the discriminative power of the indicators 
lies principally in their combination with one 
another, However, there were five indicators 
that had significantly different frequencies in 
Sessions 1 and 2 (all in the predicted direc- 
tion): silence to one question (p < .05), 
touching mouth area (p < .01), gratuitous 
arm movement (p < .001), trunk contortions 
( < .001), and rigid posture (p < .05). 


Discussion 


The results of the present study support 
the use of the POSA as a measure of anxiety 
in young children. Independent judges 
achieved strong agreement on both total 
POSA scores and total scores for most of 
the 30 individual indicators. The large num- 
ber of indicators used seemed to interfere 
with interobserver concordance at the micro 
level of 1-min observation blocks (at Jeast 
by the most rigorous method of analysis); 
and this suggests that the two observers may 
have differed frequently in the particular be- 
havioral incidents that they observed. Yet, 
this is a relatively trivial limitation con- 
sidering the high interobserver correlations 
obtained for total POSA and individual indi- 
cator scores. In sum, the difficulties intro- 
duced by requiring observers to watch for 
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30 indicators appear to be outweighed by the 
value of including a variety of potential 
anxiety manifestations in order to capture 
indications of anxiety in individuals with 
differing expressive styles. 

Two types of evidence support the view 
that the POSA yields a valid index of anxi- 
ety. As predicted, the scale was significantly 
correlated with all three inventory measures 
of anxiety. Also as predicted, the POSA 
yielded significantly higher scores during a 
presumably high-anxiety test session than 
during a session designed to provoke less 
anxiety. The data generally supported our 
original belief that broad-based measures 
such as the POSA would outperform simpler 
approaches such as self-ratings by young 
children and global judgments by examiners. 
Children’s self-ratings and examiner ratings 
did not correlate significantly with any of 
the three inventory measures, with POSA 
scores, or with each other. Self-ratings 
showed very slight differences (and in the 
wrong direction) between Sessions 1 and 2; 
however, Session 2 ratings by the examiner 
were significantly lower than Session 1 rat- 
ings. It is uncertain whether this latter find- 
ing derived from an expectation by the ex- 
aminer that children’s anxiety would be lower 
in the second session with mothers present. 
However, whatever the basis for the finding, 
It constitutes the only bit of evidence sup- 
Porting the use of either self- or examiner 
ratings, This pattern of findings seems to 
Indicate the superiority of structured ob- 
servations of carefully delineated behaviors 
over global, unstructured, and thus subjec- 
tive Tatings, though this conclusion must be 
naan by the fact that the structured 
7 paons were made by trained observers, 

ereas the global ratings were not. 
ag ae in the Introduction, the rela- 
ae oS anxiety and problem-solving 
the a appears to be quite complex. 
Ae sent study none of the inventory 
aa S was significantly related to test per- 
{ie and the POSA was significantly 
nee only to Digits performance (r = 38). 
hy nae as measured by the POSA and 
doii Y scores, did not appear to have a 

ating effect on test performance for 
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children in the present study, and may have 
had an enhancing effect on one test, perhaps 
due to factors related to the high socioeco- 
nomic status of the present subjects. 

Further research with the POSA should 
include investigations of the scale’s capacity 
to reflect the effects of stressors other than 
those devised in the present study. Children 
varying more widely in age and other demo- 
graphic characteristics than those of the 
present sample should be included. And, more 
importantly, there is a need to assess the 
scale’s usefulness with clinical populations of 
children who (unlike those of the present 
sample) suffer from pronounced behavior 
problems. In addition, research using longer 
observation periods than the 10-minute-plus 
sessions of the present study could be useful; 
some behaviors that did not occur in the 
relatively brief sessions of the present in- 
vestigation might prove to be useful indices 
of anxiety if children were given a lengthier 
opportunity to display them. Even the be- 
haviors that occur infrequently might well 
prove to be potent anxiety indicators when 
they do appear. Finally, the degree of anxiety 
is apt to be reflected not only by the fre- 
quency of anxiety behaviors but also by their 
intensity, Although intensity may be difficult 
to quantify, its potential for increasing the 
precision of anxiety measurement would seem 
to justify efforts in this direction. 

For the present, however, the POSA rep- 
resents a potentially useful approach to the 
assessment of anxiety in children. Given the 
sensitivity of the POSA to situational varia- 
tions in stressors, the scale itself should be _ 
particularly useful to reseachers concerned 
with the interplay between specific situa- 
tional conditions and affective states (e.g., 
Spielberger et al., 1972). In addition, the 
POSA could contribute usefully to our under- 
standing of sex differences in anxiety mani- 
festations at various ages (see Maccoby & 
Jacklin, 1974, pp. 182-190), and to our 
capacity to evaluate therapeutic techniques 
for children (see Achenbach, 1974, pp- 606- 
650). Research on these topics, focusing on 
anxiety states in children, clearly must cir- 
cumvent the problems of projective tech- 
niques, self-reports, global and subjective 
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observer reports, and physiological measures, 
all outlined in the Introduction. The evidence 
presented in the present study suggests that 
instruments such as the POSA may provide 
logical, valid, and reliable alternatives to 
these more traditional approaches. 


Reference Note 


1, Hollingshead, A. B. Two-factor index of social 
position. Unpublished manuscript, Yale Univer- 
sity, 1957. 
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The work of A. R. Luria has been recognized as a major contribution to neuro- 
psychology. Among his accomplishments, Luria has devised an extensive set of 
procedures used for neuropsychological evaluation. Luria’s tests permit the full 
identification of the specific deficits underlying a disorder and can be com- 
pleted in about 2 hours. The most significant flaw in the battery is a lack of 
standard administration and scoring that has precluded an assessment of its 
validity. The present study was an attempt to overcome these deficiencies by 
developing an objective form, combining Luria’s procedures with the advan- 
tages of a standard test battery. The resultant test was evaluated using 50 
medical control and 50 neurological patients. Of the 285 measures in the bat- 
tery, 253 significantly discriminated at the .05 level, and only 16 failed to 
discriminate at the .2 level. A discriminant analysis, using the 30 most effective 
items, yielded a hit rate of 100%. The battery’s potential and the future re- 


search necessary are’ discussed. 


The work of A, R. Luria, the Russian 
neuropsychologist, has been internationally 
recognized as a major contribution to ex- 
perimental and clinical neuropsychology. 
Luria has made extensive theoretical con- 
tributions (e.g., Luria, 1963, 1966, 1970, 
1973) and has devised numerous clinical di- 
agnostic and rehabilitation procedures (e.g,, 
Luria, 1963, 1966). Many of Luria’s diag- 
nostic procedures have recently been pub- 
lished by Christensen (1975a, 1975b) along 
with the equipment necessary to perform 
these tests (Christensen, 1975c). 

Luria’s tests are based on his theoretical 
contributions to neuropsychology. Luria en- 
visions the brain divided into three principal 
units responsible for arousal, sensory input 
and integration, and behavioral planning and 
execution. In turn, the areas within each of 


— Ra 
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tion on the battery should be sent to Charles T. 

Golden, Nebraska Psychiatric Institute, University 
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8108. e Medical Center, Omaha, Nebraska 


these units have specific functions. For ex- 
ample, one area within the second functional 
unit is responsible for auditory input, and 
another area is responsible for the integration 
of visual and auditory stimuli. 

Within Luria’s system, overt behavior is 
the result of the cooperation among different 
areas of the brain. The pattern of interacting 
areas responsible for a given behavior is 
called a functional system. Each area of the 
brain participates in numerous functional 
systems, 

The effect of any brain injury is to inter- 
rupt the execution of any functional system 
that includes the injured areas(s). Thus, any 
brain injury will effect the performance of 
humerous behaviors. The generality of the 
behavioral loss depends on the importance of 
the functional systems interrupted, as well 
as the availability of alternate functional 
systems to replace the injured system. 

The role of the neurodiagnostician is to 
isolate the specific point at which a client’s 
functional systems have been interrupted. 
This requires a detailed evaluation of a sub- 


Copyright 1978 by th i i 
by the American Psychological Association, Inc. 0022-006X/78/4606-1258$00.75 


1258 


‘N 


a 


STANDARDIZED NEUROPSYCHOLOGICAL BATTERY 


ject’s intact and interrupted functional sys- 
tems. From the pattern of performance 
deficit discovered, the specific locus of the 
brain injury can be determined, as well as 
such factors as type and severity of the 
disorder. 

Thus, Luria’s diagnostic tests consist of 
numerous specific procedures. These are de- 
signed to isolate dysfunction, compared to 
the more global assessments characteristic of 
many neuropsychological tests. In addition, 
Luria’s procedures tend to be qualitative in 
nature, rather than quantitative. 

Luria’s procedures have several practical 
advantages, First, they provide a more ex- 
tensive breakdown of behavior than can be 
determined from more global tests. Thus, the 
clinician can derive a more specific analysis 
of the deficits present in a patient. This in- 
formation can be used for both neurodiagno- 
sis and for specific rehabilitation planning 
based on an individual’s pattern of test re- 
sults (Golden, 1976, 1978; Luria, 1963). 

A second advantage of these test proce- 
dures is the ability to do a full examination 
in less than 2.5 hours (Luria, 1966). In 
Contrast, a test battery like the Halstead- 
Reitan, which can reveal similar deficit pat- 
terns, may take from 6 to 8 hours. A third 
advantage of the Luria tests is their porta- 
bility. Luria’s procedures were specifically set 
up to be done at bedside with a minimum of 
Mexpensive equipment. Again, this is in con- 
trast to many current test procedures that 
require a laboratory setting or great amounts 
of equipment, sometimes costing several hun- 
dred dollars. 

Pew me te battery includes an 
Gila. all of the areas necessary for 
Begley: neuropsychological exam (Ben- 

BF j). This includes evaluation of mo- 
tile ee sensory skills (auditory, tac- 
aa visual), verbal skills (expressive 
ing) eceptive speech, reading, and writ- 
Ma patial skills, mathematical abilities, 

D y, and intellectual skills, 
oe these advantages, the Luria tests 
ue of, widely criticized. One significant 
tation of at been the lack of any direct eval- 
servation the tests beyond Luria’s own ob- 

S and conclusions. In this regard, 
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Reitan (1976) has written that Luria’s “re- 
ports essentially represent evaluation of ‘crit- 
ical cases’ based upon his own observations, 
conclusions, and statements of significance” 
(p. 199). This lack of systematic validation 
has greatly limited the generalizability of 
Luria’s results. 

A major problem in a validation study is 
the lack of any standardized method of giv- 
ing or scoring Luria’s test procedures. In 
many cases, the description of adminstra- 
tive procedures is vague. Moreover, Luria 
acknowledges changing procedures for in- 
dividual patients. Even when administrative 
procedures are clear, scoring procedures are 
not. Scoring is determined by the personal 
assessment of the clinician based on experi- 
ence and knowledge rather than on any 
normative data. 

Despite these major problems, the Luria 
battery clearly possesses a high degree of 
face validity as well as a strong theoretical 
foundation. With proper standardization, 
scoring, and validation, the Luria tests could 
become a major tool of both clinical and 
experimental neuropsychology. 


Development and Rationale 


The intention of the present study was to 
use the material presented by Luria (1966, 
1973) and Christensen (1975a, 1975b, 
1975c) to form a standardized, objectively 
scored version of Luria’s neuropsychological 
procedures, The new battery was designed 
to retain, as closely as possible, the qualita- 
tive nature of Luria’s tests and the sampling 
of all the major areas of neuropsychological 
performance. 

At the same time, the standardization of 
the items and the objective scoring would 
allow for careful validation and replication 
of the test results, as well as for the sys- 
tematic collection of data on a wide variety 
of neurological disorders. This would allow 
the development of scales representing spe- 
cific loci of injury or underlying causes 1n 
order to establish a firmer interpretation of 


the battery. P 
Consequently, the battery, as envisioned, 
would have the advantages of a detailed, 
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qualitative analysis of a client’s behavior, as 
well as the advantages of a standardized 
quantitative battery and the systematic di- 
agnostic research that can be completed with 
such a battery. In addition, the battery 
would have the advantage of a relatively 
short testing time. 

The first step in designing the battery 
was to develop standardized items, keeping 
as close as possible to the original proce- 
dures while creating items that could be ob- 
jectively scored. After the development of 
the initial items, they were administered to 
a clinical population to evaluate their prac- 
ticality. 

As the project progressed, numerous items 
were rewritten or deleted as they were found 
to be ineffective or duplicative. Others were 
eventually discarded when a reliable scoring 
system could not be developed for them, 
Some items were eventually found to be too 
difficult to administer to an impaired pop- 
ulation. After 6 months of effort, a final 


version of the test to be used in this study 
was developed. 


Present Study 


The purpose of the present study was an 
initial validation of the standardized battery, 
Fifty normal hospitalized subjects were com- 
pared with 50 brain-injured subjects, £ tests 
were calculated on each score within the 
battery. In addition, a discriminant analysis 


was used to measure the overall effectiveness 
of the battery. 


Method 
Subjects 


The subjects were 100 hospitalized pati 
in hospitals in Sioux City, EEE 


Dakota; and Sioux Falls, So 


or neurosurgeon, 
a variety of medical 
injuries, infectious diseases, 


‘ontrol subjects was 42.0 
anne and the average age of 
Subjects was 44.3 years ‘SD = 
18.8 years). The difference in age was not ae 
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cant, #(98)=.7, p > 40. Overall, there were 49 
females and 51 males. No significant differences in 
sex distribution were present in the two groups. 

The control group had 12.21 years (SD = 2.86 
years) of education, and the neurological group 
had 10.30 years (SD = 2.84 years). The difference 
between the groups was significant, £(98) = 3.51, 
b< 01. 


Test Battery 


Overall, the Luria-South Dakota Neuropsycho- 
logical Test Battery consists of 285 measures and 
can be administered in less than 24 hours to a sig- 
nificantly impaired individual. The battery requires 
several pieces of inexpensive equipment. First, a 
series of cards with pictures and word items pub- 
lished by Christensen (1975c) are needed. Several 
additional pictures (available from the first author) 
are necessary to replace some of the Christensen 
items that were found to be ineffective with an 
American population. 

In addition, the battery requires the following 
objects: 1 (a) a 13-cm black comb; (b) a 24X# 
inch (5.5 X 6 cm) rubberband; (c) a paper clip, 
jumbo size; (d) a Bow Compass 5178 (available 
from the Empire Pencil Company, Shelby, Tennes- 
see 37160); (e) a Pedigree Quality Eraser 2910 
(also available from the Empire Pencil Company); 
(f) a key (WR2 Curtis 177); (g) a straight pin; 
(h) a quarter; (i) a metric ruler; and (j) an 
audiotape for some of the rhythm and verbal 
items. 

The items in the battery are generally adapted 
from Christensen (1975a, 1975b, 1975c)? with 
slight modifications as necessary in order to estab- 
lish a standard administrative or scoring proce- 
dure. The items fall roughly into 10 categories: 


Motor functions. This section includes a series 
of tasks requiring the reproduction of simple mo- 
tor movements with the hands, mouth, and tongue, 
both when a model is provided and under verbal 
instructions alone. The section also evaluates simple 
coordination, optical-spatial organization, complex 
Sequencing of behavior, and the ability to draw. 
Sample items on this scale include: 


1. Using your right hand, touch your fingers in 
turn with your thumb as quickly as you can 


i 1 Information on obtaining the materials used 
in the battery, copies of the battery, and the cur- 
rent test manual can be obtained from the first 
author, 

2 The items in the battery have been modified 
from Luria’s Neuropsychological Battery with per- 
mission. (Copyright 1975 by Anne-Lise Christensen 
and Munksgaard, Copenhagen, Denmark, and Spec- 
trum Publications, Inc., 1975-20 Wexford Terrace; 
Jamaica, New York 11432.) 
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while you count them. [with palm facing up, 
demonstrate and then have the subject practice 
before timing, Allow 10 sec. Score the number 
of complete times the subject does the sequence 
accurately.] 


5. Close your eyes and place your right hand in 
the same position as I place it first. [Press thumb 
against fifth finger for 2 sec, then return to 
normal position. Score correct or incorrect.] 


11. Do as I do. [Place right hand under chin 
with fingers bent.] 


22, With your hands in front of you, tap your 
right hand twice and your left hand once, chang- 
ing smoothly from one hand to the other like 
this, [Demonstrate and allow the subject to 
practice.] Do this as quickly as you can until 
I tell you to stop. [Allow 10 sec. The score is 
the number of fully correct sequences.] 


25. Show me how to work with scissors. 


36. Without lifting your pencil from the paper, I 
want you to draw the best square you can. [Score 
for time and quality. All quality items have ob- 
jective grading requirements in the test manual.] 


44. [Show picture of a square]. Draw this. [Score 
as above]. 


50. If I knock once, raise your right hand, if I 
knock twice, raise your left hand. [Give four 
trials, alternating one and two knocks. Score 
number of errors.] 


Rhythm (acoustico-motor functions). This sec- 

ioi includes items requiring the individual to 

illerentiate between sounds with different pitch 

ae thythmic relationships, The subject must indi- 

a ra tier sounds are the same or different as well 

fe is er they reproduce rhythmic and pitch pat- 
» Sample items included: 


o Now you are going to hear two tones on a 

aie, Tell me whether the tones are the same or 

E A [Play six pairs of tones, The score is 
e number of errors.] 


Tr Again, you will hear two tones. Tell me which 
irae? the first or second tone. [Play tape 
ith five pairs. Score errors.] 


5 
8. Tell me how many beeps you hear. [Play 


four grow 

errors ] ips of beeps from tape. Score number of 
62. i 

Wh You will now hear a rhythm on the tape. 


Wee tell you that the rhythm is over, I want 
‘© tap with your hand the rhythm you 


ea 
ti ze On the tape. [Play three rhythms. Score as 
ght or wrong.] 
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Cutaneous and kinesthetic functions (tactile). 
This section evaluates complex cutaneous function, 
muscle and joint sensations, and stereognosis. Kin- 
esthetic assessment requires a blindfolded subject to — 
identify the direction of limb movements and re- 
produce limb positions. Cutaneous assessment in- 
cludes evaluation of threshold, localization, stimu- 
lus identification, and two-point finger discrimina- 
tion. The assessment for stereognosis requires a 
blindfolded subject to identify common objects 
placed in the palm of the hand under both active 
and passive palpating conditions. Sample items 
include: 


64. Tell me where I am touching you. [Have 
blindfolded subject in sitting position with hands 
in front and palms facing up. Touch the subject 
with the eraser end of pencil, alternating among 
right and left fingers (numbered 1-5), palms 
(P), forearms (F), and shoulders (S). If uncer- 
tain of where the subjects report touch from 
verbal report, have the subjects indicate the 
place touched with their opposite hand. Touches 
should be in the following order: 


Right hand: 1 F 3 5 P 2S 4 
Left hand: P 23S 5 4Ft1 
Score for number of errors on each hand. 


80. Now I will put your left arm in a certain 
position; try to put the other arm in the same 
position. [Extend left arm of blindfolded sub- 
ject in front at 90°.] 


82. Feel this object and tell me exactly what it 
is. [Instruct the subject to hold their right palm 
up and place objects on the fingers. Alternate 
between hands in this manner. Allow 10 sec per 
item. Score for correctness of answer, and time 
each response. Objects include quarter, key, eraser, 
and jumbo paper clip.] 


Visual functions. This section includes a series of 
tasks assessing the integrity of visual-spatial per- 
ception, including the identification of objects and 
pictures, identifying the missing elements in complex 
geometric configurations (similar to the tasks in 
Raven’s Progressive Matrices) and constructing ge0- 
metric patterns from blocks. Subjects must identify 
time on clocks with no numbers and show spatial 
and directional orientation. Finally, the ability of a 
subject to perform spatial rotations and transforma- 
tions is assessed. 

Sample items include: 


86. What do you call this object? [Examiner 
presents the subject with objects one at a time. 
Allow 10 sec per item. Items are pencil, eraser, 


rubber band, and quarter.] 


87. What is this picture supposed to be? [Present 
pictures one at a time and allow 10 sec for each. 
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Pictures include a purse, a nut cracker, a glass 
vial, a camera, and an egg carton. Score the 
number wrong.] 


88. [Show clock faces,] Tell me what time these 
clock faces show. 


97. This drawing shows a stack of blocks in three 
dimensions. Tell me how many blocks are in the 
stack. [Be sure to include those you see as well 
as those you don’t see.] 


Impressive speech. This section assesses a sub- 
ject’s ability to discriminate basic English phonemes 
and to reproduce the discriminations orally or by 
writing; to name familiar and unfamiliar objects 
among a series of pictures; and to respond to state- 
ments and questions that require the understanding 
of genitive, prepositional, comparative, and complex 
grammatical constructions. 

Sample items included: 


100. You will hear some sounds on the tape. 
What I want you to do is first repeat exactly 
the sound you hear and then write down the 
letter of the alphabet that goes along with the 
sound, For example, if you hear the sound “ta” 
you would say “ta” and then write down the 


letter “t”. [Score oral and writing errors sepa- 
rately, 


110. I will place some pictures before you. I want 
you to point at the shoe, the candle, the stove, 


114, Put your hand on your head. 


125. Which boy is shorter if John is taller than 
Peter? 


130. Is the following sentence said by a disciplined 
or an undisciplined person? “I am unaccustomed 
to disobeying rules,” 


Expressive speech. This section includes tasks 
requiring the articulation of simple speech sounds, 
familiar and unfamiliar words of varying lengths, 
and phrases or sentences of varied length and 
aaah oe test also requires the subject to 
name and classify objects an i 
arn y obj d to produce narrative 


Sample items include: 


133. Repeat after me: (a) a (as in late); (b) i 
(as in light); (c) in mi aes 
baby); and (e) sh (as Gn a ee 


137. Repeat after 


laboro me: (a) hairbrush, screwdriver, 


138. Repeat after me: a ; 
three as one item], me: house-ball-chair. [Say all 
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157. What objects do these pictures represent? 
[Show five pictures.] 


163. Say the days of the week backwards starting 
with Sunday. 


164. Tell me what’s happening in this picture. 
[Present picture. Score time to respond and num- 
ber of words in first five seconds of response.] 


Reading and writing. This section requires the 
subject to break words into their component sounds 
or letters; to integrate sounds or letters into words; 
to copy letters and words; to write words of vary- 
ing complexity from dictation; and to read sounds, 
words, phrases and paragraphs. 

Sample items: 


178. Copy these in your own handwriting. [Pre- 
sent card.] 


183. Write these words [dictate]; wren, knife. 


188. What sound is made by the letters (a) 
g-r-o and (b) p-l-y. 


192. Read these sounds [Present card with the 
sounds po, cor, cra, spro, and prot.] 


197. Read these sentences. [Present cards.] 


Arithmetical skills. This section requires a sub- 
ject to identify Arabic and Roman numerals, to 
identify the significance of digit placement, to 
compare numbers of varying size, and to do simple 
arithmetic operations (multiplication, addition, sub- 
traction) and simple algebraic manipulations. The 
ability to form arithmetic series is also evaluated. 


Items are presented both orally and visually. 
Sample items: 


203. Write these numbers: 71, 17, 69, 96. 
206. Read these numbers [on card]: 7-9-3, 3-5-7. 


210. Tell me which is larger: 17 or 68? 23 or 56? 
189 or 201? 


216. Add these numbers in your head: 5, 9, 7. 


219, What is the missing number [Present card 
with the following:] 


Mnestic processes. This section involves a series 
of tasks assessing an individual’s retention and 
retrieval skills for visual, acoustic, and kinesthetic 
inputs. Subjects must work with both verbal and 
nonverbal material. The effects of retroactive an 
Proactive interference are also examined. 
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Sample items: 


227. I am going to show you a card, and I want 
you to look at it carefully. When I remove the 
card, I want you to draw as much from it as you 
can remember. [Present card for 7 sec.] 


231. I want you to remember some words that I 
am going to say: house-tree-cat. Now, look at 
this picture; what do you see? [Allow the subject 
to describe the picture for 15 sec.] What were the 
words? 


Intellectual processes. This section requires the 
subject to interpret the themes of pictures, to dem- 
onstrate vocabulary skills, to form concepts, to 
classify objects, to understand analogies, to com- 
plete complex arithmetical problems, and to show 
logical reasoning skills. 

Sample items: 


245. What is meant by the expression “iron 
hand”? “green thumb”? [Score for quality.] 


249, In what way are table and sofa alike? In 
what way are an ax and a saw alike? 


256, What has the same relation to good as high 
has to low? 


258. Peter had two apples and John had six 
apples. How many did they have together? 


Other items in the test, in addition to the exam- 
ne shown, test the same processes, varying the 
ifficulty, modality involved, instructions, or hand 
that is used. 


Procedure 
Ih the case of each potential subject, written 
peeussion was obtained to review medical records 
pect to assess whether the subject met the 
‘apres requirements of a confirmed diagnosis for 
ERA Individuals whose diagnosis was ques- 
inju le as to the presence or absence of brain 
ENA were not included in the study, unless a 
eae diagnosis was later established. 
milie a patient had assented to participate, ad- 
A ct was arranged at a time that did not 
of ec ane the hospital schedule. The majority 
RS administrations occurred at bedside, although 
ooms for testing were occasionally available. 


ec; . p 
meee of the relative short time needed to ad- 
cee es battery, the testing was done in a single 
divided n some cases, however, the session was 
by visitors or hospital tests. 
Scoring 
Items 


acordin were scored by a variety of methods 
8 to the nature of the item and the quali- 
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tative dimension or dimensions that were to be 
assessed by the item. As can be seen in the sample 
items already presented, many items could be 
scored as right or wrong, or the overall number of 
errors could be counted on an objective basis. Time 
to complete an item or latency of response was 
also frequently used. Other scoring criteria included 
frequency of response in a specified time, trials to 
correct performance, and number of items com- 
pleted. In some cases, quality of response was 
judged. The test manual gives extensive instruc- 
tions on the evaluation of such responses. 

For all items, the raw scores were recorded and 
then converted into a 0, 1, 2 system. A score of 0 
was intended to represent the performance char- 
acteristic of a normal individual. A score of 1 rep- 
resented performance intermediate to that of nor- 
mals and brain-damaged patients. A score of two 
represented performance characteristic of brain 
damage. The scoring for each item was established 
by finding cutoff points that maximized the dis- 
criminative effectiveness of the item in a group of 
75 subjects collected early in the development of 
the battery (37 normals, 38 brain damaged). This 
was done by maximizing the chi-square that com- 
pared the number of subjects in each category who 
were diagnosed correctly or incorrectly. 

Alternate scoring systems were also tried that 
contained wider ranges, including a 0, 1, 2, 3 sys- 
tem and a 0, 1, 2, 3, 4 system, However, these 
failed to add any additional discriminative validity 
to the tests. Hence, it appears that the simpler 
system should be used. For researchers wishing to 
compare scores on tests more finely, the raw score 
data allow this to be accomplished. 

To evaluate the reliability of the scoring system, 
the test was administered by 1 examiner in the 
presence of a second examiner. Each examiner 
scored the test independently. This procedure was 
repeated for five patients. In each repetition, a 
different pair of examiners was used; thus the 
procedure involved 10 examiners. Overall, there 
were 1,425 pairs of scores available. The rate of 
agreement was over 95%. 


Results 


t tests were run between the control and 
neurological group on all 285 scores gen- 
erated by the test. Of the 285 comparisons 
made, 253 were significant at the .05 level 
(df = 98). In all cases, the neurological 
group performed less effectively (a higher 
score) than the control group. Of the 32 
items not significant at the .05 level, 16 ex- 
ceeded the .20 level of probability. On these 


3 Copies of the text, a test manual describing 
scoring criteria, and procedures for each item are 
available from the first author. 
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32 items, the neurological group performed 
poorly on 30 items and showed identical per- 
formance on two items. 

Because of the educational differences 
between the groups, one-way analyses of 
variance were calculated using education as 
a covariate. In no case did any significant 
item become nonsignificant because of the 
control of the educational difference, al- 
though education contributed a small amount 
of variance to each item. In no case did a 
nonsignificant item become significant be- 
cause of the use of the covariate. 

To further evaluate the potential effective- 
ness of the battery, a discriminant analysis 
was run using all 285 measures. It was found 
that a weighted, linear combination of 30 
variables was sufficient to separate the 
groups with 100% accuracy. 


Discussion 


The results of the current study clearly 
supports the validity of the standardized bat- 
tery. Nearly 90% of the items suggested by 
Luria, and modified so as to be objective, 
significantly discriminated between the 
brain-injured and normal subjects. In addi- 
tion to the excellent results with the indi- 
vidual items, the discriminative analysis 
achieved a 100% hit rate using only 30 of 
the 285 scores. This result is comparable to 
the results achieved by a neuropsychological 
test battery of any kind. It is recognized 
that the hit rate might not be as high in a 
cross-validation. Despite this, the results 
clearly illustrate the strong potential of the 
assessment approach reflected in the battery. 

An evaluation of those items that failed to 
discriminate between groups revealed no con- 
sistent pattern. Some of the items were too 
easy and were missed by no one Such as 
“Show me your teeth.” Others were clearl: 
quite difficult: “Which is correct ‘The och 
is illuminated by the sun’ or “The sun il- 
luminates the earth’?” (Correct answer: 
Both are correct), In other cases, no fully 
comprehensive scoring method feats be 
a In this category, one deceptively 
renea E eT me how to frown” resisted 


3 pts to devise an i 
scoring procedure, etective 


N 
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Despite the impressive results of the cur- 
rent study, there is a need for more exten- 
sive research before the battery can be used 
in clinical situations on a regular basis. One 
important research goal will be the estab- 
lishment of a basic summary scoring system 
so that performance in general areas (spatial 
skills, speech, cognition, among others) can 
be easily determined and compared. 

A second research area is the demonstra- 
tion of the effectiveness of the battery in a 
population including psychiatric patients, a 
setting in which extensive neuropsychological 
evaluation takes place. It is necessary to 
examine the effectiveness of the battery with 
psychiatric patients who have a chronic 
history and a long duration of hospitaliza- 
tion, subjects for whom neuropsychological 
tests are often highly ineffective (Golden, 
1977, 1978; Lezak, 1976). 

A third research area is assessing the abil- 
ity of the battery to localize injuries and aid 
in the identification of underlying neuro- 
logical processes, Luria’s theoretical system 
Suggests that the battery should be highly 
effective in this regard, perhaps more effec- 
tive than any other comparable neuropsycho- 
logical battery. 

In this regard, it may also be possible to 
develop scales designed to specifically mea- 
sure such factors as laterality, localization, 
or process. This scale development would be 
analogous to the process used with the Min- 
nesota Multiphasic Personality Inventory and 
other similar tests. 

The final area of necessary research is to 
examine the effects of variables such as age, 
education, medication, intelligence, chronic- 
ity, and severity of a disorder. As the effects 
of these variables are determined, appropri- 
ate clinical procedures can be devised to cor- 
rect for these factors or to include them in a 
clinical analysis. 

In addition to it’s usefulness in neurodiag- 
nosis, the battery is a potential useful in- 
strument in rehabilitation planning. As was 
seen in the Method section, the battery evalu- 
ates a wide range of abilities, allowing the 
clinician to identify the specific areas in 
which an individual has problems as well as 
the underlying deficits in an individual’s 
functional systems caused by a brain injury- 
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There is also the ability to evaluate those abili- 
ties that remain intact. This information can 
be used to plan rehabilitation tasks designed 
to retrain a lost ability or to use intact abili- 
ties to reformulate a functional system 
(Golden, 1976, 1978; Luria, 1963). This 
potential of the battery requires careful and 
systematic evaluation by rehabilitation psy- 
chology. 

At present, we are carrying out further 
studies aimed at meeting some of the im- 
portant research requirements outlined above. 
The present study clearly indicates that the 
standardized Luria battery has tremendous 
potential and may be the extensive, highly 
effective, economical, and standardized bat- 
tery that neuropsychology will need as the 
field grows into more diverse settings. Future 
tesearch should determine the extent to which 
the battery fulfills this potential. 
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This study examines the ability of a standardized battery of tests suggested by 
the extensive work of A. R. Luria to discriminate between brain-injured and 
schizophrenic patients. An earlier study reported 93% effectiveness for the 
standardized battery in discriminating brain-injured patients and normal con- 
trols. In the present study, the battery was administered to 100 schizophrenic 
and brain-injured patients. Chronicity was 121 months (SD = 125) for the 
schizophrenic group and 56 months (SD = 116) for the neurological patients. 
Of the 282 items in the battery, schizophrenics showed significantly better per- 
formance on 72 items (p <.05). A discriminant analysis using 60 items dem- 
onstrated 100% diagnostic accuracy. Schizophrenics performed better on 10 of 
14 summary measures (p < .01). A discriminant analysis using the 14 summary 
measures achieved 88% diagnostic accuracy. The accuracy shown by the battery 


University of Nebraska Medical Center 


is as high as the results obtained usin 
schizophrenic sample. The usefulness of 


injured patients is discussed. 


The differential diagnosis between psycho- 
logical problems due to schizophrenia and 
those due to brain injury is a significant issue 
in the field of clinical neuropsychology. Sey- 
eral diagnostic instruments have been devel- 
oped for this purpose, ranging from short, 
individual test procedures such as the Bender- 
Gestalt to long test batteries such as the 
Halstead-Reitan. 

Despite many attempts to find the ideal 
test or tests, no instrument yet developed 
consistently separates schizophrenic from 
brain-injured groups. Although many studies 
have reported statistically significant group 
differences (Yates, 1954, 1966), these results 
do not reflect the real clinical utility of a 
test. A test can discriminate between groups 
on the basis of mean differences 


in perform- 
ance, but there may still be too much over- 


pies of the test 

be sent to Charles J. Golden 
titute, University of Ne- 

» Omaha, Nebraska 68105. 


g other tests in a comparable chronic 
the battery in schizophrenic and brain- 


lap between members within each group to 
allow accurate individual diagnosis. There- 
fore, the major measure of clinical effective- 
ness is the hit rate—the percentage of pa- 
tients accurately diagnosed by a test (Yates, 
1954). 

Unfortunately, many studies do not report 
hit rates (Spreen & Benton, 1965). Studies 
that have reported hit rates when evaluating 
the diagnostic effectiveness of an assessment 
technique in discriminating between schizo- 
phrenic and brain-injured patients have indi- 
cated a wide scope of results ranging from 
chance levels of 50% to over 90% accuracy: 
Generally, no test or test battery has shown 
consistently good results in studies using 
Psychiatric patients (see Golden, 1978). 

Three primary factors have been identifi 
by various reviewers to explain the incon- 
sistent results in the literature: differences in 
chronicity of schizophrenia, the comprehen- 
Siveness of assessment procedures, and the 
methods of data analysis (Davison, 1974). 1 
general, chronicity is positively related tO 
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decrements in adaptive functioning as mea- 

sured by a wide variety of psychometric tests. 

Long-term chronic schizophrenics are likely to 

reveal deficits similar to those seen in neuro- 

logical patients. The majority of studies re- 
t 


porting nonsignificant results or low hit rates 
have used long-term chronic schizophrenics. 
For example, Watson and his collaborators 
(Watson, 1968, 1972; Watson, Thomas, An- 
dersen, & Felling, 1968; Watson, Thomas, 
Felling, & Andersen, 1968; Watson, Thomas, 
Felling, & Andersen, 1969; Watson & Ueker, 
1966) tested schizophrenics who had an 
average length of hospitalization of over 10 
years. Those studies reporting positive re- 
sults have used schizophrenics with gen- 
erally short hospitalizations of under 1 year 
and less severe disorders (e.g., Golden, 
1977). 
A second factor is the ability of the tests 
to assess all neuropsychological skills. Or- 
ganicity cannot be considered as a unitary 
entity resulting in identical disturbances 
across patients (Boll, 1974). In a mixed 
brain-injured group, the differences in the 
mature and loci of lesions for each individual 
result in unique patterns of disturbed func- 
tioning. Procedures not providing compre- 
hensive assessment may fail to reveal lesions 
that manifest disturbances in abilities other 
than those included in the assessment. Ac- 
cordingly, the Halstead-Reitan, presently the 
SH comprehensive standardized test bat- 
Eo has been able to demonstrate the high- 
hit rates with neurologically impaired 
Populations (Golden, 1977). 
met problem exists with some of the 
eed used tests of brain damage. These 
tee comprised of items that require the 
I ed working of several neuropsycho- 
3 as (Lezak, 1976). These types 
en S are often so complex that even non- 
may c apan particularly schizophrenics, 
eee orm poorly, As a result, many pa- 
alsely out neurological problems may be 
or nose as organically impaired. 
der-Gestalt e, poor performance on the Ben- 
oye toed not be a positive indicator 
Psychologi pee? it may simply reveal 
etiolo gical dysfunction without regard to 
By (Golden, 1977). 


: : 
© method of analyzing test results is a 
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third factor to consider: in evaluating the 
efficacy of various diagnostic instruments. 
Most studies of tests that do not demon- 
strate significant group differences have 
based their comparisons on global measures 
representing overall test performance. Per- 
formance on individual test items is obscured 
in these global measures, and much poten- 
tially useful diagnostic information is lost. As 
such, global quantitative comparisons may 
not be a fair method of assessing a test’s true 
diagnostic effectiveness (Lezak, 1976). For 
example, Goldstein and Neuringer (1966) 
argued that a qualitative analysis of the 
Trail Making Test, part of the Halstead- 
Reitan, increased hit rates to nearly 80%, 
whereas global scores were only able to 
achieve separation at chance levels. Hewson 
(1949a, 1949b) demonstrated up to 907% 
diagnostic accuracy using analyses of Wech- 
sler Adult Intelligence Scale (WAIS) sub- 
test relationships, whereas studies comparing 
group mean performances did not show con- 
sistent significant differences (Matarazzo, 
1972). 

The statistical methods that are used may 
also affect the results of a study. Many stud- 
ies have used univariate tests to detect group 
differences for each measure under consid- 
eration. This practice may result in lower hit 
rates than when maximum information is 
obtained by combining tests. A study by 
Golden (1978) demonstrated more reliable 
discrimination between schizophrenics and 
brain-injured subjects when the results of all 
measures in a test battery were statistically 
combined by a discriminant analysis than 
when each measure was considered sepa- 
rately by ¢ tests. 

A. R. Luria, the Russian neuropsycholo- 
gist, has developed a set of procedures in- 
tended to assess systematically each area of 
neuropsychological functioning. Based on 
Luria’s extensive theoretical contributions to 
neuropsychology (eg., Luria, 1966, 1973), 
the items yield a qualitative assessment of the 
major neuropsychological skills. 

Luria’s procedures have several advan- 
tages. First, they allow for a comprehensive 
assessment of neuropsychological abilities. 
Second, complex behaviors are systematically 
analyzed by items assessing specific neuro- 
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psychological skills. This feature provides in- 
formation about the qualitative nature and 
degree of a disturbance that can be useful 
for diagnostic and rehabilitative purposes 
(Luria, 1963). Finally, a relatively short 
administration time makes it more likely that 
a patient’s attention and interest can be 
sustained (Golden, 1978; Smith, 1975). 
These characteristics suggest that it should 
be a highly useful diagnostic instrument, 
even in chronic schizophrenic populations. 
Despite these advantages, the tests have 
been criticized. Stressing the importance of 
flexibility when testing neurological patients, 
Luria does not provide a standard manner in 
which to administer his tests. Instead, he 
recommends modifying procedures as neces- 
sary to account for patient needs and specific 
referral questions, In addition, Luria does 
not provide for rating responses. Judgment 
and clinical intuition are the methods of 
test interpretation rather than comparisons 
to normative standards. Reitan (1976) has 
criticized the lack of standardization and the 
lack of scoring criteria, contending that 
Luria’s opinion is the only measure of va- 
lidity that has been used with the battery. 
As a result of these criticisms, we (Golden, 
Hammeke, & Purisch, 1978) developed an 
alternate version of Luria’s battery, We at- 
tempted to eliminate the weaknesses dis- 


Table 1 

Frequency and Percentage of Diagnostic Subtypes 

for Brain-Injured and Schizophrenic Subjects 

ere Se CE ieee 
Diagnostic subtype F % 


Neurological diagnosis 
Cerebral trauma 


10 20 
Neoplasms 6 12 
Infectious diseases 3 6 
Cerebral vascular disorder 14 28 
Degenerative diseases 6 12 
Epilepsy 4 8 
Metabolic and toxic disorders 3 6 
Congenital disorders 4 8 


Schizophrenic diagnosis 


Catatonic 2 4 
Hebephrenic 3 
Paranoid 19 6 
Simple i 3 
Undifferentiated 

Schizoaffective a a 
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cussed above while maintaining the positive 
features, Standard administration procedures 
were defined, and a scoring system was 
developed. The scoring included separate 
scores for the important dimensions of each 
response. Another consideration was to keep 
administration time as short as possible with- 
out sacrificing comprehensiveness. 
Hammeke, Golden, and Purisch (in press) 
has shown the standardized battery to be 
highly effective in discriminating brain-in- 
jured and hospitalized control patients. We 
found that the individual items in the test 
discriminated experimental and control pa- 
tients 100% of the time. We also found that 


.a combination of 14 summary scores could 


achieve a 93% hit rate. These rates are as 
good as or better than the results seen in any 
other single test or test battery (e.g., see 
Golden, 1977; Lezak, 1976). 

Since the discrimination of brain-damaged 
and schizophrenic patients appears to be the 
most difficult task for a neuropsychological 
test battery or procedure, the present study 
is an attempt to evaluate the effectiveness of 
the standardized version of Luria’s tests 
when comparing these two populations. If the 
test is able to make this discrimination, then 
the clinical value of the Luria in diagnostic 
work would be strongly increased. 


Method 
Subjects 


The subjects were 100 hospitalized patients. All 
were approached at the recommendation of their 
physician, and all volunteered to participate in m ie 
study. Fifty subjects had confirmed neurological 
diagnoses made on the basis of medical examination 
by a qualified physician, usually a neurologist OY 
neurosurgeon. The 50 other subjects were patients 
diagnosed by psychiatrists as having a schizophrenic 
disorder. No schizophrenics were included for whom 
there was a possibility of organicity as indicated by 
a history of seizures, alcoholism, or head trauma. 
Table 1 presents the diagnostic subtypes of both 
the schizophrenic and brain-injured subjects. 3 

Overall, there were 56 male and 44 female subjects. 
No significant difference in the sexual distribution © 
the two groups was present, Other background data 
were obtained from the subjects’ hospital files after 
Securing their signed informed consent. Table a 
presents the means and standard deviations of each 
group for age, education, length of hospitalization, 
duration of illness (chronicity), age of onset © 
illness, and number of previous hospitalizations: 


: 
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Means, Standard Deviations, and t Tests for Demographic Indices 
L 


Brain injured* Schizophrenic* 

Index M SD M SD (ae 
Age (years) 44.36 18.83 41.32 14.52 .90 
Education (years) 10.30 2.84 11.38 2.59 1.97 
Length of hospitalization (days) 19.62 43.38 410.42 1,569.32 1.76 
Chronicity (months) 6.24 116.40 121.02 125.03 2.67** 
Age of onset (years) 39.42 22.34 32.08 12.26 2.04* 
Previous hospitalizations 1.50 2.05 4.29 3.48 4.79*** 


“n= 50. 
bdf = 98, 

*p <05. 
"p< .0l. 
** p < .0001. 


This table indicates that 
cantly higher means for 


schizophrenics had signifi- 
I chronicity and number of 
Previous hospitalizations. Schizophrenics also had a 
Significantly earlier mean age of onset compared to 
the brain-damaged subjects. Length of present hos- 
Pitalization was considerably longer for the schizo- 
Phrenics, but this difference was not significant. No 
Significant group differences were found for age 
or years of education. 


Test Battery 


on ae the items in the test battery were adapted 
fae ristensen (1975a, 1975b, 1975c) with modi- 
ie on in order to establish standard administra- 
pagecting procedures. The battery consists of 282 
P soend can be administered in less than 24 
TA e items fell into 10 categories.! 
eed chom This series of tasks requires the 
K; uction of simple motor movements with the 
nae sae and tongue both when a model is 
eaten a from verbal instructions alone. The 
ne 0 evaluates simple coordinated abilities, 
L patial organization, complex sequencing of 
vior, and the ability to draw. 
Aas o ilor (rhythm) functions. These 
sotinde Val the individual to differentiate between 
relationshin different pitch and different rhythmic 
Kouna: aS The subject must indicate whether 
His are the same or different and reproduce 
tape. or rhythmic relationships played from 
Aa and kinesthetic (tactile) functions. 
muscle con evaluates complex cutaneous functions, 
esthetic ass Joint sensations, and stereognosis. Kin- 
identify oe eats a blindfolded subject to 
Produce oe of limb movements and re- 
cludes ae _ Position. Cutaneous assessment in- 
identican uation of threshold localization, stimulus 
ctiminatig ap and two-point finger and palm dis- 
A ladila assessment for stereognosis requires 
Placed eon subject to identify common objects 
e palm of the hand. 


Visual functions. These tasks assess the integrity 
of visual-spatial perception, including the identifi- 
cation of objects and pictures, identifying the miss- 
ing elements in complex geometric configurations 
(similar to the tasks in Raven’s Progressive Ma- 
trices), transposing pictures of blocks with no num- 
bers, and showing spatial and directional orienta- 
tion. Finally, the ability of a subject to perform 
spatial rotations and transformations is assessed. 

Expressive speech. This section includes tasks 
requiring the articulation of simple speech sounds, 
familiar and unfamiliar words of varying lengths, 
and phrases or sentences of varied length and com- 
plexity. The test also requires the subject to name 
and classify objects and to produce narrative de- 
scriptions. 

Impressive speech. The tasks require the articu- 
lation of simple speech sounds, familiar and un- 
familiar words of varying lengths, and phrases or 
sentences of varied length and complexity. The 
subject must name and classify objects and produce 
narrative descriptions. 


Reading and writing. The subject must break 
words into their component sounds or letters; in- 
words; write words 


tegrate sounds or letters into 
of varying complexity from dictation; and read 
letters, words, phrases, and paragraphs. 

Arithmetic skills, The subject is required to iden- 
tify Arabic and Roman numerals, to identify the 
significance of digit placement, to compare num- 
bers of varying size, and to perform simple arith- 
metic operatons (multiplication, addition, subtrac- 
tion) and simple algebraic manipulations. The 
ability to form arithmetic series is also evaluated. 
Items are presented both orally and visually. 


1Due to time limitations and the desire to allow 
each article to be independently read, the present 
article reproduces some material from the previous 
article (Golden, Hammeke, & Purisch, 1978) on the 
standardized Luria test battery. 
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Mnestic processes, The tasks assess an individ- 
ual’s retention and retrieval skills for visual, acoustic, 
and kinesthetic inputs, Items involve both verbal 
and nonverbal material. The effects of retroactive 
and proactive interference are also examined. 

Intellectual processes. The final section requires 
the subject to interpret the themes of pictures, to 
demonstrate vocabulary skills, to form concepts, to 
classify objects, to understand analogies, to under- 
stand complex arithmetical problems, and to show 
logical reasoning skills. 

The battery requires several pieces of inexpensive 
equipment. First, a series of cards with pictures 
and word items published by Christensen (1975c) 
are needed. Several additional pictures are neces- 
sary to replace some of the Christensen items that 
were found to be confusing for an American popu- 
lation. In addition, the battery requires (a) a 13- 
cm black comb; (b) a 24 X4 inch (5.5 X.6 cm) 
rubberband; (c) a paper clip, jumbo size; (d) a 
Box Compass 5178 (available from Empire Pencil 
Company, Shelby, Tennessee 37160); (e) a Pedigree 
Quality Eraser 2910 (also available from the Em- 
pire Pencil Company); (f) a key (WR2 Curtis 
177); (g) a straight pin; (h) a quarter; (i) a 
metric ruler; (j) an audiotape for some of the 
rhythm and verbal items; and (k) a blindfold, 


Procedure 


In the case of each potential subject, written 
permission was obtained to review medical records 
in order to assess whether the subject met the re- 
quirements of a confirmed diagnosis for the study. 
An individual whose diagnosis was questionable as 
Ha he iene or absence of brain injury was not 
ncluded unless a definitive diagnosi: 
established, E ak 

Once a patient had consented to participate, ad- 
ministration was arranged at a time that did not 
interfere with the hospital schedule. The majority 
of the test administrations occurred at bedside 
although free rooms for testing were occasionally 
available. Because of the relatively short time 
needed to administer the battery, most of the test- 
ing was done in a single session, I 
however, the session was interrupted 
medical tests. 

Scoring. Items were scored by i 
methods according to the nature if nip eR, a 
the qualitative dimension that was being pE 
These scoring methods included accuracy of re- 
sponse, frequency of response, adequacy of response, 
number of errors, time for performance trials to 
correct performance, and number of eens 
pleted. For many items, more than one arta 
method was used to measure different Sahina 


n some cases, 
by visitors or 


All raw scores were recoded i 
into a 3. 
12) scaled score, A scaled score of 0 was intended 
Oe es x the performance characteristic of a 
individual. A scaled score of 1 represented 


-point (0, 


‘N 
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an intermediate level of performance seen in both 
brain-injured and normal individuals. A scaled 
score of 2 represented the performance character- 
istic of brain-injured subjects. The scoring for each 
item was established by examining the performance 
of a mixed diagnostic group of 75 subjects collected 
in the first stages of this project. 

Summary scoring indices. The scores of the 
items in each section were summed to yield a sum- 
mary index for that function. Ten summary indices 
were created in this manner. 

Four additional scoring indices were also created, 
The first measure summed the 34 most effective 
indicators of brain damage. These were items that 
when scored as 2, were nearly always indicative of 
brain damage. These items were selected on the 
basis of the performance of a mixed neurological, 
schizophrenic, and normal population collected dur- 
ing the development of the battery (Golden et al, 
1978). This index was labeled the pathognomic 
index. 

The second index was the sum of all items that 
required right-hand motor or tactile/kinesthetic 
function. The third index was similarly devised for 
all left-hand items. These indices were labeled left 
hemisphere and right hemisphere, respectively. 

The final index represented overall performance. 
The scores on each of the 13 previous summary 
indices were converted into z scores using the means 
and standard deviations of data previously col- 
lected on a hospitalized normal control group. These 
13 z scores were then summed into a final index, 
labeled a total score. 


Results 


Two-tailed ¢ tests were computed for the 
scaled scores on all 282 measures between the 
schizophrenic and brain-injured groups. of 
the 282 comparisons, the schizophrenics pêet- 
formed significantly better on 72 items at 
the .05 level (df=98), The prain-injured 
group demonstrated better performance On 
2 items at the .05 level of significance (df = 
98). Using a stepwise discriminant analysis 
of the individual items, it was found that @ 
hit rate of 100% could be achieved with 40 
items. 

The means and standard deviations were 
calculated for each of the 14 summary W- 
dices. Differences between the two groups 
were then determined by the two-tailed t 
tests, as reported in Table 3. The schizo- 
phrenics performed significantly better than 
the brain-injured subjects at the .01 level 
on 9 of the indices and at the .001 level on 
1 other index. Four indices failed to signif- 
cantly discriminate between the groups- 
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i e3 
ns, Standard Deviations, and t Tests for Summary Indices 
Brain injured* Schizophrenic* 
Index M SD M SD fh 
Motor 44.36 19.70 34.20 17.53 213" 
Rhythm 12.21 5.59 12.26 -5.70 —.05 
Tactile 19.32 12.00 13.62 8.58 y kd 
Visual 16.14 5.53 12.82 5.32 3.06* 
Impressive speech 23.67 12.82 19.48 10.49 1.79 
Expressive speech 33.34 16.59 24.86 14,34 2:13" 
Reading and writing 24.67 11.91 17.51 10.51 3.19* 
Arithmetic 16.15 12.43 10.07 9.21 2,78* 
Memory 16.25 6.05 15.84 6.61 33 
Intellectual 38.32 15.53 33.24 13.80 1.73 
Pathognomic 34.46 12,10 22.20 9.46 5.65** 
Left hemisphere 18.82 10.91 13.24 8.37 2.87* 
Right hemisphere 18.00 11.12 12.44 8.35 2.83* 
Total 27.06 21.04 15.51 17.95 2.95% 
n= 50. 
if = 98. 
p< 01. 
$ < .001. 


Cross-tabulations were calculated to deter- 
ne the optimal cutoff point for each of the 
summary indices. The percentage of cor- 
t diagnoses for each index was determined 
ng the cutoff point. A discriminate analy- 


ble 4 


sis was calculated to determine the overall 
effectiveness of the summary indices. As can 
be seen in Table 4, even though none of the 
individual indices were able to discriminate 
at a hit rate of better than 74%, the combi- 


“ercentage of Correct Classifications Yielded by Cutoff Value for the Summary Indices 


ý % correctly classified 
d Brian Schizo- 
Index Cutoff* injured? phrenic? Totale 
i Motor 40 54 66 60 
f Rhythm 8 78 34 56 
4 Tactile 14 66 66 66 
Visual 14 68 66 67 
Impressive speech 22 52 70 61 
Expressive speech 26 70 68 69 
Reading and writing 20 58 74 66 
Arithmetic 9 64 64 64 
Memory 16 62 54 58 
d Intellectual 35 58 64 61 
Pathognomic 26 70 78 74 
Left hemisphere 13 68 62 65 
F Right hemisphere th 38 92 65 
í Total 13 74 56 65 
Overall = 84 92 88 
: ie Point was chosen to maximize percentage of total classifications. Subjects with a score less than or 
a mo cutoff were classified as schizophrenic. 


"= 100, 
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nation of indices was able to discriminate at 
an overall hit rate of 88% 


Discussion 


The results of this study demonstrate a 
high effectiveness for the standardized bat- 
tery in discriminating between schizophrenia 
and brain injury. No other test or set of 
tests reported in the literature have achieved 
a comparable 100% hit rate. The 88% hit 
rate using the 14 summary indices is also 
higher than the results reported in any com- 
parable studies. Most of the positive results 
reported for other tests were obtained using 
schizophrenic populations of slight to moder- 
ate chronicity compared to the long-term 
chronic schizophrenics used in this study. 

Summary indices were devised to make the 
test easier to interpret with individual pa- 
tients. Of the 14 indices, only the rhythm, 
impressive speech, memory, and intelligence 
summary measures did not significantly dis- 
criminate between the groups. The schizo- 
phrenic group performed in the brain-dam- 
aged range on these four scales, compared 
to a previously tested normal group (Golden 
et al., 1978). The acoustico-motor items re- 
quired greater sustained concentration and 
attention than other items, which could have 
resulted in difficulties for schizophrenics, who 
found it hard to maintain this focus. The 
memory index included many interference 
tasks that may have triggered irrelevant in- 
terfering associations. The intellectual in- 
dex, assessing higher abstract thinking and 
verbal-reasoning abilities, and the impres- 
sive speech index, which is disproportionately 
weighted with items Presenting complex ver- 
bal relationships, were relatively difficult for 
the schizophrenics compared to the more 
basic indices, 

On the other hand, schizophrenics demon- 
strated superior performance on the motor 
tactile, visual, left-hemisphere, and right. 
hemisphere indices, Similarly, most of the 
items from other indices on which schizo- 
phrenics performed better than the brain- 
injured Patients did not require complex sym- 
bolic manipulations, Sustained attention, or 


higher abstract ability. These results are con- 
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sistent with the theoretical position that neu- 
rological damage results in impaired func- 
tioning for both simple and complex tasks, 
whereas cognitive impairment associated with 
schizophrenia results in greater disruption 
on tasks requiring complex verbal abilities 
than on tasks requiring little verbal media- 
tion. 

The hit rates obtained for each of the 14 
indices demonstrate a 10%-20% decrease in 
diagnositic accuracy compared to the results 
obtained using hospitalized normal controls. 
It is clear that none of these indices used 
alone would be diagnostically accurate with 
long-term chronic schizophrenics. Neverthe- 
less, these results are equal to or better than 
those found for other tests in similar popula- 
tions. In addition, the 88% hit rate obtained 
is comparable to the 93% with the hospital- 
ized normal controls (Hammeke et al., in 
press), This result underscores the necessity 
of considering all indices in determining dif- 
ferential diagnoses between long-term chronic 
schizophrenic and brain-injured patients. 

The ability to comprehensively assess neu- 
ropsychological functioning in relatively little 
time is one of the major advantages of the 
battery. Yet, the development of effective 
Summary measures from the rich item pool 
may further shorten administration time 
when there are specific diagnostic questions. 
For example, an index comprised of items 
chosen for their sensitivity to the general 
effects of brain damage could be developed as 
a simple screening device when there is 4 
general question of presence or absence of 
brain damage. Indices used to answer other 
questions can be developed in a_ similar 
manner, 

Tn addition to its clinical usefulnes, the 
battery also shows good research potential. 
Unlike most other tests, the scoring of many 
items reflects a single qualitative dimension 
of a response, allowing component neuro- 
Psychological abilities to be more easily 1807 
lated and analyzed. Empirical relationships 
can be easily established between specific 
neuropsychological deficits associated with 
specific neurological disorders, with schizo- 
phrenia, and with such variables as medi- 
cation and prognosis. 
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DISCRIMINATING BRAIN INJURY AND SCHIZOPHRENIA 


Overall, the results demonstrated that the 
standardized battery is able to make the 
difficult discrimination between long-term 
chronic schizophrenics and brain-injured pa- 
tients. The battery’s main advantages are 
short administration time, comprehensive- 
ness, and a systematic breakdown of com- 
plex neuropsychological abilities. These fea- 
tures make the battery particularly well 
suited both for use as a research instrument 
in clarifying many of the presently unan- 
swered questions about the underlying pro- 
cesses of schizophrenia and as a Clinical tool 
for the neurodiagnostician working with 
schizophrenic and brain-injured patients. 
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Self-Administered Relaxation Training and Money Deposits 


in the Treatment of Recurrent Anxiety 


Clifford E. Lewis, Anthony Biglan, and Elizabeth Steinbock 
University of Oregon 


This study evaluated two self-administered relaxation manuals and a money 
deposit in the treatment of recurrent, nonphobic anxiety in a college popula- 
tion, Subjects were randomly assigned to a self-monitoring-only control group 
or one of four active treatment conditions. Subjects in active conditions re- 
ceived a progressive relaxation manual or a manual that called for the client 
to devise his or her own relaxation method and were assigned to deposit or 
nondeposit conditions. Improvement did not differ for the two relaxation pro- 
cedures, but relaxation training groups improved significantly more than self- 
monitoring-only subjects on both self-report questionnaires and self-monitored 
measures of anxiety. The money deposit did not produce greater amounts of 
relaxation practice or adherence to the program, although subjects in the money 
deposit condition did report being more relaxed in practice sessions and im- 
proved more on two pre-post measures of anxiety. Subjects’ locus of control 
scores were significantly related to a number of practice, adherence, and out- 
come variables, but subjects’ ratings of the likelihood that they would practice 
and benefit from the program proved to be as good predictors. The study sug- 
gests the value of self-monitoring and relaxation Practice as treatment for 


recurrent, nonphobic anxiety. 


Although considerable progress has been 
made in our ability to treat phobias (cf. 
Bandura, 1969; Kazdin & Wilcoxon, 1976; 
Paul, 1969), we have not yet identified an 
effective strategy for treating nonphobic, anx- 
ious clients. Despite the large number of 
phobic treatment studies, there have been 
surprisingly few systematic investigations of 
how we might effectively assist clients who 
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experience recurrent anxiety for which no 
stimuli can be identified or for which stimuli 
are too numerous to allow the use of phobic 
treatment procedures. In this study, we eval- 
uated two forms of self-administered relaxa- 
tion training as treatments for recurring, 
nonphobic anxiety in a college population 
and examined whether a money deposit in- 
creases the number of practice sessions that 
clients complete. 

Perhaps the most promising approach to 
nonphobic anxiety involves the use of some 
form of relaxation training. There is evidence 
from case studies that relaxation can benefit 
chronically anxious persons (Dawley, 1975; 
Raskin, Johnson, & Rondestvedt, 1973). 
Sherman’s work (Sherman, 1975; Sherman & 
Plummer, 1973) suggests that persons can be 
taught to continue to use relaxation in stress- 
ful situations, Both Zeisset (1968) and Gold- 
fried and Trier (1974) have presented evi- 
dence that relaxing as an active “coping 
skill” is beneficial at least for situationally 
specific anxieties, Thus, a program of relaxa- 
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TREATMENT OF RECURRENT ANXIETY 


tion training with instructions to relax in 
problematic situations could be an effective 
strategy for treating persons who suffer from 
recurrent, nonphobic anxiety. 

It may not be necessary, however, to use 
the progressive or deep muscle relaxation 
procedure in such a program. It is possible 
that any procedure that helps clients inter- 
rupt chains of anxious responses will benefit 
them. Numerous responses other than deep 
muscle relaxation have been shown to reduce 
anxiety, including oriental defense exercises 
(Gershman & Stedman, 1971), muscle ten- 
sion (Wolpin & Raines, 1966), and medita- 
tive procedures (Benson, Beary, & Carol, 
1974). Kass, Rogers, and Feldman (1973) 
reported success in six of seven desensitiza- 
tion cases in which clients used anxiety-in- 
hibiting responses that were already in their 
repertoires, This approach could be better 
than progressive relaxation training, since its 
instructions are simplified and subjects may 
be more likely to continue to use the tech- 
niques after treatment. In the present study, 
we examined whether persons who suffer 
from recurrent anxiety can ‘achieve improve- 
ment simply through a self-administered pro- 
gram that instructs them to develop their 
Own method of getting relaxed at regular 
Practice sessions. 

Perhaps the major problem in using self- 
administered materials is the lack of client 
adherence to the program (Glasgow & Rosen, 
1978). In Rosen, Glasgow, and Barrera’s 
(1976) study of self-administered desensiti- 
ation of snake phobias, less than half of the 
Subjects in self-administered conditions com- 
aa half of their hierarchies, and the num- 
oa A program steps completed was highly 
ihe ated with a variety of outcome mea- 
an a Thus, getting subjects to continue 
MARN on self-administered programs may 

ase their effectiveness. 
poy deposits have been used extensively 

Sa method of achieving compliance with 
Koes aspects of treatment programs. 
he ae been found to be effective in 
& ae to keep appointments (Grove 
1972) Hae Note 1), lose weight (Hall, 
Bellack ontinue in treatment (Bellack, 1976; 

, Schwartz, & Rozensky, 1974; Ha- 
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gen, Foreyt, & Durham, 1976), and adhere 
to program procedures (Eyberg & Johnson, 
1974). We could find no studies that evalu- 
ated the effects of a money deposit on per- 
sistence in a self-administered program. 
Therefore, we experimentally evaluated 
whether a money deposit would increase cli- 
ents’ practicing in the two self-administered 
relaxation programs that we wished to eval- 
uate. 

Finally, it would be useful to establish 
client characteristics that are associated 
with successful use of self-administered pro- 
grams. One often-cited variable, locus of con- 
trol (Rotter, 1966), has been suggested as a 
predictor of program adherence and outcome 
(Abramson, 1973; Balch & Ross, 1975; 
Friedman & Dies, 1974; Mahoney & Thore- 
sen, 1974). In addition to the possibilities 
of prediction from locus of control, it may 
be that clients can predict their own success 
directly (cf. McReynolds & Stegman, 1976). 
A straightforward approach to the problem 
of predicting adherence, such as ratings of 
the likelihood that they will comply with 
instructions, may be the best and cheapest 
method of selecting those who will use a 
self-administered program successfully. 

In summary, this study evaluates the ef- 
fectiveness of two self-administered manuals 
for relaxation training as treatments for re- 
current anxiety. The manuals are evaluated 
in comparison with a self-monitoring control 
condition. The usefulness of a money deposit 
in achieving program adherence and anxiety 
reduction is also evaluated. Finally, locus of 
control ‘and clients’ own predictions are eval- 
uated as predictors of program adherence and 
outcome. 


Method 
Subjects 


Subjects were recruited from the University of 
Oregon and a nearby community college. The pro- 
ject was described as offering treatment for anxiety 
and tension. At initial phone contact, 
declined to participate, and 3 additional persons 
declined to participate at the first appointment. 
Among the remaining 61 subjects who entered the 
study, there were 38 females and 23 males. They 
ranged in age from 18 to 44 years, with a mean 
of 24 and a mode of 20 years. 
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Interventions 


During the initial interview subjects were in- 
structed on methods of self-monitoring their anxi- 
ety, Following a week of self-monitoring, subjects 
were randomly assigned to one of five conditions. 
Assignments were done separately for men and 
women in order to equate the proportion of men 
and women in each condition. 

Self-administered progressive relaxation. Subjects 
in this condition received a manual on deep muscle 
relaxation, which was written by Rosen (1975, 
1976). The manual contained complete instructions 
for learning to relax. In addition, it described 
methods of becoming able to relax in tension- 
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and the second is designed to measure the respond- 
ents’ tendency to be anxious in general. The pre- 
treatment assessment questionnaire asked subjects 
about previous experience with relaxation training, 
the importance of learning to relax, and the amount 
of therapist time that they would ideally like to 
have in learning to relax, They were then asked to 
list three situations that occurred at least once each 
week and during which they would like to be more 
relaxed, They rated their comfort in each of these 
situations at its most recent occurrence. Finally, 
the questionnaire asked them to rate the extent to 
which they believed that they could learn to relax 
using an entirely self-administered program, All 
ratings were done on 7-point scales. 

The State-Trait Anxiety Inventory was adminis- 
tered at posttest, 3 weeks following the initial inter- 
view at the same time, and under the same circum- 
stances as had existed at pretest. A postassessment 
questionnaire was also administered. It asked sub- 
jects to rate (a) their ability to relax, (b) their 
sense of self-control, (c) the benefit that they 
felt they had derived from the program, and (d) 
the likelihood that they would use the techniques 
that they had learned for relaxing in the future, 
Finally, subjects rated their comfort during the 
most recent occurrence of each of the three situa- 
tions that they had specified at pretest. 

Internal-external Locus of Control (1-E) Scale. 
A modified version of the I-E scale (Rotter, 1966) 
was administered at the initial session and at post- 
testing (Cohen, Rothbart, & Phillips, 1976). The 
scale was administered at posttest to determine 
whether outcome in the relaxation program was 
associated with changes in locus of control, 

Self-monitoring. During the intake interview sub- 
jects were asked to identify two situations in which 
they experienced anxiety. One of the designated situ- 
ations (Situation A) must have occurred daily. The 
cond situation was one that occurred at least 3 

each week (Situation B), Clients were given 
pocket-size booklets and were instructed to record 
the occurrence of Situations A and B, their tension 
levels in these situations, and any other occurrences 
of tension. Ratings of tension were done on a 100- 
point scale, where O indicated no discomfort and 
the most extreme tension that the 


3 weeks of the program. 

questionnaire. Subjects who received 
were asked to rate their ex- 
the program immediately after 
their relaxation instructions. They 


‘at this manual are available from the 


and (c) how easy they thought it 
e to follow the program. Ratings were 
7-point scales. 

practice log. Subjects who received the 
n instructions were asked to record the 
and length of each practice session, and 
to rate on 10-point scales how easy it 
get relaxed and how relaxed they actually 


were seen on three occasions, During 
appointment the study was desoribed and 
ect was asked to read and sign a statement 
med consent. During this session, subjects 
out all pretreatment assessment instruments, 
‘assisted in identifying situations that made 
nxious (Situations A and B), and were given 
on self-monitoring their anxiety. After 
of self-monitoring, subjects returned to the 
d met with the experimenter. At this time 
in the active treatment conditions were 
tion manuals, and those in the deposit 
n were asked to put up an amount of money 
“would not want to lose” (Grove & Fred- 
ote 1). Subjects in the relaxation condi- 
lso received the expectancy questionnaire and 
logs. Those in the self-monitoring-only 
simply received further self-monitoring 
als and were reminded that self-monitoring 
Il help them to reduce anxiety. The third ses- 
irred following 2 weeks of relaxation prac- 
Kt this time subjects were asked to complete 
reatment instruments as described above. 
who had been in the self-monitoring-only 
n received copies of the Rosen manual and 
for its use at this session. 
amount of time that the experimenter spent 
Subject was limited. The total time in 
h experimenter was in contact with the 
Over all three sessions was typically 1 hour 
10 minutes. The bulk of this time (approxi- 
60 minutes) was spent in subject’s complet- 
relevant questionnaires. Thus, the proce- 
f this study approximate a wholly self- 
red treatment program. 


Results 
Analyses 


’ State Anxiety and Trait Anxiety 
Were compared with published norms 

e scales (Spielberger et al., 1970). 
Anxiety, this study’s sample scored 
the 73rd and 76th percentile for 

Undergraduates. For Trait Anxiety, 
imple was at the 90th percentile. Com- 
with norms for clients seeking as- 
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sistance at a university counseling center in- 
dicated that the present sample was essen- 
tially equal to counseling center clients in 
State Anxiety and 6 points higher on Trait 
Anxiety. Of the 61 subjects, 39 had sought 
treatment previously, The mean length of 
time that clients had been anxious was 58 
months. 

As a check on randomization, one-way 
analyses of variance across the five treat- 
ments were conducted on all pretest mea- 
sures, No significant differences were found 
(all ps > .15). One-way analyses of variance 
for sex differences at pretest and posttest 
were conducted as well as analyses of vari- 
ance for the Sex X Treatment Conditions in- 
teraction, For all but one measure, no sex- 
related differences were found (all ps > .70). 
The only significant sex-related difference 
was found for pretest and posttest locus of 
control scores (both ps < .02). Men scored 
significantly more internally than women, a 
difference that has been found by other in- 
vestigators (Feather, 1968; Phares, 1976; 
Platt, Pomeranz, Eisenman, & DeLisser, 
1970). 


Statistical Analyses 


Tests of the effects of the experimental 
manipulations were conducted by planned 
comparisons of change scores between pre- 
testing and posttesting. These included (a) 
tests for differences between the two relaxa- 
tion manuals; (b) tests for differences be- 
tween the relaxation conditions and the self- 
monitoring-only condition; and (c) tests for 
differences between relaxation with money 
deposit and relaxation without a money de- 
posit. These comparisons were made for both 
questionnaire anxiety measures and for self- 
monitored data, For self-monitored data, 
means were computed for Weeks 1 and 3, and 
change scores were derived by taking the 
differences between these means. 


Effects of Relaxation Training 


Table 1 presents basic pretreatment and 
posttreatment means for each condition, 

Self-administered progressive relaxation ver- 
sus client-devised relaxation, Planned com- 


C. LEWIS, A. BIGLAN, AND E. STEINBOCK 


1278 


“o[POS 948IS AjerKuY = o78I¢-y ‘ƏVƏS VLIL AJerKUY = WeIy-y t Á10}UəAU] AJoRKUY HEILS = TV_LS 20N 


zre Of to OL SSor 9F0S TI SIST- SAECO ICT 9Let oos? 8 TUTSI CL LS M Wod q uonenyig 
8602 O16L OI 878 £478 IT 9L'ST OS6L ZI 8T 8878 8 1E ELTERE TT əd jo Suney 
6£F2 O6TL OF 8L6r 8T'8} TT 90'St STOS TI 9F7C 88SS 8 09°81 006S TI 380g y uonenyis 
TS'9T OOTES OT I8'ST O08 TT LLI 8078 ZI Zz9HI 8c08 8 PFL. cees TI zd Jo 3uney 
ZO'TT OCS OF 68 6OTr H El- SE. zt Ori 88} 8 8S6 77h FI 380g HeEAL-V. 
seo OFros OI 60'8 98h IT FL S78} ZI 19 SLtS 8 sos FZS IT 24d) IVLS 
LY9L bh bh OT 8h L FSS S TI 028 80€ ZI IE'6 8898 8 662 608€ IT 350d 378IS-V 
TUTTE OLOR OT 607% sspe IT L¥9° 191% GE 9L7T Step 8 tE9 zg OTT ad, IV.LS 
sainsvaur jsod—o1g 
TESZ PETE 6 TEPI ETTE 6 90'8T 919E 6 OL8t OLAT L OPTI 989% 6 480g [eA9] uorsua} jo 
ScO0t £98} 6 fOLT 64h 6 sost Ibt 6 LLIT- 6LTh L 0601 89S 6 Əd uoneuns py 
S97L 8LT9 6 06'0£ BLS 6 £997 oote 6 OL'ET TL'OE 4 Test ITI} 6 sod suonen rago 
beh 68b 6 zesg TTSS 6 BLES ITS 6 TEI EFPL L OV6Z 86t 6 əd Aouanbay pozewnsq 
L67H 9S6 6 968 0002 6 197E 8L6I 6 LOST €8%% 9 £v8 6867 6 380g suorenyts pre JO 
BLSE- LILE 6 9VL 9STS 6 490% TTE 6 Wee sees 9 T€6 §6O007%E 6 ad Aouanbayy [e301 
POET BOTS 9 o9'ez7 £O'SE 9 6691 TIF} L 60T EELE F LOET OLS 4 Pog: g vonenyis 
PESE 9065 9 #801 LTPS 9 est IW8h £ 66S TILTI + 6S'OZ LVRS 4 Vd JOW 
OFIZ 99Lb 8 OL9% 69th 8 £89 BS lb 8 L87 889 S TOLT ores 8 380g y uonenyig 
OL'ET PSAS 8 64i Sols 8 oLet soes 8 PLET TES S 8881 10 8 ad PW 
6° 999 866 OSES TEE 0, 08°% TESSO Ee A LT 9 Tg's 8S 6 380d g uonenjis 
So TS. s6 we s8Le 6 or, L498 6 esot LtiIt 9 zos 004 6 əd Jo Aouanbaiy 
. + a x i ; z Pp s 350d y uonenys 
OSC EERO 9E 849 6 wr oos 6 wwe EBE 9 994 848 6 
AEN CEL A es -w6 6 Toe 988 6 LoS 19'S eo FL 688 6 ad Jo Aouanbaxy 
Suiiojuo0ul-jjag 
u xn * ERT amseayy 
as WN ad as I O a as S as W as ES 
yısodə] yenueuwr yenuew 
]o13u09 Ayu qisodap 2 uonexejer uOIyexefar eae haa OSEAS 


-3unoyjuow-JjəS pəs14əp-7U1 pəstaap-4ua19 
bo SS SS Ss l eee 
uoyrpuoy KQ SUDIJY JUPUDILYSOJ PUD JUIMsDIAII4T 


TaqeL 


risons of progressive and client-devised 
ation revealed no significant differences 
n any of the self-monitored or pre-post mea- 
res of anxiety. However, subjects who re- 
ived progressive relaxation did rate them- 
Ives as able to relax at will to a significantly 
eater extent than did subjects in the client- 
ised condition, £#(47) =2.26, p< .05 
wo-tailed). The groups did not differ on 
ited expectations. 
Relaxation training versus self-monitoring 
only. Planned comparisons between relaxa- 
tion training and self-monitoring-only condi- 
tions revealed a number of significant differ- 
ences, Subjects who received relaxation train- 
g were significantly more improved on three 
monitored variables. They evidenced 
eater improvement on the frequency of 
uation B, #(13.9) = 2.25, p < .025,° the 
total frequency of all anxiety-arousing situa- 
ions, £(14.6) = 2.46, p < .025, and end of 
the day estimates of the total frequency of 
nxiety-arousing situations, ¢(19.8) = 2.22, 
< .025, At pretreatment and posttreatment 
Sessment, subjects receiving relaxation 
aining were significantly more improved 
than the self-monitoring-only subjects on (a) 
tate Anxiety scores, (47) = 2.40, p < .01; 
) ratings of the discomfort that they ex- 
jenced in Situation A, ¢(47) = 1.78, p< 
“05; and (c) ratings of their discomfort in 
Situation B, (47) = 1.78, p < .05. 
Three additional problem situations were 
Tated at pretreatment and posttreatment but 
Yere not monitored during the program 
tiod. These unmonitored situations may be 
viewed as measures of the generalization effect 
Produced by treatment conditions. Changes 
m the ratings of these situations did not 
vier among active treatment conditions. 
owever, subjects in active treatment condi- 
“ons evidenced significantly greater improve- 
oo their ratings of discomfort in these 
ko ions than did the self-monitoring-only 
e P, (47) = 1.73, p < .05, t(47) = 2.44, 
P< .05, and £(47) = 2.01, p <.05. 
Effects of Money Deposit 
q : 

poets in the money deposit condition 
a Participate in a minimum of 15 prac- 

essions to get 75% of their money back. 
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Subjects in this condition were asked to de- 
posit an amount of money that they “would 
not want to lose.” Amounts deposited ranged 
from $.10 to $50. The mean amount of money 
deposited was $6.01, and the median was 
$1.50. 

Practice and program adherence. Subjects 
in deposit and nondeposit conditions did not 
differ in the total number of practice sessions 
or in the average length of practice sessions. 
An examination of the group means for length 
of practice session indicated that subjects in 
the progressive relaxation without money 
deposit and the client-devised relaxation with 
deposit conditions spent significantly more 
time in practice sessions than did subjects in 
the other two groups, #(31) = 2.60, p < .01. 
Subjects in relaxation conditions rated their 
depth of relaxation in each practice session as 
well as the ease with which they were able to 
relax. Those in the money deposit condition 
had significantly higher ratings on ease, (32) 
= 2.60, p<.01, and depth of relaxation, 
£(32) = 2.54, p < .01, than did nondeposit 
subjects. 

Program adherence was assessed in terms 
of the number of days on which subjects kept 
self-monitored records, the number of steps in 
the program that they completed, and the 
number of days that they remained in the 
program. Deposit and nondeposit subjects 
did not differ on any of these variables. Nor 
were there significant differences for the 
proportion of subjects in each condition who 
dropped out of the study. $ 

Changes in anxiety measures. Subjects in 
the money deposit condition improved sig- 
nificantly more than nondeposit subjects on 
two of the self-monitored ratings of anxiety. 
Deposit subjects improved significantly more 
on their ratings of tension in Situations A and 
B, (33) = 1.82, p < .05; t(5.1) = 2.63, < 
05. There were no differences between de- 
posit and nondeposit conditions on any of the 
questionnaire anxiety assessments. 

Relation of amount deposited to other 
variables. Finally, it was found that the 


amount of money that subjects in the de- 


2When the Bartlett-Box F was significant, sepa- 
rate variance estimates and degrees of freedom 


were calculated. 
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posit condition put up was correlated with 
their rating of the importance of their learn- 
ing to relax, r(25) = 40, p < .05, but was 
not related to any outcome or practice mea- 
sures, 


Predictors of Program Adherence, 
Practice, and Outcome 


Adherence and practice. Locus of control 
was significantly correlated with the number 
of days on which subjects self-monitored their 
anxiety, 7(61) = —.30, p< .025, and the 
number of steps that they completed in the 
program, 7(61) = —.34, p < .025. Locus of 
control was also significantly correlated with 
the number of days that the subject re- 
mained in the program, r(61) = = 29a 
.025. All of the correlations were in the direc- 
tion of “internals” showing greater program 
adherence, Locus of control scores were not 
significantly related to practice variables such 
as number of relaxation sessions completed, 
length of practice sessions, or rated ease and 
depth of relaxation, 

Subjects rated their expectations that they 
would (a) be able to learn to relax using the 
program, (b) do the practice sessions, and 
(c) find it easy to follow the program. These 
ratings were significantly correlated with both 
adherence and practice variables, With respect 
to adherence, subjects who predicted that the 
program would enable them to relax kept 
self-monitored records for more days, r(43) 
= 38, ~ <.006, and those who predicted 
that they would do the practice sessions kept 
self-monitored records for more days, (43) 
= 48, p < .001, completed more steps of the 
program, (43) = 39, p< 005, and re- 
mained in the program longer, 7(43) = 44 
p < 002. With respect to practice, it was 
found that subjects’ ratings of the likelihood 
that they would practice were significantly 
correlated with the number of practice ses- 
sions that they completed, r(43) = 45 p< 
:001. Moreover, subjects who believed that 
the program would help them relax had sig- 


nificantly greater ease in relaxi = 
38, p < .05, ieee 


Outcome measures, 
outcome to locus of cont 
correlating locus of con 


The relationship of 
rol was evaluated by 
trol scores with sub- 
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jects’ change scores on the four pre-post 
anxiety measures and the seven self-monitor 
variables. Significant relationships were found 
for three variables. Those scoring in the 
internal direction on locus of control showed 
greater reduction in trait anxiety, r(52) = 
—.28, p< .02, and on self-monitored data 
they showed greater reductions in the total 
number of times that they were anxious per 
day, r(42) = .27, p < .04, and their end of 
day rating of tension, r(43) = .30, p < .025. 

A similar analysis was conducted to deter- 
mine whether subjects with high expectations 
for the program improved significantly more 
than subjects whose expectations were low. 
Subjects who predicted that the relaxation 
program would enable them to relax were 
significantly more improved on state, 7(40) = 
50, p< .001, and trait, 7(40) = 40, p< 
006, anxiety and on their daily rating of 
tension, r(34) = .29, p < .05. No other cor- 
relations were significant. 


Program Adherence as a Predictor of Outcome 


Finally, analyses were conducted to deter- 
mine whether program adherence variables 


were related to outcome. No significant re- ” 


sults were found. Differences among subjects 
in the number of steps that they completed 
in the program, the length and number of 
their relaxation practice sessions, their rated 
ease and depth of relaxation during these 
sessions, and the number of days on which 
they completed self-monitoring records were 
not related to any pre-post or self-monitored 
anxiety measures, 


Discussion 


The results of this study suggest that self- 
administered relaxation training, when com- 
bined with self-monitoring, can be of signifi- 
cant benefit to persons who suffer from 
recurring anxiety. Moreover, since few differ- 
ences were found between progressive and 
client-devised relaxation groups, it appears 
that extensive, structured training in progres- 
Sive relaxation is not necessary to achieve 
these benefits. Even though relaxation groups 
were superior to the self-monitoring-only con- 
dition, it should be noted that. the latter 
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condition could itself be considered an active 
treatment. Indeed, clients in this condition 
improved on four of the seven self-monitored 
measures and on three of the four pre-post 
anxiety measures. Three of these improve- 
ments were statistically significant. This is 
consistent with other evidence in the litera- 
ture that self-monitoring can produce changes 
in behavior (Nelson, Lipinski, & Black, 1975; 
Richards, McReynolds, Holt, & Sexton, 
1976). 

The improvements of the self-monitoring- 
only subjects also suggest that the beneficial 
effects of the relaxation conditions were de- 
pendent, in part, on these subjects’ self-moni- 
toring of anxiety. Relaxation training may 


, not be of benefit in the absence of self-moni- 


toring. 

In both sets of relaxation training mate- 
tials, subjects were explicitly instructed to at- 
tempt to actively use their relaxation when 
they began to feel anxious. Although it was 
not experimentally evaluated in the present 
study, this instruction may have been essen- 
tial to the success of the program. The fact 
that relaxation subjects were significantly 
more improved than self-monitoring-only 
subjects in the unmonitored anxiety-arousing 
Situations suggests that relaxation subjects 
were able to generalize the relaxation re- 
Sponses that they were learning to a variety 
of problematic situations. 

It should be remembered that subjects in 
this study were college students. Although 
Comparison with available norms suggests 
that they were moderately to highly anxious, 
the results of this study may not generalize 
to a more severe population such ‘as chroni- 
cally anxious outpatients in a clinical setting. 
K Although subjects in the relaxation train- 

8 groups improved significantly more than 
dite in the self-monitoring-only control con- 

m, intersubject variability in improve- 
hae Was quite high on many of the vari- 

‘es. Thus, even though the present findings 
ee to the value of further developing re- 

xation treatments for recurrent anxiety, 
oe not represent clinically significant 
1 ments for all relaxation subjects. One 

a 8y for treating anxious clients would 

© have a variety of techniques available. 
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The relaxation technique that has proven ef- 
fective for the greatest number of people 
could be used first, but in the event that the 
first intervention was not successful, addi- 
tional procedures should be available. 

This study provides only limited support 
for the use of money deposits in self-ad- 
ministered programs. Despite the fact that 
the return of 75% of the money deposit was 
contingent on subjects’ completing 15 or 
more relaxation practice sessions, money de- 
posit groups did not practice significantly 
more than nondeposit groups. In fact, the 
mean number of practice sessions for deposit 
conditions was less than 15, which was lower 
than the comparable nondeposit conditions. 
Subjects who deposited money did rate the 
ease and depth of relaxation in practice ses- 
sions higher than did nondeposit subjects, 
but these results may be due to a nonsig- 
nificantly higher dropout rate among de- 
posit subjects. Similarly, differential dropout 
may account for the few outcome differences 
that favored the nondeposit condition. 

It would be premature, however, to com- 
pletely abandon the use of money deposits in 
self-administered programs, Most subjects 
in the deposit condition put up very small 
amounts of money. The mean deposit was 
only $6.01, and only six subjects put up 10 
or more dollars. In a project to develop self- 
administered materials for depressed clients, 
we have generally found that persons seeking 
treatment put up larger amounts of money 
and generally do comply with our programs. 
Even though it is true that the amount of 
money deposited did not correlate with prac- 
tice or outcome variables in this study, this 
result may be due to the fact that few sub- 
jects put up amounts of money that could 
have had an impact on their subsequent ad- 
herence to the program. 

Our findings regarding the prediction of 
program success suggest the practical utility 
of simply asking people whether they think 
they will comply with and succeed in a self- 
administered program. The single rating of 
likelihood that subjects would complete the 
practice sessions was significantly related to 
the number of practice sessions completed as 
well as to all three measures of program ad- 
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herence. Similarly, the rating of the likeli- 
hood that the program would enable them to 
relax was significantly related to improve- 
ments on state anxiety and trait anxiety as 
well as daily tension ratings. Although locus 
of control scores were also related to pro- 
gram adherence measures and to three out- 
come measures, these relationships were less 
numerous and consistently less strong than 
those for the subjects’ predictions about their 
own behavior. These results are consistent 
with the hypothesis that the I-E scale is it- 
self measuring the extent to which people 
say that they will exert effort and can suc- 
ceed on a variety of tasks. Thus, the scale 
need not be viewed as a measure of an in- 
ternal, unitary construct. Rather it can be 
considered to assess the extent to which the 
people say, in a variety of ways, that their 
effort makes a difference. The scale’s ability 
to predict success in self-administered pro- 
grams may simply be due to the fact that 
working on a self-administered program is 
similar to the kinds of activities that the I-E 
scale asks about. Those who say that prepa- 
ration makes the difference on a test will 
also say that they can use a self-administered 
program successfully. And, to some extent, 
these verbal responses are related to actual 
behavior (Skinner, 1957), 

In concluding, we note several limitations 
of the present study that define some of the 
requirements for further research, First, the 
data were restricted to self-report question- 
naires and self-monitoring. It is essential that 
we investigate whether such measures are ac- 
curately related to anxiety as it actually 
occurs in the daily life of clients, Second, 
the absence of relationship between program 
adherence and outcome variables leaves us 
unable to pinpoint the effective components 
nt conditions. It 
entally manipulate 
of practice sessions 
g to elucidate their 
Finally, ratings of 
outcome were not 
oring-only subjects 


of the relaxation treatme: 
would be useful to experim 
variables such as number 
and use of self-monitorin; 
contribution to outcome, 
subject expectation for 
obtained from self-monit 
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tion conditions, it can be argued that the 
superiority of relaxation groups was more a 
matter of social influence than of the specific 
effects of relaxation and self-monitoring. It 
will be important in subsequent research to 
assess whether the self-monitoring condition 
does produce expectations for improvement 
that are comparable to those for relaxation 
conditions. 


Reference Note 


1. Grove, D. N., & Fredricks, H. D. Teaching re- 
search parent training: A clinic model. Unpub- 
lished manuscript, Teaching Research Infant and 
Child Center, Monmouth, Oregon, 1974. 
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Goal Definition by Staff Consensus: 
A Contribution to the Planning, 


Delivery, and Evaluation of Mental Health Services 


Theodore W. Lorei and Eugene M. Caffey, Jr. 
Veterans Administration, Washington, D.C. 


To provide a basis for the evaluation of the Veterans Administration mental 
health services, a survey was conducted of staff opinion regarding the impor- 
tance of several specific goals for these services. Nine goals were formulated 
and submitted to 6,435 central office and field facility staff to obtain their 


ratings of the importance of each of these goals, with an “of no importance” 
rating being possible. The goals dealing with (a) the development of patient 
skills necessary for being self-supporting, (b) the elimination of psychological 
disorders, and (c) the protection of patients and others from violence received 
the highest average ratings. The remaining goals were considered as having 
some importance, although there were substantial differences among the sample 
about the degree of importance. Interoccupational group differences in impor- 
tance ratings were statistically significant but small. Although the goals formu- 


lated and ratified by staff 
than previous goal statements, Be 
useful guidance for planning, 
health services. 


More than 6,000 professional staff mem- 
bers are involved in planning and delivering 
mental health services in Veterans Admin- 
istration (VA) hospitals and clinics, as well 
as in the national administrative office, To 
what extent do these people have a common 
understanding of the objectives of these ser- 
vices? To find out, we surveyed their opin- 
ions about nine carefully formulated goals 
describing the desired impact of mental 
health services on the well-being of patients 
and communities, The results, analyzed for 
the group as a whole and for six professional 
subgroups, are reported here. 
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were quite general, they were much more specific 
cause of this relative specificity, they provide 
managing, delivering, and evaluating mental 


The study was developed from a previous 
‘application of decision theory concepts to de- 
cisions about releasing patients from psychi- 
atric hospitals (Lorei, 1970). This applica- 
tion involved the formulation of a set of 
brief statements describing the possible out- 
comes of releasing or retaining patients. The 
importance of attaining or avoiding these 
Outcomes was rated by clinical and admin- 
istrative staff in 13 hospitals. These ratings 
Suggested that staff differed substantially in 
their underlying objectives. For several rea- 
Sons that are discussed more thoroughly 
later, it seemed important to inquire about 
these underlying objectives directly. 

The research reported is related to the 
Work just described but differs in several im- 
Portant respects. First, the central objects of 
the study were goals rather than outcomes. 
(For our purposes the relationship between 
Outcomes and goals is one in which a goal is 
defined as a positively valued outcome.) Sec- 
ond, we are concerned not with a specific set 
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of alternative acts (release or retention from 
a hospital) but with a complex program of 
activities—all mental health services provided 
by the VA. And finally, in establishing the 
importance of goals for these services, staff 
from the entire VA mental health system, 
including those in the national administra- 
tive office, were consulted. 

Although our original interest in outcomes 
and goals derived from decision theory, lit- 
erature from several other areas also stresses 
the importance of goal clarity. Examples of 
these areas include management theory 
(Raia, 1974), education (Mager, 1975), psy- 
chotherapy (Mahrer, 1967), behavior modi- 
fication (Bandura, 1969), and program eval- 
vation (Weiss, 1972). The reason for the 
emphasis is clear: Individuals and groups 
cannot successfully manage organizations, 
devise curricula, modify the behavior of 
others, or evaluate programs unless the goals 
Motivating the managers, the educators, the 
behavior modifiers, and the program opera- 
tors are clear, The administration of mental 
health services (their planning, implementa- 
tion, and evaluation) should profit from the 
Clarification of goals found essential in so 
many fields. 

The research described here was intended 
to contribute to the clarification and specifi- 
Cation of the goals of all major mental health 
Res Our strategy was simple: We formu- 
lated a provisional set of goals and then 
asked the operators of the VA mental health 
oe what they thought of these goals. 
er eanes tions we sought to answer were (a) 
a een did VA staff consider each 
a ae goals presented? (b) What addi- 
as T s did staff suggest? (c) How much 
T or vas there about the importance of 

R eon and (d) What fundamental value 

sions appear to determine the goal im- 
Portance ratings? 


Method 
Sample 


pian sample included staff both in the VA 
and free-st ministrative office and in all hospitals 
È aE clinics serving substantial numbers 
inclu ded ric patients. The national office sample 

staff with responsibilities for mental health 
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programs. The hospital and clinic target sample was 
designed to include two subsamples: (a) manage- 
ment staff—hospital (or clinic) director and as- 
sistant director; the chief and assistant chief of 
staff; and the chiefs and assistant chiefs of psy- 
chiatry, psychology, social work and nursing and 
(b) clinical staff—all physicians (usually psychia- 
trists), psychologists, social workers, nurses, reha- 
bilitation medicine senior therapists, and chaplains 
who spent at least 50% of their time working 
with psychiatric patients. 


Instrumentation 


Nine goals were written by the first author on 
the basis of the previous research on staff opinions 
about possible outcomes of release from or reten- 
tion in psychiatric hospitals, review of agency docu- 
ments, and discussions with VA staff in both the 
national office and in the field. The formulations 
were guided by Hatry’s (1970) advice that pro- 
gram goals should emphasize outputs (in this case, 
effects on people) rather than inputs, such as num- 
ber of patients treated, services provided, and so 
forth. For example, a statement such as “involve 
more patients and families in family therapy pro- 
grams” was not regarded as a goal. However, in- 
cluded in the list of goal statements was onè pos- 
sible result of such programs—‘“to minimize stress 
on families resulting from living with and/or being 
responsible for veterans with psychiatric disabili- 
ties.’ The nine goal statements were written in 
questionnaire format. In the first section, the re- 
spondent was asked to rate the importance of each 
goal on a 6-point category scale. In the second 
section, goals were presented in sets of three, and 
the respondent was asked to rank them in order of 
their importance. Using a computer program de- 
veloped by Gulliksen and Tucker (1961), which 
made the computation feasible, an incomplete pair- 
comparison procedure was used to derive interval 
scale values. (Only the results of the rating pro- 
cedure are reported here.) 


Results 
Total Staff Opinions about Goal Importance 


The mean importance ratings for all staff 
combined are shown in Table 1. Scale points 
ranged from 0 (of no importance) to 5 (ex- 
tremely important). Three goals received an 
average rating greater than 4 (of great 1m- 
portance)—(a) To develop the self-care, 1n- 
terpersonal, and work skills necessary for 
veterans to become or remain self-supporting 
in the community; (b) to eliminate or re- 
duce disorders of perception and thinking, 
severe emotional distress, drug addiction, and 
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Table 1 i 
Mean Importance Ratings for Goals of Services 
for Psychiatrically Disabled Veterans 
——EE——————————— 
Goal M SD 


How important is it for the VA to: 


4, Develop the self-care, inter- 
personal, and work skills neces- 
sary for veterans to become 
or remain self-supporting in the 
community? 

5. Eliminate or reduce disorders of 
perception and thinking, severe 
emotional distress, drug 
addiction, and alcoholism? 

6. Minimize the chances that 
severely disturbed veterans 
will injure or kill themselves 
or others? 

9. Minimize restriction of veterans’ 
personal liberty by requiring 
hospitalization only when 
absolutely necessary? 

8. Provide a sheltered environment 
for those veterans unable to 
live in the community without 
serious distress or damage to 
themselves? 

1, Teach skills useful for living 
within an institution (self-care, 
work details, etc.) to those 
patients who will probably 
always require institutional 
care? 

2. Provide enough money to live 
in the community for those 
veterans who are too disabled 
to work? 

3. Minimize stress on families 
resulting from living with 
and/or being responsible for 
veterans with psychiatric 
disabilities? 

7. Minimize the chances that 
veterans will be “public 
nuisances” by virtue of odd 
behavior, chronic drunkenness, 
disorderly conduct, vagrancy? 


4.62 .69 
4.37 88 


4.21 1.09 


3.89 


3.61 1.17 


3.57 


3.27 1.28 


3.26 1.20 


2.72 1.42 


Note. N = 6,435. VA = Veterans Administratii 
Scale points were defined as follows: 0 set 
importance; 1 = slightly important; 2 = mod- 
erately important; 3 = quite important; 4 = of 
great importance; § = extremely important, 


rated goal, “to 
veterans will 
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tated 2.72 (between moderately and quite 
important). 


Additional Goals 


An informal content analysis was made of 
additional goals that 17% of the respondents 
had proposed (as they were invited to do on 
the questionnaire) Practically all suggestions 
described means for achieving the goals pre- 
sented, such as expanding outpatient pro- 
grams, supporting community living, provid- 
ing greater continuity of treatment, and re- 
moving the requirement that only disabilities 
resulting from military service can be treated. 
The few suggested goals were not substan- 
tially different from the original set. 


Staff Consensus 


Total group. An indication of the agree- 
ment among staff about the importance of 
each goal is given by the standard deviations 
in Table 1. Theoretically, the standard de- 
viations could range from O (perfect con- 
sensus) to 6 (equal numbers in the two ex- 
treme categories of the scale). Judged on this 
scale, the degree of staff agreement about all 
goals was quite high. The small deviations, 
of course, indicate greater consensus; and, as 
would be expected, there was more agreement 
about the high-rated goals than there was 
about the others, 

Occupational subgroups. The deviations 
of each of the means of the goal importance 
ratings for six occupational groups from the 
means for the total staff group are presented 
in Table 2. This presentation makes it easy 
to see, for example, that of the six occupa- 
tional groups, physicians regarded the de- 
velopment of skills necessary ‘for self-sup- 
Port (Goal 4) as less important than did 
Psychologists. 

As can be seen from the F values (one- 
way analysis of variance) in Table 2, the 
mean importance ratings varied significantly 
across groups for all goals except “eliminate 
Psychological disorders.” Although most of 
the differences were small, there were medium 
or close to medium sized differences for the 
following three goals: (a) to minimize the 
chances that veterans will be public nuisances 


i 
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Mean Goal Importance Ratings for All Staff and Deviations of Means 


for Occupational Subgroups 


i 


Grand Psychol- 
Goal M Chaplain Nurse MD ogist RMST SW n® F 

4, Develop skills 

necessary for 

self-support 4.62 —.11 05 —.14 10 04 —.05 12 17.64* 
5, Eliminate psycho- 

logical disorders 4.37 07 01 —.06 —.06 08 02 05 2.78 
6. Minimize chances of 

violence to self or 

others 4,21 .08 16 01 —.52 03 =13 19 46.85* 
9. Minimize restrictions 

on personal 

liberty 3.89 —.31 .20 —.33 04 —.17 =.02 17 38.51* 
8. Provide a sheltered 

environment 3.61 .03 09 —.12 —.36 15 05 13 20.22* 
1. Teach skills for 

institutional living 3.57 AS .26 —.19 —.43 27 —.35 24 74,20* 
2. Provide money 3.27 16 01 —.18 —.37 —.06 40 17 37.91* 
3. Minimize family 

stress 3.26 ll —.02 —.19 —.34 —.07 49 20 54,10* 
7. Minimize chances that 

veterans will be 

“public nuisances” 2.72 .26 .23 .02 —.84 —.35 —.33 25 84.44* 


Note. N = 6,292. RMST = rehabilitation medicine senior therapists; SW = social workers. Scale points 
were defined as follows: 0 = of no importance; 1 = slightly important; 2 = moderately important; 3 
= quite important; 4 = of great importance; 5 = extremely important. 

"Values of .10, .24, and .37 may be regarded as indicating small, medium, and large differences among oc- 


Cupational groups, respectively (Cohen, 1969). 
b <01. 


by virtue of odd behavior, chronic drunken- 
ness, disorderly conduct, vagrancy; (b) to 
teach skills useful for living within an in- 
stitution (self-care, work details, etc.) to 
those patients who will probably always re- 
quire institutional care; and (c) to minimize 
stress on families resulting from living with 
pp „being responsible for veterans with psy- 
chiatric disabilities. Table 1 shows that these 
goals also had three of the highest standard 
deviations (measures of disagreement) for 

the total group. 
ie Facilities. The mean goal importance rat- 
( for the total staff group at each facility 
co and clinics) differed at a statisti- 
oe Significant (p < .01) level for all goals 
i pt to eliminate psychological disorders,” 
ai actically, these differences were small. 
dic was most interfacility variation re- 
TON the importance of minimizing re- 
lons on personal liberty and minimizing 


chances that veterans will be public nui- 
sances.” 


Value Determinants of Goal Importance 
Ratings 


Although the nine goals were written to 
deal with relatively independent issues, it 
was apparent that there were interrelation- 
ships among them. To explore the possibility 
that some more general value orientations 
might underlie the way staff rated the im- 
portance of the nine goals, these goals were 
intercorrelated and factor analyzed. The 
varimax factor structure appears in Table 3. 

Factor 1 is defined by a special concern 
with minimizing the chances of violent be- 
havior and other behavior that would con- 
stitute a public nuisance. Factor 2 is char- 
acterized by attaching importance to mini- 
mizing family stress and providing money for 
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Table 3 
Varimax Factor Structure 
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SSS 


aon Fan 


~ 


pvo 


Factor 
Goal 1 2 3 4 5 
1. Teach skills for institutional 

living 21 11 44 11 15 
. Provide money 07 48 28 08 18 
. Minimize family stress 16 52 08 13 09 
. Develop skills necessary for 

self-support —.08 .06 14 AT 24 
. Eliminate psychological disorders 25 15 .03 51 04 
. Minimize chances of violence to 

self or others -68 .09 19 Ali 07 
. Minimize chances that veterans 

will be “public nuisances” 54 29 32 04 .00 
. Provide a sheltered environment 31 29 50 .09 07 
. Minimize restrictions on 

personal liberty .05 13 .09 13 42 


Note, N = 6,435. This factor str 


ucture was obtained by rotating five Principal factors extracted from a 


correlation matrix in which squared multiple correlations had been entered as communality estimates, The 


number of factors selected for rotation 


disabled veterans to live in the community. 
Factor 3 emphasizes the importance of (a) 
providing a sheltered environment to pa- 
tients, (b) teaching skills useful for institu- 
tional living to patients with poor prognoses, 
and (c) minimizing the chances that patients 
will be public nuisances, These goals have 
usually been termed custodial, Factor 4 is 


Precisely the goals of people for either their 
Personal or professional acti 
(1975) has noted, C ities. AS Quade 


was suggested by examination of the latent roots. 


We suspect that this observation may have 
applied to some of our respondents, And 
even if they always had a clear conception of 
their goals (an unlikely supposition), it 
probably would be hard in many cases to 
express them on the brief instrument used. 
Nevertheless, having acknowledged these 
problems, we feel that our results provide at 
least a first approximation of the goals of 
VA staff for the mental health services that 
they provide, 

The results suggest that a large proportion 
of staff considered the nine goals important, 
although their degree of importance varied 
considerably. Goals for producing patient 
change took precedence over those concern 
with the welfare of relatives or others in the 
community and over those that implied ac- 
quiescence to permanent disability (such as 
teaching skills useful for institutional living). 
<ne Most important objective involved mak- 
ing patients self-supporting, a position con- 
sistent with Jahoda’s (1958) statement: 


One value in American culture compatible with 
Most approaches to a definition of positive mental 
health appears to be this: An individual should be 
able to stand on his own two feet without making 
undue demands or impositions on others. (p. xi) 


The same value probably accounts for the 


nce attached to teaching skills useful 
stitutional living. Even though to be 
tionalized is to be dependent, there 
be some opportunity for even an in- 
onalized person to “stand on his own 
2et.”” 

addition to determining whether the 
that we formulated were considered 
ant by staff, we were interested in the 
e of staff consensus about goal impor- 
This interest derived from a per- 
ive that has been well stated by Tawney 
), writing in an entirely different con- 
. He stated that 


the condition of effective action in a complex 
tion is cooperation, And the condition of 
eration is agreement, both as to ends to which 
t should be applied, and the criteria by which 
Success is to be judged. (p. 232) 


Survey results suggest that agreement 
ut the ends to which effort should be ap- 
is fairly high. However, the generality 
the goals may exaggerate the appearance 
nsensus. Some suggestion that this oc- 
s is provided by studies by Prothro and 
Stigg (1960) and by McClosky (1964) that 
le's agreement with social issues is 
er the more abstract the issue. 

the generality of the goal statement 
mask important differences of opinion 
g staff, attention should be paid to even 
indications of staff disagreement. For 
ce, the fact that psychologists rated 
of the nine goals as less important 
n did other occupational groups may point 
portant differences in values. Multidis- 
discussion of these differences and 


such as teaching institutional skills 
es), developing independence (psychol- 
s), and providing money (social work- 
—might improve communication and 
hen commitment to common goals. 

though the nine goals are more specific 
traditional “goals,” such as providing 
tment, rehabilitation, care, and custody, 
Process of specification needs to be con- 
- For example, the meaning of being 
“Supporting” needs to be further clarified 
to the prerequisite self-care, interper- 
b and work skills. Considerable guidance 
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could be obtained from educational literature 
on formulating instructional objectives. Ma- 
ger (1975), for example, gives the following 
illustration: “Given a compass, ruler, and 
paper, [the student should] be able to con- 
struct and bisect any given angle larger than 
five degrees, Bisections must be accurate 
to one degree” (p. 79). This example in- 
cludes a description of the behavior that is 
to occur as a result of instruction, the condi- 
tions under which it is to occur, and the 
standard of performance that is to be met. 
Such thinking seems to be directly trans- 
latable to the behavior that patients should 
manifest (or not) as a result of treatment." 

Finally, it is probably worth noting that 
a commitment to clarifying the goals of 
mental health services does not imply pref- 
erence for any particular paradigm for con- 
ceptualizing abnormal behavior, for example, 
medical model, psychoanalytic, or behavior- 
istic. The focus of concern about goal clarity 
is on the end product of treatment and not 
the intermediate conditions (such as devel- 
opment of insight) for achieving it. Perhaps 
the greatest contribution of the goal-oriented 
approach is that it provides a common frame- 
work for communications among people of 
diverse professional and theoretical positions. 


1 Although space limitations prevent describing 
our applications of the results of the goal-setting 
process reported here to the evaluation of day hos- 
pitals, day treatment centers, and drug-dependence 
treatment centers, further information can be ob- 
tained from the first author. 
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Comparison of Electromyographic Feedback and 
Progressive Relaxation Training in 

Treating Circumscribed Anxiety Stress Reactions 
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This study examined the effects of electromyographic (EMG) feedback and 
progressive relaxation training on the anxiety stress reactions of patients having 
recurrent, negative reactions to dental treatment. Twenty-one subjects selected 
from the patient files of one dentist were randomly assigned to one of three 
groups: EMG feedback, progressive relaxation, or control. Four dependent 
measures, EMG level, Dental Anxiety Scale (DAS), and State-Trait Anxiety 
Inventory (A-State and A-Trait), were collected at prerelaxation and postrelaxa- 
tion training dental appointments. Results showed significant, comparable de- 
creases in EMG levels across dental appointments for both EMG feedback and 

. progressive relaxation groups but not for the control group. On the DAS and 
A-State measures, significant decreases in all groups were found. Although the 
decreases shown by the EMG feedback and progressive relaxation groups did 
not differ significantly from each other, they were both significantly greater 
than the decrease shown by the control group. 


en though human reaction to stress has 
the focus of much psychological inves- 
ation, there is little research in the area 
it has been carried out under natural, 
ss-provoking conditions. Lazarus (1966) 
iS noted that the dental context provides 
excellent area in which to study both 
ological and psychological aspects of 
Due to the nature of the procedures 
, anxiety stress reactions are experi- 
by many patients undergoing dental 
ment. The dental setting thus provides 
portunity to study firsthand the effec- 
less of psychotherapeutic techniques 
d at relieving stress reactions. 

tion of the body musculature has 


article is based on a dissertation submitted 
first author to the University of Oklahoma 
ial fulfillment of the requirements for the 
e. 

ests for reprints should be sent to Martha 
T, who is currently in the private practice 
hology at 4900 North Portland, Suite 112, 
Oma City, Oklahoma 73112. 


been suggested as one means by which anxi- 
ety stress reactions can be reduced (Jacob- 
son, 1938; Shultz & Luthe, 1959; Wolpe, 
1958), Emphasizing the important role of 
relaxation in treating a number of stress-re- 
lated disorders including anxiety states, Ja- 
cobson (1938) developed a verbal technique 
(progressive relaxation) designed to syste- 
matically train individuals to relax. More re- 
cently, relaxation training using electromyo- 
graphic (EMG) feedback has been suggested 
as a way to reduce anxiety stress reactions 
(Budzynski & Stoyva, 1969; Green, Green, 
& Walters, 1973). 

Several studies have dealt with whether 
muscular relaxation learned through progres- 
sive relaxation training leads to the reduc- 
tion of anxiety. Comparing the effects of pro- 
gressive relaxation and hypnosis, Paul (1969) 
found that anxiety indicators in normal sub- 
jects were reduced significantly in the re- 
laxation group in contrast to the hypnosis 
and control groups. Partial substantiation of 
the relaxation training - anxiety reduction 
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hypothesis was shown in studies by Wilson 
and Wilson (1970) and Connor (1974), In 
the former study, only the high-anxiety re- 
laxation group showed significant decreases 
in anxiety indicators, whereas in the latter 
study, the relaxation group showed reduc- 
tions in physiological (but not self-report) 
stress indicators. Although progressive re- 
laxation has long been held to be an effective 
treatment for reduction of anxiety stress re- 
actions (Haugen, Dixon, & Dickel, 1958; 
Jacobson, 1938, 1970), little research has 
been directed toward the empirical validation 
of this technique as an effective treatment 
mode. 

The assessment of EMG feedback as a 
treatment for the reduction of anxiety stress 
reactions has been the focus of several recent 
investigations. A single group study by Ras- 
kin, Johnson, and Rondestvedt (1973) indi- 
cated that EMG feedback is of limited value 
in the treatment of chronic generalized anxi- 
ety. Stoyva and Budzynski (1974), however, 
reported successful use of EMG feedback in 
the clinical treatment of “several dozen” 
pervasive anxiety patients, Several controlled 
studies comparing the effectiveness of EMG 
feedback and modified types of progressive 
relaxation training were in general agreement 
that EMG feedback was superior to pro- 
gressive relaxation training with regard to 
speed of learning and depth of relaxation 
obtained (Coursey, 1975; Haynes, Moseley, 
& McGowan, 1975; Reinking & Kohl, 1975). 
These latter studies contained normal sub- 
jects who were not involved in a stress situa- 
tion. It is clear that the comparative effec- 
tiveness of EMG feedback and progressive 
relaxation training in reducing anxiety under 
natural, stress-provoking circumstances has 


yet to be determined. 


The purpose of this study was twofold— 
(a) to ascertain the 
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in anxiety stress reactions as compared to 
self-relaxation control procedures. 

2. EMG feedback relaxation training will 
produce significantly greater stress reduction 
than will progressive relaxation training, 

i 


Method 
Subjects i 


A list of patients classified as prone to anxiety 
stress reactions in a dental setting was compiled by 
a dentist from his total active file of patients on the 
basis of the dentist’s observations over time and also 
on patients’ verbal reports to him. The patients were 
contacted by phone by the dentist to determine their 
interest in Participating in a study of treatment 
techniques aimed at the reduction of stress reactions 
in dental situations, Those patients who were in- 
terested were scheduled for an initial interview that 
included a brief medical—dental history and a dental 
examination, Any patient who either was currently 
using drugs that might effect the results of the study 
or who was being seen regularly by other health 
service providers was excluded from the study. The 
Subjects who were included required dental work of 
a simple restorative nature entailing at least two 
Separate dental appointments. The final list included 
21 subjects (17 females, 4 males). The dispropor- 
tionate male/female ratio resulted from two factors; 
(a) Twice as many females (n = 28) as males (n= 
14) were originally identified by the dentist as stress 
Prone in a dental setting, and (b) the males who 
were contacted were more reluctant to participate in 
the study than were the females, Subjects ranged in 


age from 21 to 48 years, with a mean age of 35 
years, 


Apparatus 


EMG measures were recorded with an Autogen 
1500 feedback myograph using standard frontalis 
Placements 2 inches (5.08 cm) on either side of the ” 
center of the forehead and 1 inch (2.54 cm) above 
each eyebrow (Venables & Martin, 1967). A ground 
electrode was secured to the forehead midway bè- 
tween the other electrodes. d 

Connected to the Autogen unit were stereophoni¢ 
headphones through which subjects in the EMG 
feedback group received auditory feedback of on- 
going muscular tension, This feedback was pre- 
sented in the form of clicks that were logarithmically 
proportional to the level of EMG activity being 
monitored. All meter readings were based on average 
integral microvolts, 


Procedure and Mi easures 


During the first phase of the initial dental E 
pointment in which actual dental work occurred, 
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baseline frontalis EMG readings were taken. After a 
j-min adaptation period while the patient was 
seated in the dental chair, the baseline EMG read- 
ings were recorded once every 10 sec for a 3-min 
period and were averaged to obtain a single score. 
After removal of the electrodes, each subject com- 
pleted a set of self-report measures including the 
Dental Anxiety Scale (DAS; Corah, 1969) and the 
State-Trait Anxiety Inventory (STAI; Spielberger, 
Gorsuch, & Lushene, 1970). Upon completion of 
these measures, the scheduled dental treatment began. 
At the conclusion of the initial dental appointment, 
all subjects were scheduled for 10 training or control 
sessions to extend over a 4-week period. Both dental 
appointments and all training sessions were sched- 
uled at approximately the same time of day, with a 
maximum deviation of 2 hours for any 1 subject. 
Subjects were then randomly assigned to one of 
three groups (n = 7 each): EMG feedback training, 
Progressive relaxation training, or self-relaxation 
control, 

Subjects in the progressive relaxation group re- 
ceived 10 training sessions 10-40 minutes in length 
in the manner standardized by Bernstein and Borko- 
vec (1973). The longer time periods were necessary 
for completion of training during the first three 
Sessions, with session length decreasing as the training 
Progressed. All training was completed while the sub- 
Jects’ eyes were closed. The progressive relaxation 
training was “live,” rather than prerecorded on tape, 
as Paul and Trimble (1970) have shown the former 
method to be significantly more effective in reducing 
Stress responses. 

In the EMG feedback group, subjects also received 
10 training sessions. Each session was 20 minutes 
long (Budzynski & Stoyva, 1969). EMG subjects 
were told that they would be hearing clicks through 
the headphones that would be proportional to their 
Moment-to-moment muscular tension (i.e., the clicks 
Would increase in speed as their tension level in- 
creased and would decrease in speed as they re- 
Taxed). EMG subjects were then instructed to close 
their eyes and to slow down the speed of the clicks, 

As with subjects in the treatment groups, control 
acces met with the experimenter for 10 sessions. 
ae Session was 20 minutes in length. The control 
ubjects were told, as were subjects in the other 
aad that they were participating in a study 
ieee to see whether practice in relaxation might 

eip people feel more comfortable during future den- 
l appointments, After being instructed to close 
ies eyes, control subjects were asked to relax 
ated as best they could. These subjects were 

informed of the control aspect of their partici- 
bation, 

Treatment and control sessions were conducted in 
i je therapy room of a physician’s office 
on nt to the office of the participating dentist. The 
Durin, was well insulated from sound on all sides. 
ospi ti relaxation sessions all subjects reclined on a 
AR bed with the experimenter seated in a near- 
Tent Ee In all cases, the second dental appoint- 
of the F scheduled within 2 weeks after completion 

© relaxation sessions. EMG recordings of 
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muscle tension levels and self-report data were col- 
lected in an identical manner as that collected dur- 
ing the first dental appointment. The dentist who 
recorded EMG levels and collected self-report data 
was unaware of the experimental group to which 
each patient was assigned. 

Routine dental checkup notices were mailed to all 
subjects approximately 1 year after their second ex- 
perimental dental appointment. Those subjects who 
responded to this call for checkup appointments and 
who on examination required dental treatment other 
than routine cleaning were scheduled for a follow-up 
appointment. EMG levels and self-report data were 
collected in the same manner as in the first and sec- 
ond dental appointments. 


Results 
EMG Levels 


Training sessions, The mean EMG levels 
in microvolts for all groups were determined 
for each of the 10 training or control sessions 
that occurred during the period between den- 
tal appointments. Figure 1 shows the EMG 
trends for each group across these 10 sessions. 
If it is assumed that learning to relax is ac- 
companied by progressively lower EMG levels 
across sessions, significant linear trends for 
each group would be one indicator that learn- 
ing had occurred, Separate tests for trends 
across the 10 training sessions for each group 
showed significant linear trends for the pro- 
gressive relaxation, F(1, 18) = 8.93, p < 01, 
and the EMG feedback, F(1, 18) = 6.31, p < 
.05, groups but not for the control group (F 


2.40) EMG FEEDBACK(BF)GROUP — 
PROGRESSIVE RELAXATION (PR) GROUP --~~~— 
ul CONTROL (C) GROUP =-—-—-—-— 
2. 


1.80] 


EMG LEVELS IN MICROVOLTS 


! PRS ee ed 10 
TRAINING SESSIONS 


Figure 1. Trends for training session electromyo- 
gram (EMG) levels. 
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Table 1 
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Analysis of Variance of the Effects of EMG Feedback, Progressive Relaxation, 
and Self-Relaxation on Training Session EMG Levels with 


Planned Trend Comparisons 


eee 


Source df MS F 
Groups (A) 2 2.07 52 
Error (within groups) 18 3.98 
Training sessions (B) 9 63 Bie 
nee trend 1 4.54 34,92*** 
AXB 18 .19 IIN 
Differences in linear trends 2 .30 2.31 
Error (B X Subjects within Groups) 162 ar 
Linear error 18 13 


Note. EMG = electromyograph. 
*p <.05. 

** b < 01, 

“* > < 001. 


= 46). Further analysis of the data was 
accomplished through use of a 3 X 10 (Groups 
x Training Sessions) analysis of variance 
(ANovA), with the results being presented in 
Table 1. Of particular interest is the Groups 
X Training Session interaction, which indi- 
cates significant differences in trends among 
groups. Although a significant difference 
among linear trends would be an expedient 
way of showing differential learning among 
groups, it can be seen from Table 1 that this 
linear trend difference was not significant. As 
a result, trends other than linear must be 
considered in order to account for the sig- 
nificant Groups x Training Sessions interac- 
tion, Therefore, although significant differ- 
ences among linear trends were not shown, 
the progressive relaxation and EMG feed- 


Table 2 
Analysis of Variance of the 


back groups did show significant linear de- 
creases in EMG levels while the control 
group did not. 

Dental appointments. The mean EMG 
level in microvolts for each group across 
dental appointments is presented in Table 2, 
as are the results of a 3 x 2 (Groups X Ap- 
pointments) anova, A significant decrease in 
EMG level across dental appointments was 
found. Even though the Groups x Appoint- 
ments interaction was not significant, planned 
analysis of the data using an anova for simple 
main effects showed that significant decreases 
in EMG levels from the first to the second 
dental appointment occurred in the EMG 
feedback (p < 05) and the progressive re- 
laxation (p < 05) groups but not in the con- 
trol group. Although EMG levels for both the 


Effects of EMG Feedback, Progressive Relaxation, 


and Self-Relaxation on Dental Appointment EMG Levels 


Appointment 


‘on ao oe e 
et EAA groups) SA gt .23 BF 3.50 2.08 
es a wag 1 4227 O 813" O PR 370 186 

Error (A X Subjects within Groups) 18 pe ae c 2.35 2.43 
pe EMG = electromyogram; BF = EMG feedback; PR 


i = Progressive relaxation; C = control. 
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Analysis of Variance of the Effects of EMG Feedback, Progressive Relaxation, 
and Self-Relaxation on Dental Anxiety Scale Scores 
o Ő 


Appointment M 


Condi- 
Source df MS F tion 1 2 
j Groups (A) 2 23.74 3.05 BF 14.00 8.00 
Error (within groups) 18 7.79 
Appointments (B) 1 201.53 70.20** PR 12430 287671 
AXB 2 11.45 3.99% 
Error (A X Subjects within Groups) 18 2.87 C 13.86 11.43 


Note. EMG = electromyogram; BF = EMG feedback; PR = progressive relaxation; C = control. 


fb <.05. 
* b < 001. 


[EMG feedback and progressive relaxation 

groups were reduced significantly across den- 
tal appointments, these decreases were not 
Significantly different from each other. 


Dental Anxiety Scale 


Table 3 consists of the mean score on the 
DAS for each group across appointments and 
the results of a 3 x 2 (Groups X Appoint- 
Ments) anova. A significant decrease in DAS 
Scores occurred between dental appointments, 
as demonstrated by the significant main ef- 
fect, Further, a significant Groups x Dental 
Appointments interaction was also shown. A 
Planned anova for simple main effects ac- 
ae for the interaction effect by demon- 
trating that first, all groups evidenced a 
Significant decrease in scores across appoint- 
ments (for EMG feedback, p < .001; for 
pee relaxation, p < .001; for controls, 
iin and, second, the groups’ scores, al- 
Rice not differing on the first dental ap- 
Dae did differ significantly on the 
Fata (p < .05). Individual comparisons in- 
ii that even though there was no sig- 
back nt difference between the EMG feed- 
ae Progressive relaxation scores at the 
these ental appointment, the mean of 
a a differed significantly (p < .05) 

i aea of the control group. A ¢ test on the 
t Me oe between appointments for 
aes eedback and progressive relaxation 
ia as not significant. Thus, DAS scores 

Bi groups showed significant reductions; 

ver, the reductions shown by the EMG 


feedback and progressive relaxation groups, 
while not differing significantly from each 
other, were significantly lower than those in 
the control group. 

STAl-State. In Table 4 the mean STAI A- 
State scores for each group and the results 
obtained from a 3 X 2 (Groups X Appoint- 
ments) anova are presented. Main effects for 
groups (p<.05) and appointments ($ < 
.001) and the interaction effect (p < .001) 
were significant. A planned ANOVA for simple 
main effects revealed significance for groups 
at both levels of appointments and for ap- 
pointments at all levels of groups. Individual 
comparisons showed that at the first appoint- 
ment, the score of the progreśsive relaxation 
group was significantly lower than that of the 
EMG biofeedback group ($ < .05); how- 
ever, the score of neither the progressive re- 
laxation group nor the EMG biofeedback 
group differed significantly from that of the 
control group. At the second dental appoint- 
ment, the score of the EMG biofeedback 
group did not differ significantly from that 
of the progressive relaxation group, although 
the scores of both the EMG biofeedback (p 
< .01) and the progressive relaxation groups 
(p < .01) differed significantly from that of 
the control group. A ż test of difference scores 
between appointments for the EMG biofeed- 
back and progressive relaxation groups was 
not significant. Thus, all groups showed sig- 
nificant decreases in A-State scores; however, 
even though the reductions shown by the 
EMG biofeedback and progressive relaxation 
groups did not differ significantly from each 
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Table 4 


Analysis of Variance of the Effects of EMG Feedback, Progressive Relaxation, 


and Self-Relaxation on A-State Scores 


SSFSFSFSSSSSSSSSSSSSSSSSSSSSSsSSMSsFsfseF 


M. MILLER, P. MURPHY, AND T. MILLER 


Appointment M 


Condi- —— 
Source df MS F tion 1 2 
Groups (A) 2 848.86 5.77* BF 55.43 30.57 | 
Error (within groups) 18 147.14 
Appointments (B) 1 2,288.09 2,138.40** PR 43.43 27.43 
AXB 2 405.82 379.27** "7 
Error (A X Subjects within Groups) 18 1.07 (2 52.71 49.29 


Note. EMG = electromyogram; A-State = State Anxiety scale of the State-Trait Anxiety Inventory; 
BF = electromyogram feedback; PR = progressive relaxation; C = control. 


SP) oS 


other, they were both significantly greater 
than the decreases shown by the control 
group. 

Trait Anxiety. Table 5 includes mean 
Trait Anxiety scores for all groups across 
dental appointments and the results of a3 x 2 
(Groups X Appointments) anova, A signifi- 
cant reduction was shown between scores on 
the first and second dental appointments. 
However, the Groups x Appointments inter- 
action was not significant. Even though the 
subjects reduced their trait anxiety from the 
first to the second dental appointments, 
there was no differentiation among the three 
groups in the trait anxiety reduction effect. 


Discussion 
Both the training session data and the pre- 
post appointment data lend Support to the 


Table 5 


Analysis of Variance of the Eff 
and Self-Relaxation on A-Trait Scores 


first hypothesis, which contends that EMG 
feedback and progressive relaxation training 
will lead to significant reductions in anxiety 
Stress reactions as compared to self-relaxa- 
tion control procedures. The second hypothe- 
sis, which holds that EMG feedback relaxa- 
tion training will produce significantly greater 
anxiety stress reduction than progressive re- 
laxation training, was not confirmed. Both 


as that found in the dental setting. ; 

A review of the data suggests that patients 
having recurrent anxiety stress reactions 10 
dental settings may obtain a significant de- 
gree of relief from relaxation training with 
either EMG feedback or progressive relaxa- 
tion techniques. The EMG levels of both the 


feedback and the progressive relaxation 


ects of EMG Feedback, Progressive Relaxation, 


Appointment M 


treatment groups were equally effective in 
reducing transitory, situational anxiety such 


Condi- 
Source df MS F tion 1 3 
Groups (A) 1.00 
Error (within groups) 8 150.88 me ee he 
eee (B) 1 7736 5.49* PR 36,29 34.71 
x 
PES. 2 10. 
Error (A X Subjects within Groups) 18 1408 G c 39,29 37.43 
Note. EMG = electrom: 


BF = mie feedback: eRe A-Trait = Trait 


*p< 


~< 


S Anxiety scale of the State-Trait Anxiety Inventory: 
= progressive relaxation; C = control, 


showed significant decreasing linear 
across training sessions suggestive of 
ig. No such trend, however, was ob- 
in the control group. Also, from the 
o the second dental appointments, the 
and the progressive relaxation groups 
d significant decreases in scores on three 
dependent measures (EMG, DAS, A- 
Even though significant decreases 
also shown by the control group on the 
two measures, these decreases were sig- 
tly less than those shown by the EMG 
ck and progressive relaxation groups. No 
tion of a clear-cut superiority of one 
of relaxation training over the other 
shown on the EMG, DAS, or A-State 
res as pertaining to situational anxiety. 
, on the basis of this evidence, it 
seem that patients in both the EMG 
ck and progressive relaxation groups 
: more comfortable, physiologically and 
logically, when exposed to the immedi- 
reat of dental stimuli after receiving 
tion training. 

he consistency of findings for both treat- 
t groups provides the suggestion of the 
hanism for the reduction effect of the 
ety of the treatment groups. The learned 
iction in EMG levels across sessions, 
by the decreasing linear trends, trans- 
to the posttreatment dental appoint- 
. In addition to the reduced EMG levels 
the patients exhibited in the dental 
immediately prior to the dental work, 
y reported less dental anxiety and state 
ty at that time. Apparently, the learned 
n in muscular tension that was a re- 
f either the EMG feedback or the pro- 
sive relaxation procedures produced the 
Ctions in both physiological and self-re- 
Measures of anxiety. 

important point to note is that the 
arch under discussion here is a clinical 
ent study. This represents a major 
ence between it and other studies that 
investigated the comparative effects of 
tion techniques, notably the Coursey 
) study and the Reinking and Kohl 
) study, both of which contained normal 
jects. As mentioned earlier, the subjects 
ed in the current study were tested 
natural conditions, which to them were 
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highly stressful. Beyond the observations of 
the dentist who identified the dental-anxious 
subjects, representing less than 1% (original 
subject pool N = 42) of an available patient 
population (W > 5,000), other factors pointed 
to a high discomfort level among the subjects 
selected. Indicative of this discomfort were 
pretraining DAS scores and A-State scores, 
which ranked at about the 90th percentile 
when compared to scores found in normal 
populations (Corah, 1969; Spielberger et al., 
1970). Further, the pretraining muscular 
tension levels of the subjects were approxi- 
mately 32% higher than the pretraining lev- 
els of the normal subjects in the Coursey 
(1975) study and 23% higher than those in 
the Reinking and Kohl (1975) study, (For 
the purpose of comparison, the peak-to-peak 
microvolt readings reported in the latter two 
studies were divided by the constant 3 in 
order to approximate the average integral 
microvolt readings used here.) 

In conclusion, direct clinical implications 
that lead from the results of this study point 
to EMG feedback relaxation training and 
live, therapist-directed progressive relaxation 
training as effective treatments for relatively 
short-lived but recurrent bouts of anxiety that 
are bound to a particular stimulus event. 
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Dershowitz and Frankel (1975) have re- 
' cently summarized the results of a number of 
Studies showing that Jewish subjects tend to 
be characterized by relatively low scores on 
some of the performance subtests of the Wech- 
sler Intelligence Scale for Children (WISC) 
and of the Wechsler Adult Intelligence Scale. 
Scores on Picture Completion, Picture Ar- 
Tangement, Block Design, and Object Assem- 
bly are low in relation to scores on the verbal 
Subtests of Comprehension, Information, 
Arithmetic, and Similarities, Studies are cited 
as showing a similar but less extreme pattern 
in Protestant children and an even less ex- 
treme pattern in Catholic (Irish and Italian) 
children. Dershowitz and Frankel related 
these patterns to Witkin’s (1967) concept of 
Psychological differentiation. According to 
their hypothesis, Jewish subjects are less psy- 
chologically differentiated than Protestant or 

Catholic subjects, 
k mi though low levels of psychological 
o erentiation have been shown to be related 
aa rk performance on three of the subtests 
JN ich Jewish subjects exhibit relative defi- 
Ee Dershowitz and Frankel (1975) them- 
on a admit that they are unable to account 
Tit poor Picture Arrangement scores of 
the \ subjects with this explanation. Further, 
Y cite only one unpublished study (Litman, 


Re 0 

; as for reprints should be sent to Colin Mar- 

une epartment of Psychology, University of 
ne, Orono, Maine 04473. 


Hemispheric Asymmetry and Jewish Intelligence Test Patterns 


Colin Martindale 
University of Maine 


On the basis of intelligence test performance, Dershowitz and Frankel have 
hypothesized that Jewish subjects are less psychologically differentiated (more 
field dependent) than Protestant and Catholic subjects. It is argued that dif- 
ferential intelligence test patterns are better explained in terms of differential 
emphasis on abilities mediated by the left and right cerebral hemispheres. The 
hypothesis is advanced that Jewish subjects exhibit a tendency toward left- 
hemisphere dominance in comparison to Protestant and Catholic subjects. Evi- 
dence for, and implications of, the hypothesis are discussed. 


Note 1)—in which a version of the Embedded 
Figures Test was used—that would directly 
support the differentiation hypothesis. Two 
published studies (Dershowitz, 1971; Wendt 
& Burwell, 1964) reported failures to find sig- 
nificant differences between Jewish and non- 
Jewish children on Embedded Figures Test 
performance, although differences were in the 
expected direction. 

Dershowitz and Frankel’s (1975) findings 
are more parsimoniously explained by the 
hypothesis that Jewish subjects show a rela- 
tive superiority on tasks that are dependent 
on the left cerebral hemisphere and a relative 
deficit on tasks that are dependent on the 
right cerebral hemisphere, whereas Protes- 
tants show a similar but less extreme pattern 
and Catholics are relatively balanced in their 
performance. Recent research using a number 
of strategies—such as split visual-field stimu- 
lation, dichotic listening, and recording elec- 
troencephalogram during task performance— 
has led to the hypothesis that the left hemi- 
sphere is specialized for linguistic, analytical, 
and sequential tasks, whereas the right hemi- 
sphere is specialized for tasks requiring ho- 
listic, spatial abilities. To my knowledge, no 
one has used these strategies to test for dif- 
ferential performance as a function of sub- 
jects’ religious backgrounds. However, a num- 
ber of studies of the intellectual effects of 
unilateral brain damage have been conducted. 
(See Goldstein, 1974, for a review.) Although 
there are a few exceptions, almost all have 
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shown that left-hemisphere damage leads to 
deficits on the verbal subtests of the Wech- 
sler-Bellevue and the Wechsler Adult Intelli- 
gence Scale, whereas right-hemisphere damage 
leads to deficits on the Performance subscales 
(including Picture Arrangement) of these 
tests. 

Dershowitz and Frankel (1975) used the 
slope of the regression line relating scaled 
score to rank of subtest difficulty (across all 
of their groups) as an index of degree of per- 
formance subtest deficit. The order of sub- 
test performance from worst to best was as 
follows: Object Assembly, Picture Comple- 
tion, Block Design, Picture Arrangement, 
Similarities, Arithmetic, Information, Com- 
prehension. Thus, the steeper the slope, the 
greater the performance subtest deficit, Pool- 
ing the means presented in Dershowitz and 
Frankel’s Table 1 (p. 127), I obtained slopes 
of .55 for 546 Jewish subjects, .31 for 30 
Protestant subjects, and .25 for 94 presuma- 
bly Catholic (Irish and Italian) subjects. (All 
of the slopes are positive because the tests 
were ordered on the basis of the performance 
of all groups in the first place.) 

Several studies of unilateral brain damage 
include full tables giving mean subtest scaled 
scores of left- and right-hemisphere-damaged 
subjects on the Wechsler Adult Intelligence 
Scale and on the Wechsler-Bellevue Scale. If 
the Dershowitz and Frankel (1975) technique 
is applied to these means, slopes of .54 for 23 
right-hemisphere-damaged subjects and of .06 
for 21 left-hemisphere-damaged subjects are 
obtained for the means reported by Simpson 
and Vega (1971). Slopes of .35 for 31 right- 
hemisphere-damaged subjects and —.33 for 
29 left-hemisphere-damaged subjects are ob- 
tained using the figures reported by Den- 
nerll (1964). Unfortunately, in neither of 
these studies is the religious background of 
the subjects specified. The slope of .55 for 
Jewish subjects is essentially the same as one 
of the slopes obtained for right-hemisphere- 
damaged subjects and is somewhat higher 

than the other slope for right-hemisphere- 

. This is, of course, consis- 
Pes eed of a relative left- 
nance in Jewish subjects, 
that Jewish subjects exhibit a 
tive left-hemisphere dominance 
rily inconsistent with the hy- 


hemisphere do; 

The notion 
pattern of rela 
is not necessa 
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pothesis that they tend to be field dependent. 
In fact, on the basis of a review of the effects 
of right-hemisphere damage, Pizzamiglio and 
Carli (1974) have hypothesized that right- 
hemisphere damage leads to field-dependent 
behavior. The implication of the hemispheric 
dominance hypothesis would be that the field- 
dependent test behavior of Jewish subjects 
does not arise from a lack of psychological 
differentiation on the level of personality but 
is, rather, an artifact of a more basic pattern 
of cognitive abilities. The hemispheric domi- 
nance hypothesis would seem, further, to be 
the more general and preferable one in that it 
can subsume the findings explained by the 
psychological differentiation hypothesis as 


well as others not explained by the latter | 


hypothesis. Finally, the hemispheric domi- 
nance hypothesis predicts relative strengths as 


well as weaknesses for each of the religious | 


groups. 

In regard to intellectual test performance, 
the perceptual differentiation hypothesis can- 
not account for the poor performance of Jew- 
ish subjects on the WISC Picture Arrange- 
ment subtest, but the hemispheric dominance 
hypothesis can: Right-hemispheric damage 
depresses Picture Arrangement scores more 
than does left-hemispheric damage (McFie, 
1960). It is not well established that Jewish 
subjects exhibit poor performance on the 
Embedded Figures Test. Only one of three 
studies obtained significant results in this di- 
rection. This may be because this test is 
dependent on both linguistic (left-hemi- 
sphere) and spatial (right-hemisphere) abili- 
ties: Aphasia causes decrements on this task 
(Teuber & Weinstein, 1956), but there is evi- 
dence that in patients without aphasia, right- 
hemisphere lesions cause greater decrements 
than do left-hemisphere ones (Russo & Vig- 
nolo, 1967). One study suggests that Jewish 
Subjects perform more poorly than non-Jew- 


=~ 


ish subjects on the Body Adjustment Test | 


(Dershowitz, 1971). This is, of course, COB 
Sistent with the psychological differentiation 
hypothesis. I know of no studies of the effects 
of lateralized lesions on this task, but the 
mediation of a wide variety of spatial orienta- 
tion behaviors by the right hemisphere (cf. 
Luria, 1973) suggests that this finding should 
cause no problems for the hemispheric dom 
nance hypothesis. Finally, the finding that 
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the figure drawings of Jewish children suggest 
feld dependence (Dershowitz, 1971) is paral- 
lled by the finding that right-hemisphere 
lesions produce a similar effect (Pizzamiglio, 
Note 2). 

The hemispheric dominance hypothesis 
lads to a number of predictions that cannot 
be derived from the psychological differentia- 
tion hypothesis. On the basis of findings with 
lateralized lesions, we would expect Jewish 
subjects to perform worse than non-Jewish 
subjects on tasks, such as three-dimensional 
size discrimination (cf. Weinstein, 1964), 
identification of faces (cf. Hécaen & Ange- 
lergues, 1963), and tactual form identification 
(cf. Pizzamiglio & Carli, 1974) but better 

than non-Jewish subjects on tasks involving 
verbal fluency and abstraction (cf. Luria, 
1973). 

The hypothesis that Jews, Protestants, and 
Catholics differ in degree of emphasis on 
processes dependent on one or the other hemi- 
sphere is consistent with a number of cultural 
and religious practices. For example, Jewish 
and Protestant religious rituals emphasize 
linguistic stimuli and deemphasize complex 
spatial stimuli such as images, stained glass 
windows, and elaborate architectual detail. 
Just the opposite is true of Catholic ritual. In 
old Hebrew, only consonants were written. 
Interestingly, the superiority of the left hemi- 
Sphere in recognition of linguistic material is 
Strongest for consonants (Studdert-Kennedy & 
Shankweiler, 1970). The right-to-left writing 
of Hebrew is also consistent with the finding 
that right-hemisphere weakness leads to per- 
Ceptual left-side “neglect” (Luria, 1973). 

hese and other consistencies do not, of 
Course, prove the existence of the hypothe- 
ct hemispheric asymmetry nor do they tell 
da whether it causes the cultural differences 
Recs versa, (It seems likely that cultural 
ages bring about the differential em- 
nk on one or the other hemisphere, since 

E owitz and Frankel, 1975, presented evi- 
ee et the degree of Jewish weakness on 
S a Set EEE with acculturated as 
thre = to “traditional” subjects.) If the 
ode ee do in fact tend to process and 
gen ee in the differential ways sug- 
al n is would shed some potentially help- 
x tae on the reasons for their historical 

icts. On a more prosaic level, it seems 
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clear that religion and ethnic background may 
be rather potent nuisance variables in use 
of intellectual test patterns in connection with 
diagnosis of left- and right-hemisphere brain 
damage. 


Reference Notes 


1. Litman, G. Personal communication to Z. Dersho- 
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Therapeutic Relationship in Behavior Therapy: 
An Empirical Analysis 


Julian D. Ford 
University of Delaware 


The determinants and predictive utility of the client’s perception of the ther- 
apeutic relationship (CPTR) were investigated in the context of a behavior 
therapy clinical research project evaluating three approaches to assertion train- 
ing. Individual differences in therapists were a significant determinant of CPTR. 
Neither therapy type nor therapy session accounted for much variance in 
CPTR, although CPTR ratings by individual clients varied from session to 
session. CPTR was an effective predictor of dropping out when measured early 
in therapy, and of immediate posttherapy client gains when measured in a mid- 
to late therapy session, but not of long-term maintenance of client improve- 
ments. Patterns of therapist behavior that were predictive of CPTR at three 
time points in therapy are delineated. It is speculated that CPTR is largely a 
function of the degree to which the client's expectation of the therapist and the 
consequences of therapy are being fulfilled. It is concluded that CPTR has sig- 
nificant predictive value, and perhaps also causal impact, in behavior therapy. 


The therapeutic relationship between cli- 
ent and therapist is widely acknowledged as 
a central factor in producing positive out- 
comes in psychotherapy and counseling (cf. 
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Howard & Orlinsky, 1972; Kiesler, 1973; 
Marsden, 1971; Meltzoff & Kornreich, 1970; 
Mitchell, Bozarth, & Krauft, 1977). Within 
the behavior therapy framework, increasingly 
greater attention has been paid to the thera- 
peutic relationship in recent years (cf. Wil- 
son & Evans, 1976). However, very little 
empirical research has been reported to di- 
rectly evaluate the impact of variations in 
the therapeutic relationship on the outcome 
of behavior therapies. This study represents 
an initial attempt to address that issue. 

Although the therapeutic relationship can 
be assessed from several vantage points (i.e, 
the client, the therapist, experienced clinician 
observers, nonexpert observers), and in terms 
of either specific client-therapist behavioral 
interactions or the global experiencing of the 
client and therapist, only one approach tO 
the Measurement of the therapeutic relation- 
ship has been well validated as a predictor 0 
therapy outcome: the client’s perception 0 
the therapeutic relationship (CPTR) (ch 
Gurman, 1977), Despite optimistic early re 
views (Truax & Mitchell, 1971), there is 9° 
consistent evidence that ratings by nonexpert 
observers of the therapeutic relationship are 
Predictive of outcome in therapy (cf. Mit- 
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chell et al., 1977). All other approaches to 
‘the measurement of therapy processes have 
received little, if any, empirical evaluation 
as predictors of therapy outcome, with very 
few promising results (e.g., Dietzel & Abeles, 
1975; Mintz & Luborsky, 1971; Rice, 1965). 

Although CPTR has been found to be an 
excellent predictor of therapeutic outcome in 
tlient-centered, psychoanalytic, vocational 
guidance, and “personal” therapy and coun- 
seling (Gurman, 1977), this has been done 
only three times in a behavior therapy con- 
text, and these studies have had important 

‘methodological limitations. In an analogue 
desensitization study conducted with mildly 
phobic volunteer subjects, Carmichael (cited 
‘in Gurman, 1977) found CPTR to be in- 
‘versely predictive of decreases in avoidance 
‘behavior and not predictive of changes in 
“subjective fear, Ryan and Gizynski (1971) 
“found that retrospective CPTR reports were 
correlated with therapists’ global ratings of 
client improvement after receiving unspeci- 
fied types of therapy by self-identified be- 
havior therapists. Sloane, Staples, Cristol, 
Yorkston, and Whipple (1975) also showed 
that retrospective CPTR reports were pre- 
dictive of outcome in unspecified types of 
therapy administered by self-identified be- 
havior therapists, using global ratings by 
clients, therapists, and expert judges as their 
Outcome criteria. 

The designs and results of these studies 
leave a major question unanswered. Is CPTR 
Predictive of changes in therapy clients’ spe- 
cific behaviors and self-perceptions that are 
caused by participation in a clearly described 
naturalistic program of behavior therapy? 
Although the value of CPTR as a predictor 
of global improvement in behavior therapy 
has been demonstrated, it is unclear whether 
it is predictive of the more behaviorally spe- 
oe criteria that are essential in behavior 
therapy. And, the therapy types used in 
these studies were either too incompletely 
described to permit replication or too arti- 
ficial to be generalized to actual clinical 
Practice, 
pte these studies all measured 
: TR retrospectively, so it is not possible 
© infer whether it would, if assessed prior 
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to the termination of therapy, serve as a 
clinically viable predictor of therapy out- 
come. Nor can we say whether CPTR re- 
mains constant across the course of therapy 
or whether it is differentially effective as a 
predictor at different time points. 

Finally, there has been very little research 
in any therapy approach, and none in a 
behavior therapy context, that delineates the 
determinants of CPTR (cf. Gurman, 1977). 
If CPTR is important, how does it come 
about? Research on client and therapist per- 
sonal and professional characteristics (eg, 
age, psychological status, expertise, mood, 
dogmatism) as predictors of CPTR is sparse 
and inconclusive at present (see Gurman, 
1977). When observers’ ratings of therapist 
in-session behavior are tested as predictors, 
many verbal, vocal, kinesic, and proxemic 
behaviors have been examined, but very few 
have emerged even in one study (let alone in 
replications) as viable predictors of CPTR 
(e.g. therapist interruptions; vocal expres- 
sion of concern and involvement; eye con- 
tact, forward trunk lean; see Gurman, 1977). 
Clearly, very little is known about the de- 
terminants of CPTR in any therapy context. 

One study that has examined this issue 
using a factorial analysis of variance design 
demonstrated that the emergent events in 
each therapy session were the major causal 
factor underlying CPTR (Howard, Orlinsky, 
& Perlstein, 1976). But what are those 
events (e.g., What is it that the therapist is 
saying or doing that facilitates positive 
CPTR?), and does this hold true in a be- 
havior therapy context as well? These issues 
are examined in this study. 


Method 


Clients 


39 volunteers from scattered 


The clients were pr pi 
suburban communities in the vicinity of a major 
onded to announce- 


east coast university, who resp 
ments in the media of a time-limited therapy pro- 
gram for assertion training. Persons responding to 
the announcements were sent a brief descripton of 
the program, which included the following in- 


formation: 


We hope to achieve two goals: first of call, to 
assist people like yourself who have difficulty 
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acting assertively or who experience anxiety or 
nervousness in situations where they want to 
act assertively. At the same time, we hope to 
obtain valuable information regarding the types 
of treatment methods that are most effective 
for different individuals. All of the treatment 
procedures we will be using in the assertion 
Program have been shown to be effective for 
some individuals, 


The therapists will be PhD candidates in clinical 
Psychology trained and supervised by Dr. Gold- 
fried. . . . We are unable to accept anyone into 
the program who is currently being seen in 
psychotherapy. . . . The treatment sessions will 
consist of weekly individual sessions carried out 
over a 2-month period, with each session lasting 
approximately 1 hour. . . . Because the program 
is’ being funded by the National Institute of 
Mental Health, there wll be no charge to you. 

. We want to again emphasize that your 
questionnaire responses and what transpires dur- 
ing the actual assertive training sessions will be 
kept strictly confidential, 


Persons applying to the Program were screened 
(a) to eliminate client sex as a factor, and since 
the vast majority of applicants were women, only 
females were accepted as clients; (b) university 
students were not accepted in order to obtain a 
representative sample of community adults; (c) 
persons concurrently undergoing Psychotherapy were 
not accepted, nor were persons who showed evi- 
dence of severe marital problems or thought dis- 
order; and (d) only applicants who scored below 
zero on the Rathus (1973) Assertiveness Schedule 
and above three on an assertion screening inven- 
tory designed especially for this study (Personal 
Reaction Inventory, Form D) were accepted as 
participants. All persons who volunteered were 
screened in an interview conducted by an experi- 
enced clinical psychologist, Applicants who met the 
screening criteria were accepted as clients until all 
therapists participating in the study were matched 
with three clients; thereafter, applicants were re- 
ferred to other clinics and assertion training pro- 
grams in the area with the explanation that the 
Program could no longer accept further applications, 

All clients reported in the Screening interview 
that problems with assertiveness had generalized 
debilitating effects on their lives (eg. self-confi- 
dence, marriages, vocations). The average prether- 
apy score on the Rathus (1973) Asse; 
ule was — 46.3, markedly below ti 
reported by Rathus for 


The clients’ average aj 


was i a 
22-60), and they had an anes 203, Years (range = 


an average education level of 
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2 years in college (range = 10-17 years of educa- 
tion). Seventy percent were married. Ten were 
unemployed outside the home, 8 had part-time 
jobs, and 12 held full-time jobs. 


Therapists 


The 13 therapists were advanced graduate stu- 
dents in a clinical psychology training program 
(M age = 28.6). Eight were male, and 5 were fe- 
male. All but 1 were inexperienced (Mdn=seven 
prior cases). 


Therapy Types 


Clients were matched using within-group proce- 
dures on the basis of age and scores on all pretest 
measures (see Table 1) and were then randomly 
assigned to one of three therapy types (Linehan, 
Goldfried, & Goldfried, Note 1): interactive, in 
which therapists focused on providing nondirective 
support to the client; instigation, in which home- 
work assignment to engage in assertive behaviors 
was provided as a supplement to therapist support; 
and, rational restructuring, in which clients were 
trained to restructure their cognitive self-statements 
(cf. Ellis, 1970) through modeling and behavior 
rehearsal, in addition to receiving behavioral in- 
stigation and emotional support, All clients re- 
ceived an explanation of why people are unassertive 
and how one can become more assertive that was 
congruent with their therapy types in the initial 
Session, 

Each therapist conducted each type of therapy 
with an individual client over eight 1-hour weekly 
sessions. Two therapists had one client drop out, 
one therapist had two clients drop out, and one 
therapist had one client drop out and two clients 
who failed to provide pretest data. Thus, there were 


32 clients who completed the entire therapy pro- 
cess.t 


Measurement of the Perceived Therapeutic 
Relationships 


The Relationship Inventory Form G (RI-G), as 
developed by Gurman (1973), was used to assess 
the client’s Perception of the therapeutic relation- 
ship at three points in the therapy process (i.e 
immediately after the third, sixth, and final therapy 
Sessions). The RI-G is a 30-item true—false written 
questionnaire that was derived from similar pre- 
vious instruments. Ten-item subscales can be score 
for empathy, warmth, and genuineness, but reli- 


—_ 


*One of the clients who completed therapy was 
not included in the data analyses because her thera- 
Pist had both of his other clients drop out, and 
was thus not representative of the other therapists 
(none of whom had more than one client drop out). 
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ability has been assessed only for the entire scale; 
split-half Spearman-Brown reliability was reported 
to be 86 by Gurman (1973). 


Measurement of Therapist Behaviors 


Twelve 3-minute samples were taken from audio- 
tapes of each of the 31 complete therapy cases. 


Table 1 
Outcome Measures 


Variables measured pre, post and follow-up 
Rathus Assertiveness Scale (Rathus, 1973) 


Personal Reaction Inventory Form D (Linehan, 
Goldfried, & Goldfried, Note 1) 
Self-Esteem Inventory (Robinson & Shaver, 1973) 


Fear of Negative Evaluative Scale 
: ooer Scale (Watson & Friend, 


Bem Sex-Role Inventory (Bem, 1974) 
Masculine 
Feminine 
Androgyny 


Multiple Affect Adjective 
Checklist (Zuckerman & Lubin, 1965) 
Anxiety 
Depression 
Hostility 


S-R Inventory of Anxiety (Endler & Hunt, 1968) 
Refusal situations 
Initiation situations 
Total score 


S-R Inventory of Hostility (Endler & Hunt, 1968) 
Refusal situations 
Initiation situations 
Total score 


Variables measured pre and post only 


Role-playing self-ratings 
Nervousness 
Baseline nervousness 
Anger 
Guilt 


Ba paying observer-rated behavior 
oe contact duration 
eee (first response) 
EE (M of all responses) 
Deh disfluency (first response) 
eee disfluency (M) 
ees assertiveness level 

o. challenges by confederate 


2 Variables measured post only 
ioe test observer-rated behavior 
ntent : assertiveness level 


No. challenges b: 
(Friedman. ao s confederate 
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Therapy Sessions 1, 3, 6, and 8 were sampled, with 
one 3-minute sample taken from the middle of the 
first (ie. early), second (ie, middle), and final 
(ie., late) third of each session. Eight trained re- 
search assistants rated the therapist’s behavior from 
the audiotaped time samples and written transcripts. 
Each rater rated an equal number of tape seg- 
ments from all therapists, clients, therapy types, 
therapy sessions, and in-session time periods. Raters 
were assigned to teams of two or three, and each 
team rated one third of the behavior categories so 
as to minimize rater overload. 

The therapist behavior code that was used for 
ratings consists of 48 behavior categories, each 
operationally defined in detail and derived from 
existing therapy process measurement instruments 
(see Table 2). Each category was rated once every 
20 sec, or nine times in each 3-min sample. The 
data were condensed by means of arithmetic aver- 
aging to provide one score for each 3-min time 
sample. 

Reliability was assessed by having each rater re- 
rate 48 (for the three-person teams) or 72 (for 
the two-person team) 3-minute segments already 
rated by each other rater in his/her team. These 
segments were equally distributed across therapists, 
clients, therapy types, therapy sessions, and in- 
session time periods. Reliability was assessed by 
means of Pearson product-moment correlations for 
variables that were scored as frequency counts or 
ordinal ratings. Reliability coefficients ranged from 
70 to .99, with a median of .82. For the variables 
that were scored simply as occurrence Or nonoccur- 
rence, Cohen’s (1960) Kappa was used as a reli- 
ability estimate, since it accounts for chance agree- 
ments when the more typically used percentage 
agreement score does not. When Kappa was calcu- 
lated for interrater agreement on both occurrence 
and nonoccurrence, reliability ranged from .77 to 
1.00, with an average of .87. With the much more 
conservative test of agreements on occurrences only, 
the levels were 40 oF higher (ie, well above 
chance), with only one exception (i€. topic change, 
28), and with an average of ./3. 


Measurement of Therapeutic Outcome 


Thirty separate variables were used to assess 
therapy outcome (see Table 1). Seventeen variables 
were derived from written questionnaires completed 
by clients before, immediately after, and 2 months 
after therapy. These self-report measures assess a 
variety of components of assertion, including self- 
perception of assertiveness, anxiety, hostility, de- 
pression, and masculinity-femininity as general 
traits and as behavioral reactions to specified situa- 
tions. All of the measures have been well re- 
searched and have been extensively used to evaluate 
outcome in prior assertion training research (cf. 
Rich & Schroeder, 1976), although none have clearly 
demonstrated external validity. 

Eleven variables were taken from the client’s 
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Table 2 
Therapist Behavior Code Categories 


ee 


Frequency count categories 


No. therapist utterances 

Verbal productivity 

Encouragements to continue or complete utterance 
Interruptions 


Ordinal rating categories 


Nonverbal 
warmth 
certainty and sincerity 
control and relaxation 
energy 
emotional responsivity 


Occurrence-nonoccurrence categories 


Laugh (during client utterance) 
Informational statement 

fact 

possibility 
Request information 

closed ended 

open ended 
Request Action 

command 

suggestion 
Contentless communication 
Simultaneous speech 
Implements technique 
Models assertive thoughts 
Reflects client's feelings /fears 
Implied question or statement 
Restatement 
Agreement 
Disagreement 
Client strength emphasized 
Client weakness emphasized 
Topic change 
Interpretation 
Interpretation 
Reassurance 
Clarification 
Disfluency 
Filled pause 
Silent pause 
First person singular ("I" statement) 
First person plural (“We” statement) 
Content 


based on client's viewpoint 
based on therapist's viewpoint 


Present 
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subjective emotional reactions to, and observer rat. 
ings of client behavior in, six role-playing simula- 
tions requiring assertion and Were measured only 
at pretest and posttest (Table 1). 

Finally, two variables were assessed at posttest 
only: observer ratings of 2 categories of client be- 
havior in an analogue “stress test.” 

Trained research assistants served as observers 
for the behavioral outcome measures. Interrater 
agreement levels were calculated by product-moment 
correlations between two observers for each vari- 
able. Reliability levels were as follows: role-playing 
assertiveness content (M = 85, range= 84-87); 
role-playing loudness/affect (M = 92, range = .78- 
97); role-playing disfluency (M = .88, range = 83- 
.92); and stress test assertiveness content (M = 84, 
range = .69-.96) , 

Pretest, posttest, and follow-up scores were con- 
verted to outcome scores through the use of re- 
sidual gain scores (Manning & DuBois, 1962), 
which were used rather than raw difference scores 
because of their greater reliability and their con- 
trol for regression effects due to pre-post or pre- 
follow-up correlation.2 


Results 


Variation in the Perceived Therapeutic 
Relationship 


Three two-way analyses of variance 
(ANovAs) were conducted, with RI-G scores 
as the dependent measure, to determine the 
Proportions of variance in CPTR accounted 
for by differences among therapists, therapy 
types, and therapy sessions: (a) Therapists 
X Therapy Types (pooling over therapy ses- 
sions; (b) Therapists x Therapy Sessions 
(pooling over therapy types); and (c) Ther- 
apy Types x Therapy Sessions (pooling over 
therapists), A single three-way Anova was 
hot used, since it would have involved one 
data score per cell, thus confounding the 
Therapist x Therapy Type and Therapist X 
Therapy Type x Therapy Sessions interac- 
tion terms with the error term. Variance pro- 
portions were calculated based on expected 
mean square formulas (Winer, 1971) and 

assumption that therapist is a random 
between-subjects factor, therapy type is a 

d between-subjects factor, and therapy 
session is a fixed within-subjects (repeated 
measures) factor (cf, Endler & Hunt, 1968). 


2 Those correlations 
tudy, with 75% 


i were noteworthy in this 
05) and 


falling in the 32-.78 range (p< 
more than half greater than .52. 
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Table 3 
Determinants of Variance in the Perceived Therapeutic Relationship 
Factor AXB AXC BXC Overall 
Therapist (A) .20 43 — 315 
Therapist type (B) .02 — .01 .015 
Therapy session (C) = .08 .07 075 
AXB 18 -= = .180 
AXC = 05 = -050 
BXC — = 07 .070 
Residual 60 .44 -85 .295 


Note. Determinants are expressed as proportions, varying betweeen .00 and 1.00. Overall variance propor- 
tions for therapist, therapy type, and therapy session were determined by numerically averaging the two 
values for each. The overall variance proportion for the residual term was calculated by subtracting the 
overall variance proportions of all other factors from 1.00. The residual term includes the variance due to 


the highest order interaction term and error. 


The results (Table 3) demonstrate that 
differences between the individual therapists 
had a significant effect on CPTR, the inter- 
action of therapists and therapy types had a 
moderate impact, and all other factors ex- 
erted only a minimal influence. Only one 
factor achieved a significant F in any ANOVA 
(ie, Therapist, in the Therapist X Therapy 
Session ANOVA). 

A £ test was conducted to examine whether 
the therapist’s gender was a significant de- 
terminant of CPTR ratings. The results 
showed that male and female therapists did 
not receive significantly different CPTR rat- 
ings, (91) = .1, and this is reflected in the 
aye for males (27.1) and females 


Perceived Therapeutic Relationship as a 
Predictor of Outcome 


eee coefficients were calculated be- 
Hae RI-G scores and the scores for each 
ý oe measure separately for Sessions 3, 
RLG 8. The results (Table 4) show that 
a Scores were not consistently predictive 
AL at outcome at Session 3, with only 
Ste Correlations at or near significant. 
a er, Session 6 RI-G scores were consist- 
i8 ie M deliva of posttest outcome, with 
06) si 0 correlations at or near (i.e, P= 
Se ager ane all in the expected direction. 
Sots RI-G scores were not consistently 
pe x of follow-up outcome, with only 

7 correlations significant. Session 8 


RI-G scores were not consistently predictive 
of outcome, with only 4 of 47 correlations 
significant. 

CPTR was assessed as a predictor of pre- 
mature termination by comparing the Ses- 
sion 3 RI-G scores of the five dropouts with 
those of the 31 full-term clients. Mean RI-G 
levels for the two groups were 14.6 and 27.3, 
respectively. A £ test showed that the clients 
who completed therapy gave significantly 
higher RI-G ratings for Session 3 than did 
the dropouts, #(34)= 81, P< 001, Few 
false positives or negatives were found, as 
well; RI-G scores for three dropouts were 
much lower than those for any full-term 
client; the fourth dropout’s RI-G score was 
equal to the lowest RI-G score for any full- 
term client; and, the fifth dropout’s RI-G 
score was lower than all but two of the full- 
term clients’ RI-Gs. 


Therapist Behavior as a Predictor of CPTR 


The therapist behavior categories were 
grouped into 14 composite behavioral 
“styles” on an a priori basis to permit for 
relatively reliable stepwise multiple regres- 
sion analyses with the small sample size 
(see Table 5). The multiple regression anal- 
yses were conducted separately for each of 
the three therapy sessions for which RI-G 
ratings were obtained. 

Results from the nine multiple regression 
analyses are presented in Table 6. Overall, 
therapist behaviors did not account for the 
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major part of the variance in RI-G scores, 
generally explaining between 15% and 30% 
of the variance in CPTR. However, in the 
early phase of both Sessions 6 and 8, thera- 
pist behavior accounted for substantial pro- 
portions of RI-G variance (i.e, 48% and 
53%). 

Although there was much variation from 
session to session, and time period to time 
period, certain styles were consistently posi- 
tively related to favorable CPTR (i.e., focus 
on behavior or cognitions; nonverbal style; 
cognitive restructuring), whereas one was 
often inversely related to RI-G scores (i.e., 
significant others). 


Table 4 


Therapeutic Relationship as a Predictor of Outcome 


SS eee eee | ee 


RI-G session 
Co SS ee 
Outcome measure 3 6 8 
Posttest outcome 
Rathus Assertiveness Schedule .29* 
Fear of negative evaluation —.32** 
Social avoidance/distress RE 
MAACL anxiety —,39** 
Bem SRI: Masculine 29* 
S-RIA Initiation —"45eex 
S-RIA Refusal — 534+ 
S-RIH Initiation — 4344r 
S-RIH Refusal — 50*** 
S-RIA Total "5440 
S-RIH Total —_4gee« 
Role play S, 
Nervousness —.34* —.52*** 
Baseline nervous —.32%* —34ee« 
Anger — 52 
Eye contact joe 
Disfluency (1st) “ere 
Content 39%" aoe 
Challenges : 43 


Follow-up outcome 


Fear of negative evaluati 
MAACL Anxi ea 
MAACL 

S-RIH Total 

Bem SRI: Feminine 
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Discussion 


The fact that the client’s perception of 
the therapeutic relationship seems to be a 
potent factor in behavior therapy is con- 
sistent not only with prior research (Gur- 
man, 1977) but also with two important cur- 
rent developments in behavior therapy. First, 
there is growing recognition that the (os- 
tensibly) same technique (e.g., desensitiza- 
tion), when applied by different therapists 
or at different time points or in different 
Settings or with different clients, may be 
implemented quite differently and with dif- 
ferent outcomes. Behavior therapists have 


Table 5 


New variable 
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Composite Therapist Behavior Variables 


Combination of original therapist behavior variables 


Nonverbal 
Voice quality 


Supportiveness 


Instigation /prescription 


Assessment 
Nondirectiveness 


Technique implementation 


Nonverbal warmth + nonverbal certainty + nonverbal control + nonverbal 
energy + nonverbal responsivity — simultaneous speech — disfluency 
— filled pause — silent pause 


Encourages to continue or complete + agreement + emphasis on client 
strengths + reassurance — interruptions — disagreement — emphasis on 
client weakness 


Information statement : fact + information statement : possibility + request 
action; command + request action: suggestion + interpretation: client 
view + interpretation : therapist view + content: client + content ; future 


closed ended + request information open ended 


Request information: 
+ content : outside session 


+ content: client + content: past 


Contentless communication + restatement — number of utterances — ver- 
bal productivity — implied question or statement 


Implements technique + models assertive thoughts + reflects client fears 


Self-disclosure Content : therapist 

Here-and-now Content: present + content: in session 
Affect Content : emotions 

Cognitions Content : cognitions 

Behavior Content : behavior 


“I” statement First-person singular 


“We” statement First-person plural 


Significant others 


Content : significant others 


developed precautions (eg., Paul, 1969) 
against the “therapist uniformity myth” 
(Kiesler, 1966) for many years now. How- 
ever, it is recognized that it is not sufficient 
to only control for “nonspecific” effects: We 
must attempt to determine what it is in the 
moment-to-moment behaviors of therapists 
that affects clients and how these specific be- 
haviors interact with behavior change tech- 
niques—and differences in therapists, clients, 
settings, and target objectives—to produce 
more or less successful outcomes (cf. Wil- 
son & Evans, 1976). Only then will sys- 
tematic training and research replication be 
possible, 

Second, the significance of the client’s per- 
ceptions and cognitions (i.e. attributions, 
self-statements, expectations) in behavior 
change and maintenance has been convinc- 
ingly established by the cognitive therapists 
Nog Benk, 1970; Ellis, 1970) and cogni- 
3 ehavior therapists (e.g., Lick & Boot- 

» 1975; Mahoney, 1974). Self-perception, 


self-reinforcement functions, expectancy of 
gain, cognitive distortions, and cognitive cop- 
ing strategies have been extensively consid- 
ered in recent behavior therapy theory and 
research. Now, CPTR can be added as a 
researchable cognitive factor of potential 
practical importance. 


Variance in CPTR 


Individual differences among therapists 
emerged as the primary determinant of 
CPTR, consistent with previous research on 
therapeutic process in contexts other than 
behavior therapy (cf. Meltzoff & Kornreich, 
1970). Further research will be needed to 
pinpoint just what it is that differentiates 
therapists who evoke more or less CPTR 
(e.g., mood states; cognitive styles). A £ test 
conducted to explore the possibility of sex 
differences was nonsignificant, indicating that 
therapist sex was not a factor in this study. 
It should also be noted that replication with 


1310 JULIAN D. FORD 


Table 6: Therapist Style as a Predictor of the Client's Perception of the Therapeutic Relationship 


Predictor R R R? increase Coefficient 

Session 3 Time 2 

Behavior 46 Raat wal 9.8 

Significant others 61 37 16 —7.6 

Nonverbal quality 38 15 AS 2.1 

Significant others 51 25 10 —6.4 
Session 3, Time 3 

“We” statements 20) AS AS 9.1 

Instigation -49 .25 -10 12.5 

“We” statements 39 15 AS 9.1 

Nonverbal quality AT 22 07 1,2 
Session 6, Time 1 

Cognitions Al 17 ols 13.4 

Significant others 59 35 18 —10.1 

Here and now 69 48 13 —8.1 i 
Session 6, Time 2 P 

Emotions 45 -20 .20 9.05 
Session 8, Time 1 

Behavior 40 16 16 19.1 

Significant others 59 435 19 —11.7 

Supportiveness -67 -45 .10 —24.6 

Nondirectiveness 13 .53 .08 24.9 
Session 8, Time 2 

Cognition 36 AZ 13 8.0 
Session 8, Time 3 nity 

“I” statements 36 13 ais 7.3 


Nole. There were no predictors for Session 3, Time 1 and Session 6, Time 3, 


more experienced therapists will be necessary 
to rule out the Possibility that individual 
differences among therapists are only factors 


Scores at the three different time points were 
05 (Session 3 to Session 6), .05 (Session 3 
i J 1 a to Session 8), and —.56 (Session 6 to Ses- 
with relatively inexperienced therapists, sion 8), The data suggest that CPTR should 
Surprisingly, therapy type had virtually no þe measured at several time periods across 
effect on CPTR. Apparently, the same thera- the course. of therapy when used as either a 
ie can eo Tappar as effectively when research or clinical tool. 
applying systematic cognitive and behavioral Beca i i signifi- < 
interventions as when focusing exclusively cance eer te anA hee 
on offering nondirective emotional support, relatively small sample sizes, replication with 


This result is consistent with prior research larger Ns seems advisable to confirm the 
Few: en behavior ie are equally finding that only therapist factors contrib- 
to other theneeeositive CPTR as compared uted substantially to variance in CPTR, and 

) erapists of other orientations (eg., to rule Out the possibility of chance findings. 
nage oN Kickertz, Hubbard, & Gray- 3 
ston, ; Sloane et al., 1975). 

The particular therapy E ETAR CPTR as a Predictor of Outcome 
Sasi relatively little of the variance CPTR was shown to have value as a prei 
hes wane z fete to previous dictor of two key therapeutic outcomes: stay- 
sures have excellent temporal sto as (ue? ing in therapy and making changes in be- 
ms, D ni oe stability (Gur- havior and self-perception. Although this is 

, » te correlations between RI-G Consistent with previous research (eg, Ryan 


_ 


: 
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& Gizynski, 1971; Sloane et al., 1975), rep- 
lication is certainly needed with other be- 
havior therapies, more experienced therapists, 
clinic-referred rather than volunteer clients, 
longer term and open-ended therapies, and in 
a naturalistic rather than research context 
before this conclusion can be generalized be- 
yond the present study’s context. However, 
the results have good external validity for 
several significant settings, populations, and 
therapy types, including novice therapists-in- 
training, brief time-limited therapies (Watz- 
lawick, Weakland, & Fisch, 1974), thera- 
peutic and educational interventions in which 
assertion training is the primary vehicle and 
target (Rich & Schroeder, 1976), and most 
behavior therapy outcome research projects 
(O’Leary & Wilson, 1975). 

CPTR was never an effective predictor of 
long-term maintenance of client improve- 
ments as measured at the 2-month follow-up. 
This suggests that CPTR facilitates and/or 
is predictive of short-term therapeutic change 
but that long-term behavioral maintenance 
requires special therapeutic intervention 
(e.g., training in self-management skills; re- 
structuring the client’s natural environment). 
This is consistent with a major principle of 
behavior therapy: Behavioral generalization 
and maintenance must be programmed rather 
than lamented (cf. Baer, Wolf, & Risley, 
1968). It also reconfirms the notion that 
therapist facilitativeness is a necessary but 
not sufficient ingredient in effective counsel- 
ing and psychotherapy. 


Therapist Behavior as a Predictor of CPTR 


Given the importance of positive CPTR, 
the critical question remains as to what a 
therapist can do to establish rapport with 
each client. (i.e, What is the RI-G really 
Measuring?) In Session 3, there were no 
one Predictors from the first in-session 
(E period, but in the middle part of the 
ESE a focus on behavior or a warm, re- 
ee and energetic nonverbal style, plus a 
Spi the client and not on her significant 
Hae was optimal for CPTR. In the final 
Maa ct a stress on a collegial client- 
eens relationship (i.e., “We” statements) 

either encouragement to take action or 
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a positive nonverbal style were associated 
with higher RI-G scores. Thus, it seems that 
CPTR is maximized in this early therapy ses- 
sion when the therapist communicates ver- 
bal and/or nonverbal encouragement, in- 
volvement, concern, and respect for the 
client. The behavioral correlates of CPTR 
in this session closely parallel those identi- 
fied by previous research (Gurman, 1977). 
This is not surprising, since much of the 
previous research was done using behavior 
samples from sessions early in therapy. This 
type of verbal and nonverbal approach is 
thought to be essential to creating “rapport” 
in early sessions by behavior therapists (e.g., 
Wilson & Evans, 1976) as well as by thera- 
pists of other theoretical orientations, 

In the early phase of Session 6, a focus 
on the client rather than on her significant 
others, and in particular on her cognitive 
responses to extratherapy session events and 
situations, was related to higher CPTR rat- 
ings. As in Session 3, CPTR in Session 6 is 
maximized when the therapist shows par- 
ticular interest in the client (and minimized 
when the focus is on her significant others), 
but more specifically on her extratherapy 
cognitive reactions and coping strategies. 
Starting off on the right foot seems important 
in Session 6, because therapist behavior is 
relatively insignificant as a predictor of 
CPTR in the middle and late phases of this 
session, 

Similarly, in the final therapy session, 
therapist behavior is strongly related to RI-G 
ratings only in the early phase. At that point, 
a focus on the client and not on her signifi- 
cant others, with the therapist talking rela- 
tively little and largely reflecting back the 
client’s messages, and with little direct sup- 
port explicitly communicated to the client, 
were predictive of higher RI-G ratings. At 
the middle time point, a focus on cognitions 
was optimal, and in the final part of the 
final session, “I” statements were the sole 
predictor of positive CPTR. This suggests 
that therapists enhanced CPTR in Session 8 
by taking actions that facilitated a smooth 
and, for the client, complete transition be- 
tween therapy and termination: allowing the 


client to take charge and wrap-up any un- 
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finished business that she might have; not 
attempting to belatedly bolster the client’s 
self-perceptions through supportive state- 
ments; dealing with the client’s expectations, 
concerns, and fears; and providing personal- 
ized feedback to the client. 

Thus, consistent with past research in a 
nonbehavioral therapy context (e.g., Bar- 
rington, 1961), the pattern of therapist be- 
haviors that emerged as predictive of CPTR 
was not static; it changed systematically 
from session to session. Clearly, therapists 
must tailor their style of interaction to fit 
the changing goals and needs of the client 
at different time points in therapy. 

These data suggest a potential explanation 
for why clients’ RI-G ratings varied from 
session to session, and why RI-G scores were 
differentially effective as predictors of dif- 
ferent outcomes at different time points in 
therapy. The data do not support the possi- 
bility that clients were basing their RI-G 
ratings on the same therapist behaviors in 
each session and that their therapists simply 
behaved very differently in different sessions: 
With one exception (i.e., a focus on signifi- 
cant others was detrimental to CPTR in all 
sessions), RI-G ratings were associated with 
different patterns of therapist behaviors in 
each session and at different time periods. 

A more plausible explanation is that clients 
based their RI-G ratings on how well the 
therapist and the therapy were fulfilling 
their expectations, and that different factors 
influenced the clients’ expectations at differ- 
ent points in therapy. Early in therapy, a 
global good-bad therapist/therapy dimension 

that has been hypothesized based on empiri- 

cal data by previous researchers (cf. Mitchell 
et al., 1977), and that has little to do with 
subsequent client gains, but that serves as 

a basis for the client to either stay in or 

drop out of therapy seemed to be most 

important—hence, the importance of such 
behaviors as nonverbal st: 


a yle and emphasis 
on a collegial client-therapist relationship at 


this point. At the mid to late orti 

therapy, CPTR seemed to be ae ad 
tion of the gains that the client saw herself 
making toward her objectives (i.e., becomin 
more assertive and less anxious in ise 
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therapy social situations). The «therapists 
seemed to enhance, and perhaps elicit, these 
positive self-perceptions by encouraging the 
client to discuss her successful attempts at 
assertion from the past week and the cog- 
nitions that accompanied such occurrences, 
Finally, in the last therapy session, RI-G 
scores appeared to assess the client’s feelings 
about ending therapy. Those who felt fin- 
ished and ready to move ahead on their own 
gave high ratings, whereas those who felt 
unready to end the supportive relationship 
expressed this fear and dissatisfaction 
through low RI-G scores, Therapist actions 
that prevented the latter and facilitated the 
former cognitive set (e.g., allowing the client 
to deal with the issues that she feels are 
essential, giving meaningful feedback) en- 
hanced CPTR. The significant inverse rela- 
tionship between RI-G scores for Sessions 6 
and 8 seems to have occurred because some 
of the clients who experienced significant 
gains in therapy (and thus tended to give 
high RI-G ratings during Session 6) were 
nevertheless not ready to terminate therapy 
and therefore gave lower RI-G ratings in 
Session 8. Conversely, some of the clients 
who did not make significant gains (and who 
gave low CPTR ratings in Session 6) were 
simply insufficiently assertive to resist the 
strong “hello-and-goodbye” demand charac- 
teristics that generally accompany the end 
of therapy or simply adjusted their expecta- 
tions and self-perceptions so that despite 
failing to make significant changes, they did 
indeed feel ready to leave therapy. A scan of 
the data for individual clients supported this 
explanation: 80% of the clients increased or 
decreased their RI-G ratings from Session 6 
to Session 8, and they were relatively evenly 
distributed in four groups paralleling those 
hypothesized (i.e, some clients who made 
significant gains increased their RI-G ratings 
while some lowered them; similarly, some cli- 
ents who failed to improve significantly in 
therapy raised their RI-G ratings and others 
lowered them). 

Clearly, more detailed research must evalu- 
ate these hypotheses before they can be con- 
sidered as more than speculative. In particu- 
lar, a more direct assessment of the factors 
that are hypothesized as bases for the client’s 


i] 


THERAPEUTIC RELATIONSHIP 


RI-G ratings at different points in therapy 
is necessary (e.g., in addition to administering 
the RI-G, clients could be asked specifically 
about their global evaluations of the thera- 
pist and therapy, their perceptions that they 
were benefiting from therapy, and their 
readiness to end therapy). Further refinement 
of the therapist behavior code is another im- 
portant future direction to eliminate the 
need for a priori groupings of the separate 
therapist behavior code variables. The mea- 
sures that contributed most to the prediction 
of CPTR would seem to be excellent foci for 
such research (e.g., significant others, non- 
verbal quality, we statements, cognitions). 
Also, other potential determinants of CPTR 
must be examined in light of the finding that 
therapist behavior rarely accounted for as 
much as 50% of the variance in RI-G scores 
(e.g., client expectations and self-perceptions; 
the nature of the client-therapist interac- 
tion; and therapist characteristics). 

To conclude, this study has demonstrated 
that CPTR is an effective predictor of drop- 
ping out and short-term client gains in a 
behavior therapy context. Differences among 
individual therapists contributed significantly 
to variance in CPTR, but therapy type and 
session did not. Further, a set of specific 
therapist behaviors, which were remarkably 
similar to those postulated by Rogerian cli- 
nicians (Mitchell et al., 1977), were found to 
be predictive of positive CPTR. The limita- 
tions of this study mandate its replication 
before the results can be considered as defini- 
tive, yet the importance of the therapeutic 
telationship in behavior therapy seems clear. 
However, it must be recalled that CPTR was 
never an effective predictor of the funda- 
mental criterion of therapeutic effectiveness: 
long-term client gains. We cannot rule out the 
Possibility that in view of the potential reac- 
tivity of the self-report and role-playing 
i sae of outcome (Ciminero, Calhoun, & 
ee? 1977), the strong demand character- 

ics implied by a facilitative therapist may 

ave been responsible for the pre-post client 
ie that were related to CPTR. The fact 
sor) Gea outcome measures showed a 
ee ion with CPTR militates against but 
âp ot rule out this hypothesis. CPTR thus 
pears to facilitate the process of change in 
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behavior therapy, but it is clearly not a 
sufficient basis for fully effective intervention. 


Reference Note 


1. Linehan, M., Goldfried, M. R., & Goldfried, A. P. 
A systematic evaluation of three approaches to 
assertion training. Unpublished manuscript, State 
University of New York at Stony Brook, 1977. 
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Effects of Aging, Organicity, Alcoholism, and 
Functional Psychopathology on WAIS Subtest Profiles 


John E. Overall, Norman G. Hoffmann, and Harvey Levin 
Department of Psychiatry 
University of Texas Medical Branch, Galveston 


Multivariate analysis of variance was used to examine the independent effects 
of several factors on Wechsler Adult Intelligence Scale subtest patterns. A dis- 
criminant function contrasting organic brain syndromes with lesser functional 
psychiatric disorders was defined. Alcoholic patients were found to occupy a 
position close to the organic brain syndrome group on that continuum. Aging 
was associated with deficits in specific subtests, whereas the organic brain syn- 
drome pattern involved more generalized deficits. 


e primary purpose of this investigation 
© examine the similarities and differ- 
in Wechsler Adult Intelligence Scale 
S) profile patterns for various organic 
nctional psychiatric diagnostic groups, 
the effects of several other relevant fac- 
controlled statistically. In recent years, 
has been increasing interest in the na- 
of changes in patterns of intellectual 
ilities associated with aging and with vari- 

rganic and functional disorders, but the 
have seldom been clearly separated. 
“authors agree that in cross-sectional 
at least, age decrements in intellectual 
formance begin to appear rather early 
es, 1959; Wechsler, 1958). Such age- 
ed changes tend to be confounded with 
ges that are produced by disease 
which are also differentially age re- 


potential diagnostic contributions of 
‘AIS have been said to range from the 
atiation between primary depression 
mood alterations that are frequently 
ited with senile dementia (Post, 1975) 
discrimination between patients with 
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Alzheimer’s disease and those having a 
strictly functional psychiatric disorder (Mal- 
amud, 1975). The possible differential diag- 
nostic significance of the WAIS in elderly 
patients is supported by the findings of a 
number of other investigators who have cor- 
related changes in WAIS profile patterns 
with estimated degree of cerebral integrity 
(Thompson, 1976), slowing of dominant fre- 
quency of the electroencephalogram (Wang, 
1973), and with reduced cerebral blood flow 
(Wang, Obrist & Busse, 1970). Most studies 
suggest that the Performance subtests that 
correspond to the factor of “perceptual or- 
ganization” (Cohen, 1957) are particularly 
sensitive to neurologically based intellectual 
deterioration in elderly patients. A problem 
with interpretation of such results is that 
aging, from very early on, produces pattern 
changes in the absence of any evidence of 
neuropathology. In most studies, the effects 
that have been reported in elderly patients 
suffering from neurological disfunctions of 
one type or another are almost certainly con- 
founded with the effects of normal aging, 
even though age-group norms have frequently 
been considered. 

Somewhat different from the question of 
diagnostic utility is the hope that subtle dif- 
ferences in profile patterns can lead to a 
better understanding of the nature of cogni- 
tive, perceptual, and phenomenological defi- 
cits experienced by psychiatric patients. 
Overall and Gorham (1972) questioned 
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whether the impairment present in normally 
aging individuals is similar to that observed 
in patients with diagnoses of chronic brain 
syndrome. Using the method of multiple dis- 
criminant analysis with WAIS subtest pro- 
files, they found that normally aging in- 
dividuals seemed to differ on one dimension, 
whereas the chronic brain syndrome patients 
differed from the normal aging groups along 
a separate dimension. Specifically, in the in- 
stitutionalized samples that they studied, 
simple aging produced greater pattern vari- 
ability, whereas old patients with chronic 
brain syndromes evidenced a more general 
deficit that was reflected in both Verbal and 
Performance subtests. 

Subsequently, Williams, Ray, and Overall 
(1973) used the WAIS aging and organicity 
functions defined in Overall and Gorham 
(1972) to investigate the nature of possible 
subtle changes in the phenomenological world 
of the alcoholic patient. They questioned 
whether the preclinical changes associated 
with alcohol abuse are more consistent with 
accelerated mental aging or with a develop- 
ing organic brain syndrome, They found that 
the WAIS subtest profiles for alcoholics in a 
state hospital sample deviated from age norms 
along both the mental aging and organicity 
dimensions. Such findings are compatible with 
other work involving the effects of chronic 
alcohol abuse in intellectual functioning 
(Matarazzo, 1972). 

The present study attempts to use a more 
appropriate and powerful multivariate statis- 


Table 1 
Twelve Major Diagnostic Groups and 
Their Frequencies 


eae 
Diagnosis n 
Organic brain syndrome 29 
Drug abuse 13 
Alcoholic 29 
Paranoid schizophrenia 43 
Schizophrenia 52 
Manic 20 
Schizoaffective—manic 16 
Schizoaffective—depressed 41 
Psychotic depression 29 
repressive reaction 91 
Situational reaction 33 
Personality disorder 
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tical method to separate out and clarify the 
nature of WAIS profile pattern changes that 
are associated with chronic organic brain 
syndrome, alcoholism, and various functional 
psychiatric conditions independent of the 
effects associated with aging and with back- 
ground factors that might also contribute to 
apparent diagnostic group differences, The 
need for such complex statistical methods 
arises because diagnostic groups differ in 
average age and in other factors that are 
known to affect WAIS profile patterns. Left 
uncontrolled, such factors can produce differ- 
ences in the WAIS profile patterns that are in 
no way characteristic of the disease processes 
themselves. 

A multivariate analysis of variance 
(MANovA) procedure was used to partial out 
the effects of several partially confounded 
nuisance variables, to estimate the WAIS 
profile patterns typical of diagnostic groups 
with age held constant, and to estimate the 
effects of simple aging with diagnostic group 
differences held constant. The prototype for 
the manova is a factorial design with equal 
numbers of subjects in all cells. The equal cell 
frequencies ensure that the effects of any 
Particular factor, such as age, are equally rep- 
resented in each diagnostic group. In practice, 
it is not necessary to have each factor bal- 
anced with regard to presence in the various 
levels of each other factor. A nonorthogonal 
design can be analyzed by least squares re- 
gression methods to obtain estimates of the 
same effects that would be observed in 4 
balanced (orthogonal) design involving the 


Same factors (Overall, Spiegel, & Cohen, 
1975). 


Method 


WAIS subtest profiles for a sample of 414 psychi- 
atric patients were obtained from the files of the 
Psychometric Laboratory at the University of Tan 

edical Branch. Final clinical diagnoses were use 
to group the patients into 12 major diagnostic cate: 
gories, as listed in Table 1. The organic group is © 
Special interest in this investigation, since it repre- 
sents a generally younger and somewhat less chronic 
sample with regard to the question of differences 
between mental aging and organicity than previ- 
ously studied by Overall and Gorham (1972). The 
organic brain syndrome group in the present tee 
includes patients who were given diagnoses of either 
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Table 2 
Summary of Tests for Main Effects in a Five-Way MANOVA Design 
Hotelling trace Pillai-Bartlett 
Wilk’s 
Variable xe df A F df 
Age 142.16 33 .6942 4.19** 33, 1158 
Ethnicity 118.37 22 -1376 5.56** 22, 770 
Sex 84.50 11 8045 8.48** 11, 384 
Parental SES 39.86 22 -9026 1.80* 22, 770 
Diagnosis 204.25 121 5951 2533" 121, 4334 


Note. MANOVA = multivariate analysis of variance; SES = socioeconomic status. 


*p <.05. 
* p < 001. 


chronic or acute organic brain syndrome, but all 
were either newly admitted inpatients or outpatients 
who were referred for diagnostic testing. No focal 
lesion, trauma, or temporal lobe epilepsy diagnoses 
were included. 

The independent effects of age and diagnosis were 
of primary interest in the analyses of the WAIS 
subtest profiles for these psychiatric patients. Sex 
and race of the patient and social class of the pa- 
tients’ parents were selected as control variables that 
should substantially affect the expected IQ and 
profile pattern without being influenced by the 
cognitive abilities or clinical status of the patient. 
The desire was to control statistically for premorbid 
differences so that the actual effects of age and 
Psychopathology could be seen more clearly without 
Partialing out anything that might represent an 
effect of the variables of interest. A five-way Age X 
Race X Sex X Socioeconomic Status (SES) X Diag- 
nosis MANova was used to test the effects of each 
factor independent of effects on WAIS profile pat- 
terns that might be attributed to other factors in 
the design. Univariate analyses of variance were also 


Table 3 
Adjusted Mean WAIS Subtest Profiles for Four 
in the Psychiatric Population 


calculated for each WAIS subtest using the same 
five-way design. A computer program by Woodward 
and Overall (1974) was used to accomplish the 
multivariate and univariate analyses in a single pass. 


Results 


The manova resulted in the recognition 
that age, race, sex, and SES all have highly 
significant independent effects on WAIS sub- 
test profiles and that WAIS subtest profiles 
for the 12 clinical diagnostic groups differ 
significantly after the effects of the other 
factors are partialed out. A summary of re- 
sults from the tests of main effects in the 
five-way MANOVA is presented in Table 2. 
Three different test statistics are shown for 
each effect: an approximate chi-square test 
based on the sum of all roots of the pac aW| 


Age Groups 


Pe Piychiairic: Popular ae D E E a eet 


Subtest <30 30-39 40-49 50+ ae 
Information 7.84 8.22 7.84 7.98 65 
Comprehension 8.01 8.71 7.85 7.49 2.43 
Arithmetic 7.07 7.48 7.03 6.61 1.31 
Similarities 9.50 8.62 7.17 7.20 9.91" 
Digit Span 7.12 8.36 7.44 7.25 2.35 
Vocabulary 8.32 8.54 7.86 8.03 97 
Digit Symbol 7.80 6.97 5,56 4.47 28.11* 
Picture Completion 8.41 8.20 7.00 6.36 11.46* 
Block Design 7.69 7.48 6.23 5.58 9.93* 
Picture Arrangement 7.95 7.75 6.54 6.19 7.02* 
Object Assembly 7.90 7.63 6.68 6.05 5,66* 


N 
age WAIS = Wechsler Adult Intelligence Test. 
oJ = 3, 394, 

$ <01. 


Ay 


Figure 1. Configural relationships among diagnostic 
groups (organic brain syndrome, 1; drug abuse, 2; 
alcoholic, 3; paranoid schizophrenia, 4; schizo- 
phrenia, 5; manic, 6; schizoaffective—manic, 7; 
schizoaffective—depressed, 8; psychotic depression, 
9; depressive reaction, 10; situational reaction, 11; 
and personality disorder, 12.) 


= 0 determinantal equation (Hotelling, 1951), 
a chi-square test based on Wilks? likelihood 
ratio criterion lambda (Wilks, 1932), and the 
Pillai-Bartlett F approximation (Olson, 1976), 
The latter test is widely accepted as best for 
MANOVA purposes, although in the present 
case all tests lead to the conclusion that each 
factor has a highly significant independent 
effect on WATS subtest profiles, 

Adjusted mean subtest profiles for patients 
in the four age groups are presented in Table 
3, together with the univariate F ratios for 
testing the significance of effects of age on 
each subtest separately. The interpretation 
of “adjusted means” in a nonorthogonal de- 
sign should perhaps be clarified at the out- 
set. The adjusted means are 
means that would obtain i 
included equal numbers of 
as well as equal numbers in each of the three 
social classes, the three 
the 12 diagnostic ‘oups. isti 
estimate the aero a. Pa eally 


otherwise estimate by sampling in such a 
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manner that all cells in the complex factorial 
design contained the same N (Overall et al, 
1975). It is in this manner that the effects of 
other factors are balanced out so that the 
effect of any one factor can be evaluated in- 
dependently of any confounding with other 
factors in the design. 

It is apparent in Table 3 that age had se- 
lective effects on WAIS subtest profiles in 
the psychiatric population, just as has been 
reported in the general population. The age 
effects were pronounced on all Performance 
tests and on the Similarities subtest of the 
Verbal section. Age effects were not observed 
on Information, Arithmetic, and Vocabulary 
subtests, and there was only a nonsignificant 
trend toward a decrement in Comprehension 
and Digit Span. Thus, considering that the 
Similarities subtest is the only Verbal sub- 
test requiring “fluid intelligence” (Cattell, 
1963), deficits associated with aging in the 
psychiatric population are clearly represented 
in an increase in the discrepancy between 
“crystalized” verbal abilities and cognitive- 
perceptual-motor behavior requiring flexibil- 
ity, speed, and agility. 

Adjusted subtest means for the 12 clinical 
diagnostic groups are presented in Table 4. 
In this case, the profiles have been adjusted 
to reflect a common average age, race, Sex, 
and SES. Independent of age and the other 
factors, patients in the different diagnostic 
groups evidenced significantly different WAIS 
profile patterns. The diagnostic group differ- 
ences can be seen to be represented in both 
Verbal and Performance subtests to a greater 
degree than was true with the age effects. 
Again, the diagnostic effects represented in 
these adjusted profiles are associated with 
Psychopathology independent of age and other 
factors. Whereas age effects were absent in the 
Information, Arithmetic, and Vocabulary sub- 
tests, significant diagnostic group differences 
Were present on those subtests, as well as 0n 
the Performance subtests, which were also 
Sensitive to age. No significant diagnostic 
group differences were present for Similarities, 
which was the only Verbal subtest that was 
affected by age, 3 

The MaNova between diagnostic groups 1S 
equivalent to a multiple discriminant function 
analysis calculated on WAIS profiles from 
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which the effects of age, race, sex and SES 
have been eliminated. The analysis confirmed 
that all of the significant group differences 
were represented in two major dimensions. 
The sum of all roots of the error-adjusted 
between-groups matrix was Zà = hie, 
which can be referred to a chi-square distribu- 
tion with p(% — 1) = 121 degrees of freedom. 
The first and second roots were Ay = 79.8 and 
d= 41.4, with (p +k— 2) and (p +k — 
4) degrees of freedom, respectively. The sum 
of all remaining roots, after the first two 
were subtracted out, was SA; = 96.0, which is 
approximately distributed as chi-square with 
81 degrees of freedom, and was not statisti- 
cally significant. Thus, the data provide no 
evidence that more than two dimensions are 
required. 

The configuration of group means in the 
plane defined by the first two discriminant 
functions is displayed graphically in Figure 1. 
The first, or horizontal, dimension contrasts 
the organic group with several of the least 
impaired of the functional psychiatric groups. 
It is defined primarily as a contrast between 
Information and Vocabulary on the one hand 
and Arithmetic and Digit Symbol on the other. 
Individuals who score toward the organic end 
of this contrast function tend to score low on 
Arithmetic and Digit Symbol relative to 
their scores on Information and Vocabulary. 
This interpretation can be verified by calcu- 
lating the simple contrast from the adjusted 
group means in Table 4. 

The second, or vertical, dimension sepa- 
tates the organic group from the more accel- 
erated schizoaffective, manic, and manic- 
depressive groups. It tends to emphasize a 
general deficit in performance of individuals 
who score toward the organic end of the 
vertical continuum. If one considers the sum 
of Vocabulary and Digit Symbol scores, 
tather than their difference, the ordering of 
diagnostic groups in the vertical dimension 
can be reasonably well approximated. The 
Ordering in the vertical dimension can be 
even more closely approximated by the sum 
of adjusted mean scores on all 11 subtests in 

able 4, 

= Considering the question of a general 
TA as well as the more specific pattern 

S, it is important to recognize that the 


Table 4 


Groups in the Psychiatric Population 


Adjusted Mean WAIS Subtest Profiles for 12 Diagnostic 


ality 
F 


Person- 
reaction reaction disorder 


Depres- Situa- 
sive tional 


Psy- 
chotic 
sion 


Schizo- 
affective affective depres- 


manic 


depressed 


Schizo- 


Schizo- 
Paranoid phrenia 
schizo- (non- 
phrenia paranoid) Manic 


Alco- 
holism 


Drug 
abuse 


Organic 
brain 
syn- 
drome 


Subtest 


Information 


Comprehension 


Arithmetic 
Similarities 


Vocabulary 

Digit symbol 
Picture Completion 
Block Design 
Picture Arrangement 
Object Assembly 


Digit Span 


++p < 01. 


* p <.05. 


Wechsler Adult Intelligence Scale. 


Note. WAIS 
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WAIS subtest profiles in this analysis were 
adjusted statistically for the effects of age, 
race, sex, and parental social class. It is diffi- 
cult clinically to take all relevant factors into 
consideration in deciding whether a general 
or specific pattern deficit is present. The aim 
here is merely to understand what the effects 
of organic and functional psychopathology 
are on the cognitive-perceptual-motor per- 
formance of psychiatric patients. The results 
suggest that organic patients manifest a gen- 
eral impairment, which is most marked on 
tasks requiring perceptual-motor speed and 
agility. As compared with other types of 
psychopathology, moderate hypomania ap- 
pears to have a less detrimental effect on 
WAIS performance. 

It is interesting to note the proximity of 
the alcoholic group to the clinically organic 
group in the WAIS subtest space. None of the 
alcoholic patients had a clinically recognizable 
organic brain syndrome; yet, as a group they 
ranked next to the organic brain syndromes 
on both the specific (horizontal) and general 
(vertical) dimensions of deficit, Again, it 
would be difficult to evaluate such trends clini- 
cally because of the need to control for other 
factors that influence WAIS performance; 
however, the results do Suggest that WAIS 
profile patterns are sensitive to subtle pre- 
clinical changes associated with alcohol abuse 
and that the brain syndromes that occur as 
end points in alcoholism are not precipitous 
outcomes. 

Ethnicity and sex of patients and social 
class of patients’ parents were included pri- 
marily as control variables to adjust statisti- 
cally for major premorbid differences that 
should not be confused with the effects of age 
or psychopathology. As a consequence, the 
results pertaining to the effects of those fac- 
tors will be discussed only very briefly. It 
should be noted that educational achievement 
was not partialed out because it jc ; 
result of IQ. SS eee 


The total patient sample was subclassified 


Mexican American, Two 
ns were found to Separate 
he WAIS subtest space 
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rated the three groups in the order white, 
Mexican American, and black. It was a gen- 
eral factor that correlated highly with the 
total score on all subtests. The second dis- 
criminant function separated Mexican Ameri- 
cans from the other two groups, and it tended 
to be a contrast between Verbal and Perform- 
ance subtests, with the Mexican Americans 
scoring relatively better on the Performance 
subtests than on the Verbal subtests. 

Sex differences were apparent in this psy- 
chiatric population after adjustment was made 
for age, race, parental SES, and clinical diag- 
nosis. Males scored significantly higher on 
Information, Arithmetic, and on all Perform- 
ance subtests except Digit Symbol. Females 
did not score significantly higher on any sub- 
test. 

Social class of parental home was taken 
from a history form completed by the psy- 
chometrists who did the testing. Although it 
was routinely inquired about, there is some 
concern that inferences based on social 
achievement of the patients themselves may 
have crept into the scoring of this item. Social 
class was scored as lower, lower middle, or 
middle, and upper. After adjustment was 
made for age, race, and diagnosis, the effect of 
the SES factor was highly significant on all 
subtests. A single significant dimension sepa- 
rated the groups in the anticipated order, with 
Information and Vocabulary being the two 
most discriminating variables. 

The multiway manova provided evidence 
that diagnostic groups differ in WAIS profiles 
independently of age and that age groups 
differ independently of diagnosis. It did not, 
however, Provide a basis for concluding 
whether the pattern of effects produced by 
aging is different from the pattern associated 
with organic brain syndrome. Neither of the 
two discriminant dimensions in F igure 1 can 
be considered to be the organicity dimension, 
since the organic brain syndrome group dif- 
fered from the other groups along both di- 
mensions. A 45-degree rotation of the hori- 
zontal axis in Figure 1 was used to define 4 
function that tends to separate the organic 
group from all other groups along a single 
dimension in the age-corrected WAIS subtest 
Space. The manova directly provided a single 
age function in the diagnosis-corrected WAIS 


$ 


CE a S - 
et i ie 
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subtest space. The weighting coefficients de- 
fining these two discriminant functions are 
presented in Table 5. 

Further attention can now be directed to 
whether mental aging and organicity involve 
similar or different profile pattern effects in 
the psychiatric population. It has already 
been noted in Table 4 that diagnostic groups 
differed significantly on certain Verbal sub- 
tests that were not significantly affected by 
age as shown in Table 3. Examination of the 
weighting coefficients in Table 5 reveals that 
the aging function is more clearly a contrast 
between Verbal and Performance subtest 
scores, whereas the organicity function con- 
trasts subtests within the two major domains. 
The largest effect of both aging and organicity 
was on the Digit Symbol subtest; however, 
low Vocabulary and Digit Span scores were 
Positive indicators for organicity but negative 
indicators for mental aging independent of 
diagnosis, The variability in profile pattern, 
and therefore the variability of weighting co- 
efficients in the discriminant function, is 
greater for mental aging effects than for the 
effects of organic brain syndrome. 

From a conceptual/interpretative point of 
View, the results can be understood by consid- 
ering the age effect to increase pattern varia- 
bility through reduction in “fluid intelli- 
gence” scores and the organic brain syndrome 
effect to represent a-more general global defi- 
cit. The age groups order in a similar manner 
in both dimensions because the “fluid intelli- 
gence” deficit, which increases pattern varia- 
bility, also reduces total scores. The organic 
brain Syndrome effect is almost entirely in the 
dimension characterized by overall deficit and 
only slightly in the direction of increased 
Pattern variability. 


Discussion 


oe present study supports earlier work, 
‘tat indicated that the effects of organic 
ement on the cognitive, perceptual, and 
aa functions in patients diagnosed as 
from eee brain syndrome are distinct 
Dif e effects of the normal aging process. 
ao attributable to aging in this 
‘ atric sample appeared to fit well into 
crystalized versus fluid intelligence con- 
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Table 5 

Discriminant Function Coefficients Defining 
Independent Mental Aging and Organicity 
Effects in WAIS Subtest Profiles 


Ne ee 


Subtest Aging  Organicity 
Information 1652 0693 
Comprehension .0663 .0830 
Arithmetic 1141 0326 
Similarities —.1936 —.0331 
Digit Span 0453 —.0687 
Vocabulary 0499 —.0566 
Digit Symbol — 3399  —.3069 
Picture Completion —.1035 0626 
Block Design —.0400 .0660 
Picture Arrangement —.0526 —.1237 
Object Assembly .0005 —.1743 


Note. WAIS = Wechsler Adult Intelligence Scale. 


cept proposed by Cattell (1963), in that older 
patients tended to show lower scores on those 
subtests that required mental flexibility and 
perceptual-motor speed relative to those sub- 
tests that tap established or practiced cogni- 
tive abilities. In contrast, organic patients 
manifested a more general deficit in intel- 
lectual functioning as measured by the WAIS, 
as well as a different pattern of relative score 
deficits. 

The thrust of this study has been the in- 
vestigation of WAIS subtest scores in an 
attempt to provide some understanding of the 
nature of defects found in psychiatric pa- 
tients. As such, it was not intended to directly 
provide aspects of clinical utility or to sup- 
port specific clinical approaches. The statisti- 
cal adjustments made in these analyses are 
not readily applicable for clinical use. Also, 
significant results in group data do not nec- 
essarily imply sufficient discriminative power 
for application to individual cases. Yet, the 
results reported here would appear to have 
some implications for further work along both 
theoretical and clinical lines. an 

One such approach might be the replication 
of this study on a sample of patients with 
selected organic and functional disorders in 
conjunction with more neuropsychological 
data on the patients. This would facilitate the 
consideration of specific types of organic in- 
volvement and the assessment of apparent 
cognitive deficits in specific diagnostic groups. 
As was noted in this study, patients with the 
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diagnosis of alcoholism appeared most simi- 
lar to the organic group, even though alco- 
holics with any definite or possible organic 
diagnoses were excluded from the sample. 
Similarly, schizophrenics were also found to 
be more similar to the organic group than 
were those with affective psychoses and the 
less disturbed psychiatric patients. This 
would perhaps suggest confirmation of the 
work of Johnstone, Crow, Frith, Husband, 
and Kreel (1976), who found evidence of 
enlarged ventricles in the in vivo brains of 
schizophrenic patients. 

The current results showing distinct effects 
of age support the use of age-scaled standard 
scores as provided in the WAIS manual or 
other standardized scores adjusted for age 
when the WAIS is to be used in the assess- 
ment of organicity. 

One can also infer differential utility of the 
11 subtests in the evaluation of organicity. 
Some of the WAIS subtests seem more effec- 
tive in tapping those abilities or capabilities 
that are diminished by organic impairment. 
Perhaps a few subtests could be used to 
answer specific questions relative to specific 
types of organic involvement. On the other 
hand, further work might indicate that the 
WAIS is not readily adaptable to such a task, 
but other tests or instruments that provide 
better measurement of the subtle distinctions 
suggested here can be devised to determine the 

` degree of organic impairment. 

Research containing multivariate tech- 
niques such as those used in this study may 
provide increased understanding of cognitive, 
perceptual, and motor deficits in psychiatric 
patients. Some efforts may be primarily of 
theoretical import, but others could make 
contributions to clinical practices. 
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Locus of Control in Mexicans and Chicanos: 
The Case of the Missing Fatalist 


David Cole and Jacqueline Rodriguez 
Occidental College 


Shirley Cole 


University of Arizona 


We studied the extent to which a stereotype of Mexican or Chicano students 
as fatalistic is supported by their locus of control scores. Data came from locus 
of control scores from male university students in four nations: United States 
(86), Mexico (57), Ireland (47), and West Germany (54). These data show 
the Mexican university students to be more internally oriented than students 
from each of the other nations (p <.001). The study also compares locus of 
control scores for 151 Anglo and 95 Chicano senior high school students from 
three Southern California high schools. Scores for Chicanos are nearly identical 
to those obtained from Anglo students. Only Chicano male high school students 
not planning to enter college showed any tendency toward a more external 
locus of control (p < .05). The article concludes that to the extent a perceived 
external locus of control would be indicative of a fatalistic outlook, such per- 


ception is lacking in most data on Mexican and Chicano respondents. 


Recent years have seen several studies 
allowing transnational comparisons, usually 
among students, in locus of control measures 
(Hsieh, Shybut, & Lotsof, 1969; Mahler, 
1974; McGinnies, Nordholm, Ward, & 
Bhanthumnavin, 1974; Parsons & Schneider, 
1974; Schneider & Parsons, 1970; Reitz & 
Groff, 1974), Resulting differences between 
Persons of different nations are then related 
to different societal structures, socialization 
Patterns, or national traditions or customs. 
Thus, for example, Hsieh et al. (1969) as- 
Ae the greater externality found among 
ea Chinese respondents, when contrasted to 
aor tom Chinese or Anglo Americans, 
$ ie in outlook that are traditional 

hinese as contrasted with Americans. 
ae et al. (1974) provided a series of 
ens based on Japanese culture and 
lety for their finding of greater externality 


me ea was adapted from a paper presented at 
Phoe Rocky Mountain Psychological Association, 
Raw Arizona, May 1976. 
oan ts for reprints should be sent to David Cole, 
ngeles ipot Psychology, Occidental College, Los 
les, California 90041. 


among Japanese respondents as contrasted to 
those from three nations with an Anglo-Saxon 
tradition. They suggested that the Swedish 
social system of maximizing personal security 
may account for the relatively external scores 
of their Swedish respondents, but they also 
noted that the Swedish respondents were 
younger than those from other national 
groups in their study. 

Among the studies cited above, only that 
by Reitz and Groff (1974) used other than 
student respondents. Reitz and Groff fo- 
cused on locus of control as a function of the 
degree of economic and industrial develop- 
ment in a nation and used nonsupervisory 
laborers from the United States, Mexico, 
Japan, and Thailand as respondents. There 
were no significant main effects of the state 
of economic development, although differ- 
ences did appear on the subscales of the 
Rotter test (Rotter, 1966). There was a 
significant main effect when the western na- 
tions were compared to the Asian nations, 
with the western nations being overall more 
internally oriented. Comparing the United 
States with Mexico, the Americans were more 
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Table 1 
Mean Locus of Control (1-E) Scores 
Group ” M 

U.S. business administration 38 10.03 
U.S. liberal arte 48 (10,15 
Germany “4 LPs) 
Ireland “7 10.23 
Mexico s 5.88 
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type exists among American students similar 
to those studied in this investigation, a sam- 
ple of 53 students at the liberal arts college 
(cited in Table 1) were asked to complete 
the Rotter scale as they believed they would 
answer if they were Mexican students at- 
tending a Mexican university. In contrast to 
the mean of 5.88 actually obtained from 
Mexican students, or the mean of 10.15 ob- 
tained from American liberal arts students © 
answering for themselves, these “pseudo- | 
Mexican” students produced a mean of 15.55, 
This is significantly more external than the 
mean of actual Mexicans (p< .001) or 
Americans answering for themselves (p < 
01). Combining these data with the observa- 
tions of the several authors cited above, we 
contend that the presence of a stereotype of 
Mexicans and/or Mexican Americans as pas- 
sive and fatalistic is justified. In many geo 
graphic areas of the United States, and es 
pecially those from which we come, counselor 
and teacher interaction with Chicano and 
Mexican students is widespread and daily. 
There is ample reason to investigate the var 
lidity of a stereotype that is likely to influ- 
ence counselor and teacher prediction of 4 
student's academic behavior. 

The contention that perception of an ex- 
ternal locus of control would indicate a fata- 
listic outlook seems warranted by what is im! 
plied by such a perception. If one believes, # 
à person with a perceived external locus of 
control is presumed to believe, that positive 
and/or negative events in one’s life art 
one’s personal control, this seems 


clearly to invite, if not define, a fatalistic 
outlook. 


Method 
Subjects 
= Lead study reported here, the Rotter Internal: 


Locus of Control (I-E) Scale (Rottet 
1966) was administered to male, Catholic, 
sdeministration 


t translated into the appropriate foreign language 
nish or German) by a bilingual psychologist, 
first language is English. The translation was 
given to a bilingual colleague, whose first 
uage is Spanish or German, respectively. This 
ue translated the scale back into English, 
out seeing the original scale, Translation was 
dered accurate when this translation back into 
ish corresponded with the original English ver- 
When results from Mexico proved striking, the 
nslation process was repeated, as a double check 
possible errors. No changes were made as a 
ult of this duplication of the initial work. By 
ing business administration majors, we ob- 
students in the four nations: who were pur- 
highly similar curricula, with similar career 
pals, In limiting the respondents to those who were 
it least nominally Catholic, we attempted to control 
the potentially relevant variable of religious 
me of reference as a factor influencing one’s 
ceived locus of control. When data analysis 
owed these two variables to be irrelevant, we 
dded a second group of U.S. students, from a lib- 
ml arts college, to expand the base of the Ameri- 
an sample, but we kept this second group separated 
m the American business administration students 
hout the data analysis, Because sex has been 
to be a variable in I-E scores (McGinnies et 
1974) and because no females were available 
n Ireland and very few from West Germany, we 
fonfined the sample to males. All scales were ad 
inistered in the classroom, by native speakers, The 
sample came from a university in Dublin, the 
man {rom a university in the western portion of 
Germany, and the Mexican sample from a 
mity in south-central Mexico, The American 
ness administration students came from a pri- 
Á Catholic university on the west coast, and 
the liberal arts students came from a private liberal 
college also on the west coast. West German 
Md Irish students were included because they were 
lable to us, and they provided, along with the 
n data, a much broader base of national 
ps against which to compare and contrast the 
ican data. 
At the outset of the study, the Levenson scale 
DS son, 1974) was not available, It was avall- 
~ by the time we started gathering data from 
tanos, however, and its ready analysis into three 
irate scales made it worthwhile to use, particu- 
since direct score comparison with the uni- 
Trtilty student groups was not needed. For the 
fh school portion of the study, the Levenson scale 
administered in classrooms to 12th-grade stu- 
in three Los Angeles area high schools, one 
Sasad Anglo and two almost entirely Chi- 
makeup of student population. In tbe Chi- 
high schools, three different schedules of 
ration were used: In some classes a Chicano 
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Anglo classrooms were administered by an Anglo. 
In all three schools, data were collected from stu- 
dents in college preparatory curricula and from 
students not in such curricula. 


Results 


The results from the testing of university 
students in the four nations are presented in 
Table 1, 

The mean score for the Mexican respond- 
ents was significantly lower than each of the 
other means (p < .001, in each instance), No 
significant differences were found between the 
means of any of the other national groups. 

Because Reitz and Groff (1974) presented 
their data in terms of the percentage of 
externally oriented answers given on each of 
the five Rotter subscales developed by 
Schneider and Parsons (1970), we made the 
same breakdown in analyzing the present 
data, Our figures are presented in Table 2. 
Tests for significant differences in propor- 
tions were tested by z scores, again following 
the method of Reitz and Groff, Beyond those 
cited in Table 2, only one other comparison 
reached significance, The U.S. liberal arts 
students scored more external on the Respect 
subscale than did the West Germans, Except 
for this, all significant differences attest to 
the greater internality of the Mexican re- 


alongside our own ty ate 
dents in the United States and Mai; a 


each subscale the Mexican students 
the Mexican laborers. 
American students were more 
internal than American laborers on the Poli- 
more external on the Re 
American stu- 
American la- 
borers on the but this 

did not bold for the other student group. 
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Table 2 

Comparisons of Percentage of External Responses by Category 

Category 
Group Luck Politics Respect Academic Leadership 
M % external 
U.S. business administration 41 44 57 49 30 
U.S. liberal arts 40 40 62 46 40 
Germany 38 39 39 45 43 
Ireland 45 54 46 28 43 
Mexico 19 38 26 20 26 
z scores of differences in % external 

Mexico: U.S. business administration 2.44** ns 3.10*** 3.02*** ns 
Mexico: U.S. liberal arts 2.50** ns Stee 2.88*** We. 
Mexico: Germany 2.23* ns ns 2.87%** 2.21" 
Mexico: Ireland 2.89%** ns 2.15* ns 2.17 


Note. Only one other comparison reached significance. U.S. liberal arts students scored in a more external 
direction on the Respect scale than Germans (z = 2.34, p < .05). 


*p < 05. 
* p< 02. 
mp < 01, 


students. Within ethnic groups, the college- 
bound Chicano male rejected control by 
chance more than did his non-college-bound 
counterpart (p< .05). No significant sex 
differences were found. All of the high school 
groups, whether Anglo or Chicano, tended to 
endorse the idea of internal locus of control 
and to reject the ideas of control by power- 
ful others or chance happenings. 


Table 3 


Discussion 


This study began as an attempt to evaluate 
the validity of the stereotype of Mexicans 
and Chicanos as fatalistic, as that attitude 
would be expressed through perception of an 
external locus of control. Clearly, the Mexi- 
can university students were mot more eX 
ternal; indeed, they were significantly more 


Comparison of Students and Factory Workers in the United States and Mexico 


Category 
Group Luck Politics Respect Leadership 
M% external 

Mexican factory workers 34 47 

Mexican students 19 z 26 26 

US. factory workers 42 58 43 30 

US. business administration students 41 44 57 30 

U.S. liberal arts students 40 40 62 40 

Z scores of differences in % external 

Mexican students: Mexican factory workers * pi 6.77" 

U.S. factory; business administration on aay a 0 

U.S. factory: liberal arts "62 5.53* 5.81* 3.31" 
Note. Fact, i y 
me ory worker data are taken from Reitz and Groff (1974), who did not report data on the Academie 


*p <.001, 


-N 


{ 
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Table 4 
Locus of Control Scores: High School Seniors 
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College Noncollege 
Powerful Powerful 
Group n Internal others Chance n Internal others Chance 
Males 
Anglo 47 9.86 —3.88 —5.68 28 11.29 =1.11 —3.46 
Chicano 23 11.00 —3.74 —5.74* 24 11.04 —1.86 —1.29* 
Females 
Anglo 38 10.52 —3.24 —5.94 38 10.11 —3.29 —3.11 
Chicana 29 11.41 —3.24 —3.52 29 11.33 —4.58 —3.52 
*p < .05. 


internal in locus of control than students 
from three other nations. The finding is not 
imited to a single time period, for the data 
were collected at two points in time, about 
7 months apart. The results are not limited 
to the particular instrument, for two of us 
Cole & Cole, 1977) have reported a simi- 
larly strong emphasis on internal locus of 
Control using the Levenson scale with Mexi- 
can students. The data from Reitz and Groff 
1974) allow us to note that this is not an 
artifact of testing students. Their Mexican 
laborers, although not more internal than 
hee laborers, were not more external 
T er. Indeed, on the Luck subscale, surely 
e psychological heart of fatalism, both 
an students and Mexican laborers re- 
he their least externally oriented scores. 
is s we feel that these data lend no support 
, and instead challenge, the stereotype of 
one as fatalistic. 
Ae to Chicanos, we know of only one 
tee study that addressed this issue. 
cano ee Ames (1974) reported that Chi- 
tantly niversity students scored in a signifi- 
Sir: internal direction than Anglo 
a ae the same university. Our data for 
ina students do not show a group 
ites 7 but they are striking in the high 
Hebe a ee between the scores from 
a, from Chicanos. Only the male 
fered te hot planning to enter college of- 
Stereotype, ie tenuous support for a fatalistic 
jection as in their significantly lower re- 
ever, it is ontrol by chance. Even here how- 
important to note that they did 


not embrace chance; they merely rejected it 
less than other students. 

The findings regarding Mexicans and Chi- 
canos seem so at odds with Anglo folk wisdom 
that a further item analysis seemed indicated. 
Gurin, Gurin, Lao, and Beattie (1969) have 
distinguished between perceived locus of con- 
trol as it applies directly to the self and asa 
perception of a general, but not necessarily 
personal, social condition. Consequently, we 
did an item analysis that contrasted responses 
to the personally worded items of the Rotter 
scale with those describing general social 
conditions. The Mexican students’ responses 
were the same to both types of items. The 
distinction was not a useful one in this in- 
stance. 

Secondary findings within the study seem 
to warrant only brief comment. It does not 
seem surprising that the Mexican university 
students scored in a more internal direction 
than their labor worker compatriots. As 
young, upwardly mobile students, it follows 
that they would be more convinced of the 
efficacy of their actions than their less ad- 
vantaged compatriots. That this difference 
would be marked in Mexico and not in the 
United States may be accounted for by the 
fact that being a university student in 
Mexico is a considerably more unique 
achievement than in this country. Two of us 
have argued on the basis of other data (Cole 
& Cole, 1977) that an internal locus of con- 
trol will be particularly marked in an indi- 
vidual who has taken a counternormative 
step toward self-improvement. Pursuit of a 
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university degree in business administration 
is probably considerably more unusual in 
Mexico than in the United States. 

The greater internality of the American 
students on the Politics subscale, when com- 
pared to scores on that scale for the American 
laborers, probably can be explained by the 
same rationale as is offered for the differences 
between the Mexican students and the Mexi- 
can laborers. (The American sample was a 
pre-Watergate sample.) On the other hand, 
both American student groups were signifi- 
cantly more externally oriented on the Re- 
spect subscale than the American laborers. 
This difference becomes more easily under- 
stood, however, if attention is directed to 
the content of the four items comprising this 
scale. Two of them deal directly with the 
question of how much assurance one can have 
that one is liked and accepted by one’s peers. 
It is on these two items that the American 
students showed their strongest external ori- 
entation, and thus it seems likely that the 
external orientation on the Respect subscale 
obtained from these two American student 
groups reflects their status as young people 
still unsure of themselves in peer relation- 
ships, an insecurity much less evident in the 
Mexican students, 

Results on the Leadership subscale were 
inconsistent, with one American group differ- 
ing from the laborers and the other not. No 
explanation is offered. 

It is of interest to note that the Irish and 
Mexican student groups share the common 
property of scoring most externally on the 
Politics subscale, something not found in the 
U.S. and West German samples. The political 
situations facing the Mexican and Irish stu- 
dents at the time of the testing may be of 
considerable relevance to this finding. The 
students at the Mexican university were at a 
school where there was a high degree of po- 
litical unrest and opposition to the national 
government and where the data collection 
was held up several weeks while the school 
was closed by government order due to fear 
of student uprising. The Irish students were 
tested during a time of great tension between 

Treland, where acts of 
e part of their daily 


experience. In light of these observations, it 


~ 
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is hardly surprising that these two groups 
showed more externality on the Politics sub- 


were. 
Returning, however, to the chief thrust ol 
this study, evidence to support a stereotype 


senior as fatalistic, believing that his own 
actions are irrelevant to personal outcomes, 
is almost totally lacking. Instead, these 
groups appear equally or more internal in 
perceived locus of control than their Am 
can counterparts or other groups with whom 
they have been compared. 
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A Multiple-Component Treatment Approach to 
Smoking Reduction 


Charles H. Elliott and Douglas R. Denney 
University of Kansas 


A package treatment program was designed to reduce cigarette smoking, and its 
effectiveness was compared with a single treatment condition (rapid smoking), 
a nonspecific treatment condition, and an untreated control condition. Follow- 
ing the treatment and posttesting sessions, another factor was introduced. One 
third of the subjects in each of the three treatment conditions were randomly 
assigned to specific booster (i.e., additional rapid-smoking) sessions, nonspecific 
booster sessions, or no booster sessions. Since the principal issue in the treat- 
ment of smoking is the maintenance rather than the induction of change, em- 
phasis was placed on follow-up smoking levels 3 months and 6 months after 
the termination of treatment. The package condition was shown to produce 
substantially higher abstinence rates (45%) and lower percentages of baseline 
smoking (41%) after 6 months than the other treatment and control conditions. 


No reliable effects due to booster sessions were found. 


Several reviews of the smoking reduction 
literature (Bernstein, 1969; Bernstein & Mc- 
Alister, 1976; Hunt & Bespalec, 1974; Hunt 
& Matarazzo, 1973; Lichtenstein & Danaher, 
1976; McFall & Hammen, 1971) have 
reached the following conclusions: (a) Vir- 
tually any treatment program is capable of 
reducing smoking levels to 30% or 40% of 
baseline; (b) a return to about 75% of base- 
line is commonly observed from 3 to 6 months 
after treatment; (c) seldom more than 13% 
of the subjects in any treatment program are 
completely abstinent after a 3- to 6-month 
follow-up period; and (d) of those subjects 
who are abstinent at the end of treatment, 
less than one third manage to maintain non- 
smoking 3-6 months later. High relapse rates 
have been observed for a wide variety of be- 
havioral techniques, including systematic de- 
sensitization (Pyke, Agnew, & Kopperud, 
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1966), rapid smoking (Lando, 1975, 1976; 
Lichtenstein, Harris, Birchler, Wahl, & 
Schmahl, 1973; Schmahl, Lichtenstein, & 
Harris, 1972; Sutherland, Amit, Golden, & 
Rosenberger, 1975), covert sensitization 
(Sachs, Bean, & Morrow, 1970) aversive con- 
ditioning (Berecz, 1972; Whitman, 1972), 
stimulus control (Bernard & Efran, 1972), 
contingency contracting (Lawson & May, 
1970), and behavioral rehearsal (Steffy, Mei- 
chenbaum, & Best, 1970). 

The striking uniformity in these results 
generated by diverse techniques led McFall 
and Hammen (1971) to hypothesize that non- 
specific factors were responsible for most 0 
the reported changes in smoking rates. They 
designed a treatment procedure that incorp 
rated only nonspecific factors such as moti- 
vated volunteering, self-monitoring, €*pe® 
tancy, demand, and mild encouragement. The 
procedure resulted in reduction rates and ab- 
stinence figures comparable to the figures 
cited above. Lichtenstein and Keutzer (197!) 
have argued that since smoking responds 50 
readily to nonspecific factors, at least temp% 
rarily, minimal or nonspecific treatmes 
groups constitute more adequate controls 
than do untreated groups in smoking reduc 
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tion outcome studies. Unfortunately, most of 
the above studies have failed to include non- 
specific treatment groups in their design. 
Although initial studies focused on the effec- 
tiveness of single treatment procedures, at- 
tention now appears to be shifting to broad- 
based package therapies for treating maladap- 
tive behaviors such as smoking (Lichtenstein 
& Danaher, 1976), alcoholism (Denney, 
1976), and obesity (Mahoney, Note 1). Two 
basic arguments can be cited in favor of pack- 
age approaches to treatment. First, the multi- 
determined nature of these maladaptive be- 
haviors is becoming increasingly apparent. As 
investigators learn to appreciate the complex- 
ity of these behaviors and the variety of 
functions that cigarettes, alcohol, and food 
can serve for each individual, the necessity of 
package approaches that use a variety of 
techniques becomes clear, Second, from a 
strategic point of view, it would seem ad- 
visable to devise a complex treatment pro- 
gram that achieves the desired results in terms 
of bringing about persistent changes in these 
maladaptive behaviors and then subsequently 
to perform analytical studies to discover the 
effective components operating within the 
Package. The current research on the reduc- 
tion of smoking lacks clear demonstrations of 
long-term effectiveness. Only after such effects 
have been produced can the isolation of effec- 
tive component procedures be undertaken. 
Lichtenstein and Danaher (1976) surveyed 
the initial package approaches to smoking 
reduction and concluded that their effective- 
re was no greater than single approaches. 
ìve additional package approaches have since 
appeared in the literature and have generally 
Indicated more encouraging results. Best 
is) evaluated a treatment package that 
fe tailored to subjects’ locus of control scores 
J Ss rapid smoking, aversive condi- 
ie’ attitude change techniques, and a 
bricks: of environmental determinants of 
a te Best reported smoking levels at 30% 
ttonthe n and abstinence rates of 50% 6 
and Tat Be treatment. Pederson, Scrimgeour, 
cluded te ( 1975) treatment package in- 
monitori ypnosis, relaxation training, self- 
Hee rehearsal of alternative behaviors, 
een agement training, and discussion: 
Investigators found that 50% of their 
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subjects were abstinent 6 months after treat- 
ment. Delahunt and Curran (1976) devel- 
oped a package combining self-control and 
negative practice procedures. A 6-month ab- 
stinence rate of 56% for the nine subjects in 
the package condition was reported. Lando 
(1977) reported a 76% abstinence rate 6 
months after administering a treatment pack- 
age that included aversion, contractual man- 
agement, booster sessions, group contact, and 
support. The one exception to these more en- 
couraging results that adheres to package 
approaches is a study by Danaher (1977). 
His package approach, which combined rapid 
smoking with a variety of self-control pro- 
cedures, resulted in abstinence rates of 21% 
and smoking levels of 52% of baseline after 
a 13-week follow-up. Furthermore, the pack- 
age treatment in this final study was actually 
less effective than a procedure involving rapid 
smoking alone. 

Danaher’s (1977) pessimistic findings re- 
garding package approaches are especially 
disturbing, since his study corrected for a 
number of methodological weaknesses in the 
preceding studies. All four of the preceding 
studies used single therapists who were aware 
of the hypotheses under investigation, thus 
augmenting the chances of experimenter bias. 
Three of these studies (Best, 1975; Lando, 
1977; Pederson et al., 1975) failed to in- 
clude nonspecific treatment groups, and, as 
Lichtenstein and Keutzer (1971) have ar- 
gued, these groups represent the preferable 
form of control for smoking reduction studies. 
The data collection procedures used in three 
of the studies (Best, 1975; Delahunt & Cur- 
ran, 1976; Pederson et al., 1975) relied ex- 
clusively on subjects’ own counting of the 
number of cigarettes consumed or on their 
estimated rates of smoking, with no systematic 
attempts to check on the accuracy of these 
data. Finally, Lando (1977) required subjects 
to attend additional sessions of aversive treat- 
ment if they returned to smoking after treat- 
ment, a practice that clearly could lead to 
inflated results at follow-up. 

The present study was designed to evalu- 
ate a broad-based treatment package for the 
reduction of smoking and to compare its 
effectiveness with a single treatment procedure 
(rapid smoking), which constituted a major 
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component within the treatment package. 
Thus, the results of this study are directly 
comparable to those reported by Danaher 
(1977). Like Danaher’s investigation, the 
present study incorporated several methodo- 
logical refinements including (a) multiple 
therapists; (b) a nonspecific treatment pro- 
cedure, including information and group dis- 
cussion along with the usual nonspecific fac- 
tors accompanying any treatment; and (c) 
checks on the accuracy of subjects’ cigarette 
consumption data. The selection of components 
for the current treatment package was based 
on research examining the effectiveness of each 
component as a single treatment procedure for 
reducing smoking. In particular, the treatment 
package was designed to enhance the mainte- 
nance and generalization of treatment effects 
by emphasizing the use of general, self-initi- 
ated coping strategies. 

A second purpose of the present study was 
to evaluate the use of booster sessions for 
improving the maintenance of smoking re- 
duction. Although several investigators have 
recommended the use of booster sessions and 
have incorporated such sessions in their treat- 
ment procedures (Chapman, Smith, & Lay- 
den, 1971; Hunt & Matarazzo, 1973; Lando, 
1975, 1977; Lichtenstein et al., 1973; 
Schmahl et al., 1972), only one controlled 
investigation of the effectiveness of booster 
sessions has been reported. Kopel (1974) 
found no differences between subjects who 
received rapid-smoking booster sessions fol- 
lowing treatment procedures involving rapid 
smoking and subjects who had received no 
such booster sessions. The design of the pres- 
ent study allowed us to compare subjects who 
had received specific booster sessions con- 
sisting of additional rapid-smoking trials, sub- 
jects who had received nonspecific booster 
sessions consisting of additional information 
and encouragement, and subjects who had 
received no booster sessions, 


Method 
Subjects 


ity of Lawrence, Kansas, 
To be accepted into the study, subjects had to be 
al least 10 cigarettes per day and be willing 


uld be returned to them 


CHARLES H. ELLIOTT AND DOUGLAS R. DENNEY 


gradually over the course of the study, contingent 
only on their attendance at treatment and follow-up 
sessions. Of the 108 persons who attended an intro- 
ductory session, 69 complied with these requirements 
and agreed to participate in the study. 

Because of possible health hazards associated with 
rapid smoking (Hauser, 1974; Lichtenstein, 1974), 
subjects were required to complete a medical history 
questionnaire and to have their weight and blood 
pressure measured. Five subjects were dropped from 
the study because of possible health complications. 
One additional subject was dropped later in the study 
because of failure to faithfully use the cigarette 
collection method. 

Pretest and posttest data were therefore available 
for 63 subjects (29 males, 34 females). This sample 
had the following characteristics: average age = 294 
years; average education = 15.7 years; average length 
of time smoking = 12.4 years; average estimate of 
daily smoking level = 27.0 cigarettes; and average 
daily number of cigarettes smoked during a 7-day 
baseline period = 19.6 cigarettes. The number of 
subjects in the package treatment, rapid-smoking 
treatment, nonspecific treatment, and untreated con- 
trol conditions were 20, 19, 18, and 6 respectively.’ 
Three additional subjects were lost during the 6- 
month follow-up period, 1 each from the rapid- 
smoking treatment, nonspecific treatment, and con- 
trol conditions, 


Pretesting 


A pretest session was conducted 1 week prior t0 
treatment. During this session, all subjects com- 
pleted two questionnaires. The smoking history 
questionnaire was designed to provide information 
descriptive of the subject sample and to obtain the 
name and phone number of a close friend to help 
check on the accuracy of the cigarette collection 
method. The semantic differential scale was designed 
to assess subjects’ attitudes toward smoking. The 
Concept cigarette smoking was rated on 10 7-point 
scales comprised of highly evaluative bipolar adjec- 
tives (eg, healthy-unhealthy; attractive-ugly; 
soothing-irritating ; fragrant-rank), ; 

After completing these questionnaires, subjects 
were given a stack of cloth pouches, similar n 
design to tobacco pouches, in which to store m 
cigarette butts. Subjects were explicitly told not a 
count their cigarettes but simply to drop their but! ; 
into the pouches, using a new pouch each day. T 
next 7 days prior to the first treatment session CON 


*Since the subjects in the three treatment gon 
were later divided into subgroups receiving SPec 0 
booster sessions, nonspecific booster sessions, an £ 
booster sessions, the number of subjects in the W? 
treated control group was similar to those of 
other nine cells in the complete design. uld 

2 Pilot testing had shown that subjects baka 
follow these instructions without counting the ™ 
ber of butts in the pouch. 
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stituted the baseline period during which the first 
count of cigarette consumption was completed. 


Treatment 


Following the pretest session, male and female sub- 
jects were assigned separately to four conditions: 
package treatment, rapid-smoking treatment, non- 
specific treatment, and untreated control. Each of 
the three treatment conditions was administered to 
groups composed of 6-9 subjects. The subjects in 
these three treatment conditions attended three treat- 
ment sessions per week for 3 weeks. All subjects 
continued to collect their cigarette butts and to turn 
them in periodically during the 3 weeks of treatment. 

Five advanced undergraduate psychology majors 
(three males, two females), served as therapists. One 
female therapist was responsible for the nonspecific 
procedure, which included giving brief informative 
lectures about smoking, handing out educational ma- 
terials, collecting cigarette pouches, and giving 
mild encouragement. This nonspecific procedure was 
conducted for all subjects at the start of each treat- 
ment session, after which the subjects went to other 
tooms to receive their specific treatment procedures. 
Accordingly, throughout the study this therapist was 
Unaware of the treatment conditions to which sub- 
jects had been assigned. The remaining four thera- 
pists were unaware of the hypotheses under investi- 
Bation, and the treatment procedures they dispensed 
were highly standardized, with many components 
being presented by tape recorders. 

At the first treatment session, subjects were told 
ghat they could attempt to quit smoking either im- 
astay or gradually. The importance of accurate 
aa collection was emphasized, and subjects were 
ane lie detector tests would be used at the end 
Re T study to corroborate their conscientious use 
fi e pouches. A brief description of each condi- 
ton follows.’ 
= ete treatment. In addition to the nonspecific 
in i lure, eight component procedures were included 
(Li ht package treatment. A rapid-smoking procedure 
ties on et al., 1973) required subjects to smoke 
tabi sec until they were unable to continue. Two 
eats trials were presented during each 
oie ent session. In the applied relaxation pro- 
ised e (Chang-Liang & Denney, 1976), subjects prac- 
ieee exercises while seated in a private 
Rom listening to tape-recorded instructions. Sub- 
tion eee also instructed in the application of relaxa- 

i Hes coping skill during times when they felt a 
( Sees The covert sensitization procedure 
Scenes Ee 1972) encompassed both aversive 
Visualized, relief scenes. In the former, subjects 
2 particu], themselves starting to smoke a cigarette in 
Ih the ar setting, becoming nauseous, and vomiting. 
ing relief scene, subjects imagined themselves 
covert rated from cigarettes and making some 
Whereupon, (35 can control this habit”), 

ta imagined feelings of nausea subsided. 
& Lany Ystematic desensitization procedure (Gerson 
on, 1972), subjects were instructed to relax 
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and to imagine scenes involving situations that com- 
monly elicited smoking. Six hierarchically ordered 
scenes were used in this procedure. The scenes had 
been selected on the basis of responses from a pre- 
vious sample of smokers who responded to a survey 
concerning settings that commonly elicit smoking 
behavior. In the self-reward and punishment pro- 
cedure (Thoreson & Mahoney, 1974), subjects com- 
pleted contractual forms describing rewards and 
punishers that they would deliver to themselves con- 
tingent on their reaching or not reaching a self- 
selected smoking rate. In cognitive restructuring 
(Reed & Janis, 1974), subjects listed typical ration- 
alizations as to why it was difficult for them to stop 
smoking. Later they were taught to formulate alter- 
native ways of thinking about each rationalization 
and to use new self-verbalizations whenever tempted 
to smoke. Behavioral rehearsal (Chapman et al., 
1971) involved practicing both verbal responses for 
turning down cigarettes in a variety of social settings 
and nonverbal behaviors (eg, rubbing two coins 
together) as substitute activities during times when 
one was tempted to smoke. In emotional role playing 
(Janis & Mann, 1965), subjects prepared and then 
enacted scenes in which they learned that they had 
lung cancer and had to inform loved ones of their 
impending death. 

In general, the package treatment introduced a 
wide variety of procedures rather than exhaustively 
dealing with each one. Table 1 illustrates the order 
of presentation and the approximate time devoted to 
each component procedure in the package condition 
during the nine treatment sessions. 

Rapid smoking. This treatment was patterned 
after the rapid-smoking treatments described by 
Lichtenstein (Lichtenstein et al., 1973). Following the 
nonspecific procedure, subjects in this condition re- 
ceived two rapid-smoking trials conducted like those 
described for the package treatment. This sequence 
was repeated during each of the nine treatment 
sessions. 

Nonspecific treatment. Subjects in this condition 
received the standard nonspecific procedure, includ- 
ing lectures, educational materials, mild encourage- 
ment, and data collection. To equate this condition 
with the preceding two conditions in terms of both 
time and plausibility, the subjects engaged in non- 
directive discussion for about 45 minutes following 
the standard nonspecific procedure. 

Untreated control, Subjects in this group collected 
their cigarette butts and engaged in the pretest, 
posttest, and follow-up sessions, but they received no 
intervening treatment. They were told that they 
could use any of their own efforts to “quit smoking 
during the data collection period, 


Posttesting 


The posttest session was conducted 1 week after 
the last treatment session. Subjects turned in the 


3More complete descriptions of each treatment 
condition are available from the second author on 


request, 
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Table 1 
Package Treatment Group 
Approx- 
imate 
dura- 
tion 
Session Treatment component (min) 
1 Nonspecific factors 10 
Treatment rationale and 
ratings 10 
Applied relaxation 20 
Rapid smoking 15 
2 Nonspecific factors 10 
Applied relaxation 15 
Rapid smoking 15 
Self-reward training 10 
3 Nonspecific factors 10 
Applied relaxation 15 
Rapid smoking 15 
Self-punishment training 10 
4 Nonspecific factors 10 
Covert sensitization 20 
Instructions for emotional 
role playing 15 
Rapid smoking 15 
5 Nonspecific factors 10 
Covert sensitization 20 
Emotional role playing 30 
Rapid smoking 15 
6 Nonspecific factors 10 
Convert sensitization 15 
Emotional role playing 30 
Rapid smoking 15 
7 Nonspecific factors 10 
Desensitization 20 
Lecture and distribution of 
cognitive restructuring 
questionnaire 10 
Rapid smoking 15 
8 Nonspecific factors 10 
Desensitization 15 
Cognitive restructuring 25 
Rapid smoking 15 
Behavior rehearsal 10 
9 Nonspecific factors 10 
Desensitization 15 
Behavior rehearsal 30 
Rapid smoking 15 


cigarette butts collected over the 7 da 

to their last treatment and once again owas 
semantic differential scale, Pouches for the week prio 
to the 3-month follow-up session were distributed. 


Booster Sessions 
Following the posttest session, ji 
subj i 
of the three treatment conditions oa ra 
divided into three ee 
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signed to the specific booster condition received 
three booster sessions, each of which included a 
brief refresher lecture, mild encouragement, and 
two additional rapid-smoking trials. Those in the 
nonspecific booster condition received three booster 
sessions consisting simply of the refresher lectures 
and mild encouragement. Those in the no-booster 
condition received no contact between the posttest 
and the first follow-up sessions. Booster sessions 
were conducted during the 1st, 3rd, and 5th weeks 
after the posttest session. 


3-Month and 6-Month Follow-Up 


Three months after the posttest session, subjects 
were contacted by phone and were reminded to 
begin using their collection pouches for the next 
7 days, They were also asked to estimate their 
daily smoking level. One week later the subjects 
returned to the laboratory to turn in their pouches 
and complete the semantic differential scale. 

The second follow-up, conducted 6 months after 
the posttest, was identical to the first, After the 
pouches had been collected and the questionnaire 
was completed, all subjects were debriefed and their 
deposits were returned. Untreated subjects and those 
who were not satisfied with their progress in the 
treatment conditions were offered additional in- 
dividual treatment. 


Data Accuracy Checks 


It was possible to obtain the help of a friend 
for 56 of the subjects. The friend agreed to observ’ 
whether the pouches were being used conscientious!) 
by the subject, These friends were contacted once 
during the treatment sessions and once during ea! 
follow-up session. Their reports resulted in the et- 
clusion of only 1 subject’s data. , 

As an additional check on the data, 55 subjects 
Were contacted by phone 4 days after the second 
follow-up session. Using a nonreactive measure ed 
spired by Sushinsky (1972), a confederate attempte 
to conduct a marketing survey in which the a 
ject was asked to report on the quantity of severa 
nonfood grocery items consumed. One item in the 
Survey was cigarettes. Forty-six subjects agreed to 
take part in the survey. Two were later eliminate 
because they suspected a connection to the smoking 
study. In no instance did the estimates given bY 
the remaining subjects vary more than 207 from 
either the smoking level estimates or the actu 
Pouch count collected during the second follow? 
Session. Most importantly, not one of the subject 
who claimed to be abstinent admitted to smoking 
during the nonreactive call. 


Results 
Preliminary Analyses 


A number of preliminary analyses of val” 
ance were performed on the data, produciné 
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3-Month 6-Month 
Follow-Up Follow-Up 


Figure 1, Cigarette consumption as a percentage of the baseline level of smoking. 


the following results: (a) No significant dif- 
ferences were found among subjects in the 
four conditions (i.e., package, rapid smoking, 
Nonspecific, untreated control) in terms of 
àge, years of education, years of smoking, 
‘stimated motivation to quit, estimated daily 
Smoking level, or locus of control; (b) no 
Significant initial differences existed among 
Subjects in the four conditions in terms of 
i of the repeated measures, that is, base- 
4 "smoking level or attitudes toward smok- 
y SI (c) no significant differences existed 
tion puen in the three treatment condi- 
tif) le., package, rapid smoking nonspe- 
Bei terms of expectancy of success as 
Re a by a five-item expectancy question- 
Eoi. ministered during the first treatment 
Be sn (d) no significant main effects for 
actions no significant Sex x Condition inter- 
ie existed on any of the measures de- 
a ae and (e) no significant main 
ition = sex and no significant Sex x Con- 
outcome eractions were found in terms of 
cording] on the two repeated measures. Ac- 
ered = Y, the four conditions were consid- 
qual at the start of treatment, and sex 


as elimi 
hee ad as a factor in all subsequent 


Cigarette Consumption 


Percentage of each subject’s baseline smok- 
ing was used to reflect changes in cigarette 
consumption. These percentages were con- 
sidered a more appropriate gauge of treat- 
ment outcome than actual numbers of cigar- 
ettes smoked and could easily be compared 
with other treatment studies (McFall & 
Hammen, 1971).* Figure 1 depicts the 
changes that occurred in percentage of base- 
line smoking over the course of treatment, 
posttest, and follow-up sessions. 

A 4 (treatment) X 4 (trials) analysis of 
covariance was performed on the percentage 
of baseline smoking scores for the 3 treat- 
ment weeks and the posttest week. The aver- 
age daily number of cigarettes smoked during 
the baseline week served as the covariate. 
A significant Treatment X Trials interaction 
was found, F(9, 177) = 5.96, p< 001. A 
series of one-way analyses of covariance and 
Duncan pairwise comparison tests were per- 
formed to further analyze this interaction. 
Each of the treatment conditions resulted in 
significant decreases in smoking across trials 


4Number of cigarettes smoked was also analyzed 
and yielded results that were equivalent to those for 
the percentage of baseline smoking. 
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(all ps < .001), whereas the untreated con- 
trol condition resulted in no change (F < 1). 
At the time of the posttest, the subjects in 
the three treatment conditions were smoking 
significantly less than those in the untreated 
control condition (all ps < 005). Further- 
more, the package condition resulted in sig- 
nificantly less smoking than the rapid-smok- 
ing condition. The difference between the 
package condition and the nonspecific condi- 
tion was not significant, but it did favor the 
package condition. 

A 3 (treatment) X 3 (booster) X 3 (trials) 
analysis of covariance was performed on the 
percentage of baseline smoking scores for 
the three treatment conditions during the 
posttest, 1st follow-up, and 2nd follow-up 
weeks. Again, the average daily number of 
cigarettes smoked during baseline served as 
the covariate, This analysis revealed no sig- 
nificant main effect or interactions involving 
booster as a factor. Accordingly, the booster 
factor was dropped, and a 4 (treatment) X 
3 (trials) analysis of covariance was per- 
formed on posttest, first follow-up, and sec- 
ond follow-up data, with the untreated con- 
trol condition being added to the design. 
Significant main effects for treatments, F(3, 
55) = 6.59, p<.001, and for trials, F(3, 
55) = 14.85, p < .001, were found. Specific 
comparisons revealed that subjects in each 
of the three treatment conditions showed a 
significant return in the direction of baseline 
from the posttest to the subsequent follow-up 
weeks (all ps < 005). Subjects in the un- 
treated control condition showed no change 
during this period (F <1). At the time of 
the first follow-up session, subjects in the 
package condition were smoking less than 
subjects in the rapid-smoking (p < .07), non- 
specific (p < .05), and untreated control con- 
ditions (p < .005). At the time of the second 
follow-up session, subjects in the package 
condition were still smoking less than aa. 
in the rapid-smoking (p < .07) or untreated 
control conditions (p < .005). Although the 
difference between the package condition and 
the nonspecific condition was 28.6%, thi 
difference failed to attain Siaificance a s 

sa, N the time of both the first PA 

ollow-up sessions, the package condi- 
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tion was the only treatment condition that 
differed significantly from the untreated con- 
trol condition. 

In addition to the percentage of baseline 
smoking data, the proportions of completely 
abstinent subjects in each condition were 
examined. Chi-square analyses demonstrated 
significant differences in the proportion of 
abstainers in the various conditions at the 
posttest session, x*(3) = 11.1, p< .02, and 
at the second follow-up session, x”(3) = 8-5, 
p< .04. At posttest, the package condition 
contained a substantially larger proportion 
of abstainers (65%) than the control condi- 
tion (0%, p< .005), the rapid-smoking 
(26%, p< .02), or the nonspecific condition 
(33%, p < .06). Similarly, at the time of the 
second follow-up, the package condition con- 
tained a larger proportion of abstainers 
(45%) than the control condition (0%, P< 
.06), the rapid-smoking condition (17%, P< 
.06), or the nonspecific condition (12%, P< 
03). It is clear that most of the differences 
between conditions that occurred in the pet 
centage of baseline smoking data can be ac- 
counted for in terms of the proportion of 
abstainers found in each of the conditions. 


Attitudes Toward Smoking 


A 3 (treatment) x 3 (booster) x 3 (trials) 
analysis of covariance was performed on H 
posttest, first follow-up, and second follow-up 
scores of the semantic differential scale, with 
the corresponding pretest scores serving 3 
the covariate, This analysis failed to reve? 
any significant main effects or interactions 
involving booster sessions. Accordingly; the 
booster factor was dropped, the untreate! 
control condition was added to the treat 
ment factor, and a 4 (treatment) X 3 (trials) 
analysis of covariance was performed on the 
scores from the semantic differential 5 cod 
This analysis revealed a significant main e 
fect for treatment, F(3, 55) = 12.06, f 
.001. Duncan pairwise comparisons revea! 
that the subjects in the package conditio" 
held significantly more negative attitudes to 
ward cigarette smoking than subjects 1” cs 
other conditions at the time of the post’ à 
(all ps < .001), with no differences occurrinë | 


the other three conditions. At the 
low-up session, subjects in the pack- 
m continued to express more neg- 
titudes than subjects in the control 
ion (p < .001), the nonspecific condi- 
b<.005), or the rapid-smoking con- 
(p<.07). At the time of the second 
up session, subjects in the package 
n were still evaluating cigarette smok- 
newhat more negatively than those in 
ntrol condition (p< .005), the non- 
condition (p < .08), and the rapid- 
g condition (p < .11). All other com- 
failed to approach significance. 


Discussion 


months after treatment, the package 
ent had produced a percentage of 
e smoking of 41% and an abstinence 
45%. These results are substantially 
than figures reported for single treat- 
tudies (McFall & Hammen, 1971) and 
er than figures reported by Danaher 
in his package treatment study. 
ults are somewhat lower than those 
recent investigations using combina- 
techniques (Best, 1975; Delahunt & 
1976; Lando, 1977; Pederson et al., 
However, as indicated earlier, these 
dies suffered from several methodo- 
oo that may have inflated 
resent study contained a number of 
ds to improve on the validity of its 
Experimenter bias was minimized 
tape-recorded treatment components 
tiple therapists who were blind to the 
Ses of the study and the conditions 
Ompared. In addition, a number of 
pete used to help assure the accuracy 
„Cigarette consumption data. Sub- 
€ informed that they would receive 
tor tests, informants were used to 
Subjects’ smoking status, and a dis- 

et survey was used at the end 


age condition also demonstrated 
Orable results when compared with 
+ conditions in the present study. 
Months, the package treatment was 
dition with a percentage of base- 
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line smoking significantly lower than the un- 
treated control condition. In addition, this 
percentage figure was 38% below that of the 
rapid-smoking condition and 29% below that 
of the nonspecific condition. Finally, the ab- 
stinence rate for the package condition at 6 
months was substantially greater than the 
abstinence rate for each of the other condi- 
tions (all ps < .06). 

Clearly, the package treatment demon- 
strated effectiveness beyond what could be 
attributed to nonspecific factors such as mo- 
tivated volunteering, structure, self-monitor- 
ing, information, and encouragement. Bor- 
kovec (1973) has argued that placebo groups 
or nonspecific treatment groups seldom con- 
stitute adequate controls for expectancy, 
since the subjects in these groups rarely hold 
as favorable a set of expectancies as do those 
in the valid treatment groups. In the’ present 
study, however, an expectancy questionnaire 
was administered during the first treatment 
sessions, after subjects had heard descrip- 
tions of the treatments that they were about 
to receive. Subjects in all three treatment 
conditions held comparable initial expect- 
ancies, suggesting that differential expectancy 
at the start of treatment does not account for 
the differences between the package and the 
nonspecific conditions. ; 

The present design did not permit the 
identification of the effects attributable to 
each of the component procedures within the 
package treatment. However, reports on a 
follow-up questionnaire by a number of the 
subjects indicated that cognitive restructur- 
ing and emotional role playing were particu- 
larly useful in aiding them in making the 
“real” decision to quit and that relaxation 
training and behavioral rehearsal were useful 
as coping strategies once that decision was 
reached. In contrast, the rapid-smoking com- 
ponent received almost nO endorsement by 
subjects. In addition, the highly aversive 
nature of this procedure was clearly demon- 
strated. Six subjects actually vomited during 
rapid-smoking trials. Most other is 
ported extreme physical discomfort, including 


dizziness, headaches, sweating, and nausea, 
with effects lasting well over an hour after 


rapid-smoking trials. 
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In spite of its aversiveness, rapid smoking 
did not prove to be a very effective proce- 
dure. Subjects in the rapid-smoking condition 
achieved a percentage of baseline smoking of 
73% and an abstinence rate of 17% after 
6 months, figures that are almost identical 
to the results attained by McFall and Ham- 
men (1971) using nonspecific treatment. 
Clearly, the results of our study stand in 
sharp contrast to those of Danaher (1977), 
both in terms of the effectiveness of our 
package treatment and the lack of effective- 
ness of rapid smoking. Further research is 
needed to explain these differences, but for 
the present, two points are noteworthy. First, 
the results obtained for our treatment pack- 
age are much more comparable to those of 
other treatment packages (Best, 1975; Dela- 
hunt & Curran, 1976; Pederson et al., 1975) 
than is the case for Danaher’s study. Sec- 
ond, other recent investigations (Lando, 
1975, 1976; Sutherland et al., 1975) have 
begun to cast doubt on the effectiveness of 
rapid smoking. 

The package treatment might be improved 
by strengthening the cognitive restructuring, 
emotional role-playing, relaxation, and be- 
havioral rehearsal components and eliminat- 
ing the rapid-smoking trials. It would also 
seem advisable to include more procedures 
aimed explicitly at teaching controlled smok- 
ing. Almost all of the success in the package 
condition was due to relatively large propor- 
tion of abstinent subjects in that condition. 
Little was accomplished by way of teaching 
nonabstinent subjects to smoke at reduced 
medically safer levels. Controlled-smoking 
procedures addressed explicitly to those who 
are unwilling to quit might represent an im- 
provement over the current package treat- 
ment. The issues regarding abstinence and 
controlled smoking are comparable to those 
currently being debated in the area of alco- 


holic treatment (e.g. Vogler, C 
Weissbach, 1975). ` ee 


Finally, the results of the Present stud: 
replicate those of Kopel (1974) in Neng 
strating the lack of effectiveness of booster 
sessions. In spite of several 


investigators’ 
(e.g, Hunt & Matarazzo, 1973; Lando 
1975) remarks that booster sessions might 
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insure continued “immunization” among non- — 
smokers while rescuing backsliders, the pres- 
ent study revealed no effects for booster ses- 
sions in terms of cigarette consumption or 
self-report measures of attitudes toward 
smoking. In considering these results, it 
should be noted that the booster sessions in 
the current study did not represent com- 
plete replications of the original treatments 
that may have attenuated their potential ef- 
fectiveness. 


Reference Note 


1. Mahoney, M. J. Clinical issues in self-control 
training. Paper presented at the 81st annual meet- 
ing of the American Psychological Association; 
Montreal, August 1973. 
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Geriatric Patients Who Improve in Token Economy and 
General Milieu Treatment Programs: A Multivariate Analysis 


Brian L. Mishara | 


University of Massachusetts at Boston | 


Chronic geriatric mental hospital patients were randomly assigned to token | 
economy and general milieu programs. In each program, staff received the same 
amount of training, and the physical environments were identical. Results 
showed significant decreases over 6 months on both wards in frequency of 
bizarre and unusual behaviors. Incontinence decreased immediately after trans- 
fer to the treatment wards, and there were changes over 6 months in the 
amount of staff care given to patients. Multiple discriminant analyses indicated 
that in each ward different constellations of pretreatment characteristics dis- 
criminated between patients who improved and those who did not improve in 
the frequency of bizarre and unusual behaviors. In the token economy, im- 
proved patients can be characterized as less “institutionalized,” in better phys- 
ical condition, and actively exhibiting their troubles. In the general milieu, 
improved patients can be discriminated by less responsiveness to an interviewer. 
Results are interpreted in view of the different characteristics of each program. 


Most research on the rehabilitation of choice is made on the basis of ideological 
chronic elderly mental hospital patients has preferences rather than empirical data re- 
focused on finding and evaluating one poten- garding which client characteristics relate to 
tially useful treatment method rather than improvement under each condition. This af 
comparing techniques or trying to determine ticle offers preliminary indications based on 
which elders benefit from which intervention multiple discriminant analyses of which pre- < 
programs. Two promising intervention tech- treatment client characteristics discriminate 
niques, token economy and general milieu between those who improved and those who 
programs, have been shown to be effective, did not improve in a token economy and 4 
but not all clients improve under treatment. general milieu treatment program. 

This poses a dilemma for the practitioner There have been few attempts to discrimi- 


who must chose an appropriate intervention nate between geriatric clients who improve 


method for particular clients. Often the under treatment and those who do not. Kle- 


ban, Lawton, Brody, and Moss (1976) used | 
This study was conducted at Northville State multivariate techniques to determine whi { 
Hospital, Northville, Michigan. It is based in part Characteristics of their subjects best predict 


E emma Hmm! nee “an nce 
ciation, 5 i 
cisco, August 1977. » San Fran- planned treatment program or a standardize 
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GERIATRIC PATIENTS IN TOKEN ECONOMY AND MILIEU TREATMENTS 


Sherwood, Morris, and Barnhart (1975) 
used the results of a discriminant function 
analysis to develop a system for assigning 
elderly people to appropriate residential set- 
tings, Their system was designed to differen- 
tiate elders who need the greater care offered 
by institutionalization from those who re- 
quire less supervised settings or who can live 
independently or semi-independently. After 
further study and cross-validations, their ap- 
proach may prove useful in deciding who 
should be institutionalized and who would 
be better off living independently. 

The Sherwood et al. (1975) and Kleban 
et al. (1976) studies assessed characteristics 
of people who would benefit from more or 
less individual care. The present study is 
based on treatment conditions in which the 
amount of individual care was designed to be 
equivalent under both conditions. The major 
difference between the programs was their 
derivations from quite different theoretical 
perspectives. The token economy program, in 
the behaviorist tradition, involved careful ap- 
plication of operant reinforcement principles. 
On the other hand, the general milieu pro- 
gram provided the same reinforcements freely 
in the tradition of humanistic psychology 
and current research in environmental psy- 
chology, 

Milieu treatment methods have recently 
gained attention for their potential to re- 
habilitate long-term institutionalized geriatric 
eee (e.g, Bok, 1971; Gottesman, 1973; 
Sahin Quarterman, & Cohn, 1973). 

eir findings have been generally accepted 
~An improved milieu is more conducive to 
Cie therapeutic change than impover- 

ed custodial care settings. Although these 
ie seem almost self-evident, findings 
the | research on milieu treatment provided 
nr Uae practitioners to rehabilitate 
TA chronic geriatric mental patients who 
e aay denied treatment, because it 

Be ought that treatment attempts would 

4 waste of time. 
ee economy programs that utilize 
Rien conditioning principles have been 
ce age with chronic mental patients 

re yllon & Azrin, 1968; Ayllon & Mi- 

» 1959; Krasner, 1971). A number of 
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the token economy studies included elderly 
subjects (Atthowe & Krasner, 1968; Lee, 
1969; Mueller & Atlas, 1972). Although 
token economies for just the elderly have 
been rare, behavior modification techniques 
have been increasingly accepted as a useful 
method for geriatric intervention (see Hoyer, 
Mishara, & Riedel, 1975). 

In general, available data suggest that 
token economy programs are better than 
control situations in which no treatment is 
given (Gripp & Magaro, 1971; Maley, Feld- 
man, & Ruskin, 1973; Shean & Zeidberg, 
1971). Greenberg, Scott, Pisa, and Friesen 
(1975) found that a milieu treatment pro- 
gram combined with token economy was more 
effective than a token economy alone. How- 
ever, the review by Gripp and Margaro 
(1974) suggested that it is still unclear which 
variables affected the outcome of token econ- 
omy programs. 

In this study discriminant function analy- 
sis was chosen, since the discrimination made 
parallels the real-life decisions faced in re- 
habilitation settings. With the development 
of different kinds of treatment for geriatric 
patients, hospital staff are faced with the 
problem of choosing between the various re- 
habilitation approaches that are currently 
available. The discriminant function ap- 
proach may help to identify variables that 
may affect the outcome of such choices. Also, 
because of their diagnoses as suffering from 
incurable degenerative brain disease, it is 
frequently expected that no such patients 
would benefit from rehabilitation efforts. 
Their medical prognosis was continued de- 
cline. Given this expectation, a”) improve- 
ment sustained over 6 months could be an 
important indication of rehabilitation po- 
tential, hence the choice of two discrete pop- 
ulations, the “improved” and “unimproved.” 
It was decided a priori that since these were 
mental hospital patients, the main indication 
of improvement would be decreases in the 
frequency of bizarre and unusual behaviors. 
Secondary measures of improvement were re- 
duced frequencies of incontinence, increased 
interpersonal communication, and reduced 
staff care given for personal hygiene and 


self-care. 
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Table 1 
Examples of Ways of Earning Tokens 
No. 
Behavior tokens 
Social 
Helping another resident 6 
Talking to another resident 4a 
Starting an activity by yourself 4 
Starting an activity with others 8 
Self-care 
Bathe self (including washing 
hair) 3 
Dress self E 
Comb hair and brush teeth — 
Shave self or shine shoes 2 
Especially neatly dressed in 
morning 1 
Help do own hair 2 
Ward work (partial list) 
Make bed 1 
Mop 
Whole hall 7 
One day room 3 
Bathroom 2 
After incontinence 3 
Carry out trash 1 
Dust one day room 1 
Help distribute trays 1 
Tron dresses 1 
Work off ward or make handicrafts Se 
Wash ashtrays or medication cups 2 
Stack chairs after meals 4 


* Per valid conversation. 

» For each person who participates, 

° 2 = without help; 1 = with a little help. 
42 = good job; 1 = sloppy job. 

¢ Per hour, 


Method 
Subjects 


Eighty elderly people (40 men and 40 women) 
were selected randomly from the “medical” unit of 
a large state mental hospital to Populate two new 
coed treatment wards, each with 20 males and 20 
females. 


Presumably asso- 
tely half of the 


, and the remaini: H 
domly divided between the warga S Ye ram- 
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The mean age was 68.8 years (SD = 5.1 years). 
The mean length of current hospitalization was 21.4 
years (SD=14.7 years). Most of the participants 
were previously involved in active rehabilitation 
programs on other units. These patients were the 
“failures.” After failing to meet criteria of rehabili- 
tation set up by these programs, people were trans- 
ferred to the medical unit. 

On the medical unit, only custodial care was 
available. A medical doctor made daily rounds, but 
there was no psychosocial treatment available— 
there were no psychologists, psychiatrists, or social 
workers involved actively in their treatment. The 
patients mostly wore hospital gowns in place of 
regular clothing, few ever left the wards, and the 
only available recreation was television. 


Procedure 


Identical numbers of nursing staff with equivalent 
training were assigned to each program, Nursing 
staff on all three shifts on each ward first partici- 
pated in 13 weeks of training meetings for 3-4 hours 
a week. Training included discussions of how it 
feels to work with chronic elderly patients, demon- 
strations of how it feels to be disabled, and what 
it is like to work in a custodial hospital ward. ‘The 
staff was instructed that the programs would in- 
volve (a) more activities and social stimulation, in- 
cluding better foods, radio-phonographs, social 
events, environmental decorations, better clothes, 
and so forth; (b) increased opportunity for på- 
tients to choose how to run their daily lives; and 
(c) more awareness of staff for whose benefit they 
were doing things: to satisfy their own needs 0" 
the needs of the patients. 

During the last 7 weeks of the training period 
staff on the general milieu ward participated in 
open-ended discussion, whereas staff on the token 
ward learned the operant conditioning procedures 
and theory involved in a token economy program. 

Basically, the token economy program was mod- 
eled after that designed by Ayllon and Azrin (1968). 
Individuals were rewarded for desirable behaviors 
including social interaction with other patients, W4" 
work, personal hygiene, and, self-care activities. 
Table 1 lists some of the ways tokens could be 
earned. A detailed but simple bookkeeping system 
was used to keep track of all tokens given out and 
taken in, and the midnight shift audited the ah 
tallies and reported any discrepencies. Tokens coul 
be exchanged for secondary reinforcements such 4&5 
Cigarettes, wine, permission to leave the ward, ext 
food, and other supplements. The, staff met ne 
larly to revise the token system and determine whic 
behaviors should be rewarded. 

The same secondary reinforcements that could be 
purchased by the token economy ward were avali- 
able free to anyone on the general milieu ward va 
wanted them. Staff were instructed to increase oe 
Vironmental opportunities without any systemati 
reinforcement of behaviors. For example, in 
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general milieu ward, food supplements such as 
coffee, wine, ice cream, and sandwiches were avail- 
able free on a daily basis to anyone who wanted 
them, Whereas on the token economy ward these 
same foods were sold for tokens. On the general 
milieu ward, activities such as making handicrafts 
were available to anyone who wanted to partici- 
pate, but on the token economy ward patients 
were paid by the hour for their work on these 
projects. 

The physical environments on both wards were 
identical. The wards had duplicate floor plans and 
duplicate furniture at the start of the programs. 
On both wards efforts were made to acquire decora- 
tions and some more comfortable furniture, though 
the environments could still be regarded as quite 
“institutional” in character. 


Prediction Measures 


Before being transferred to the new research wards, 
data on all participants were gathered on the fol- 
lowing measures, which were later entered into the 
discriminant analyses as the independent variables: 

Psychotic Inpatient Profile. All participants were 
observed for 3 days by nursing staff using Lorr’s 
a Inpatient Profile (PIP; Lorr & Vestre, 

VIRO (Vigor Intactness Relationship Orientation). 
Research assistants rated subjects’ interpersonal be- 
havior using the VIRO scale developed for use with 
Beriatric patients by Kastenbaum and Sherwood 
(1972), The VIRO scale consists of three scores: 
Presentation (based on the initial interpersonal be- 
havior toward the interviewer), Interaction (based 
on interpersonal behavior during 4 hour, and Orien- 
tation (which assesses orientation as to time, place, 
and person). 

Staf evaluations of patient abilities. Staff who 
Were familiar with the patients filled out question- 
naires concerning observed dressing abilities and 
habits, personal hygiene level, physical condition 
(difficulties and illnesses), and behavioral disabilities. 


Prediction Measure Reliabilities 


As a check of the reliability of the PIP observa- 
ions, on 10 occasions a research assistant spent 3 
ees on the ward in the course of other 
lise and filled out profiles on participants who 
i a being observed by staff. There were few dis- 
as between ratings by staff and observers, 
is what little there was reflected a seemingly ran- 
ae Pattern of errors of omission on the part of 
an To test the reliability of the VIRO, 20 elderly 
iter, were interviewed twice within 2 weeks by 
5 han research assistants. The correlations between 
‘04 Wo interviewers on the sections of VIRO were 
» 98, and .99. 


T 
reatment Outcome Measures 


Ay 
A a € treatment outcome measures were based on 
Y observations by nursing staff, who recorded 
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the frequencies of occurrence of specific target be- 
haviors. Observations were conducted in a random 
sequence, so that for 1 day every 2 weeks each 
patient was observed by a randomly assigned staff 
member on both the day and afternoon shifts. Staff 
conducted the observations during the course of their Ù 
usual daily activities. Observations were from the 
month before the patients were transferred to re- 
search wards until 6 mo. after the programs began. 

The main outcome measure was the frequency of 
bizarre and unusual behaviors. The behaviors in- 
cluded in this variable were defined on an individual 
basis after extensive observations. Typical types of 
behaviors included confused verbalizations (eg. 
“My mother is coming today to take me to the 
zoo.”), conversations or reports of conversations 
with people not actually present (e.g, “Jesus Christ 
just told me to throw peas on the ceiling.”), self- 
injurious behaviors (e.g. scratching one’s arm until 
it bleeds), unusual movements (e.g, “sweeping” the 
floor without a broom), and so forth. Secondary out- 
come measures were the frequency of occurrence of 
incontinence, the frequency of conversations with 
other patients, and the number of times nursing staff 
gave care or help in personal hygiene and patient 
self-care activities. 


Treatment Outcome Measure Reliabilities 


As a check of the reliability of these staff observa- 
tions, outside experimenters occasionally spent 1 day 
on the ward observing the same behaviors in specific 
patients. Disagreements between staff and observer 
ratings were few, and they seemed to follow a ran- 
dom pattern. As a second check on the rating relia- 
bilities, identical independent observations were 
conducted on the same day for the same patients by 
members of the day and afternoon shifts and were 
compared. As was expected, the day shift staff, who 
spent more waking-time contact with patients in 
general, reported more behaviors. The Spearman 
rank-order correlations (rho) between observations 
on the two shifts ranged from 1 to 92 (n=37 in 
the general milieu program, and n= 36 in the token 
economy), with the median correlation being .77. 
(All correlations were significant at <.01.) 


Statistical Analyses 


General improvement according to staff observa- 
tions was analyzed by sign tests on changes from 
before treatment to the end of 6 months on the re- 
habilitation wards. (See Siegel, 1956, pp. 68-75.) 
Sign tests were chosen for the primary analysis, since 
the data were not interval data. 

For the multiple discriminant analyses, patients 
were classified as either having improved or as not 

efore the start of the treat- 


having improved from bi S 
ment program until 6 months after being trans- 


ferred to the new treatment units. For each separate 
program a stepwise linear discriminant analysis was 
performed using the nine PIP scales, the three VIRO 
scales, the four staff evaluations of areas of patient 
ability, and age, sex, and total years of hospitalization 
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to determine which of these factors discriminated 
between the population of individuals who improved 
and those who did not improve. Supplementary. mul- 
tiple regression analyses were performed in order to 
gain an indication of the amount of variance ac- 
counted for by the measures. 


Results 


The first issue is whether or not either or 
both of the treatment programs had an effect 
on the primary target variable or any of the 
three secondary outcome measures. Given that 
these patients had a prognosis of further de- 
terioration, it is not surprising that the ma- 
jority showed no change from before the 
start of the programs to 6 months after the 
programs began. Most patients who changed 
decreased the frequency of bizarre and un- 
usual behaviors (see Table 2). These im- 
provements occurred in both the token econ- 
omy (sign test, p = .059) and the general 
milieu (sign test, p = .003) programs, show- 
ing treatment effects. 

Regarding the secondary outcome measures 
(see Table 2), there were significant improve- 


Table 2 
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ments on frequency of incontinence in the 
general milieu program (sign test, p < .001) 
and improvements in the token economy pro- 
grams (sign test, p = .062). Care given by 
nursing staff was significantly reduced after 
6 months only on the token economy ward 
(sign test, p< .001), and there was no sig- 
nificant improvement on frequencies of con- 
versations with other patients on either ward. 

Inspection of the data indicated that the 
declines in frequency of incontinence OC- 
curred entirely between the pretest and the 
first observation immediately after transfer to 
the two treatment units. Although these im- 
provements were sustained over the 6-month 
treatment, no further improvement occurred 
later on. Data on frequency of care given by 
nursing staff indicated that there was an imi 
mediate significant decrease in care given in 
the general milieu program immediately after f 
the programs began (sign test, p < :022), 
followed by a gradual increase over 6 months 
to the pretest levels. 

Table 3 summarizes the stepwise linear dis; 
criminant analyses for the populations of 


Changes in Staff Observations From Before the Start of Programs 


to 6 Months After the Programs Began 


EE EEE ee) i 


Variable 


} 


‘ 


A 


Token economy 


Bizarre and unusual behaviors** 
Conversations with other patients 
Care given by nursing staff**#* 
Incontinence* 


General milieu 


Bizarre and unusual behaviors*** 
Conversations with other patients 
Care given by nursing staff 
Incontinence**** 


Frequency* 
Im- No 

proved change Worse nb 
11 19 4 15 
16 8 16 32 
16 16 2 18 
6 28 il 7 
11 24 1 12 
17 8 13 30 
17 5 14 31 
16 18 2 18 


Note. Because all hypotheses were directional, one- 


p. 250). tailed tests were used (by sign test, from Siegel, 1956) 


a It . ett p i 
ean van Ga Ş Briori that improvement would be indicated by increased frequencies of conversation 
unusual behaviors, z by nursing staff, decreased frequencies of bizarre 4 
b In sign test, 
* p = 062, 
** p = 059, 
*** p = 003. 
"p < 001. 


i eee 
-` e mh 
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Table 3 
Multivariate Stepwise Discriminant Analyses for Improved and Unimproved Patients 
on Frequency of Bizarre and Unusual Behaviors for Token Economy and 
General Milieu Rehabilitation Programs 
Token Economy General milieu 
4 Order of Order of 
Variable Maite entry Maitt entry 
Sex .45385 2 (.5835, 
Staff evaluations i l 
Dressifig habits and abilities 46640 6 (.3290) 
Physical condition —.38735 4 (.4653) 
VIRO interaction 9.86923 1 (.6597) 
Age 
rote years of hospitalization 4.09605 3 (.5601) 
Anxious depression —2.50197 5 (.4055) 
Care needed .98419 2 (.6979) 
Psychotic disorganization —8,16205 1 (.7960) 


Note. Linear discriminant coefficients appear in parentheses. VIRO = Vigor Intactness Relationship 


Orientation. 


people who improved and did not improve on 
frequency of bizarre and unusual behaviors. 
The order of entry column indicates the rank 
order of the ability of the variables to dis- 
tinguish between the two groups of patients 
when factors already entered in the discrimi- 
nant function are controlled. Table 4 shows 
the results of multiple regression analyses on 
the same data, in which the order of entry of 
variables was the same as for the linear dis- 


Table 4 

ep ity of Results from Multiple 

Regression ‘Analysis With Order of Entry in 
quation the Same as for the Discriminant 


Analyses 
TA eee A a 
Variable R R? Simpler 


Token economy program 


Psychotic disorganization .199 .040 —.199 

are needed (428 .183 .227 
pel years hospitalization 558 .312 151 
hysical condition 609 .371 —.079 
nuus depression 639.408 —.163 
essing habits 646.417 —.085 

vt General milieu program 

ee interaction 430 .184 430 

1632 399.554 
Note. 
Beane = Vigor Intactness Relationship Orien- 


criminant analyses. Since the regression analy- 
ses were included in order to assess the amount 
of variance accounted for by each variable, 
the table shows the multiple correlations at 
each step, R? (amount of variance accounted 
for) at each step, and the simple correlation 
for each variable. Table 5 indicates the fre- 
quencies and percentages of misclassifications 
that would occur in these samples if the dis- 
criminant function were used to assign indi- 
viduals to the improved or unimproved 
groups. 

In the general milieu program, populations 
that did and did not improve in bizarre and 
unusual behaviors could be discriminated best 
by the VIRO interaction score, followed by 
sex. Improved patients were generally less 
responsive on VIRO interaction ratings, and 
they tended more often to be males. On the 
token economy ward, PIP psychotic disor- 
ganization best discriminated between im- 
proved and unimproved patients, followed by 
PIP care needed, length of total hospitaliza- 
tion, staff evaluation of physical condition, 
PIP anxious depression, and staff evaluations 
of dressing habits. The improved patients 
generally had greater PIP psychotic disorgan- 
ization, less PIP care needed, less total hos- 
pitalization, better staff ratings of physical 

ondition, more anxious depression, and worse 
dressing habits. 


1346 


Table 5 
Numbers of Cases Misclassified in Groups 
by the Stepwise Discriminant Analyses 


Program 
Token General 
Group economy milieu 
Improved as 
unimproved 1/11 9.1) 4/11 (36.4) 
Unimproved as 
improved 1/12 (4.3) 3/25 (12.0) 
Total 2/34 (5.9) 7/36 (19.4) 


Note. Numbers in parentheses are percentages, 


Discussion 


The token economy and general milieu pro- 
grams were both successful in bringing about 
positive behavioral changes in a number of 
these chronic elderly mental hospital patients 
with organic diagnoses, Given the expecta- 
tion that due to their diagnoses of irreversible 
chronic organic brain damage they should not 
be likely to improve, it is not unusual that 
the majority showed no change on the main 
outcome measure, frequency of bizarre and 
unusual behaviors. Most people showed no 
change, though significantly more people im- 
proved than declined. Still, the proportion of 
people who did improve (almost one third) 
Supports current research in clinical geron- 
tology, which indicates that many chronic 
elderly patients who were previously consid- 
ered hopeless have rehabilitative potential. 
These findings are bolstered by the significant 
numbers who decreased in the frequency of 
incontinence, which had been considered by 
many staff members as a purely medical de- 
generative condition. 

The lack of significant numbers of people 
increasing in their frequency of conversations 
with other patients is interesting. On both 
wards, although many conversed more fre- 
quently, almost equal numbers had less fre- 
quent conversations. I woul 
that those who increased 


e to explain this lack 
SO one must assume that any 
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changes on this dimension are due to chance. 

It is also of interest that care given by 
nursing staff for personal hygiene and self- 
care declined in the token economy program, 
but it declined at the start of the general 
milieu programs only to gradually increase to 
the pretreatment level. In the token economy 
it was obvious that the decline in staff care 
was a direct result of reinforcing patient be- 
haviors, which lessened the need for staff help. 
In fact, nursing staff seemed particularly care- 
ful to reward behaviors that made their job 
easier. Many staff meetings were spent modi- 
fying the program so that other therapeutic 
behaviors would be the primary targets. 

In the general milieu program, patients did 
more for themselves, which resulted in the 
initial drop in staff care. They required less 
help in basic care such as putting on one’s 
clothes or taking a shower. However, as the 
program progressed, patients began to increase 
their demands on staff for additional forms of 
care and new services that were not given 
before. They demanded more meticulous 
grooming, such as careful hairstyling. In this 
instance the outcome variable that was de- 
fined a priori changed its meaning as the 
program progressed. 

One possible explanation of these results 
is that the improvements were unrelated to 
the treatment methods—that resulted simply 
from a general “Hawthorne” effect of put- 
ting staff effort into patient care. If this ex 
planation were correct, the results would indi- 
cate that any treatment is effective for this 
population when compared to no treatment at 
all. Clearly, this seems to be the case Ww 
regard to the decreases in the incidence of in- 
continence. These improvements occurred iM- 
mediately on transfer to the new treatment 
environments but before the activities in the 
treatment programs had an opportunity to 
get under way. 

An alternative hypothesis is that different 
types of patients are more likely to respond t0 
each of the two programs, Support for this 
latter hypothesis can be seen in the results 
from the multiple discriminant analyses. Pa 
tients who improved in the token economy 
Program could best be discriminated by the! 
active psychotic disorganized behaviors, less 
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need for staff care, more recent admission to 

a mental hospital, better physical condition, 

active discontent, and sloppy dressing habits. 

This constellation of variables generally indi- 
"cates the less “‘nstitutionalized” patients who 
are in better physical conditions and actively 
exhibit their troubles. However, in the gen- 
eral milieu program, patients who improved 
could be best discriminated on the basis of 
their lesser responsiveness to an interviewer. 

These characterizations seem reasonable in 
view of the demands of each program. The 
token economy required patients to be suff- 
ciently alert and motivated to engage in 
target behaviors that would result in rein- 
forcement, Generally, more active involve- 
ment was necessary. However, the general 
milieu program did not demand active involve- 
ment, it only provided an enriched environ- 
ment that was more supportive to individual 
heeds, 

One additional observation may be particu- 
larly relevant in deciding which type of pro- 
gtam to conduct: Staff in the token economy 
program often reported that they were suc- 
ceeding in their efforts, whereas staff in the 
general milieu program more often reported 
that they felt that they were getting no- 
Where. The reports of dismay in the milieu 
Program occurred even when progress was 
being made. This difference in staff reactions 
May reflect the token economy staff’s greater 
Comfort with the highly structured token 
system. In the token economy, staff behav- 
lors were clearly specified in most situations, 
and daily feedback was provided by charting 
| of reinforcement frequencies for each patient. 

Tn the general milieu program, staff had to 

rely more on their intuition about how to 

ee They were not given daily tallies 
thy could provide feedback about the effec- 
eness of their actions. 

Overall, this study suggests that different 
ee characteristics may discriminate be- 
es pis who improve and those who do 
me ifferent types of rehabilitation pro- 
| tams. Further research in this area, and par- 


OEE 


poy cross-validations of these findings, 
W ee clarify which patients benefit from 
tës types of treatment. Perhaps further 

earch will allow hospitals to rehabilitate a 


re 
Steater proportion of their patients by offering 
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alternative programs to meet the special needs 
of different patient populations. 
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Veridicality of Self-Report: 


Replicated Correlates of the Wiggins MMPI Content Scales 


David Lachar and Richard S. Alexander 
Lafayette Clinic, Detroit, Michigan, and Wayne State University 


Correlates of the 13 Minnesota Multiphasic Personality Inventory Wiggins con- 
tent scales were identified using both single-sample and cross-validation par- 
adigms in a sample of 384 male clients who received a mental health evaluation 
at a military medical facility. Characteristics of both high-scoring and low- 
scoring clients were obtained for each scale, and interpretations of high scale 
elevations are proposed that reflected scale content, psychometric properties, 
and correlate characteristics. The 48 replicated and 28 single-sample correlates 
descriptive of high scorers significant at .01 reflected either the substantive 
nature of each scale or suggested primary characteristics expected of high 
scorers on each scale, whereas the additional 69 single-sample correlates sig- 
nificant at .05 often provided the descriptions needed to “round out” each scale 
interpretation. Evaluation of the correlates of low scorers essentially supported 
the position that low scores reflect the absence of descriptors characteristic of 
high elevations on the same scale. A factor analysis of the content, clinical 
profile, A, and R scales supported the interpretive intent of the content scales, 


as well as suggested their relative vulnerability to a defensive response set. 


The 13 content scales of the Minnesota 
Multiphasic Personality Inventory (MMPI) 
were constructed by Wiggins (1966) to study 
the relation between item content and scale 
validity. They were developed by applying 
both psychometric and intuitive procedures to 
the original content classifications of Hath- 
away and McKinley (1951). Each scale item 
was assigned exclusively to its respective scale 
only if it obtained a point-biserial correlation 
with the total scale score in excess of .30, 
Which also exceeded its correlation with the 
remaining content scales. 

Initial study of the content scales revealed 
Meaningful differences between various normal 
and psychiatric samples and between samples 
defined by traditional diagnostic classifications 
(eg, schizophrenic psychoses, brain disorders, 
etc.), The factor structure of the scales was 
also supportive of their interpretive intent 
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(Wiggins, 1969). Additional construct wa- 
lidity has been demonstrated in a study that 
related content scale elevation to clinical pro- 
file code-type correlates (Payne & Wiggins, 
1972), in studies that compared scale scores 
of groups differing in composition (Cohler, 
Grunebaum, Weiss, Hartman, & Gallant, 
1975; Jarnecke & Chambers, 1977; Mezzich, 
Damarin, & Erickson, 1974; O’Neil, Teague, 
Lushene, & Davenport, 1975), as well as in 
studies that correlated content scale scores 
with other MMPI scales and other personality 
measures (Derogatis, Rickels, & Rock, 1976; 
Hoffmann & Jackson, 1976; Taylor, Ptacek, 
Carithers, Griffin, & Coyne, 1972; Wiggins, 
Goldberg, & Appelbaum, 1971). 

Wiggins (1966) suggested that the content 
scales may serve as a supplementary source 
of information to interpretations derived from 
the empirically keyed MMPI profile scales, 
The content scales were constructed to mea- 
sure the substance of the client’s communica- 
tion that is directed at the examiner. The in- 
terpretation of these scales represents a view 


of test response midway between the naive- 


rational and the radical-empirical perspec- 
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tives (Dahlstrom, 1969), Suggested interpre- 
tations of the content scales (Wiggins, 1966) 
reflect this interpretive intent by emphasizing 
that scale elevations reflect admission of 
various symptom clusters. 

The present study seeks to evaluate the 
extent of agreement between self-report and 
clinical impression among clients by investiga- 
tion of the external correlates of the Wiggins 
content scales. The study design and char- 
acter of these scales also allows an examina- 
tion of the relation between method of corre- 
late selection and degree of conceptual fit 
with manifest scale content. A final goal of 
this investigation is the construction of in- 
terpretive statements for each scale to de- 
scribe high-scoring clients, 


Method 
Subjects 


The 384 male subjects were United States Air 
Force personnel and their dependents who had re- 
ceived a psychiatric evaluation between July 1972 
and March 1973 at Wilford Hall Medical Center, 
Lackland Air Force Base, San Antonio, Texas, Mean 
age of this sample was 26.3 years (range = 18-57), 
and 35.4% were under 24 years of age. The mean 
educational level was 12,5 years, with 24% having 
more than a high school education, Marital status 
was about evenly divided between single (49%) and 
married (45%), and 6% were either divorced or 
widowed. These subjects were seen by different diag- 
nostic components of the Department of Mental 
Health: 35% were inpatients on a psychiatric service. 
18% were inpatients on general medical or neu- 
rology/neurosurgery services, 39% were seen at the 
outpatient clinic, and the remaining 8% were evalu- 


ated by the mental hygiene clinic during ipati 
in basic training, oe 


Measures 


Each subject completed the standard booklet form 
Part of his evaluation, 
re obtained as part of a Previously 


accuracy of mated 
interpretation system for the dial a 


(Lachar, 1974). T scores for the cont 
obtained using norms reported by 
(1971), and raw scores for Profile scales 

A and R (Welsh, 1956) were converted to T 


Scales 
using the Hathaway oe (1957) norms, a 
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evaluation form permitted the description of each 
client by checking 81 possible descriptive adjectives. 
These adjectives were subjectively placed under the 
headings affect (n = 18), interpersonal relations (n= 
10), motor behavior (n= 9), psychological eff- 
ciency (n=4), patient-therapist relationship (n= 
3), history (n=7), thought process (n= 8), thought 
content (n=7), and physical complaints (n=15). 
Since many of the physical complaint descriptors 
were selected infrequently, these 15 adjectives were 
also collapsed into a composite variable Jabeled 
physical complaint present. 


Procedure 


MMPI protocols of dubious validity (F > 25 raw 
Score) or substantial missing data (Q > 30 raw score) 
Were omitted from this analysis, reducing the study 
sample by 21 subjects to 363. All subjects were then 
designated as either high Scorers, low scorers, oF 
“other” for each of the 13 content scales. Though 
score distributions allowed designation of T <41 as 
the criterion inclusion rule for low scorers, the initial 
criterion inclusion rule of T> 69 for high scorers 
was modified to T>59 for the Manifest Hostility, 
Authority Conflict, Feminine Interests, Hypomania, 
and Religious Fundamentalism scales to form suffi- 
ciently large criterion groups for these scales. 

Correlates were selected for the total sample 
(criterion group vs, all remaining protocols) using the 
chi-square statistic at the .05 and .01 levels, In 
addition, each criterion group and the remaining 
sample were randomly split, and the adjective cor- 
relates significant at least at the .10 level in both 
Samples were designated as replicated, with a re- 
sultant joint probability of at least .01. 3 

assist in definition of the construct dimensions 
related to each content scale, T values for the clinical 
profile scales, A, R, and the 13 content scales were 
intercorrelated, This correlation matrix was then 
submitted to a principal components factor analysis, 
and the resulting factor matrix was rotated to # 
varimax criterion, 


Results 


Criterion sample size, correlates replicated 
at .01, and correlates not replicated but sig- 
nificant at .01 or .05 for the total sample are 
Presented below. Forty-eight correlates repli- 
cated for high-elevation criterion groups: 
whereas only 18 correlates cross-validated in 
the analysis of low-score criterion groups- 
Approximately twice the number of correlates 


c 


* Adjective base rates and criterion group o 
quencies for significant correlates and scale in 


~ t 
correlations are available on request from the sire 
author, 
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were obtained by these three selection meth- 
ods for the criterion groups defined by high 
elevations than for criterion groups defined 
by low elevations (145 vs. 88). 


Organic Symptoms (ORG) 


T>69 (n=79). Replicated at .01: im- 
potent/decreased libido, fatigue, insomnia, 
poor memory, anorexia, headache, joint pains. 
Single sample significant at .01: depressed, 
difficulty in concentration, paucity of ideation, 
back pain, loss of consciousness, physical com- 
plaint present. Single sample significant at 
05; less hyperactive/hypomanic, chest pain. 

T< 41 (n= 24). Replicated at .01: less 
depressed. Single sample significant at .05: 
shallow affect, talkative, less fatigue. 


Poor Health (HEA) 


T>69 (n=56). Replicated at .01: 
chest pain, Single sample significant at .01: 
anorexia, diarrhea, shortness of breath. Single 
sample significant at .05: worrisome, constipa- 
tion, 

T<41 (n= 19). Replicated at .01: less 
depressed, shallow affect. Single sample signifi- 
cant at .05: less anxious, less worrisome, less 
withdrawn, 


Depression (DEP) 


T > 69 (n= 87). Replicated at .01: im- 
Potent/decreased libido, retarded (motor), 
Sense of inadequacy inferiority. Single sample 
Significant at .01: depressed, withdrawn. Sin- 
gle sample significant at .05: guilty, less in- 
‘propriate affect, combative when intoxi- 
cated, paucity of ideation. 

T < 41 (n= 32), Replicated at .01: less 
depressed, less withdrawn, less sense of inade- 
{acy /inferiority, Single sample significant at 

i less anxious, less guilty, less worrisome, 
1 tive, less insomnia, less financial prob- 
ems, less ideas of reference, physical com- 
Plaint present, 


Poor Morale (MOR) 


rec 2 (n= 65). Replicated at 01: de- 
essed, combative when intoxicated, paucity 
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of ideation, sense of inadequacy/inferiority, 
anorexia. Single sample significant at .01: 
fearful/phobic, tremulous. Single sample sig- 
nificant at .05: guilty, less inappropriate af- 
fect, less labile, suspicious, tearful, withdrawn, 
retarded (motor), insomnia, less religiosity. 

T < 41 (n= 64). Replicated at .01: less 
depressed, less fearful/phobic. Single sample 
significant at .01: less sense of inadequacy/ 
inferiority. Single sample significant at .05: 
less anxious, less guilty, less fearful, less wor- 
risome, less impotent/decreased libido, less 
withdrawn, less agitated/restless, less inde- 
cisive, less difficulty in concentration, less in- 
somnia, defensive, less ideas of reference, 
nausea/vomiting. 


Social Maladjustment (SOC) 


T > 69 (n= 62). Replicated at .01: apa- 
thetic, depressed, fearful/phobic, worrisome, 
withdrawn, compulsive, retarded (motor), 
sense of inadequacy/inferiority, suicidal 
thoughts. Single sample significant at .01: 
guilty, insomnia, Single sample significant at 
.05: fatigue, less poor judgment, constipation. 

T < 41 (n= 69). Replicated at .01: less 
depressed, less fearful/phobic, less perplexed, 
less withdrawn. Single sample significant at 
.01: less poor memory, malingering, Single 
sample significant at .05: less apathetic, less 
guilty, less suspicious, less tearful, less worri- 
some, amoral, less dependent, less passive, less 
indecisive, less retarded (motor), less difficulty 
in concentration, less confused, less sense of 
inadequacy /inferiority, less suicidal thoughts, 
headaches. 


Manifest Hostility (HOS) 


T >59 (n=24). Replicated at .01: hos- 
tile, less suspicious, assaultive. Single sam- 
ple significant at .01: chest pain, Single 
sample significant at .05: moody, amoral, im- 
pulsive, combative when intoxicated. 

T<41 (n= 48). Single sample signifi- 
cant at .05: less depressed, less moody, less 
impotent /decreased libido, less agitated/rest- 
less, less insomnia, less ideas of reference, less 
sense of inadequacy/inferiority. 
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Family Problems (FAM) 


T > 69 (n= 86). Replicated at .01: as- 
saultive, impotent/decreased libido, insomnia, 
drug usage, marital conflict. Single sample 
significant at .01: agitated/restless, impulsive. 
Single sample significant at .05: excitable, less 
passive, combative when intoxicated, sense of 
inadequacy/infericrity, unrealistic feelings, 
less loss of consciousness. 

T < 41 (n= 34). Replicated at .01: less 
depressed, homicidal, less marital conflict. 
Single sample significant at .01: abdominal 
pain. Single sample significant at .05: inap- 
propriate affect, less sense of inadequacy/ 
inferiority. 


Authority Conflict (AUT) 


T > 59 (n= 94). Replicated at .01: ma- 
lingering. Single sample significant at .01: 
marital conflict, less autistic thought, less 
perfectionistic. Single sample significant at 
.05: assaultive, less withdrawn, destructive 
gestures, combative when intoxicated. 

T<41 (n=49). Single sample signifi- 
cant at .O1: less immature. Single sample sig- 
nificant at .05: less moody. 


Feminine Interests (FEM) 


T > 59 (n=75). Replicated at .01: per- 
plexed, difficulty in concentration, unrealistic 
feelings. Single sample significant at .01: sui- 
cide attempts, religiosity, Single sample sig- 
nificant at .05: homosexual, less impulsive 
indecisive, insomnia, less malingering, less 
combative when intoxicated, confused, de- 
lusions, hallucinations. ; 

T< 41 (n= 39). Single sample signifi- 
cant at .05: agitated/restless, malingering. 


Phobias (PHO) 


T > 69 (n= 47). Replicated p 
ful /phobic, worrisome, EA Š ea 
anorexia. Single sample significant at 01: de- 
pressed. Single sample significant at i 05. 
anxious, withdrawn, tremulous, poor memo: , 
paucity of ideation, chest pain, joint vane 

T<41 (n= 42). Single sample signifi- 
cant at .01: physical complaint present, Single 


A 
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sample significant at .05: abdominal ‘pain, 
back pain, visual problems. 


Psychoticism (PSY) 


T > 69 (n=54). Replicated at .01; re 


tarded (motor), autistic thought, paucity of © 


ideation. Single sample significant at .01; 
anorexia. Single sample significant at 05: 
perplexed, suspicious, worrisome, less depen- 
dent, impotent/decreased libido, disorganized 
thought, incoherent, less perfectionistic, hal- 
lucinations, ideas of reference. 

T<41 (n=21). Single sample signifi- 
cant at .05: less suspicious, less insomnia. 


Hypomania (HYP) 
T > 59 (n= 104). 


Replicated at .01; ex- ` 


citable, immature, hyperactive/hypomanic, — 


Single sample significant at .01: agitated/ 
restless. Single sample significant at .05: less 
compulsive, destructive gestures, less retarded 


(motor), malingering, less suicidal thoughts. 
T<41 (n=28). Single sample signif- 


cant at .01: less immature. Single sample sig- 
nificant at 05: nausea/vomiting. 


Religious Fundamentalism (REL) 


T>59 (n= 40). Replicated at .01: less 
alcohol excess. Single sample significant at 
01: delusions, religiosity. Single sample sig- 
nificant at .05: less impulsive, less drug usa8® 
less marital conflict, autistic. k 

T<41 (n=384). Replicated at Ol 
homicidal, impulsive, drug usage. Single sam- 
ple significant at .05: less moody, destrut- 
tive gestures, confused. 

The results of the factor analysis are pre 
sented in Table 1. Five factors were obtaine! 
that accounted for 96% of the common var 
ance among these 28 scales. The first factor 
accounted for 57.2% of the variance aul 
flected somatic complaints (ORG, HEA, dl 
Hy) and a secondary emphasis of psycholog" 
cal discomfort (D, Pt, Sc). The second factor 
accounted for 16.3% of the variance am A 
peared to be organized around informant © 
sponse style. Measures of symptom denial S 
K, R) characterized one end of this dime” 
sion, whereas the other was defined by ĉ 
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Table 1 
Rotated Factor Matrix of the Content, Validity, Clinical, A, and R Scales 
Scale 1 2 3 4 5 he 
Wiggins content 
ORG 728 76 
HEA 15 ‘16 
DEP —.42 .73 88 
MOR —.40 -76 -86 
soc 91 84 
HOS -—.77 73 
FAM —.42 Al 49 
AUT —.73 59 
FEM 78 -60 
PHO 54 .48 
PSY —.59 .14 
HYP —.80 -68 
REL -09 
Clinical profile 
L 54 37 
ie —.45 51 4 
K 61 83 
Hs 93 87 
D 49 65 75 
Hy 85 19 
Pd -66 59 
Mf 72 .60 
l Pa 40 61 
Pt 48 155 42 17 
Se 50 43 50 81 
Ma —.63 38 
Si 95 +93 
Welsh factor 
4 —.47 75 92 
R 66 54 
% of variance 57.2 16.3 14.1 7.0 44 


pe ORG = Organic Symptoms; HEA = Poor Health; DEP = Depression; MOR = Poor Morale; 
c c in Social Maladjustment; HOS = Manifest Hostility; FAM = Family Problems; AUT = Authority 
‘onflict; FEM = Feminine Interests; PHO = Phobias; PSY = Psychoticism; HYP = Hypomania; 


. a = Religious Fundamentalism. 
. “actor loadings less than .40 have been omitted. 


mission of pathology (F, A). Although only 1 
A the 10 clinical profile scales loaded on Fac- 
ae 7 of the content scales obtained such 
wines The especially high and primary 
a a of scales HOS, AUT, PSY, and HYP 
ae aa 2 suggested their susceptibility to 
Ei e response set. The third factor ac- 
& ed for 11.1% of the variance and clearly 
Reacts a dimension characterized by de- 
angon (DEP, D), anxiety (PHO, Pt, A), 
Si oe withdrawal and alienation (SOC, 
a a ). The fourth factor accounted for 7% 
ition, ance and uniquely represented tra- 
à Aa Sex role interest pattern (FEM, Mf). 
nal factor represented 4.4% of the vari- 


ance and was characterized by Scales FAM, 
F, Pd, Pa, Pt, and Sc. This factor appeared to 
reflect both poor interpersonal relationships 
and the causal and resultant traits associated 
with interpersonal conflict. 

Proposed interpretations of high content 
scale scores are presented in the appendix. 
These interpretations reflect a combination of 
scale item content, empirically supported scale 
correlates, and their frequency, as well as 
scale intercorrelations. 


Discussion 


The correlates obtained for the Wiggins 
content scales demonstrated substantial agree- 
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ment between client self-report and clinician 
evaluation. Evaluation of correlates of high 
scale elevations by method of correlate selec- 
tion suggested that both the 48 replicated and 
28 nonreplicated correlates selected at .01 
directly reflected either the substantive na- 
ture of their corresponding content scale or 
primary characteristics expected of clients who 
score highly on each scale. This substantive 
agreement raises doubts as to the frequently 
stated need for the cross-validation of corre- 
lates (Boerger, Graham, & Lilly, 1974; Gyn- 
ther, Altman, & Sletten, 1973; Lewandowski 
& Graham, 1972; Marks, Seeman, & Haller, 
1974) if the level of significance for correlate 
selection is sufficiently conservative in analysis 
of sufficiently large samples. Inspection of the 
additional 69 correlates selected for the total 
sample at .05 reveals that they often provide 
the descriptors needed to “round out” each 
scale interpretation. The correlates of HOS 
(T > 59), for example, include “hostile, less 
suspicious, assaultive, chest pain” at W1; 
whereas inclusion of .05 descriptors adds 
“moody, amoral, impulsive, and combative 
when intoxicated.” 

Evaluation of the 25 correlates selected at 
-O1 and the 69 correlates selected at .05 for 
low-elevation criterion samples supports the 
position that low scores reflect the absence 
of descriptors characteristic of high eleva- 
tions on the same scale. This analysis does 
not exclude the possibility that some of the 
content scales reflect bipolar dimensions, as 
the potential correlate pool for this study 
did not include any favorably worded de- 
scriptors, An artifact of this analysis appears 
to be several physical complaint correlates of 
low elevations [e.g., “abdominal pain” for 
PHO(T <41) and FAM(T < 41)] result- 
ing from sample inclusion of Physically ill 
patients referred for evaluation who evi- 
denced no symptoms of Psychological dis- 
turbance. 

The susceptibility of Personality scales 
with high content satura 


i tion to response 
biases such as defensiveness is illustrated by 


negative relation- 


rrection applied to 
scales of the MMPI. 
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: 
The second factor obtained in the current 4 


factor analysis highlighted the susceptibility 
of the majority of the content scales to such 
a defensive response set. A correlate of MOR 
(T <41), “defensive,” also suggested that 
lowered content scale scores are likely to 
result from intentional client distortion, 
Clinicians applying the content scales should 
be aware that high elevations represent sig- 
nificant probability correlates, whereas low | 
elevations are less likely to suggest the ab- 
sence of these symptoms—especially if Scales 
L or K obtain some elevation. 

In spite of our hope, inspired by adherence 
to dustbowl empiricism, of discovering a 
number of unexpected correlates, few were | 
obtained. One interesting finding was that 
REL, and to some extent FEM, elevations 
represent inhibition of acting out in male 
psychiatric populations. That is, the admis- 
sion of religious beliefs is strongly associated 
with decreased substance abuse, marital con- | 
flict, and impulsive behavior, whereas low 
REL elevations relate to the replicated cor- 
relates “homicidal, impulsive, drug usage.” 
Also, it appears that PSY (T > 69) in a pre- 
dominantly nonpsychotic sample, reflects not 
only low base-rate psychotic behaviors, but 
it also suggests perplexity, suspiciousness, 
and worry. 
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Appendix 
Proposed Interpretation of High MMPI Content Scale Scores 


Organic Symptoms (T > 69) 


This individual has admitted to a variety of 
Sensory, motor, or general somatic concerns that 
may be related to psychological discomfort and 
Seneral malaise as well as to reduced effective- 
ness in completing daily tasks. Clients who ob- 
tain high Organic Symptoms elevations may 
Complain of lack of stamina and strength and 
May present physical symptoms that often in- 
Heate emotional conflict, such as problematic 
adache or back pain. 


Poor Health (T > 69) 


rehecant number of physical complaints are 
meet by item endorsement centering mainly 
ol an the digestive system. Individuals who 
S ain high Poor Health elevations are often 

"siderably worried about their health. Cardiac 


and pulmonar i i 
complaints are also occasionall; 
Teported, i H x 


Depression (T >69) 


ite isividual has admitted to symptoms asso- 
with problematic depression, such as lack 


of interest in the environment, pessimism, self- 
criticism, and brooding. In client populations, 
social withdrawal, a negative self-concept, guilt 
feelings, and a reduced activity level may be 
suggested. 


Poor Morale (T > 69) 


Inventory responses reflect a pervasive lack of 
confidence in one’s abilities and a history of 
failure, which “is related to these perceived 
limitations. Clients who obtain high Poor Morale 
elevations may be insecure, despondent, with- 
drawn, intropunitive, and oversensitive, and may 
become easily upset by the actions of others. 


Social Maladjustment (T > 69) 


Endorsed item content reflects a lack of social 
skill and poise, discomfort in social interaction, 
and resultant inhibition and social isolation. In 
client populations, this lack of social supports 
may be associated with a negative self-image, 
feelings of despair or fearfulness, thoughts of 
suicide, or a defensive orientation characterized 
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by apathy and limited activity or compulsive 
attention to detail. 


Manifest Hostility (T > 59) 


This individual admits to problems in adjust- 
ment related to unmodulated expression of anger, 
resentment of perceived injustices, need for in- 
terpersonal dominance, and limited self-control. 
In client populations, the combination of hos- 
tility, moodiness, and impulsivity may be asso- 
ciated with assaultive or other antisocial or 
violent behavior. 


Family Problems (T > 69) 


Inventory responses include admission of path- 
ology in and among family members. A history 
of poor relationships with parents is suggested, 
as well as the absence of positive supports in 
current family interactions, whether with par- 
ents, spouse, or extended family. Patient male: 
In adult male clients, admission of family path- 
ology may reflect not only marital conflict but 
may also suggest intolerant, overreactive in- 
dividuals and a negative self-concept. Drug 
abuse and other destructive behavior may be 
associated. 


Authority Conflict (T > 59) 


Endorsed item content reflects the belief that 
interpersonal relations are often exploitive in 
nature. Disregard for principles of ethical con- 
duct and truthfulness is Suggested, as well as a 
tendency to minimize the negative impact of 
antisocial behavior. In client populations, these 
attitudes may be associated with problematic 
overassertive and manipulative social relations. 
Conflict with relatives may result, 


Feminine Interests (T > 59) a 


Inventory responses suggest an interest in pur- 
suits traditionally labeled as feminine and/or 
dislike of activities stereotyped as masculine. 
Patient male: In male clients, this interest pat- 
tern may be associated with an indecisive 
passive orientation that has Proven to be prob- 
lematic. Conflict may lead to confusion or self- 
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blame. Evaluation for suicidal ideation or pre- 
vious attempts is suggested. 


Phobias (T > 69) 


This individual admits to a variety of fears and 
appears to be significantly uncomfortable in 
many situations. Clients who obtain high PHO 
elevations are viewed as more anxious, tremu- 
lous, worrisome, and phobic than most patients. 
Depression and social withdrawal may also be 
indicated. 


Psychoticism (T > 69) 


Inventory responses include admission of un- 
usual experiences and beliefs, many of which 
may include a clearly paranoid component. In 
client populations, this response pattern often 
Suggests an individual who finds comprehension 
of human motives and behavior difficult and is 
consequently suspicious of and worried about 
others. Symptoms associated with a psychotic 
adjustment, such as ideas of reference, halluci- 
nations, and autistic or disorganized thought, 
may be present. 


Hypomania (T > 59) 


This individual’s self-description suggests a fast 
Personal tempo characterized by enthusiasm, 
cheerfulness, and perhaps irritability or emo- 
tional lability. Clients who obtain high Hypo- 
mania elevation are often described as immature, 
hyperactive, excitable, agitated, and restless. 
They are unlikely to respond intropunitively to 
conflict and may manipulate others to reach 
their goals, 


Religious Fundamentalism (T > 59) 


Endorsed item content reflects strong religious 
beliefs and religiously motivated behavior. In 
client populations, this orientation suggests ê 
reduced probability of substance abuse, impul- 
sive behaviors, and conflict with family mem- 
bers. Expression of strong religious beliefs may, 
at times, reflect a delusional system and a550- 
ciated thought disorder. 
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Effects of Film Modeling on the Reduction of 
Anxiety-Related Behaviors in Individuals Varying in 
Level of Previous Experience in the Stress Situation 


Barbara G. Melamed 
University of Florida 


Richard Yurcheson, E. Louis Fleece, Steven Hutcherson, and 
Roland Hawes 
Case Western Reserve University 


The influence of film preparation on 80 children undergoing three dental ses- 
sions (prophylaxis, dental examination, and dental restorative treatment) was 
evaluated with respect to (a) peer modeling versus demonstration of procedures 
and (b) amount of information. It was found by evaluating self-report and 
behavioral and visceral-arousal indices in a 2 X 2 factorial design that children 
exposed to a peer-model videotape presentation immediately preceding their 
own restorative treatment exhibited fewer disruptive behaviors and reported 
less apprehension than those watching a videotaped demonstration without a 
peer model. The modeling film elicited less heart rate activity in the subjects 
than the demonstration. The younger children (4-6 years) had lower self- 
reports of fear after viewing a more complete synopsis of what to expect, 
whereas the older children (8-11 years) had the lowest report of fears after 
viewing the peer model receiving a local anesthetic and brief intraoral examina- 
tion. Children with previous treatment experience benefitted most from viewing 
the peer model undergoing the entire restorative procedure or a demonstration 
of the administration of local anesthetic in the absence of a peer model. Chil- 
dren with no prior experience were sensitized by being shown this demonstra- 
tion. Thus, it was concluded that the age and previous experience of the viewer 
were important factors in determining childrens’ fear-related behaviors after 


exposure to preparatory stimuli. 


A large literature indicates that modeling, 
ĉither live or symbolic (filmed), is effective in 
reducing fearful avoidance behaviors and in 
Mecreasing adaptive behaviors in a wide vari- 
ety of situations (Bandura, 1969; Bandura 
Menlove, 1968), The effectiveness of mod- 
eling cannot be assumed without understand- 
mg what information needs to be presented, 
OW this information can best get across to 
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the observer, and how the previous experi- 
ence of the individual in the situation modi- 
fies these considerations. 

The current investigation addressed these 
issues in children facing a real-life stress: 
dental treatment. Early attempts to introduce 
children to dental treatment by use of mod- 
eling films were successful (Adelson & Gold- 
fried, 1970; Machen & Johnson, 1974). 
Melamed and her colleagues (Melamed, 
Hawes, Heiby, & Glick, 1975; Melamed, 
Weinstein, Hawes, & Katin-Borland, 1975) 
demonstrated reduced disruptive behavior 
and lower ratings of anxiety in children who 
had had no previous dental treatment experi- 
ence after the children viewed a cooperative 
peer model as compared with children in a 
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control condition who observed either no 
film or a film unrelated to dentistry. 

However, when film modeling was com- 
pared with systematic desensitization or a 
placebo (warm interaction with the hygien- 
ist), its effectiveness became less certain. 
Machen and Johnson (1974) and Sawtell, 
Simon, and Simeonsson (1974) found con- 
tradictory results. Much of the discrepancy 
between the findings of different investigators 
was most likely due to a lack of attention to 
the subject’s prior experience with dental 
treatment and the presentation of informa- 
tion in the modeling situation. 

This study compared peer modeling with a 
demonstration of the same dental personnel 
and procedures in the absence of a child 
model, Both of these videotapes provided the 
subjects with information about impending 
procedures, but in addition the peer model 
provided a sample of adaptive behaviors that 
should be emitted during treatment. 

The second variable, amount of informa- 
tion presented, was used to evaluate whether 
the child observer needs to be exposed to 
each step of the procedure or whether ex- 
posure to the most feared event, the anes- 
thetic injection, would facilitate modeling. 
Kleinknecht, Klepac, and Alexander (1973) 
suggested that dental fears can best be con- 
ceptualized as learned responses to painful 
stimuli during dental treatment. 

‘The evaluation of the effectiveness of film 
modeling in relation to previous experience 
of the observer allowed us to look at the in- 
fluence of information presented on avoidance 
behaviors that were generated by vicarious 
learning (no actual prior experience) or by 
some combination of classical, operant, and 
vicarious conditioning, 


Method 
Subjects 


Eighty children between the ages of 4 and 11 we 
selected from the pedodontic clinic of University 
Hospitals, Cleveland, Ohio. Mentally handicapped 


one of five film grou; i 

race, previous dental experience, 
fear on the Children 
(Scherer & Nakamura, 


N 
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related fears. There were 48 children who had had | 


one or more cavities restored and 32 without prior 
treatment experience. 


Procedure 


Subjects were observed during two clinic visits. 
At the first visit the hygienist performed a standard 
prophylaxis (Session 1). Immediately afterward, a 
dentist examined the childs mouth and prepared 
bitewing radiographs (Session 2), The same dentist 
restored one or two cavities in a 30-minute appoint- 
ment 7-10 days later (Session 3). The experimental 
manipulation involved exposure of children sepa- 
rately to one of five videotapes according to group 
assignment immediately prior to their restorative 
treatment session. Dentists were matched for ex- 
perience, and both they and the independent ob- 
servers were unaware of group assignment. 


Groups 


Long model. A 7-year-old black child is viewed 
undergoing a dental restorative treatment procedure. 
This 10-minute tape included the examination, in- 
jection, cavity preparation, and placement of the 
restoration. The boy remains cooperative and fear- 
less throughout. The dentist and assistant are neu- 
tral. They instruct the model but do not use posi- 
tive or negative reinforcement. 

Long demonstration (demo). The 10-minute 
videotape was matched for the auditory tract, with 
the same dentist and assistant demonstrating the 
identical procedures without a child model in the 
chair. The dentist does say what behavior he ex- 
pects in the child. Re 

Short model. The same child is shown receiving 
the anesthetic injection followed by an oral ex 
amination. He remains cooperative throughout. 
This videotape runs approximately 4 minutes. 

Short demonstration (demo). The dentist and 
assistant demonstrate the anesthetic injection an 
the oral examination without a child in the chait. 
This videotape runs approximately 4 minutes. j 

Unrelated control film. A videotape of BU 
Corner (Bostustow, 1969) pictures a 7-year-ol 
black boy fixing a corner of the living room as his 
special place. This served as a control for subjects 
exposure to a film. i 

Assessment battery. Previous research epora 
low intercorrelations between different anxiety 
measures (Lang, 1968, 1978; Venham, Bengston d 
Cipes, 1977; Bernstein & Kleinknecht, Note 1) a 
us to examine different aspects of a child’s ao 
jective, behavioral, and physiological anxiety. Ta fe 
1 indicates the variety of measures used and ast 
times at which they were assessed. On the ie 
visit each mother (or guardian) completed 4 ed 
tionnaire about the child’s behavior se 
(Peterson, 1961) and one concerning her anxie A 
about dentistry and how she perceived her yore 
ster would respond to dental treatment a J 
& Baldwin, 1969). The Children’s Fear Sre 
Schedule (CFSS; including 15 specific dental n 
was administered orally to the child at this Arit 
and was repeated immediately prior to treatm! 
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le 1 
ary of Times of Measurement 
A é Session 3 
Session 1 Session 2 Behavior dental 
dental hygiene dental exam manipulation treatment 
Pre During Post Pre During Post Pre During Post During Post 
X 
X 
X x 
x x x x x 
X x x x x x 
x x X 
X X 
X X 
X X X 
X X X 
X x x 
x X x 


tr Viewing the appropriate videotape. The chil- 
ited their degree of fear (on a 5-point fear 
Ometer) prior to the hygienist’s treatment, 
g the dentist’s examination, before and after 
p and following the restorative session. The 
Sweat Index (PSI; Thomson & Sutarman, 
was obtained at these same times. Observer 
of anxiety were obtained during each ses- 
ly an independent observer on the Behavior 
| Rating Scale (BPRS), a quantified measure 
iptive behavior, This instrument has demon- 
Teliability and validity (Klorman, Ratner, 
Sven, 1977; Melamed, Hawes, Heiby, & 
(1975; Melamed, Weinstein, Hawes, & Katin- 
1975.) In addition, the dentist and the 
tated each subject on a 10-point scale of 
ation (1= good, 10= poor) and anxiety (1 
‘© 10= high). 
zanic skin response and cardiac responses were 
during film viewing and treatment by 
Of a Grass polygraph. Data were stored on 
FM tape recorder, and the heart rate data 
Wailable for analysis by means of a PDP-12 
Program. Five 30-sec samples from each 
Selected for analysis. To compare video- 
differing lengths, the total time of each 
€ was divided into sixths, The five samples 
Were taken at the first, second, third, 
And fifth one sixth of the videotapes start- 
the third sample, which was chosen as 
temporal middle of all films. Thus, the 
Samples and the last two samples were 
Proportional distance from the beginning 
nd of each film. Heart rate difference 
Computed using median heart period in 


. MAQ = maternal anxiety questionnaire; BPCL = Behavior Problem Checklist; CFSS = Children's 
f Survey Schedule; FT = fear thermometers; PSI = Palmar Sweat Index; BPRS = Behavior Profile 
Scale; GSR = galvanic skin response; HR = heart rate; DRA = dental rating of anxiety; DRC 
ental rating of cooperation; ORA = observer rating of anxiety; ORC = observer rating of cooperation, 


a one-min prefilm resting baseline and subtracting 
from that the median heart rate period in milli- 
seconds for each of the five samples for each in- 
dividual. 

In addition, the median heart rate for 10-sec 
intervals preinjection scene, during the injection, and 
postinjection were sampled. Difference scores were 
calculated by subtracting each from the preinjec- 
tion value. These heart rate difference scores were 
analyzed to assess increments and decrements in 
heart rate in response to the specific content of 
the local anesthetic injection, regardless of the time 
into the videotape. 


Results 


There were no significant pretreatment 
group differences, except on the PSI. Each 
dependent variable was analyzed in a 2 X 2 
factorial design to evaluate the effect of 
modeling versus demonstration, long versus 
short, and all possible interactions involving 
these factors and the repeated measure of 
time of assessment. Separate analyses were 
repeated with age, sex, race, and experience 
as separate main effects. Age was defined in 
three age blocks: 4-6 years; 6-8 years; and 
8-11 years. Experience was dichotomized as 
no previous treatment versus previous resto- 
rations. A one-way analysis of variance was 
computed on all dependent measures to de- 
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termine whether an unrelated film control 
group, run after the main experiment, was 
significantly different from any of the treat- 
ment groups.’ Appropriate subsequent £ tests 
were used, All p values are for two-tailed 
tests.* An intercorrelation matrix of all mea- 
sures at all assessment periods was obtained. 


Modeling Versus Demonstration 


The greater effectiveness of the modeling 
over the demonstration videotape was sub- 
stantiated across most measures, Children 
viewing the modeling tapes reported fewer 
overall fears than children viewing the dem- 
onstration videotapes, F(1, 56) = 4.26, p< 
04, The race of the subject did affect the 
subjective report. White children reported 
fewer fears than black children after viewing 
the modeling tape, ¢(29) = 2.13, p< .05, 
despite the fact that the child model was 
black. In fact, the black children reported 
fewer fears on dental items than whites after 
seeing the demonstration tapes, #(31) = 2.6, 
p< 02. 

The behavioral measures were concordant 
with self-report findings. There was a border- 
line significance (p < .06), with the children 
viewing the modeling tape exhibiting less dis- 
tuptive behavior than those viewing the dem- 
onstration of the same procedures, This in- 
teraction with sessions achieved significance, 
F(1, 56) = 7.06, p < 01, as illustrated in 
Figure 1. Immediately after viewing the peer 
model, children were less disruptive during 
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actual restorative treatment than 
shown the demonstration version. 

The type of videotape observed also 
yielded significant differences in the observ- 
er’s ratings of anxiety and cooperation when 
age was taken into account. The 6- to 8- 
year-old children were rated as less anxious 
if they had seen the modeling film than if 
they had viewed the demonstration, ¢(23) = 
2.11, p < .05, as illustrated in Figure 2. The- 
Same age group was also rated as more co- 
operative during restorative treatment if 
they had had the opportunity to view the 
peer mode] than if they had viewed the 
demonstration version for both observer rat- 
ings, £(23) = 2.3, p < .05, and dentist rat- 
ings, #(23) = 3.3, p < .01. 

The Palmar Sweat Index data analyzed by : 
covariance analysis showed a significant in- 
crease prefilm to postfilm, regardless of which 
videotape was seen. There was a significant 
correlation of the initial PSI with children’s 
general fears (r = .30, p < .02) and specific 
dental fears (r = .31, p< .01), as reported 
on the Children’s Fear Survey Schedule ad- 
ministered immediately preceding restorative 
treatment, The initial scores on the CFSS 
also correlated positively with the PSI taken 
immediately before the videotape was shown 
(r= 31, p < 01). 

The heart rate data for the injection scenes 
also reflected a difference due to the type of 
film seen, F(1, 51) = 4.69, p< .035. There 
was a greater increase in children’s heart 


those , 


1 Because the subjects in the unrelated film group 
were all run after the completion of the E, 
conditions, the results are suspect. The failure E 
replicate high levels of disruptive behavior found 
our previous studies (Melamed, Hawes, Heit 
Glick, 1975; Melamed, Weinstein, Hawes, & P ct 
Borland, 1975) may be accounted for by the fac 
that all of these subjects were hooked up seat 
Polygraph during treatment, thus limiting movem 
and likelihood of disruption, Therefore, only = 
nificant differences between the treatments ro- 
this condition were reported in this study. APP f; 
Priate replication is needed prior to interpreting of 

ck of group differences when one-way Brit 
variance, including the control group, Were H 
formed. ari 

? Appendix A, which has a table of all compy 
sons made, is available on request from the 
author, 
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Figure 2, Observer's rating of children’s anxiety by 
| ge and type of videotape preparation. 


fates from the preinjection period to the 
during- and after-injection segments of the 
demonstration videotapes than in the heart 
ates of children who viewed the modeling 
Versions, In fact, the group means in heart 
tate difference scores for the modeling con- 
ditions (M = 8.69 for during and M = 11.55 
for Postinjection) indicated a further de- 
“tease in heart rate as compared with the 
Preinjection values. 


Amount of In formation 


fo thoush a main effect for length of the 
m was only apparent in the heart rate data, 
© self-report data did reflect an effect of 
fom eneth when the age of the child was 
a ted, No differences were found for the 
a Unt of information on any of the ob- 
oan measures. 
oe in the youngest age group, 4-6 
ing th Teported fewer general fears after see- 
longer version as compared with the 
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short videotapes, #(16)=2.35, p< .05. 
However, it should be noted that there was 
a main effect of age on the fear thermometer, 
another self-report measure. The youngest 
children gave higher ratings of self-reported 
fear than other age groups regardless of the 
type of film, F(2, 65) = 4.95, p < .01. The 
oldest children; 8-11 years, had the lowest 
report of dental-related fears after viewing 
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Figure 3. Heart rate difference scores for long and 
short videotapes across the five sampling intervals. 
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@— SHORT DEMONSTRATION 
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Figure 4. Degree of disruptiveness durin 
type of videotape preparation. 
the short model as compared with the short 
demonstration, #(8) = 2.44, p < .05, 

Heart rate data did yield a significant ef- 
fect of length of the film, F(4, 204) = 3.07, 
p < .02, with long films eliciting higher heart 
rate difference scores, In the interaction be- 
tween length and interval, F(4, 212) = 2.64, 
b < .04, shown in Figure 3, it is clear that 
there exists a linear positive relationship 
between heart rate acceleration and time into 
videotape for the longer, more informative 
videotape. The children viewing the shorter 
tapes on the other hand showed an initial 
peak during the second, third, and fourth 
sampling points (film con 
part with the injection segment) and a return 


long version (film 


direction of increased 
heart period), 
Previous Experience 

The effectiveness of the diff 
Preparatory films on erent types of 


the behavior of the child 


&--A SHORT MODEL 
4 --& LONG MODEL 


g dental treatment as effected by previous experience and 


was influenced by previous experience, A 
56) = 7.06, p < .01. The difference cau 
by previous experience was not noted in any 
of the self-report measures. ee. | 
Group differences are illustrated in Figut i 4 
4. The Behavior Profile Rating scores 0 
Session 2 were used as a covariate in deter- 
mining the significant differences between 
film types in groups defined as no prior ý 
perience versus previous experience beca 
of the high correlations between these ae 
and the BPRS during Session 3 (ae ff 
#<.001). The children with no prior E 
perience showed significantly less disrupti E 
behavior after viewing a short model or k 
long demonstration prior to their first ren i 
ative treatment, as compared with v 
the short demonstration ($ < .01). In | a 
the short-demonstration group showed high 
degrees of disruption when compared va 
children who viewed the unrelated pae 
film (~<.02). The children with sik 
treatment showed less disruptive ee 
after viewing the long modeling film vers 2 
pared with the long demonstration, #(1 a 
—2.75, p< .007. However, there were a 
Significant differences between the long oc 
and other videotapes, including the ey 
lated film. Overall, the dentists rate | 
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inexperienced children as decreasing in co- 
operation from the dental examination to 
the dental treatment, #(48) = 2.0, $ < .05, 
whereas the change in cooperation for ex- 
perienced subjects was insignificant. The 
correlations between BPRS scores for Ses- 
sion 2 and Session 3 were highly positive for 
experienced subjects (r = .44, p < .002), as 
were the dentists’ ratings of their anxiety 
between Session 1 and Session 3 (7 = .63, 
b< .001). 


Intercorrelation Matrices 


The data revealed good test-retest relia- 
bility within measures. Table 2 shows the 
high positive correlations that resulted be- 
tween comparisons on the self-report mea- 
sures. Table 3 indicates that the degree of 
disruptiveness (BPRS) during the first two 
Sessions was highly related to the children’s 
behavior during restorative sessions across 
all subjects. It was interesting to note that 
the greatest concordance between dentist and 
observer ratings of anxiety and cooperation 
occurred during the actual restorative treat- 
Ment session, Table 4, which illustrates the 
Comparison between measures tapping dif- 
ferent systems (i.e. self-report and behav- 
loral), is not unexpected in its sparse number 
of significant correlations. The most note- 
Worthy is the fact that the children’s self- 
| rotted dental fears just prior to the restora- 
‘On correlated significantly with the degree 
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of disruptiveness during that session (r= 
29, p< 01). This measure of anticipatory 
anxiety also correlated with the observer and 
dentist ratings of anxiety and cooperation 
during the treatment session. 

The Table 5 measures of age, maternal 
anxiety, and general problem behaviors 
(BPCL) yielded some interesting correlations 
with the other dependent variables. Age cor- 
related consistently with the child’s self-re- 
port of anxiety, with younger children re- 
porting more anxiety. Dentists and observers 
also tended to rate the younger child as more 
anxious. It is interesting that there was no 
significant correlation between the Behavior 
Problem Checklist and any of the BPRS 
measures. This would support the contention 
that this scale is measuring something other 
than just the general level of disruptiveness, 
hyperactivity, or management problems of 
the children. The relationship reported earlier 
in the literature between maternal anxiety 
and children’s anxiety is not supported. In 
fact, the mothers’ concern about their chil- 
dren is negatively associated with the chil- 
dren’s self-report of anxiety prior to actual 
dental treatment. 


Discussion and Conclusions 
Peer Modeling Versus Demonstration 


Does peer modeling reduce anxiety more 
than a demonstration of the same proce- 


Table 2 
Pearson Product-Moment Correlations Between Self-Report Measures of Anxiety 
Variable 2 3 4 5 6 7 8 9 

1, CFSSA (1 adios .82*** Poorts .26* -20 23% 18 30** 
2. CESSA (2) ‘ogee ayes 27e. iA Laamanen aor 
3. CFSSB (1) ‘gase ‘ase o2 028" 318 "23" 
4. CFSSB (2) 22" ‘09 «3g 5583" 
5. FT (1) ‘2a, | 11324 ata 26" 
6. FT (2) 45) ps 33"* 
T. FT (3) 4aee* 19 
8. FT (4) "22" 
9. FT (5) 

Ne ‘ ' 

ole. N = 80. CFSSA = Children’s Fear Survey Schedule, full scale score; CFSSB = Children's Fear 


Survey 
ea: $ 
+ surement time, 
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Table 4 
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Pearson Pyoduct-Moment Correlations Between Self-Report and Observational Measures 


FT (2) FT (3) FT (4) FT (5) 


| BPRS (1) Bel 15 .25* 
BPRS (2) 16 18 .11 
BPRS (3) 13 .20 AT 
DRA (1) 27" 31** 37" 
DRA (2) 18 19 27% 
DRA (3) 12 .20 18 
DRC (1) 15 .29** .34** 
DRC (2) —.04 —.11 —.02 
DRC (3) .12 .15 .08 
ORA (1) 04 18 20 
ORA (2) 202 5 7 
ORA (3) 17 .18 21 
ORC (1) 18 .28* .33** 
ORC (2) 05 07 AT 
ORC (3) 04 12 .09 


19 09 —.08 Al 15 01 
ll —05 14 28% —03 09 
.29%* —.16 —.03 ee 19 .00 
ine 228° = »—.09 —.09 05 01 
23% 7 04 AGN 2238 12 
.24* —.05 —.09 19 47 -=.03 
.32** 31% 13 06 12 01 
—.09 -00 ole 03 —.03 —.07 
18 —.04 04 14 “20 yee. 09) 
18 .30** 04 —.03 -02 00 
.19 13 —.07 Bul 04 —.05 
220 —.05 —.03 .09 13 13 
.28* 14 —.06 05 -20 .02 
.13 —A5 .05 .02 DS aU) 
-20 —.16 —.05 04 04 .00 


Note, N = 80. BPRS = Behavior Profile Rating Scale; DRA = dental rating of anxiety; DRC = dental 
tating of cooperation; ORA = observer rating of anxiety; ORC = observer rating of cooperation. CFSSA 


A Children’s Fear Survey Schedule, full scale score; CFSSB = Children’s Fear Survey Schedule, dental 
ems only; FT = fear thermometer. Arabic numerals in parentheses indicate measurement time. 


42 <05. 
wm? <01 
$ < .001. 


dures? There is evidence that this is true. 
Children who observed a peer-model film 
pied less dental and general fears than 
aa who observed the demonstration film. 
n fact, the observational data are congruent 

With this in that children observing the peer 
a cooperated with the dentist and ex- 
Bee fewer disruptive behaviors. The chil- 

a in the 6- to 8-year-old group were rated 
i. anxious and more cooperative if they 
Fa seen the peer-model film as opposed to 
k ao ration Dentists also rated these 
sup ren as more cooperative. These data 
a fa the general finding that children clos- 
ae Z age to the filmed model were the most 
He ae (It is interesting that black chil- 

Rise id not reduce their self-reported fear 
White Seeing a black child model as much as 
a ae did.) The physiological data 
atousal congruent. Higher sympathetic 
ibose (heart rate) was produced in re- 
deceler a the demonstration tape, whereas 
modelin, lon occurred in response to peer 
Deers ee Thus, modeling films that show 
ae similar age cooperating with dentists 
a more favorable effect on the self- 


rej 
| Ported apprehension, actual behaviors of the 


observing children, and autonomic indices of 
observers than do mere desensitization or ex- 
posure to impending events. 


Amount of Previous Experience 


The length of the film affected the self- 
reported apprehension, but not the behavior, 
of the children. The youngest children, ages 
4-6, had the lowest reports of fear with 
longer versions of the film regardless of type 
of presentation (model vs. demo). The more 
information imparted, the more the heart 
rate difference scores increased. Therefore, 
it is not sufficient to give more information, 
as this can have a sensitizing effect, if the 
format of the presentation (model or demo) 
is ignored. 


Effects of Previous Experience 


The effects of having had prior dental 
experience were seen in terms of the chil- 
dren’s behavior during dental restorative 
treatments. The dentists rated the children 
with no prior experience as being less co- 
operative during treatment than during the 
dental examination. The child who had al- 
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Table 5 

Pearson Product-Moment Correlations Among 
Age, Maternal Anxiety Questionnaire (MAQ) 
and Behavior Problem Checklist (BPCL) 
Scores, Measures of Self-Report, and 
Observational Ratings 


Age MAQ BPCL 
Age .00 —.12 
CFSSA (1)  —.07 —.19 .14 
CFSSA (2) —.28* —.33** 3a 
CFSSB (1) —.11 —.23* 14 
CFSSB (2) = —.25* —.26* .28* 
FT (1) =.29* [15 —.09 
FT (2) —.25* —.03 —.13 
FT (3) —.15 —.07 AL 
FT (4) —.16 —.07 .27* 
FT (5) —.16 .15 —.04 
MAQ .00 =.34** 
BPCL —.12 —.34"* 
BPRS (1) —.14 10 —AL 
BPRS (2) —.07 4 07 
BPRS (3) —.03 04 22 
DRA (1) —.20 —.28* 22° 
DRA (2) —.22* 05 07 
DRA (3) —.10 03 21 
DRC (1) AS =.19 AS 
DRC (2) =.19 .00 00 
DRC (3) —.08 —.08 133 
ORA (1) —.27* -11 01 
ORA (2) =.14 —.01 04 
ORA (3) 06 O01 2 
ORAC (1) -17 01 Ot 
ORC (2) —.13 04 01 
ORC (3) —.05 Al .23* 


ee 
Nole, N = 80. BPCL is a measure of the mother’s 
rating of child's general behavior problems; CFSSA 
= Children's Fear Survey Schedule, full scale score; 
CFSSB = Children's Fear Survey Schedule, dental 
items only; FT = fear thermometer. BRPS = Be- 
havior Profile Rating Scale; DRA = dental rating 
of anxiety; DRC = dental rating of cooperation; 
ORA = observer rating of anxiety; ORC = ob. 
server rating of cooperation. Arabic numerals in 
parentheses indicate measurement time, 

*p <.05. 

p< 01. 
"> < 001. 


ready been to a dentist did not signifi- 
cantly more anxious or daie pee 
f different types 
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short demonstration. On the other 1 


children who have already experienced dental 
restorations behave much better if they view 
a long modeling version. Thus, when these 
children received maximum information re- 
garding what to expect and how to behave, 
they were most cooperative during treatment, 
However, the lack of difference between the 
long model and the short demonstration at- 
tests to the ability of some children to use 
their past experience in knowing how to be- 
have, even in the absence of a model. 

This study has raised important issues re- 
garding the blanket assumption that filmed 
modeling reduces fear-related behaviors. It 
was found that there is a greater reduction 
of both self-reported and actual fear behavior 
during treatment when information is pre- 
sented through a peer model. Whether this 
was due to greater information regarding 
behavioral expectation or whether viewing 
an identification figure increased modeling 
cannot be definitively stated. The physio- 
logical data would support that peer model- 
ing produced a catharsis for model condi- 
tion, since observers heart rates decr 
with exposure. 

The importance of age and previous eX- 
perience in choosing appropriate preparatory 
material was borne out. The child most simi- 
lar in age to the peer model being portray’ 
benefitted the most in his or her degree of 
cooperation. Race had a paradoxical effect, 
with white children showing lower self-te 
Ports of fear than blacks after seeing 4 black 
model, 


portant issues regarding the influence of prio! 
knowledge on selection of type and format 
of information were raised. If the child has 
never actually undergone dental treatm a 
but has formed a conception through vican- 
ous processes, such as being told that 4 shot 
will be given, reminding him or her of 

event in the absence of a model showing BO" 
to handle it (short demonstration) will se" 
tize him or her and increase the degree ° 
disruption, On the other hand, the more © 
Perienced child will benefit most from being 
Prepared for this noxious event briefly 07 y 


4 
In terms of previous experience, some 


EFFECTS OF FILM MODELING 


seing another child going through each step 
‘of the procedure in a cooperative manner. 
Thus, the dental setting allows the re- 
searcher to investigate many questions re- 
garding prevention and modification of stress 
tesponses, taking into account developmental 
and learning differences in information-pro- 
essing abilities of the coping individuals. 


Reference Note 


1, Bernstein, D., & Kleinknecht, R. Assessment of 
fear of dentistry. Paper presented at the 10th 
annual convention of the Association of the 

| Advancement of Behavior Therapy, Washington, 

' D.C., December 1976. 
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A and B Undergraduate Interviewers of Schizophrenic and 
Neurotic Inpatients: A Test of the Interaction Hypothesis 


Daniel F. Barnes 
Counseling Center 
Loyola University of Chicago 


Juris I. Berzins 
University of Kentucky 


In an attempt to elucidate further the personological basis of the differential 


compatibility of A and B therapists with schizophrenic and neurotic patients, 
this study required A and B undergraduate volunteers (20 males, 20 females) 


f 


to conduct 20-minute interviews with male state hospital inpatients (40 schizo- 
phrenics, 40 neurotics) in a 2 (interviewer A-B status) X 2 (interviewer sex) 
X 2 (patient type) factorial design. As expected from studies of the personality 
correlates of A-B status, many more B than A interviewers “looked forward” 
to conducting the interviews. Once in the interview situation, however, A-type 
interviewers elicited better self-disclosure from schizophrenic patients than did 
Bs, whereas the latter outperformed As with neurotic patients. The results are 
discussed in terms of a personological formulation that considers interviewer 


and the situational context. 


The A-B variable emerged from a series 
of studies conducted by Whitehorn and Betz 
(1954) in the early 1950s. In these studies, 
therapists (arbitrarily labeled A and B) were 
found to be differentially effective with schiz- 
ophrenic patients (As more effective than 
Bs). However, with neurotic patients, Mc- 
Nair, Callahan, and Lorr (1962) found Bs 
to be more effective than As. These findings 
Suggested an “interaction hypothesis”: As 
are more effective with schizophrenics than 
are Bs, whereas the latter are more effective 
with neurotics than are As, Subsequent ana- 
logue and clinical studies have generally sup- 
ported the interaction hypothesis (e.g., Ber- 
zins, Ross, & Friedman, 1972; King & Bla- 
ney, 1977; Matthews & Burkhart, 1977). 
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effectiveness to be a joint function of interviewer personality characteristics 
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Several explanatory formulations of the 
A-B interaction effects (e.g., the “comple 
mentarity” hypothesis, perceptual—cognitive 
style compatibility) have been proposed: 
One such formulation can be called the pel 
sonological hypothesis, which evolved i 
a study by Berzins, Barnes, Cohen, and Ross 
(1971), who showed that A-B differences 
can be explained in personality terms. Using | 
the Personality Research Form (PRF, Jack- 
son, 1967), they found that A-type persons 
were characterized by cautious self-expre® 
sion, social ineptness, and a restricted COB 
nitive scope, whereas B subjects appear! 
socially ascendant and “open” to complex 
experiences. The single most differential i 
personality dimension was “harmavoidancé 
(As > Bs), denoting cautious avoidance % 
risk taking. ji 

Berzins, Dove, and Ross (1972) cross-Vau” 
dated the Berzins et al. (1971) study W E 
very large samples of college males E d 
males, male therapists, and male college CHIM 
patients. In spite of striking intergroup © 
ferences in PRF scale scores, A and B 
Sons, across samples, showed distinctive 
sonality profiles. In terms of five “com 
scales, B-type persons were less harmavol® 


} 
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A AND B INTERVIEWERS OF INPATIENTS 


‘ant but more dominant, variety seeking, 
gntient, and “counterdependent” than were 
A-type persons. 

The personality correlates of A-B status, 
in our view, should be regarded as depicting 
the everyday behaviors of A and B persons. 
Asking A and B college students to interview 
aschizophrenic state hospital inpatient, how- 
wer, does not require the student to display 
weryday, accustomed behaviors. Rather, he 
or she is required to interact in a presumably 
“helpful” fashion with someone who is 
severely maladjusted. These situational de- 
mands, furthermore, need not be experienced 
Similarly by As and Bs because of their 
tiffering levels of adaptation, self-confidence, 
md personal adjustment. Therefore, one 
Would expect A- and B-type persons to dis- 
‘lay their “everyday” behaviors before they 
‘onduct an interview with a “state hospital 
Patient” but not while interviewing. Accord- 
mgly, the following hypotheses were ad- 
vanced. 


1. Prior to meeting any patient, B-type 
persons should look forward to interviewing 
‘late hospital patients to a greater extent 
than should A-type persons. 

2. Once in the situation, A-type persons 
should elicit better self-disclosure from schiz- 
pitenic patients and display greater non- 
etbal immediacy with them, whereas B-type 
as should perform similarly with neu- 

ic patients than should be the case under 

W opposite or “incompatible” pairing con- 
‘Utions, 

Pct the interview, more positive re- 
me about the interview should be elicited 
pass members of optimal pairings, as com- 
ia with members of nonoptimal pairings, 

Onsequence of greater interview success. 


Method 


Interviewers 


ae of 131 student volunteers (56 males, 75 
Psyc] lo: enrolled in the summer school session in 
the a pe filled out the PRF (Jackson, 1967) and 

sty cae (19 scored items). From this group, 
the first 19 (the first 10 male and female As and 
erviewe male and female Bs) were selected as 
(1972) S Using the Berzins, Dove, and Ross 
i normative data, A status was defined by 
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scores between O and 7 for males and between 0 
and 4 for females; B status, by scores 13 to 19 
for males and 10 to 19 for females. 


Patients 


Schizophrenic (n=40) and neurotic (n= 40) 
male inpatients at the Eastern State Hospital (Lex- 
ington, Kentucky) were selected on the basis of 
symptom clusters rather than formal diagnosis. 
Phillips and Rabinovitch (1958) distinguished three 
symptom cluster categories: (a) self-deprivation, 
turning against the self (TAS); (b) avoidance of 
others (AVOS); and (c) self-indulgence, turning 
against others (TAO). The first two categories, TAS 
and AVOS, were considered prototypic of neurotic 
and schizophrenic status, respectively. Using a pro- 
cedure developed by Green (1971) and Welch 
(1971), a patient was classified as AVOS if the 
number of AVOS symptoms noted was equal to 
at least two and if the number of TAS plus TAO 
symptoms was less than the number of AVOS symp- 
toms. Similarly, a patient was classified as TAS if 
he had at least two TAS symptoms and fewer (than 
TAS) AVOS plus TAO symptoms noted in the 
clinical folder. 


Procedure 


Each interviewer was to meet one AVOS and 
one TAS patient, with each interview lasting 20 
minutes. The interviews were held in a moderately 
large one-way vision room that contained no furni- 
ture except for 15 chairs located on the perimeter 
of the room. 

Before the first interview, each interviewer was 
instructed to “interview each patient as completely 
as you can by getting as much information as you 
can.” To help the interviewer gain some degree of 
comfort and structure, the experimenter provided 
a deck of 12 cards, each with a possible interview 
topic on it. Of the 12 topics, 6 had been prerated 
as personal in content (e.g., “What aspects of your 
personality do you dislike?”), and 6 were neutral 
in content (e.g. “What is your favorite reading 
matter?”), After the interviewers had familiarized 
themselves with the topics (no directions were given 
to restrict the interview to these topics, however), 
each interviewer was asked casually if he/she was 
“looking forward to the interview,” and was intro- 
duced to the patient and left to choose seats and 
conduct the interview. Following the first inter- 
view, the patient and the interviewer were asked 
(independently) how he or she felt about the inter- 
view. The same procedure was followed with the 
second patient. Both interviews were terminated by 
the experimenter’s entering the room. 


Dependent Measures 


interviews, two advanced graduate 


During the 
nonclinical) observed 


students (one clinical, one 
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the interaction through the one-way mirror and 
rated selected aspects of the participants’ behavior. 

The main dependent measure was depth of pa- 
tient self-disclosure, rated by both judges on a 
0-3 scale (Berzins, Ross, & Cohen, 1970). Depth 
of self-disclosure ratings were assigned to every 
interview topic perceived as distinct by each judge; 
the topics were later subgrouped according to the 
personal-neutral distinction. In addition, at the 1st, 
Sth, 10th, 15th, and 20th minutes of the inter- 
view, the judges used multipoint scales to rate the 
interviewers’ nonverbal behavior in 11 categories. 
These were taken from Mehrabian (1968) and in- 
cluded measures of “immediacy” or liking (physical 
proximity, eye contact, orientation of torso), re- 
laxation or status (asymmetrical placement of arms 
and legs, sideways lean, reclining position, hands 
relaxed), and activity (facial expressiveness, rate and 
flow of speech). 

The experimenter (the first author) also unob- 
trusively recorded interviewers’ responses to the 
preinterview question (“Are you looking forward 
to the interview?”), and, after each interview, he 
asked the participants independently (a) “Did you 
like the interview?” (b) “Did you like the person 
you talked to?” and (c) “Throughout the interview, 
were you basically tense or relaxed?” Responses to 
these questions were coded as O or 1. Since the 
experimenter could not help being aware of the 
group membership of some patients and some in- 
terviewers, he attempted to ask all questions as 
impartially and equivalently as Possible. 


Design 


The experimental design was a 2 (interviewer 
A-B status) X 2 (interviewer sex) X2 (patient 
type), with the order of presentation of the last 
factor counterbalanced to control for order effects. 


There were 10 dyads per cell, with each interviewer 
having seen one AVOS and one TAS patient, Of 


the 40 interviewers, 20 saw the AVOS tient first, 
and 20 saw the TAS patient first, mt ap 


Results 


A correlational analysis was conducted to 
determine whether the A-B scale correlates 
in the PRF, originally demonstrated on much 
larger samples (Berzins, Dove, & Ross 
1972), were replicable within the summer 
school classes from which the interviewers 


the A-B scale 
the Harmavoid- 


of the PRF, thereby replicating prior results, 
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Interjudge Reliability 


Since the judges had strong agreement re- 
garding the number of different topics cov- 
ered by each interviewer, r(78) = .93, p< 
.0001, only those topics that were rated by 
both judges were considered. For individual 
topics (z =1,118 across the 80 dyads), the 
interjudge coefficient for patient self-disclo- 
sure ratings was .69 (p< .0001), but the 
coefficient for mean self-disclosure per dyad, | 
r(78) = 89, p < .0001, indicated that the | 
judges used the self-disclosure rating scale 
in a consensual manner. Of the 11 non: | 
verbal measures of interviewer behavior, 
three (eye contact, verbal fluency, and rate 
of speech) were not rated reliably (rs less | 
than .27), but the remainder showed inter- 
judge coefficients ranging from .64 to .98 (all | 
ps < .0001). Only the latter were analyzed. 


Preinterview Measures 


We had hypothesized that prior to meet- 
ing any state hospital inpatient, the B-type 
interviewers would look forward to the inter- 
views to a greater extent than A-type inter- 
viewers. This hypothesis was supported by, 
the data, x?(1) = 16.94, p < .001, with the 
effect especially pronounced among male in- | 
terviewers, x*(1) = 16.36, p < .001. 


Intrainterview Measures 


A four-factorial analysis of variance (i 
corporating the patient sequence as 4 sep% | 
rate factor) conducted across all intraintel 
view measures showed eight significant a 
action effects, seven of which involved t 
Patient sequence variable. That is, A-tYP 
interviewers in the AVOS-TAS sequent 
(AVOS patient seen first) and Bs in e 
TAS-AVOS sequence obtained more a 
closure from patients than did interviews” 
Paired oppositely. In general, although z 
sults with the “first patient seen” confor 
to the interaction hypothesis, these trends ei 
transferred themselves to the second na 
view, even though another (compatible 
incompatible) patient now comprise! hose 
stimulus. These order effects replicate t 
of Berzins and Seidman (1968, 1969)- 
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Since the interviewer’s performance with 
the “first” patient cannot be affected by the 
impending second interview, analyzing the 
first interviews should afford a clearer ap- 
praisal of the interaction hypothesis in this 
experiment, although at a cost of statistical 
power (cf. Berzins & Seidman, 1968, 1969). 
Table 1 presents the results that were signifi- 
cant in three-factual (interviewer A-B status, 

interviewer sex, patient type) analyses of 

variance of all intrainterview measures. 

Of the four significant patient-type main 
effects, three involved the self-disclosure vari- 
able. The TAS patients clearly were more 
self-disclosing than were AVOS patients, with 
tegard to the personal, neutral, and total 
topics discussed within the interview. (The 
total topic category also includes the mean 
self-disclosures on topics introduced by in- 
terviewers, ie., topics not on the cards.) * 
Since the AVOS patients may be regarded 
as more regressed than TAS patients, this 
result is not at all surprising and replicates 
the results of Green (1971) and Welch 
(1971) with Veterans Administration pa- 
tients, 

The other main effect involving the AVOS— 

TAS distinction (asymmetrical placement of 
arms by the interviewer) indicates that 

AVOS patients elicited higher levels of this 

index of relaxation (Mehrabian, 1968) than 

did TAS patients. This finding seems con- 

sistent with the notion that more schizo- 

Phrenics exude lower status than do neurotics. 

The main effects involving interviewer sex 
Concerned one measure of nonverbal immedi- 
acy (proximity) and one measure of ac- 
tivity (facial expressiveness). Males sat 
closer to the patients than did females, but 
females were markedly more expressive than 
males. These results might best be under- 
Stood in terms of sex role stereotypic be- 
haviors, 

Be finding that A-type interviewers sat 

“ser to the patients than did Bs is con- 
Sistent with the results of the Berzins et al. 
ae study and of the Green (1971) and 
a ch (1971) studies, in which the A inter- 
ie was rated as “warmer” than the B 
a | a Note, however, that this index 

ehavioral “approach” strikingly contra- 
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Table 1 

Significant Main and Two-way Interaction 
Effects in the Three- Factorial Analyses of 
Variance of Intrainterview Measures 


Variable Type F(1, 2) 
Main effects 
Patient type AVOS TAS 
Self-disclosure, total 2.03 2.33 7.28** 
Self-disclosure, 
personal topics 1.81 IAS y AERES 0:7 08 
Self-disclosure, 
neutral topics 1.96 2.28 8.88*** 
Asymmetry of arms 1.77 -80 5.00* 
Interviewer sex M F 
Physical proximity 6.50 4.35 5:124 
Facial expressiveness 1.54 31525:4008 
Interviewer A-B status A B 
Physical proximity 6.60 4.25 6.12** 


Interviewer A-B Status X Patient Type Interaction 


AVOS TAS 
Self-disclosure 
Total p 
A 2.12 2.1 
B 195. 02.48 | t2 
Personal topics 
2.05 2.12 
B 136), 2.47 eat 
Neutral topics $ La 
A Zii „11 
B EE VEESI, 


Note. M = male, F = female; AVOS = avoidant of 
others (schizophrenic prototype), TAS = turning 
against self (neurotic prototype). 


dicts expectations based on these interview- 
ers’ responses to the preinterview question, 
“Are you looking forward to the interview?” 

Turning now to the two-way interaction 
effects in Table 1, it is clear that the main 
effects denoting the greater self-disclosure of 
TAS patients is qualified prominently by 
interviewers’ A-B status. That is, A-type 
interviewers obtained equivalent self-disclo- 
sures from both types of patients, whereas Bs 


1 There were no main or interaction effects in- 
volving interviewer A-B status for the number of 
personal, neutral, extraneous, or total topics dis- 
cussed. The average number of topics per dyad were 
43 personal, 4.6 neutral, and 4.9 extraneous. 
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obtained considerably higher self-disclosures 
from TAS than AVOS patients. (Orthogonal 
comparisons between patient types within 
Bs were significant at p < .005 for total, per- 
sonal, and neutral topics.) However, on per- 
sonal and neutral (but not total) topics, 
A-type interviewers obtained significantly 
higher self-disclosures from AVOS than TAS 
patients (personal topics, p< .01; neutral 
topics, ~<.05). Also, with TAS patients, 
B-type interviewers outperformed As on total 
(p < .05) and neutral (p < .02) but not on 
personal topics, Taken together, these results 
offer substantial support to the interaction 
hypothesis and are also consistent with the 
findings of Berzins et al. (1970) with addict 
patients, and those of Greene (1971) and 
Welch (1971) with Veterans Administration 
inpatients. 

Two additional interaction effects (Inter- 
viewer A-B Status X Interviewer Sex X Pa- 
tient Type), not shown in Table 1, involved 
one measure of nonverbal immediacy (orien- 
tation of torso) and one measure of relaxa- 
tion (asymmetry of arms). Close examina- 
tion of these effects, considered jointly, 
suggest that under compatible pairing con- 
ditions, male interviewers manifested greater 
immediacy but less relaxation than did male 
interviewers paired incompatibly. This atti- 
tude of poised attention appears consistent 
with the self-disclosure data and the inter- 
action hypothesis. Female interviewers, how- 
ever, did not follow this pattern; rather, 
they appeared to exhibit greater immediacy 
and relaxation with the AVOS patient than 
with the TAS patient. These differences in- 
vite clarification in further research, 


Postinterview Measures 


Our third hypothesis stated that members 
of compatibly paired dyads would show 
more positive postinterview reactions than 
members of incompatibly paired dyads. Since 
the variability of participants’ reactions was 
constrained by ceiling effects (eg., in only 
eight instances did an inte iewer report 
having disliked the patient), the hypothesis 
was not supported statstically. However 
when negative reactions were given, Giese 
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were consistent with our hypothesis and with 
the intrainterview self-disclosure data. 


Discussion 


The preinterview data clearly supported 
the first hypothesis and also offered behavi- 
oral validation of the interviewer’s everyday 
personality characteristics, as reflected by 
PRF scale scores. However, since responses 


to the “looking forward to” question were | 


solicited by the experimenter, these differ- 
ences, their magnitude notwithstanding, re- 
quire replication with better control over 
possible experimenter effects. 

Before discussing the intrainterview results, 


the patient sequence factor bears examina- | 
tion, In the present study as in earlier ana- | 


logues (Berzins & Seidman, 1968, 1969), 
undergraduate As and Bs paired compatibly 
and incompatibly “transferred” their differ- 
ential performances with the first patient to 
the second patient. This “transfer” of per- 
formance differences associated with initial 
pairings to later ones, contrary to Chartier 
(1971), in no way cancels out the initial dif- 
ferences. Rather, these sequence effects, 
demonstrated in both analogue and in vive 
settings, deserve recognition in the design of 
further research. 

Turning now to the interviews themselves, 
the degree of support accorded the interac 
tion hypothesis by the self-disclosure data 
seems important particularly because this 
study paired untrained, unsophisticated in- 
terviewers with markedly disturbed inpatients 
in a realistic interview situation. In spite F 
this arrangement, one that should wo! 
against ready, let alone differential, elicita- 
tion of self-disclosures, the A-type interview- 
ers were able to elicit as much self-dis- 
closure from AVOS (schizophrenic proto” 
type) as TAS (neurotic prototype) patienni 
and they outperformed Bs with the AVO 
patients. 

Recalling that both in terms of PRF score 
and responses to the looking forward qu 
tion, A-type interviewers appeared more reti- 
cent than did Bs before encountering gE 
patient, the finding that once in the gi 
view, they in fact sat closer to their patien 
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than did Bs is intriguing also. Since increased 
physical proximity seems an unlikely corre- 
late of cautious and reticent attitudes, it 
seems plausible that the A-type interviewer’s 
“transformation” may have been eventuated 
by some aspect of the actual interview 
situation. 

Our personological formulation assumes 
that the interviewer’s behavior (effective or 
ineffective) is a function of the interviewer’s 
own everyday personality characteristics and 
the situational context, the latter comprised 
at least of the demand that one function as 
a “helping agent,” and the perceived char- 
acteristics of the patient, for example, degree 
of psychopathology. In everyday life, the A- 
type interviewer indeed may be cautious and 
submissive, and the B-type interviewer may 
be risk oriented and dominant. Upon encoun- 
tering a markedly disturbed patient, however, 
the cautious and submissive A-type inter- 
viewer may perceive the regressed patient as 
someone whom he/she can help. In this 
sense, the regressed patient “liberates” A- 
type interviewers to behave effectively, 
whereas less disturbed patients may render 
these interviewers uncertain about the out- 
comes and may “inhibit” their effectiveness. 
On the other hand, with less disturbed pa- 
tients, the dominant, risk-taking, self-assured 
B interviewer may perceive the situation as 
a reasonable challenge; contrariwise, the 
More disturbed patient may be experienced 
by the B-type interviewer as boring or even 
hopeless. The results of this study generally 
Support this formulation. 
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Drug Abuse Patterns, Personality Characteristics, and 
Relationships With Sex, Race, and Sensation Seeking 


Patricia B. Sutker, Robert P. Archer, and Albert N. Allain 
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Medical University of South Carolina 


Interrelationships among sex, race, drug use patterns, and personality variables 
were examined in a sample of 84 chronic users of illicit drugs. Subjects were 
administered the Minnesota Multiphasic Personality Inventory, the Sensation 
Seeking Scale, and the Shipley Institute of Living Scale and were interviewed 
using the Background Information Questionnaire. Comparisons were made be- 
tween sex and ethnic subgroups on personality and drug use variables using 
analysis of covariance and chi-square procedures for subjects classified into 
high-, medium-, and low-sensation-seeking groups. Blacks were characterized by 
lower levels of sensation seeking, less psychopathology, use of fewer drug cat- 
egories, and later drug use than whites. Use and personality patterns among 
women differed little from those of men. Levels of sensation seeking were re- 
lated to specific personality constellations, number of drug categories used, and 


motive for first alcohol use. 


Research on the psychological character- 
istics of drug abusers has developed from 
attempts to describe and differentiate addicts 
from representatives of other clinically de- 
viant categories. More recently, investigators 
have compared drug abuse subgroups (de- 
fined by race and sex) on personality dimen- 
sions or drug use patterns. Female and white 
drug abusers have been shown to demonstrate 
greater psychopathology than males and non- 
whites (DeLeon, 1974; Olson, 1964), and 
ethnicity was found to be related to choice 
of drug type and variety used, particularly 
among men (Kaestner, Rosen, & Appel 
1977). Suffet and Brotman (1976) reported 
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lower rates of illicit drug use among women, 
but sex differences in drug use patterns have 
not been adequately specified. 

Studies have also attempted to describe the 
association between personality characteris- 
tics and chronic drug use often without sex 
or race comparisons, Relationships have been 
demonstrated between sensation seeking and 
drug use patterns in college students (Zuck- 
erman, Bone, Neary, Mangelsdorff, & Brust- 
man, 1972) and hospitalized male veterans 
(Kilpatrick, Sutker, Roitzsch, & Miller, 
1976). Combined elevations on Minnesota 
Multiphasic Personality Inventory (MMPI) 
Scales Pd and Ma, suggestive of exaggerated 
tendencies toward social nonconformity, have 
also been associated with chronic illicit drug 
use in men (Sutker & Allain, 1973; Zucker- 
man, Sola, Masterson, & Angelone, 1975) 
and women (Sutker & Moan, 1972); and 
relationships between drug choice and scores 
on sensation-seeking and MMPI dimensions 
have been described (Carrol & Zuckerman, | 
1977). For the most part, however, sex % 
race comparisons have been made indepen 
dently, and their potential interactions in 1 
fluencing drug use patterns or associated per- 
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sonality characteristics have not been fully 
explored. The present study was designed to 
address this area of limited investigation and 
to examine relationships between levels of 
sensation seeking, drug use patterns, and per- 
sonality characteristics among chronic users 
of illicit drugs. 


Method 


Subjects were 84 drug abusers in residential treat- 
ment at Odyssey House Louisiana and included 38 
white men, 18 white women, 22 black men, and 6 
black women, a breakdown representative of pro- 
gram composition. Roughly 57% were addicted to 
opiates at program entry, and the remaining sub- 
jects were continuous users of stimulants (19%), 
depressants (14%), psychedelics (5%), or other 
drugs (5%). Treatment admission was nonvoluntary 
in over 90% of the cases. Subjects were selected 
from residents referred for psychological assess- 
ment at program promotion to Level 2, one of 
six hierarchical treatment stages. This selection pro- 
cedure was used to minimize variation in length of 
time in current treatment (average of 3 months). 
Criteria for subject inclusion in data analyses were 
signed informed consent, demonstrated ability to 
read test items, treatment residence for 2 months, 
and history of drug abuse exceeding 2 years. Means 
for combined race and sex groups on age, education, 
and Shipley Institute of Living Scale scores were 
24.29 years, 11.30 years, and 115.55, respectively. 
Preliminary Race X Sex analyses of variance showed 
no differences between subgroups in age, education, 
length of current treatment, or length of continuous 
drug use, Differences were found on the Shipley, 
with blacks producing lower scores than whites, 
F(1, 80) = 26.68, p < .01, 

Instruments used for data collection were (a) 
the Sensation Seeking Scale (SSS), a forced-choice 
questionnaire that measures individual differences 
in preferred optimal level of stimulation and yields 
five subscale and total scores; (b) the MMPI, 
Scored for the 3 validity, 10 clinical, and Special 
Scales 4, R, and Es (K-corrected T scores); (c) 
the Shipley Institute of Living Scale, a measure of 
Verbal comprehension and problem-solving skills; 
and (d) the Background Information Questionnaire 
(BIQ), a structured interview developed by us to 
Acquire information about personal history and pat- 
terns of drug use (e.g, age at first drug use; num- 
ber of drug categories ever used; reason for first 
drug, alcohol, or opiate use; first drug used; and 
"ug of choice). First responses to reason for drug, 
roho, and opiate use were each classified in one 

three categories defined by Naditch (1975): re- 
aha use from social pressure, use for therapeutic 
ntent, and use for pleasure or curiosity. 
tiaki and race subgroups were compared on 7 quan- 
ra le BIQ measures using analysis of variance and 

6 SSS and 16 MMPI variables using analysis 
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of covariance procedures with the Shipley score as 
the covariate; F tests for simple effects were per- 
formed where significant interactions were identified. 
Chi-square analyses were performed to assess rela- 
tionships between sex and race and five categories 
of BIQ responses, including reason for first drug, 
alcohol, and/or opiate use; first drug used; and 
drug of preference. Total SSS scores were used to 
divide subjects into three SSS groups: (a) low 
(n= 14), with scores 1 SD below the mean SSS 
score of 43; (b) medium (n=57), with scores be- 
tween +1 SD of the mean; and (c) high (n= 13), 
with scores 1 SD or more above the SSS mean. 
Preliminary analyses of variance indicated Shipley 
score differences between SSS groups, F(2, 81) = 
3.70, p~<.05, and a greater frequency of blacks 
(64%) in the low SSS group, x°(2) = 9.94, p < 01. 
There were no significant differences in education 
or sex distribution. Thus, race and Shipley scores 
were used as covariates in analysis of covariance 
comparisons of SSS groups on MMPI variables, 
whereas analyses of variance were performed to 
compare groups on BIQ dimensions. Fisher's least 
significant difference tests were used to evaluate 
significant between-groups differences. Chi-square 
analyses were performed to test relationships be- 
tween SSS groups and drug use patterns. 


` Results 


Drug abuse subgroups defined by race dif- 
fered significantly on SSS variables, and 
whites scored higher than blacks on Thrill 
and Adventure Seeking, F(1, 79) = 18.79, 
p< .01, General Sensation Seeking, F(1, 79) 
= 9.18, p < .01, and Total SSS, F(1, 79) = 
5.53, p< .05. Sex differences in sensation 
seeking were limited, with men scoring higher 
than women only on Thrill and Adventure 
Seeking, F(1, 79) = 5.78, p < .05. Race and 
sex MMPI comparisons showed no differ- 
ences between men and women, but whites 
produced higher scores on Scales F, F(1, 79) 
= 4.03, p<.05, D, F(1, 79) =4.61, p< 
.05, Pa, F(1, 79) = 4.32, p< .05, and Pt, 
F(1, 79) = 5.04, p < .05, than blacks. 

White drug abusers were younger than 
blacks at time of first drug, F(1, 80) = 
11.18, p < .01, and first opiate use, F(1, 61) 
= 6.51, p < .05, and had used drugs from a 
greater variety of categories, F(1, 80) = 
6.06, p < .05. Blacks and whites differed in 
drug of preference, x*(1) = 14.36, p < .01. 
Opiates and depressants were endorsed by 
100% of blacks and 62% of whites, whereas 
stimulants and hallucinogens were preferred 
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Figure 1. Mean Minnesota Multiphasic Personality Inventory profile patterns for low-, medium-, 


and high-sensation-seeking groups, 


by 38% of whites and 0% of blacks. In 
male/female comparisons, men reported use 
of more drug categories, F(1, 80) = 4.82, p 
< .05, but women and men did not differ in 
reported drug preference. Race or sex dif- 
ferences were not significantly associated 
with reason for first drug, alcohol, or opiate 
use. 
High, medium, and low SSS groups were 
characterized by variations in mean MMPI 
profile configurations (see Figure 1). Differ- 
ences were found between groups on Scales L, 
F(2, 79) = 4.22, p < 05; F, F(2, 79) = 
3.19, p < .05; Hs, F(2, 79) = 3.49, p < .05; 
Hy, F (2,79) = 4.50, p < 03; Pd, F(2,79) 
= 4.23, p < 05; Ma, F(2,79) = 10.00, p < 
O1; Si, F(2, 79) = 3.90, p <.05; and R, F(2, 
79) = 8.93, p < .01. Low sensation seekers 
scored higher than middle sensation seekers on 
Scales L and Hs (p < .05), Hy and R (p < 
.01) and high sensation seekers on Hs (p< 
.05) and L, Hy, and R (p< 01). High sensa- 
tion seekers produced lower scores on Scale Si 
than those in low ($ < 05) and middle (p 
< .01) groups and higher scores on Scales F 


- and Pd than other groups (p < 05). Hi 
i 05). High 
sensation seekers also produced more ER 
scores on Scale Ma than medium sensation 
seekers ($ < .01), who in turn produced 
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higher scores on Ma than those in the low 
SSS group (p < .01). 

Sensation-seeking levels were significantly 
related to drug use patterns. High and mid- 
dle sensation seekers reported earlier, F(2, 
81) = 3.15, p < .05, and more varied, F(2, 
81) = 9.10, p < .01, use of drugs than low 
sensation seekers. Although reason for first 
drug or opiate use and drug of choice did not 
vary as a function of SSS classification, reason 
for first alcohol use differed across groups, 
x°(4) = 12.96, p < .05. Among low sensation 
seekers, 62% remembered their first use 0 
alcohol as motivated by the influence of oth- 
ers, whereas 67% of high sensation seekers 
attributed initial use of alcohol to pleasut 
and curiosity. 


Discussion 


Findings indicate that race is an important 
factor to consider in understanding a 
abuse phenomena, but gender may be a 
limited value in prediction of personaly a 
drug use patterns for illicit drug users. ‘a 
sistent with Kaestner et al. (1977), blac 
demonstrated lower levels of sensation se 
and less psychopathology, reported us? fof 
fewer drug categories, showed preference 
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depressants such as opiates over stimulants, 
and engaged in drug use later than whites. 
Less elevated scores on MMPI measures oc- 
curred among blacks despite the possibility 
that current MMPI norms, derived from 
white reference groups, exaggerate the T- 
score estimates of psychopathology for blacks 
(Gynther, 1972). In contrast to earlier re- 
search (DeLeon, 1974; Olson, 1964), results 


“suggest that female drug abusers, in ref- 
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erence to their normative sex group, are no 
more psychologically deviant than men. Thus, 
the issue of sex-specific personality differences 
cannot be resolved without further compari- 
sons across treatment and nontreatment con- 
ditions, Women and men also showed few 
dissimilarities on sensation-seeking measures 
with the exception of Thrill and Adventure 
Seeking, a cluster of items reflecting desire to 
tngage in outdoor sports or activities involv- 
ing speed or danger. Although women re- 
ported use of drugs from a fewer number of 
categories, no sex differences were found in 
age at first drug use, frequency of drug use, 
or drug preference. 

_ Results support the hypothesis that there 
îs a close relationship between sensation seek- 
mg, other personality dimensions such as 
sociopathy and neurotic involvement, and 
drug use patterns. High sensation seeking 
Was related to use of more drug categories, 
earlier age at first drug use, and curiosity as 
a motive for initial alcohol use. Drug abusers 
Classified as high sensation seekers scored 
higher on scales reflecting sociopathy, atti- 


tudinal deviance, and heightened activity and 


lower on measures indicating denial, hypo- 
chondriacal preoccupation, hysteria, and so- 
tial introversion. Such individuals, relatively 
uninhibited by neurotic defenses, seem to be 
Strongly motivated to increase external stim- 
ulation, In contrast, low sensation seekers 
Re ree higher elevations on measures of 
Suis, involvement, repression, and denial. 
Ret S relationships among SSS and MMPI 
sf at les have been reported in correlational 
NE aa prisoners (Blackburn, 1969) and 
= Olics (Kish & Busse, 1969). 
ki Tesent findings suggest that motives for 
vant, Bes vary depending on such critical 
invol es as race, sensation seeking, neurotic 
vement, and sociopathy. It might be 


1377 


hypothesized that chronic drug use is associ- 
ated with exaggerated needs to attenuate 
unpleasant internal states or, conversely, to 
seek out external sources of stimulation. These 
assumptions provide a basis on which to match 
specific therapeutic packages to client per- 
sonality characteristics and drug use patterns 
as well as a reasonable framework for in- 
vestigating treatment outcome. For example, 
treatment of low sensation seekers might in- 
corporate relaxation and social skills training 
to provide alternatives to drug use for re- 
ducing unpleasant internal states; high sensa- 
tion seekers could be encouraged to identify 
activities and goals that provide gratifying 
and stimulating alternatives to the pharma- 
cologic effects and concomitant life-styles of 
illicit drug use. Finally, motives for drug use 
and their relationships with such variables as 
race, sensation seeking, neurotic defenses, 
social introversion, and sociopathy should be 
explored systematically among drug experi- 
menter, chronic user, treatment, and post- 
treatment populations. 
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Marital Satisfaction and Depression as 
Predictors of Physical Health Status 


Robert L. Weiss and Barbara M. Aved 
University of Oregon 


This study investigated the multiple correlation between physical health status 
and a set of marriage-related “predictor” variables. Family practice physicians 
provided a sample of 104 married couples. Marital satisfaction, depression, 
number of visits to physician, and educational level were among the set of 
cross-validated “predictors” of reported physical health status. The correlation 
between physical health status and depression was significantly greater for wives 
than husbands. For wives, marital satisfaction and depression were related pri- 
marily through the uncontrolled variance in physical health status, whereas for 
husbands a significant relationship between marital satisfaction and depression 
remained for husbands when physical health status was partialed out. These 
findings support similar conclusions drawn by others. 


Marital relationships provide a particu- 
larly attractive point of entry for the study 
of the relationship between behavior and 
health. Socially oriented conceptions of men- 
tal health focus either on interpersonal learn- 
ing or on much broader psychosocial systems, 
such as collectives and communities. The 
Marital unit provides the investigator with 
a vantage point within a minisystem from 
which attention can be directed downward 
toward individual or upward toward com- 
munity determinants of behavior and adjust- 
ment. Nonetheless, empirical relationships 
between marital status and various concep- 
tions of “health” have been studied only im- 
perfectly (e.g., Crago, 1972; Vincent, 1973). 

Some recently reported empirical relation- 
ships are as follows: (a) Marriage-related 
requests for professional mental health ser- 
vices run high; 58% of 2,000 consecutive 
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outpatient requests at a large university hos- 
pital were marriage related (Overall, Henry, 
& Woodward, 1974), whereas for psycho- 
logical and social work settings, estimates run 
from 58% to 76% of all presenting prob- 
lems. (b) Persons who have had multiple 
marriages, relative to the never married, 
show significantly less rated psychopathology 
(Overall, 1971). (c) Marital dissatisfaction 
may be part of a broader depressive spec- 
trum disorder (Overall et al., 1974). The 
direction of causality between psychosocial 
variables and marital satisfaction has not 
been determined by these and the many 
other studies available in the literature. It is 
just as reasonable to assume that marital 
dissatisfaction causes depressive symptoms, 
poor job performance, or poor health as it 
is to assume that marital dissatisfaction 
results from these maladies. 

The aims of this investigation were mod- 
est: We sought to establish the multiple co- 
variations between reported physical health 
status and at least two face-valid person 
variables: marital satisfaction and depres- 
sion. A similar approach, reported by Cole- 
man and Miller (1975), determined that 
both self-reported and therapist-rated de- 
pression were significantly related to marital 
satisfaction, accounting for 147 and 23% 
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of the variance in satisfaction ratings, re- 
spectively. Since physical health status and 
depression tend to covary (depression fre- 
quently involves somatic complaints), as do 
depression and marital satisfaction, the rela- 
tionship between physical health status and 
marital satisfaction might be a spurious one. 

The present study sought to establish the 
multiple correlation between a set of “pre- 
dictor” variables and reported physical health 
status. Active patients of family practice 
physicians were selected for study, because 
physicians are often first-line providers of 
health services. They are an integral part of 
the psychosocial context of health status, 
since they frequently suspect that marital 
adjustment and depression, among other 
sources of distress, are concomitants (or 
even causal agents) of impaired health status. 
The multiple correlational analysis of health 
status and relationship variables was cen- 
tered in this broader consumer-provider con- 
text. Physical health status has been a vari- 
able of psychological interest for those con- 
cerned with psychophysiological reactions 
and, more recently, life stress and somatic 
illness (cf. Holmes & Masuda, 1974; Rabkin 
& Struening, 1976; Rahe, 1974). Since the 
longitudinal study of changes in health status 
that occurs during marriage is impractical, 


the present study sought to provide cross- 
sectional data. 


Method 
Subjects 


Family practice physicians at a regular meeting 
of the Lane County Academy of Family Practice 
were invited to participate in a study of the rela- 
tionship between physical health status and vari- 
ables related to marital satisfaction by providing 
names of couples who met the following criteria: 
(a) married for at least 2 years, (b) in a first 
marriage for both spouses, (c) between the ages of 
22 and 48 years, and (d) wife not now pregnant. 
These criteria assured that couples were beyond the 
initial stages of marital accommodation and that 
medical problems associated with advanci k 
and/or Baas were excluded. The aaa 

ere al indi 
ai own pcs gee <a rae 
were “overusers” 


the persons they nominated 

or “normal users” of thei 
vices. The aim was to include eo aire 
pallens having Psychosomatic involvements, i 
aA Poena participated by contributing a 
of 280 persons (140 couples). (In almost all 
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instances, husbands and wives were patients of the 
same physician.) Each physician drafted a letter to 
his own patients, explaining that the study was 
“surveying adults about their health patterns,” 
indicating an interest in identifying factors of 
family health potentially related to prevention, and 
assuring them that refusal to participate would not 
influence their relationship with their physician. 

From the initial pool of 140 couples who had 
been contacted by letter, 135 were subsequently 
contacted by phone; 104 couples completed usable 
forms. Participation was refused in 23 cases; 5 
couples had separated or divorced, 1 wife was 
pregnant, and 2 couples provided incomplete data. 
In the final sample, 18% of the subjects were 
designated by their own physicians as overusers 
(22% of the refusals had this designation). In- 
dividual physicians contributed from 17% to 38% 
of the final sample of 104 couples. 

The typical subject was 30 years old, had had 
2 years of college, and had been married for 8 
years, Fifty-four percent of husbands and 24% of 
wives held white-collar and professional positions; 
4% of husbands and 61% of wives were unem- 
ployed; 66% of the sample had two or more chil- 
dren. Couples were predominantly Protestant (57%), 
with 19% Catholics, 6% other, and 18% none. 
The resulting sample was a well-educated, possibly 
upper-middle class, group of couples who were 
rearing preadolescent children. 


Measures 


Physical health status. The Cornell Medical Jn- 
dex (CMI) was used as the measure of physical 
health status (Brodman, Erdman, Lorge, & Wolfi, 
1952); it consists of 195 items typically covered 
in an intensive medical history interview, ranging 
from systemic (sensory, respiratory, urinogenital, 
etc.) to vague complaints associated with psycho- 
somatic disorders. The questions sample todi 
symptoms, past illnesses, family history, and 9% 
fective states, The score is the total number © 
affirmative responses to the questions. In a sample l 
of women scheduled for cancer surgery, both He 
preoperative and postoperative CMI scores nis 
related to staff ratings of postsurgery invalidism 
(r= 49 and 52, respectively, p< .05); the Pm 
Surgery to postsurgery CMI scores were also re- 
lated (p=.67, p< 01; Bard & Waxenburg, 1957). 

Self-Rating Depression Scale. The Zung i 
Self-Rating Depression Scale (SDS) was used ® 
assess depression, It consists of 20 self-statemer 
half of which are worded affirmatively and oe 
negatively. The SDS has been used widely oy oa 
Pression research for detecting the so-called “hid 
depressions.” Individual scores can range from the 
to 80 on this scale. Zung (1974) reviewed re- 
literature on the reliability and concurrent and P ‘de 
dictive validity of this measure. A number of ae 
ies have shown that for age-matched depres : 


Patients and normals, 88% of the patients a 
detected by the SDS, whereas 12% of the OF 
Were detected as “false positives.” 


i 
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Marital satisfaction. The Locke-Wallace Mari- 


tal Adjustment Test was used to assess marital 


satisfaction and has been used widely in the 
literature as a reliable measure of marital satisfac- 
tion (Coleman & Miller, 1975; Locke & Wallace, 
1959; Weiss & Margolin, 1977). Each spouse an- 
swers individually; scores of 100 or more generally 


| indicate marital satisfaction, and scores below 100 


indicate marital distress. 

Activity level inventory. A 28-item version of 
the Inventory of Rewarding Activities, developed 
by the Marital Studies Program, University of 
Oregon, was used to assess how couples spend 
their leisure time. The scale reflects activities in 
home and community equally (eg. “watching TV 
for more than 1 hour” and “go to a dance or 
party”). Respondents indicate whether during the 
last week each activity was engaged in alone or 
with spouse, Four activity scores were obtained: 
Home with spouse, Home without spouse, com- 
munity with spouse, and community without spouse. 
Questionnaire. In addition to the six measures 
described above, information on five demographic 
variables (age, number of annual visits to physi- 
cian, years married, years residence in county, and 
education) was obtained. 


Procedure 


Persons to whom individual physicians had sent 
letters introducing the study were contacted by 
telephone and were given additional information. 
If they agreed to participate, an in-home meeting 
was then scheduled. 

All data were collected in this meeting, which 
lasted approximately 45 minutes. The same inter- 
viewer (the second author) a registered nurse, con- 
ducted all interviews with respondents within a 
3-month period. 

When all interviews had been completed, the 
Taster lists of patients disclosing the initial “user” 
Status designaton were made available to the inter- 
viewer; the authors did not inform physicians 
which of their patients had or had not participated. 


Results 
Husband-Wife Comparisons 


Wives, compared to husbands, on the av- 
erage made significantly (p< .01 for all 
comparisons) more visits (annually) to 
Physician, had more medical complaints 
phe CMI scores), and reported engaging 
a more home alone activities. Husbands and 
ie did not differ significantly in mean 
oo marital satisfaction scores 
See eats and 113.1, respectively). The 
tio ation between spouses’ marital satisfac- 

n scores was .65 (p< 001). 
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As an internal validity check, the 19 cou- 
ples identified by their own physicians as 
“overusers” of medical services were com- 
pared to the 85 designated as “normal user” 
couples. In all comparisons, the females in 
the overuser group were significantly differ- 
ent from their counterparts in the normal 
group, whereas the males in both groups were 
comparable. As defined by their individual 
physicians, overuser relative to normal user 
females, on the average, scored 10 points 
higher on the CMI, 4 points higher on the 
SDS, and 7 points lower on marital satisfac- 
tion, and they were reported as having made 
twice as many visits to their physicians. 

These comparisons between males and 
females are generally consistent with previous 
findings: Females reported more physical 
health concerns, made more visits to physi- 
cian, and tended to score as somewhat more 
depressed on the SDS. 


Predictors of Physical Health Status (CM1) 


The 11 predictor variables were combined 
through multiple linear regression equations, 
with CMI as the dependent variable, Al- 
though the combination of self-report and 
demographic variables was selected for pre- 
sumed relevance to marital interaction, mul- 
tiple regression is particularly sensitive to 
fluctuations among correlations within suc- 
cessive samples. Consequently, the sample of 
104 couples was divided at random into 
equal subsets of Na = Ng = 52 couples. The 
neutral designation, Sample A and B, is pre- 
ferred, since the dichotomy was formed after 
all data had been collected. 

The results of major interest are pre- 
sented separately in Table 1 for sex within 
each A and B subsample, as well as for sub- 
sample totals for each sex (total N = 104). 

The zero-order correlations between each 
of the 11 predictors and CMI are listed for 
husbands and wives separately for sub- 
samples and for totals, thereby allowing 
comparisons of within- and between-sub- 
sample fluctuations. 

The multiple regression coefficients that 
resulted in each instance for the 11 pre- 
dictors are listed below the zero-order cor- 
relations. The R? values indicate the amount 
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Table 1 


Zero-Order, Multiple Correlation, and Validity Coefficients for Regression of CMI on 


Predictor Variables 
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Husbands Wives 

Predictor A B Total A B Total 
SDS 395 374 3798 650 705 673" 
Marital satisfaction —470 —464 —468 —569 —371 —459 
AHS —166 —148 —137 —335 —171 —258 
AHS —017 093 —034 —035 —208 —124 
ACS 025 071 —022 033 —194 ~071 
ACS —013 —047 —028 —045 —238 —139 
Age 063 106 082 —225 —273 245 
Years married 267 012 147 —165 —109 —143 
No, visits 392 375 384 278 512 346 
Years in county 146 023 083 392 018 223 
Education —279 —143 —216 —263 — 534 —394 
Re 664 659 629 823 840 777 
R 44 43 395 68 70 604 
Validity coefficient r 484 487 (485) 668 568 (620) 


Note. Na = Np = 52. Decimals have been omitted. V: 
Averaged values appear in parentheses. For 50 df, 
Medical Index; SDS = Self-Rating Depression Scale 


tivities at home without spouse; ACS 
community without spouse. 

. * Difference between these zero-order correl; 
» For Subsample A = B df = 11, 40; Total (s) df = 
2.79, 5.47. F values for Wives A, B, and Total: F 
F = 2.79, p < 05. 


of CMI variance accounted for by the set 
of predictors. 

Each of the multiple correlations was sig- 
nificant; the range of CMI variance varied 
from a low of 40% to a high of 70%. From 
the zero-order correlations in Table 1 it 
can be seen that the predictive ability of 
SDS, the depression variable, was most dif- 
ferent for husbands and wives, (The sig- 
eed of the difference between these 
correlations, based on oz.z =, i 
e 1z = .141, nay 

For the two husband sub; 

“best” replicated predictors OR ae 
satisfactions, visits, and SDS; those for the 
wives included SDS and marital satisfaction, 
similar to husbands, Nonreplicating predic. 
tors. for the wives included education, visits, 

wi w 3 
eos A see and activity in community 


Finally, the last TOW o; 
validity coefficients 
cross-validation”; 


N 


f Table 1 lists the 
determined by “double 


the beta regression 


alidity coefficients are all significant at the .001 level. 
T.s = .273; for 102 df, r95 = .195. CMI = Cornell 
; AHS = activities at home with spouse; AHS = ac- 
= activities in the community with spouse; ACS = activities in the 


lations is significant at the .002 level. 
11, 92. F values for Husbands A, B, and Total: F = 3.23, 
= 7.72, 8.72, 12.79. For all F values > 3.23, p < Mi 


weights, derived from each within-sex sub- 
sample, were applied reciprocally to the 
Other. For example, the betas for Sample 
wives were applied to Sample B wives, and 
those for Sample B were applied to Sample 
A. The Pearson product-moment correlation 
coefficients between the predicted CMI and 4 
observed CMI scores thus yield a validity © 
Coefficient based on the adequacy of one set 
of beta weights to predict from the raw data 
of a different sample. This particular choice 
Of cross-validation underestimates the good- 
ness of fit, since only one half of the data 
is used to determine the weights. F 
From Table 1 it can be seen that all valin 
ity coefficients were highly significant, 2% 
though those for husband samples were sub- 
stantially lower than the estimates for wives 
To further specify the relationship be- 
tween CMI (1), marital satisfaction (2), ™ 
SDS (3), the partial correlation (712.8) W% 
calculated for husbands and’ wives separately: 
The respective partial correlations were ~" 
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and —.28 (ps<.01). From Table 1, the 
comparable zero-order correlations are —.468 
and —.459.) Reported physical health status 
and marital satisfaction were mediated, in 
part, by variations in depressiveness scores; 
the partial correlations remained statistically 
significant. 

When the relationship between marital 
satisfaction and depressiveness was statisti- 
cally controlled for differences in physical 
health status scores, the partial correlations 
dropped from their respective zero-order cor- 
relations of —.486 and —.401 (ps< .01) to 
hes = —.334 (p< .01) and —.159 (ns), 
for husbands and wives, respectively. The 
CMI and SDS share many physical items in 
common (e.g., “I have trouble with consti- 
pation.”). The significant residual correla- 
tion between SDS and marital satisfaction 
Suggests that marital (dis)satisfaction and 
depressive affect are likely to be found to- 
gether for husbands, although the magnitude 
of the correlation was small. For wives it 
appears that common variance from the 
physical symptom items accounts for the 
CMI x MS relationship, which is under- 
Standable given the size of the zero-order 
Nag between CMI and SDS in Table 


Discussion 


This has been an investigation of the net- 
work of covariations between physical health 
status on the one hand and marriage-related 
variables on the other. It differed from others 

/ (e.g., Coleman & Miller, 1975) by (a) draw- 
ing nonclinical spouses from the context of 
an ongoing consumer-provider relationship, 
namely, patients and their family practice 
Physician providers, and (b) providing a 
cross-validation of the correlates of health 
Status. Well-educated couples in first mar- 
fide who reported above average marital 
id istaction served as respondents. Physicians 

| entified a relatively small number of their 
Ho as overusers of their services to 
oe re inclusion of typical “psychosomatic” 
aM (Descriptively, these cases were 

l SE who differed significantly from their 
annus Darts on all relevant variables, eg., 

nual visits, CMI, and SDS.) 
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The “best predictors” to CMI among the 
zero-order correlations that replicated within 
same-sex subsamples were SDS, marital satis- 
faction, visits, and to a lesser extent educa- 
tion. The amount of physical health status 
variance accounted for by SDS variance dif- 
fered significantly for husbands and wives 
(14% and 45%, respectively). 

When all 11 predictors were combined into 
multiple regression equations, the amount of 
CMI variance accounted for increased sub- 
stantially, ranging from 40% to 60% for 
husbands and wives, respectively. Overall, 
CMI scores were better predicted for wives 
than for husbands. 

Among the best predictors of physical 
health status, depression and marital satis- 
faction are the most interesting. (Persons 
reporting greater numbers of physical symp- 
toms tended to see their physicians more 
than others, and those with more years of 
education reported fewer physical symp- 
toms.) 

Whereas the relationship between health 
status and marital satisfaction was only 
slightly reduced when unplanned variations 
in depression were statistically controlled, the 
correlation between marital satisfaction and 
depression for wives became insignificant | 
when health status variance was controlled. 

Coleman and Miller (1975) reported simi- 
lar results for an older, less-educated sample 
of mental health clinic outpatients and their 
spouses. (The same pattern of mean differ- 
ences between husbands and wives on de- 
pression and satisfaction ratings was observed 
in the two investigations.) For combined 
samples of husbands and wives, the relation- 
ship between marital satisfaction and depres- 
sion was r= —.38 (p< 01) and r= —.43 
(p < 01) in the Coleman and Miller and the 
present studies, respectively. They also found 
a similar pattern of difference between mari- 
tal satisfaction and depression relationships 
for husbands and wives (i.e., rs = —.66, $ < 
01, and —.25, p = .10, respectively). In the 
present results the Marital Satisfaction X 
SDS relationship became insignificant for 
wives when the uncontrolled variations in 
physical health status were accounted for. 
(In both studies a significant inverse rela- 
tionship was found between husbands’ mari- 
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tal satisfaction and wives’ depression scores, 
but in neither case were wives’ depression and 
husbands’ marital satisfaction scores signifi- 
cantly related.) 
Coleman and Miller (1975) concluded that 
marital satisfaction and depression for fe- 
males were essentially unrelated, and the 
present results lend some support to this 
view. Bernard (1972), on the other hand, 
concluded that wives suffer far greater mental 
health hazards and that they present a far 
worse clinical picture than do husbands. 
Drawing mainly on sociological studies (e.g., 
Knupfer, Clark, & Room, 1966), she stressed 
the double standard of mental health that 
essentially builds into “our standards of 
mental health for women the defects neces- 
sary for successful adjustment in marriage” 
(p. 52). More recently Overall et al. (1974) 
have presented statistical evidence that both 
a family history of marital discord and what 
might be labeled a “depressive spectrum dis- 
order” independently predict the likelihood 
of outpatients’ complaints of marital discord. 
All of the available data are based on cor- 
relational analyses. This approach may be 
helpful in suggesting which aspects of rela- 
tionship living have significance for physical 
health status. A direct test is needed whereby 
marital therapy can be shown to improve 
physical health status of the partners; de- 
pending on one’s point of view, this may 
occur more for husbands than for wives. The 
present investigation is merely another step 
toward specifying the probable contribution 
of separate variables. 
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Sexual Fantasies of Females as a Function of Sex Guilt 


and Experimental Response Cues 


Denise Moreault and Diane R. Follingstad 
University of South Carolina 


This study ‘investigated the effects of response cues (erotic, romantic, or neu- 
tral) and level of sex guilt on the self-reported sexual fantasies of females. 
Undergraduates completed a sex guilt inventory, a mood adjective checklist, a 
fantasy theme checklist, and ratings of their affective responses and physiologi- 
cal arousal associated with the writing of the fantasies. High sex guilt females 
preferred fantasy themes indicating a lack of responsibility for engaging in 
sexual interaction. Subjects in the erotic fantasy condition wrote more explicit 
fantasies and described more varied content. Arousal seemed to be affected by 
the response cuing in the predicted direction but not by the subjects’ guilt 
levels. Sex guilt level seemed to be a better predictor of affective responses, 
such as guilt and embarrassment, than the response cuing. The results suggest 
that sexual fantasy behavior may be part of a cluster of sexual behaviors 
governed by an individual’s level of sex guilt. The demonstration that fantasy 
production seemed to be influenced by situational demands has implications for 


collection and use of fantasy information by both clinicians and researchers, 


‘Until recently, sexual fantasy behavior in 
females has remained a relatively unexplored 
“area except among, clinicians studying cases 
of sexual deviation (eg., Abraham, 1922; 
Eidelberger, 1945; Sterba, 1921; Wulff, 
1942). It has been viewed as pathological or 
at least symptomatic of poor heterosexual 
relations, especially when the fantasies in- 
cluded other than conventional heterosexual 
Practices (Hollender, 1970), Only one ex- 
perimental study (Hariton & Singer, 1974) 
in this area established the normalcy of sex- 
ual fantasies, substantiating Kinsey, Pom- 
eroy, Martin, and Gebhard’s (1953) norma- 
tive demographic data. This study, however, 
only correlated a few personality variables 
with fantasy, and predictability from their 
data was limited. 

The limited information regarding females’ 
Sexual fantasies has been complicated by a 
treatment of sexual fantasy as a homogenous 
entity in which a sample of individuals are 
Simply asked whether or not they engage in 


Se A o A 
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fantasy during sexual experiences. No inves- 
tigation to date has selected subjects along a 
personality trait dimension to determine 
whether certain individuals are more likely to 
fantasize or are affected differently by fan- 
tasizing. In addition, specific operationalized 
parameters of fantasy behavior (e.g., themes, 
vividness, number of acts and organs por- 
trayed) that potentially could yield more 
precise information have been sacrificed for 
simple assessment of the presence or absence 
of fantasies. Researchers attempting to col- 
lect data regarding sexual fantasies have also 
neglected to study the circumstances or con- 
ditions under which fantasies are produced 
and disclosed to others, thus ignoring the 
demand characteristics of the situation, The 
purpose of this study in assessing females’ 
sexual fantasy behavior was to determine 
whether fantasies were tied to a personality 
trait and/or were under situational stimulus 
control, 

A personality dimension that has been 
shown to predict a variety of sexual behavior 
patterns is the construct of sex guilt as de- 
veloped, defined, and measured by Mosher 
(1966). Sex guilt has been defined as “a 
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generalized expectancy for self-mediated pun- 
ishment for violating or for anticipating vio- 
lating standards of proper sexual conduct” 
(Mosher & Cross, 1971, p. 27). Since sexual 
fantasies are likely to be related to other 
sexual behaviors, the level of sex guilt in 
individuals was expected to influence the con- 
tent and explicitness of fantasies. Subjects 
scoring high in sex guilt compared to sub- 
jects low in sex guilt have been found to 
produce or recall less information relating to 
sexuality (Galbraith & Mosher, 1968; Lang- 
ston, 1973; Schwartz, 1973), to evaluate less 
conventional and explicit sexual activity 
negatively (Mosher, 1973; Ray & Walker, 
1973), and to report more guilt and embar- 
rassment following exposure to sexual stim- 
uli (Mosher & Greenberg, 1969; Schill & 
Chapin, 1972). Based on the literature re- 
garding the effects of levels of sex guilt, the 
fantasies of high sex guilt (HSG) subjects 
were expected to be fewer in number, shorter, 
and less vividly remembered; to exhibit less 
thematic variety; to be more restricted in 
terms of specific sexual acts and organs de- 
scribed; and to contain romantic heterosexual 
themes. Following the writing of their own 
fantasies, subjects with high levels of sex 
guilt were expected to report more guilt and 
embarrassment than low sex guilt (LSG) sub- 
jects. 

There have been conflicting data regarding 
how level of sex guilt affects sexual arousal. 
Mosher (1973) suggested that most females 
experience arousal regardless of their sex 
guilt level, whereas Schill (1972) suggested 
that subjects with little sex guilt report more 
arousal, It was predicted that subjects would 
report arousal to sexual fantasy stimuli inde- 
pendent of their level of sex guilt. In light of 
Schmidt’s (1975) findings that females report 
a greater degree of arousal when rating geni- 
tal sensations than when rating general 
arousal, sexual arousal needed to be mea- 
sured by a variety of ratings. 

Some investigators have been cognizant of 
the fact that the atmosphere and context in 
which self-disclosure takes place has an effect 
on the data that subjects produce (Galbraith 
& Mosher, 1968; Mosher, 1965). The stimu- 
lus conditions under which sexual fantasies 
are reported have important implications for 
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research and clinical practice, which relies on 
self-reported fantasy of subjects. No study, 
however, has attempted to manipulate the 
conditions under which fantasies are pro- 
duced to determine which conditions are more 
conducive for, and differentially predictive 
of, the quality and quantity of subjects’ fan- 
tasy production. Routh, Warehime, Gresen, 
and Roger (1973) found that giving subjects 
explicit instructions to write sexy stories as 
opposed to simply writing stories produced 
sexier stories. This would imply that fantasy 
content is under the influence of instructional 
set and experimental demand characteristics, 
Also, no researchers have investigated the 
effect of a romantic stimulus devoid of ex- 
plicit sexual content on the sexual respon- 
siveness of subjects (Schmidt, 1975). 

Based on the research regarding imstruc- 
tional set, the following hypotheses were 
tested in relation to sexual fantasy produc: 
tion: (a) Erotic (i.e., sexually explicit) ex 
amples of fantasies would result in more and 
longer fantasies, more explicit imagery, and 
more variation in thematic content than r04 
mantic examples; (b) sexual fantasies would 
be longer, more varied, and more explicit fo : 
those females given sexual fantasy examples 
than those given a nonsexual fantasy stimu- 
lus; (c) individuals in a neutral fantasy cong 
dition would report less guilt and embarrass 
ment than those in the erotic and romantig 
stimulus conditions, although erotic fantasy 
cues would produce a higher degree of sexual 
arousal in females than either romantic Of 
neutral cues; and (d) a purely romanii 
stimulus would be less threatening than al 
erotic sexual stimulus for females with hig 
sex guilt, resulting in more fantasy prol 
tion with a greater variety of themes b 
less sexually explicit content. 

The current literature on sexual fantasi# 
ing in clinical settings has been concerne 
with using fantasy to change deviant sema 
behaviors (e.g., Evans, 1968; Marquis, 1970 
Marshall, 1973) and assessing its effect 1 


increasing sexual pleasure and treating sexus 
1974; Klin 


Tr 


behavior as well as the conditions most o3 
ducive for producing sexual fantasies 3 


A 
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contribute to the effective application of fan- 
tasy techniques in clinical settings. 


Method 
Subjects 


Undergraduate females, ranging in age from 17 to 
25, volunteered from a variety of psychology classes 
in a large southern university, One hundred fifty- 
three subjects were randomly assigned initially to 
three experimental conditions, receiving either erotic, 
romantic, or nonsexual fantasy examples. The final 
sample of 90 heterosexual subjects consisted of the 
30 females in each condition who had the highest 
and lowest levels of sex guilt as measured by an 
inventory tapping self-reported guilt feelings. Ethi- 
cal guidelines with regard to use of human subjects 
were followed. 


Experimenter 


The experimenter delivered the instructions and 
absented herself while the subjects completed the 
Measures. To decrease the fear of external censure, 
the experimenter attempted to present herself as a 
Nonevaluative and scientific individual (Galbraith & 
Mosher, 1968; Mussen & Scodel, 1955). 


Measures 


Mosher Forced-Choice Guilt Scale—Female Form. 
The Sex Guilt subscale consists of 20 Likert-type 
items with four response options. Split-half reliabil- 
ity of the scale produced a correlation of .95. Con- 
vergent and discriminative validity of the Mosher 
Guilt scales have been demonstrated (Mosher, 1966; 
Mosher & Cross, 1971) with the Sex Guilt subscale 
being significantly different from the other guilt 
Measures, 

Aa ‘antasy essays. Subjects were asked to write 

Own all the personal sexual, fantasies that they 
Could recall having experienced. 

Fantasy Theme Checklist. This checklist was 
ee from the list developed by Hariton and 
renee (1974) and was expanded to include themes 
eee in a compilation of women’s sexual fan- 
the > by Friday (1973). Subjects checked any of 
Pa 2 fantasy themes that they had experienced as 

ae their own fantasies. 
sure 22 Adjective Check List (MACL). This mea- 

ie 49 adjectives was originally developed by 

i is (1965) to prompt subjects to label their 

ie moods and was modified by Mosher and 
Sexual erg (1969) to include 7 adjectives measuring 
arousal, guilt and 7 adjectives measuring sexual 

an; - Previous studies have demonstrated that 
tive = ip guilt and arousal as measured by adjec- 
aperies are a viable measure of change due to 
1969. OE manipulation (Mosher & Greenberg, 
j Okel & Mosher, 1968). 


exual BSA 
arousal, vividness and embarrassment rat- 
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ings. Subjects rated whether or not they experi- 
enced arousal, their subjective degree of sexual 
arousal on a 7-point scale, and the strength of 
genital and/or breast sensations in response to the 
writing of their fantasies. Embarrassment as a 
result of the fantasy reporting and the vividness 
(i.e. ability to remember facts and details clearly) 
of the fantasies while recording them were rated on 
a 5-point scale. 


Experimental Conditions 


Subjects in the three stimulus conditions were 
presented either with three erotic, three romantic, 
or three neutral examples of fantasies prior to 
writing their own fantasies. Heterosexual fantasies 
for the erotic (i.e. sexually explicit) condition were 
adapted from Friday’s (1973) book, which reports 
fantasies solicited from the general population. Ro- 
mantic fantasies, focusing on an emotional hetero- 
sexual relationship without mentioning sexual acts 
or organs, were solicited from undergraduate fe- 
males. Nonsexual fantasies were solicited from other 
undergraduate females who wrote fantasies contain- 
ing no sexual content, such as career aspirations or 
adventures. 

Several fantasies of each type were given to a 
panel of five female judges (ranging in age from 23 
to 31) prior to the experimental sessions and were 
rated by the judges along the dimensions of erotic, 
romantic, or personally pleasing for the three fan- 
tasy conditions mentioned above. The three fan- 
tasies rated highest in each condition were used as 
the experimental stimuli. 


Procedure 


Subjects in each experimental group were given 
the measures in three phases during a 14-hour ex- 
perimental session, Approximately 25 subjects were 
present at each session. In the first phase, subjects 
were administered the Mosher Forced-Choice Guilt 
Scale and the MACL. Following this, subjects were 
informed that the study involved reading and 
writing sexual materials. Subjects were informed that 
they could leave at any time. 

To promote a permissive atmosphere, the second 
phase began with instructions informing subjects that 
sexual fantasies and thoughts were normal, usual 
experiences. The experimental stimuli were introduced 
with an explanation that reading examples of fan- 
tasies might put subjects in the mood for writing 
their own. Subjects were given the set of examples 
appropriate to the condition that they were in, and 
they returned the stimulus materials to the experi- 
menter before proceeding. Explicit instructions of 
what constituted sexual fantasies were given in order 
that subjects would understand that either detailed 
or vague mental scenes, pictures, or stories that 
they had experienced would be appropriate to 
report. They were also told that these fantasies 
were to be written down in as much detail as 


possible. 
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Table 2 
Summary Table of F Values by Fantasy Example Condition and Level of Sex Guilt 
Dependent variable Guilt (A) Condition (B) AXB 
Length of fantasy AT 4.01* 29 
Total words 9.75** 2.48 36 
No. fantasies 4.91* 1.71 64 
No. sex organs 2.56 9.48*** 2.15 
No. sex acts 3.85* 21:32% 2.63 
Variety of content 3.57* S578) 2.70 
No. themes checked 6.39** 1.19 48 
Embarrassment 235) 28 2.49 
Vividness 13.94*** 2.66 2.01 
Degree of arousal Teal 9.09*** 02 
Presence of sensations 3.58 i} byes .05 
Genital 2.24 12.61*** 92 
Breast 59 11.96*** 1.17 
Guilt» (A) 3:125 10 .28 
Arousal® (B) 2.64 3.40* 40 
A X Time 1.13 86 1.62 
B X Time 3.86* 3.53* 1.08 


* Measured by the Multiple Adjective Check List. 
*p < 05. 

z $< 01. 
*p < .001, 


During the third phase, the following measures 
were administered: the Fantasy Checklist; the sex- 
ual arousal, vividness, and embarrassment ratings; 
and a second administration of the MACL. Sub- 
jects were debriefed after the experiment concluded. 


Scoring of Subjects’ Fantasies 


Subjects’ fantasies were scored according to (a) 
Number of sexual fantasies; (b) length of longest 
Sexual fantasy; (c) total number of words written; 
iy) explicitness of the sexual fantasy (ie. the 
otal number out of 7 possible sex organs and 12 
Hane sex acts that were mentioned either in 
Physiological, medical, or slang terminology in the 
fisted Containing the highest frequency of acts and 
tel mentioned) ; and (e) total of 4 possible con- 
tru Bree, of sexual fantasies (ie. typical hetero- 
cs activities; heterosexual oral-genital contact; 

Up sex; rare or unusual sexual acts). 


Results 


orja Siteen subjects were dropped from the 
Se sample; 8 subjects reported having 
a ae in homosexual activities, 6 test rec- 
Bia incomplete, 1 subject was over the 
E age limit, and 3 subjects left the 
eye EE after the nature of the study was 

aled. The 15 subjects scoring highest and 

5 scoring lowest on sex guilt in each of 


three experimental conditions comprised the 
final sample (N = 90). Ninety-eight percent 
of the sample wrote at least one sexual fan- 
tasy, and all subjects checked at least two 
fantasy themes. 

In this 2 x 3 factorial design, the means 
of the HSG and LSG subjects across fantasy 
stimulus conditions were as follows: LSG in 
the erotic condition = —52.6; LSG in the 


romantic condition = —58.5; LSG in the 
neutral condition = —53.2; HSG in the erotic 
condition = —1.2; HSG in the romantic 


condition = —1.5; and HSG in the neutral 
condition = —.4. One-way analyses of vari- 
ance (ANovAs) performed on the means of 
sex guilt scores for LSG and HSG subjects 
across experimental conditions revealed no 
significant differences, supporting the random 
assignment of subjects to conditions and 
lending confidence to the assumption that any 
differences in dependent variables can be 
viewed as a result of the experimental manip- 
ulation and subjects’ sex guilt level. 

The hypothesis that HSG and LSG sub- 
jects would differ in the characteristics of 
their sexual fantasies was largely supported 
(see Tables 1 and 2 for means, standard 
deviations, and F values). One-way ANOVAS 
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Table 3 

Chi-Square Results of Differences Between Sex 
Guilt Groups in Descriptions of Sex Organs 
and Sex Acts in Their Sexual Fantasies 


Sex 
organ ae) Sex act x? 
Breast 05 Kissing 1.73 
Penis 6.69** Manipulation of 
female breast 2.19 
Clitoris 5.94** Manipulation of 
female genitals 14 
Vagina .54 Intercourse 
Ventral-ventral 4.84* 
Oral contact with 
male genitals 3.04 
Oral contact with 
female genitals 3.42 
Homosexuality 3.60 


Note. For all significant differences, low sex guilt 
subjects were more explicit than high sex guilt sub- 
jects. Cell frequencies for the remaining three sex 
organs and five sex acts were too small to test. 
df = 1 in all cases. 

* p< 05. 

** p < 01. 


indicated that LSG subjects demonstrated 
less sexual inhibition than HSG subjects in 
terms of sexual explicitness, expressiveness, 
and responsiveness to the stimuli: for total 
words, F(2,84) = 9.75, p < .01; for total 
number of fantasies, F(2, 84) = 4.91, p< 
.05; for number of sex acts, F (2, 84) = 3.85, 
p< .05; for variety of content, F(2, 84) = 
3.57, p < .05; and for number of fantasy 
themes checked, F(2,84) = 6.39, p< 01. 
As predicted, there were no significant dif- 
ferences between HSG and LSG subjects on 
level of arousal or breast and genital sensa- 
tions, even though LSG subjects increased 
the amount of arousal experienced signifi- 
cantly more from pre-MACL to post-MACL 
than HSG subjects, F(2, 84) = 3.86, p < .05. 
More embarrassment was reported by HSG 
subjects, F(2,84) = 23.57, p< .001, and 
they also reported experiencing their fan- 
tasies less vividly, F(2,84) = 13.94, p< 
.001 (see Table 2). LSG subjects reported 
experiencing less guilt following the sexual 
stimulation (postguilt) than HSG subjects, 
F (2,84) = 5.12, p < .05 (see Tables 1 and 
2), although HSG subjects in comparison to 
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the LSG subjects did not significantly in- 
crease the amount of guilt (Guilt x Time) 
experienced due to the experimental manipu- 
lation (see Table 2). 

A chi-square analysis was performed on 
each sex organ and sex act mentioned in the 
fantasies of LSG and HSG subjects. Results 
indicated that LSG subjects mentioned the 
words penis, clitoris, and sexual intercourse 
significantly more often than HSG subjects. 
No other significant differences between 
groups were found for sex organs and sex acts 
that were stated in the fantasies (see Table 
3). 

HSG subjects checked 30% fewer themes 
on the Fantasy Checklist as compared to LSG 
subjects. For each sex guilt group, the number 
of times a theme was checked was compared 
to the total number of themes checked. A 
nonparametric test of the significance between 
two proportions was performed comparing the 
percentage of each theme to the total themes 
for the LSG and HSG groups. LSG subjects 
responded significantly more often to two 
themes: “I relive a previous sexual experi- 
ence” (z = 2.03, p < .05), and “I have pre- 
tended that I am making love to a man that 
I am acquainted with other than my current 
lover” (z= 2.65, p< .01). HSG subjects 
responded more frequently to two themes 
related to being dominated sexually: “I 
imagine that I am being overpowered 10 
forced to surrender,” and “I enjoy imagining 
that I am being dominated sexually and that 
I am helpless” (z = 2.78, p < .01, and = 
2.25, p <.05, respectively). HSG subjects 
also checked “Thoughts of an imaginary 
lover or a stranger enter my mind” (z= 
2.17, p < .05), and “I imagine that I am 2 
beautiful that men cannot resist me” (2 = 
2.23, p < .05) significantly more often than 
LSG subjects. a, 

As predicted, aNovas on the characteristic 
of subjects’ fantasies resulted in significan 
differences in response to the conditions a 
erotic, romantic, or neutral response ale 
(see Tables 1 and 2). The length of fat 
tasies differed between groups, F (2,844 
4.01, p < .05, and post hoc analyses CF 
Table 4) revealed that subjects 1n the en 
fantasy condition wrote longer fantasies É 
subjects in the romantic condition, t(58) 
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Table 4 
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t-Test Resulis by Condition for Significant F Values 


el 


X Erotic vs. Erotic vs. Romantic vs. 
Dependent variable romantic neutral neutral 
Length of fantasy 2.68** 1.69 1.23 
No. sex organs 3.97** HY bed 74 
No. sex acts 5.87** 4.14** 1.78 
Variety of content 2.295 1.35 1.16 
Sexual arousal 1.95 4.31** 2.38* 
Presence of sensations 2.32* eyed 2.43* 
Genital 2.05* 5.56** 2.78** 
Breast 2,82** 4.46** 2.09* 
Arousal* 1.56 2.95* 1.26 


Note. df = 58 in all cases. 

* Measured by Multiple Adjective Check List. 
up <.05. 

"p< 01. 


2.68, p < .01. Both the number of sex acts 
and sex organs mentioned varied across 
groups, with subjects in the erotic condition 
writing more explicit fantasies than those in 
both the romantic condition and the neutral 
condition. The variety of content described 
in each experimental condition produced 
significant differences, F(2, 84) = 3.57, P< 
.05, with post hoc analyses indicating that the 
differences were a result of the erotic group 
subjects, who wrote fantasies with more var- 
ied content than those in the romantic fantasy 
group, #(58) = 2.25, p < .05. Contrary to 
Predictions, the number of fantasies, total 
number of words written, and number of 
themes checked were not significantly af- 
_ fected by experimental conditions. 
Presence in different experimental groups 
did not produce differences in the degree of 
Teported guilt, embarrassment, or vividness of 
the fantasy experience. However, subjects in 
i the erotic and romantic fantasy groups 
peered more sexual arousal and the pres- 
ee of sexual sensations more often than 
Ose in the neutral fantasy group. A greater 
perce of genital and breast sensations were 
ported by subjects in the erotic group than 
se in either the romantic or neutral 
8toups, and subjects in the romantic fantasy 
or teported more genital and breast sensa- 
ns than those in the neutral group. 
in a seen in Table 2, none of the predicted 
€tactions between levels of sex guilt and 


experimental condition on sexual fantasy were 
significant. 


Discussion 


Both sex guilt level in subjects and experi- 
mental fantasy cues demonstrated strong 
effects on fantasy parameters. Sex guilt level 
had a demonstrable effect on the quantity of 
fantasy production and females’ reports of 
embarrassment and vividness, whereas the 
type of fantasy example had more of an 
effect on the explicitness of fantasies and the 
report of sexual arousal. The two indepen- 
dent variables did not interact, but they af- 
fected different aspects of the sexual fantasy 
experience for subjects. 

Specifically, sex guilt level in females re- 
sulted in HSG subjects reporting fewer, 
shorter, and less explicit fantasies with less 
variety of content and fewer themes than 
LSG subjects. These results supported pre- 
vious research that HSG subjects were more 
conservative in sexual matters. The findings 
clearly showed that in explicitness and quan- 
tity, sexual fantasy production was under the 
influence of the sex guilt trait, suggesting that 
it may be part of a cluster of sexual behaviors 
governed by this trait. 

Experimental fantasy cues exhibited a 
stronger influence than sex guilt level on ex- 
plicitness and variety of content of females’ 
fantasies. It is interesting to note that both 
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HSG and LSG subjects were responsive to 
the experimental conditions and that the pre- 
dicted interactions of the effects of the stim- 
uli conditions on HSG and LSG subjects’ 
sexual fantasies did not occur. The specula- 
tion that HSG subjects would be more con- 
stricted and less explicit in their reported 
fantasies under the sexually explicit erotic 
condition as compared to the neutral and 
romantic conditions was not supported. The 
strong influence of situational demands on 
self-reported sexual fantasies for both HSG 
and LSG subjects suggests that clinicians and 
researchers need to be sensitive to the condi- 
tions under which sexual fantasy production 
is elicited. Sexual fantasy may be more ac- 
curately viewed as being under stimulus con- 
trol rather than as a stable trait. 

Response cuing may, however, place con- 
straints on subjects’ responses. The erotic 
fantasies that the subjects read may not have 
been explicit enough or varied enough in 
content to elicit the greatest detail or the 
maximum number of sexual fantasies that 
women have experienced. On the other hand, 
there may be a level of explicitness beyond 
which subjects would begin to censor their 
responses, Future research could vary the 
levels of sexual explicitness in an effort to 
determine the optimal response facilitator for 
fantasy production. 

As proportionally more HSG females 
checked themes concerning being sexually 
dominated and being irresistible to men, it is 
possible that these themes indicated a reduc- 
tion in responsibility for the sexual interac- 
tion, thus reducing the guilt experienced by 
these females. The fantasy themes preferred 
by LSG subjects were more often concerned 
with real individuals, whereas HSG subjects 
more often responded to themes concerning 
an imaginary lover. 

Supporting Mosher’s (1973) data, sex guilt 
groups did not differ significantly in degree 
of self-reported arousal. It is clear from these 
data that both groups did get at least mod- 
erately sexually aroused by sexual fantasy 
stimulation, and sex guilt did not seem to 
interfere with reporting this level of arousal. 
The experimental conditions did produce dif- 
ferences in reported arousal in the predicted 
direction, with more sexually explicit fan- 
tasies (i.e., erotic examples) resulting in 
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higher levels of arousal. This finding sup- 
ports Mosher’s (1973) results and suggests 
that sexual arousal in fantasy production 
seems to be a function of the degree of sex. 
ual explicitness of the stimuli rather than a 
function of the sex guilt trait of individuals, 
A study assessing actual physiological arousal 
rather than self-report is needed to more 
clearly assess the effects of sexual fantasy 
stimuli on arousal. 
HSG subjects reported experiencing more 
guilt and embarrassment in response to taking 
part in the experiment and writing sexual 
fantasies than did LSG subjects, thus sup- 
porting previous findings that HSG subjects 
reported feeling more ashamed after exposure 
to sexual stimulation. They also indicated 
that they experienced their fantasies less 
vividly than did LSG subjects. The question 
remains whether HSG subjects, in writing 
fantasies, did not report as many or as ex- 
plicit fantasies, because they were too €m- 
barrassed to do so rather than because they 
did not experience fantasies in the same way 
as LSG subjects. However, the more passive 
response of checking themes also showed the 
same sex guilt difference. If embarrassment 
were the important variable affecting fan- 
tasy production, one would have expecte 
HSG subjects to write more under the ro- 
mantic or neutral example condition, ™ 
which sexual demand characteristics were 
minimal. This does not appear to be the cast 
as HSG subjects, like LSG subjects, wrote 
more under the sexually explicit condition. 
Regarding the lower vividness ratings, a Te 
sonable explanation is that HSG subjects 
with their postulated censoring mechanisms; 
probably do not remember their fantasies 
clearly, As these ratings were initial attempts 
at assessing parameters of sexual io 
via self-report, there is a definite need fo 
research that will partial out possible ane 
factors and their predictive weights 1n |” 
differential responding of HSG and LSG su 
jects. Ae 
Subjects’ ratings in the three experimen 
conditions, in terms of guilt feelings, em 
rassment, and vividness of their fantasies a 
dicated no differential reactions to the ie 
tasy stimuli. These results contradict t r 
of Mosher and Greenberg (1969), who fou : 
that subjects responded with more guilt 
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ings after sexual stimulation in the form of 
erotic literary passages. Since the most ex- 
plicit sexual stimuli described only hetero- 
sexual intercourse, it is possible that the con- 
tent of the fantasy examples was not varied 
enough to elicit feelings of guilt or embar- 
tassment in the subjects. The lack of con- 
gruence of these results with the Mosher and 
Greenberg study suggests the need for further 
research to manipulate the specific param- 
eters of explicitness and variety of content of 
experimental stimuli to determine the effects 
on self-reported guilt levels and ratings of 
embarrassment. 
Most young adult females in this college 
- population related at least one fantasy when 
asked to write their own sexual fantasies and 
when prompted with fantasy stimuli. Whether 
this stimulus situation accurately assessed the 
incidence of fantasies in females still needs to 
be determined, Future studies could ask for 
fantasy incidence information to be collected 
over a period of time in addition to the self- 
reported fantasies in a laboratory setting. 
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On the External Validity of Two Psychotherapy Analogues 


Kenneth Kushner 
University of Michigan 


The present study investigated the similarity between two live psychotherapy 
analogues and real psychotherapeutic interviews. Therapists were asked to par- 
ticipate in two different types of analogue situations and in initial intake ses- 
sions with real clients. In both of the analogues, a recruited subject presented 
a real personal problem to the therapists in helping interactions. Audiotapes of 
the real and analogue interviews were rated on 10 dependent variables, which 
were different dimensions of therapist and client behaviors. Different results 
were obtained for each analogue. The major findings concern mean differences 
between the analogue and real interviews and the linear relationships between 
the real and analogue interviews. Additional findings, including significant inter- 
actions between the type of interview and the experience level of the therapist, 
are also discussed. The results are interpreted as indicating that the generaliz- 
ability of the analogues is contingent on the dependent variables in question, 
the type of relationship to be predicted, and the experience level of the ther- 
apists. The implications that these results have for future research involving 


psychotherapy analogues are discussed. 


The present research was inspired by the 
‘importance of analogue methodologies for 
process studies of psychotherapy. Since the 
time of the first study that used an analogue 
of the psychotherapeutic process (Keet, 
1948), the use of psychotherapy analogues 
(also referred to as simplifications or simula- 
tions) has become a very popular and ac- 
cepted research strategy. The types of ana- 
logues that researchers have used over the 
years have been both diverse and creative. 
They have ranged from highly artificial, non- 
live situations (i.e., Porter, 1950) to highly 
realistic live situations (i.e., Russell & Sny- 
der, 1963). Today, the sheer number of ana- 
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logue studies is large enough to warrant ex- 
tensive literature reviews (Heller, 1971; Mun- 
ley, 1974), as well as substantive theoretical 
critiques (Bordin, 1965; Strong, 1971; 
Strupp, 1961; Thomas, 1962). It is easy t0 
understand why analogues are often prefera- 
ble to the naturalistic study of psychother- 
apy; they are more convenient, they affor 
more experimental control, and they avoid 
many of the ethical problems inherent in 
field study methodologies that require the usé 
of real clients and therapists. 

There is an underlying assumption invol 
in the use of analogues in psychotherapy 1e 


ved 


search; namely, that the results found in the 
simplified situations are generalizable to 1? 
real psychotherapeutic setting that they a 
supposed to simulate. As Heller and Mae 
(1969) have pointed out, this is a questio 
of the external validity of analogues. 
utility of an analogue depends on the amoun 
of confidence one can place in one’s a 
extrapolate from results found in th E 
logue situation to the real psychotherapen” 
situation to which it refers. The degree $ 
similarity between results found in anaidi 
and naturalistic settings is an empirical qu 


bility t° 
e ana 
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tion. Bordin (1965) advocated validational 
studies that “bridge” results found in the one 
setting to the other. Other authors have also 
suggested such empirical comparisons (eg., 
Strupp, 1961). Nevertheless, as of today, 
validational studies can be counted on one 
hand (Hopke, 1955; Matarazzo, Wiens, Mat- 
arazzo, & Saslow, 1968; Roark, 1969; Sigal, 
Guttman, Chagoya, & Lasry, 1975; Sigal, 
Lasry, Guttman, Chagoya, & Pilon, 1977). In 
each of these five studies, the investigators 
found some degree of similarity between the 
analogue and real interviews for some of the 
dependent variables that they examined. How- 
ever, for other dependent variables, the in- 
vestigators found important differences be- 
tween real and analogue situations. These re- 
sults indicate that there is good reason not 
to accept on faith the generalizability of all 
tesults based on analogues. 

The present study was undertaken to apply 
the principal of “empirical bridging” to two 
similar psychotherapy analogues. In both ana- 
logues, real therapists interviewed nonreal, or 

quasi,” clients in simulated initial intake 
sessions. In one type of analogue, the non- 
standard client analogue (similar to Danish, 
D’Augelli, & Brock, 1976), each therapist in- 
terviewed a different female volunteer drawn 
from a subject pool. The subject was in- 
Structed to genuinely discuss a personal prob- 
lem with the therapist. In the second ana- 
logue, the standard client analogue (similar 
to Carkhuff, Kratochvil, & Friel, 1968), all 
therapists in the sample interviewed the same 
female quasi-client. She was instructed to 
Present the same real problem to each thera- 
Pist and to discuss it genuinely. The thera- 
Pists were instructed to conduct both types 
of analogues as if they were real initial intake 
Interviews, Thus, the two types of analogues 
Were identical, except for the fact that one 
interviewee was used in the standard client 
analogue and multiple interviewees were used 
ìn the nonstandard client analogue. 


Desa Method 


S present study contained a within-subjects 
stand A ee that each therapist participated in the 
ERER client analogue, the nonstandard client 

gue, and the real initial intake interviews. All 
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interviewees were female, and the order of presenta- 
tion of the two analogues was balanced in the sam- 
ple. This design allowed for direct comparison of 
the process of each analogue to the real therapy 
interviews. 


Subjects and Procedure 


Therapists were recruited from four campus mental 
health agencies, Thirty-one therapists volunteered to 
participate in the study. The minimum qualifications 
for therapists to be included in the data analysis 
was having completed at least one real and one 
analogue interview. As a result, 23 therapists were 
included in the data. Of these, 15 therapists partici- 
pated in both analogues and two real interviews, 7 
therapists participated in both analogues but were 
able to record only one real intake interview, and 1 
therapist participated in only the nonstandard client 
analogue but recorded two real initial intake inter- 
views. The remaining 8 therapists, those who volun- 
teered to participate but who did not record at 
least one real intake session, all reported that they 
were unable to obtain appropriate intake interviews. 
The therapists who participated in the study were 
a diversified group, as shown by an even sex distri- 
bution and wide ranges of age, experience level, and 
theoretical orientation. 

The clients for the real therapy interviews were 
selected by the therapists. Each therapist was asked 
to tape record the first two real intake sessions that 
met the following criteria: (a) The client had to be 
female; (b) the presenting problem had to be emo- 
tional rather than solely educational or vocational 
in nature; (c) it had to last at least 30 minutes; and 
(d) the client had to agree to participate in the 
research, 

The subjects in the nonstandard client analogue 
interviews were females recruited by phone from a 
paid subject pool. They were told that the study 
would entail ‘having an interview with a therapist in 
which they would talk genuinely about a real per- 
sonal concern or problem. The only prerequisite was 
that they were not currently involved in counseling 
or psychotherapy. f 

The client for the standard client analogue inter- 
view was a 21-year-old senior undergraduate psy- 
chology major. She was instructed to choose a, real 
personal problem, and other than presenting it in 
the same words to each therapist, she was asked to 
discuss it genuinely and spontaneously in each 
interview. Thus, her instructional set was essentially 
the same as that of the nonstandard clients. The 
standard presenting problem that she chose con- 
cerned her ambivalence about her relationship with 
her boyfriend, especially whether she should move 
to the east coast when she graduated instead of 
staying in the midwest to be near him. She described 
the problem on the information sheets as follows for 
each interview: “Confusion about various things— 
ie., boyfriend and where I go at the end of the 


year—how can I make some important decisions 


related to above—do I think too much about it, 
etc.” 
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In designing the two analogues, the attempt was 
made to create situations that differed from the real 
therapy interviews only by virtue of the fact that 
the analogue subjects were actively recruited to 
discuss real problems, whereas the real therapy clients 
were self-motivated to seek help for their problems, 
For this reason, all interviews (analogue and real) 
were conducted in the therapists’ own offices. In 
addition, the analogue subjects, as well as the real 
clients, presented to the therapist a face sheet pro- 
viding both demographic information and their 
presenting problem(s). In both analogue interviews, 
the instructional set was designed to parallel the 
demand characteristics of a real therapy interview; 
the subjects were requested to present real prob- 
lems, and the therapists were asked to conduct the 
interviews as if they were actual initial intakes. The 
therapists were aware of which real clients were being 
taped for research purposes. They were also informed 
that the analogue subjects were experimental sub- 
jects, However, they were not told that there were 
any differences between the two analogue interviews. 
Specifically, an attempt was made to keep the thera- 
pists blind to the fact that one of the subjects was 
being seen by other therapists. During debriefing, 
eight therapists reported knowledge or suspicion that 
a standard client was used. 

It is worth noting that I judged the real and 
analogue situations to be equivalent in terms of one 
more important characteristic—presenting problem. 
The presenting problems of the real and nonstandard 
clients mostly centered on interpersonal and/or aca- 
demic concerns. Difficulties with parents, boyfriends, 
or schoolwork were typical examples. In light of this 
fact, the standard client’s presenting complaint was 
representative of the type of problems presented by 
the other subjects. It is also worth noting that the 
mean age of the real clients was 22.3 and that of the 
nonstandard clients was 19.9. This difference was 
significant, #(59) =3.33, »<.01. Although the at- 
tempt was made to keep the real clients, standard 
client, and nonstandard client interviews equivalent 
in length by asking the participants to keep them 
between 4 and 1 hour, their mean lengths were 49 
min 3 sec, 40 min 22 sec, and 42 min 49 Sec, respec- 

tively. Only the contrast between the real client and 


standard client analogue interviews was significant 
(p < 05). 


Dependent Variables 


The audiotape recordings of the real client, stan- 
dard client, and nonstandard client analogue inter- 
views were rated for the following 10 dependent 
variables that were selected due to their relevance to 
psychotherapy research in general and analogue re- 
search specifically: accurate empathy, nonpossessive 
warmth, genuineness (all after Truax & Carkhuff, 
1967), ambiguity (after Osburn, 1951), question-to- 
statement ratio (Q/S; after Ornston, Cicchetti, Le- 
vine, & Fierman, 1968), mean duration of therapist 
utterance (TUL), mean duration of client utterance 
(CUL), percentage of client talk time (C%), percent- 
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age of therapist talk time (T%), and percentage of 
silence (S%; all after Matarazzo et al., 1968). 

For accurate empathy, nonpossessive warmth, gen- 
uineness, and ambiguity, the ratings were done on 
four excerpts per tape, each approximately 4 minutes 
in duration, This yielded four scores per interview. A 
mean score for each variable was then calculated 
for each interview. Unless stated otherwise, it was 
these mean scores that were used in the data analy- 
sis, The remaining dependent variables—Q/S, TUL, 
CUL, C%, T%, and S%—were rated for the entire 
tape by a hand-held stopwatch, yielding one overall 
score per interview per variable. For one therapist 
whose data was included in the analysis, the excerpts 
were inaudible, and the raters were unable to rate 
her interviews for empathy, warmth, genuineness, and 
ambiguity. However, they were able to rate the 
original tapes for the remaining variables. 


Raters and Rater Reliability 


The raters were six undergraduate psychology 
majors. Three of them rated the interviews for ac- 
curate empathy, nonpossessive warmth, and genuine- 
ness. The other three rated them for the remaining 
dependent variables. Both groups of raters wert 
given extensive training and practice in the use of 
the scales by me, and each group had to demonstrate 
a high degree of interrater reliability before the ac- 
tual rating began. For both groups of raters, pre- 
rating reliability was determined on the basis of 
their scores on 11 4-minute excerpts taken from 
therapy sessions that were not in the present sample. 
In addition to the prerating reliabilities, each 
group of raters rated the same 11 excerpts sprea' 
over the actual ratings. These scores allowed me to 
determine the “drift” in reliability that might have 
occurred in the process of rating. In determining 
reliability, correlation coefficients were computed for 
each of the three pairs of raters within each group, 
and a mean of these three coefficients was then calcu- 
lated. Instead of calculating the coefficient for the 
Q/S ratio, separate coefficients were calculated for 
the number of questions and the number statements. 

The prerating reliabilities were uniformly high, 
ranging from correlations of .75 for warmth to 3 
for CUL. The drift rating coefficients were also ig 
for all dependent variables except empathy an 
warmth, which were .01 and .53, respectively. Since 
this raises questions about the value of the rating 
for empathy and warmth, the results pertaining to 
these two variables have been omitted from the 
following presentation. Correlations for all pra 
and drift rating reliabilities can be found in Kushni 
(1977). 


Results 


Mean Differences Between Analogue and 
Real Interviews and Effects of Therapist 
Experience Level 

To investigate whether there were ee 
differences between the real and analogu! 


a S S 
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Table 1 
Cell Means for Conditions Main Effect for Both Analogues Versus Real Interviews 
Condition 
Variable Real Standard Nonstandard F df 
Genuineness 2.31 2.36 2.19 1.23 2,28 
Ambiguity" 13.98 19.22 15.75 2.61 2,28 
CUL (in sec) 20.51 10.64 19.78 5.67* 2,32 
TUL (in sec) 6.97 8.20 7.63 3.22 2,32 
Q/S 1.72 1.87 1.38 2.04 2,32 
Ç Zo? 69.73 56.93 69.17 9.98*** 2,32 
TA 22.06 29.84 23.08 3.72* 2,32 
S%> 8.47 13.22 7.88 3.10 2,32 


Note. CUL = mean duration of client utterance; TUL = mean 


duration of therapist utterance; Q/s 


= question to statement ratio; C% = percentage of client talk time; T% = percentage of therapist talk 


time;S% = 
"Low score indicates more ambiguity. 


percentage of silence. For Dunnett's test, real vs. standard analogue on CUL and C%, p < .001. 


b 5 x 7 5 
Analysis of variance performed on arc-sin transformations of raw scores. Cell means are expressed as means 


of raw scores before transformation. 
*p <.05. 
E $ < 01. 
p < .001. 


interviews, an analysis of variance of the form 
Conditions x Experience X Therapists was 
calculated for each dependent variable. There 
were three levels of the conditions factor 
(real, standard, and nonstandard).* Experi- 
ence was included as a factor due to the large 
range of clinical experience in the sample of 
therapists (0-20 years) and the fact that it 
was wondered whether the analogues would 
reflect differences between less experienced 
and more experienced therapists. Accordingly, 
the sample was split at the median into a 
group of less experienced (0-3 years) and 
More experienced (5+ years) therapists. This 
peed two levels of the experience factor. 
S therapists were nested within experi- 
jare: end conditions were crossed with both 
esaa and therapists. Several therapists 
ad to be randomly disgarded in order to 
ance the cells, resulting in 16 therapists in 
a analyses conducted on genuineness and 
AT, and 18 in those conducted on the 
raning variables. For those variables that 
a bounded ratios (C%, T%, and S%), 
G analyses were conducted on the arc-sin 
nsformations of the raw scores. 

tel cell means and F ratios of the condi- 
a main effect are displayed in Table Js 
aa whether the real interviews dif- 
rom either analogue, Dunnett’s method 


of testing all cell means against one control 
(Winer, 1971) was used to test the signifi- 
cance of the two pairwise contrasts of in- 
terest: between the real and standard client 
interview conditions and between the real and 
nonstandard client interview conditions. The 
results of Dunnett’s test for these variables | 
that had significant overall F ratios appear 
in Table 1. As can be seen, the conditions 
main effect was significant for CUL and C%. 
For these two client variables, the contrasts 
between the real client and standard client 
conditions were significant, revealing that the 
standard client had shorter mean durations of 
utterance length and talked for smaller per- 
centages of the sessions than did the real 


riance, the scores of the first 
d for those therapists who 
al intake sessions. Prior 


1In all analyses of val 
real interviews were use 
had recorded two real initi 
to conducting the analyses, ¢ tests had shown that 
there were no significant differences between the 
first and second real interviews. 

2 Actually the analyses for genuineness and am- 
biguity were conducted as repeated measures analy- 
ses of variance. This was because four ratings of 
those variables were made from each tape, one from 
each excerpt. Thus, the design for these variables 
was Conditions X Experience X ‘Therapists X Repli- 
cates; with 3, 2, 8, and 4 levels of each factor, Te- 
spectively. (For a further treatment of this, see 


Kushner, 1977.) 
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Table 2 
Cell Means for the Experience Main Effect 
Less More 
experienced experienced 
Variable (<3 yr) (25 yr) F df 
Genuineness 2.15 2.43 3.45 1,14 
Ambiguity* 13.33 19.30 7:93°* 1,14 
CUL (in sec) 22.02 11.94 7,39%* 1,16 
TUL (in sec) 6.51 8.69 3.22 1,16 
Q/S 2.20 1.11 6.21* 1,16 
CHP 70.26 60.29 9,06** 1,16 
TH 20.14 29.85 7.56** 1,16 
S%> 9.59 10.13 12 1,16 


Note, CUL = 


S% = percentage of silence. 
a Low score indicates more ambiguity. 


mean duration of client utterance; TUL = mean duration of therapist utterance; Q/S = ques- 
tion to statement ratio; C% = percentage of client talk time; T% = 


b Analysis of variance performed on arc-sin transformations of raw scores. Cell means presented are means 


of raw scores before transformation. 
*p <.05. 
** p < 01. 


clients. For the third variable, T%, the con- 


trast between the real client and standard cli- 
ent conditions approached but did not attain 


Table 3 
Cell Means for Experience X Conditions 
Interaction for Ambiguity, C%, and T% 


Experience 
Condition Less More 
Ambiguity* 

Real 7.97 20.00 

Standard 19.20 19.25 

Nonstandard 12.83 18.67 
CHP 

Real 78.26 61.20 

Standard 57.30 56.57 

Nonstandard 75.24 63.09 
TH 

Real 13.18 30.95 

Standard 29.99 29.70 

Nonstandard 17.27 28.90 


Note. C% = percentage of client talk time; T% 
= percentage of therapist talk time. Analysis of 
variance was performed. on arc-sin transformation 
of raw scores. Cell means are means of raw scores 
before transformation. 

a Low score indicates more ambiguity; F(2, 28) 
= 3.29, p <05. 

b F(2, 32) = 3.48, p < .03. 

c F(2, 32) = 4.14, p < .03. 


significance.* In no instance was the con- 
trast between the real client and the non- 
standard client interviews significant. 
Table 2 displays the cell means and F ra- 
tios for the experience main effects. As can be 
seen, there were significant differences between 
the less experienced and more experienced 
therapists for the following dependent varia- 
bles: ambiguity, CUL, Q/S, C%, and T%: 
Significant Experience X Conditions interac 
tions were found for three dependent varla- 
bles ambiguity, C%, and T%. The cell means 


and F ratios of these significant interactions 


are displayed in Table 3. The same two pat 
terns are reflected in the cell means of all 
three of those interactions. The first patter? 
is that the differences between the less and 


i a 
3A second set of analyses of variance, based on 


larger sample size, did in fact reveal a significant 
difference between the real and standard client ee 
ditions for T%. This was when the analyses W ns 
calculated without the experience factor (Condit $ 
X Experience). This enabled the inclusion of iy 
data of three more therapists. (They had one ee 
been disgarded in order to balance the cells for di 
previous series of analyses.) Similarly, this ae 
set of analyses revealed a significant conditions ae 
effect for S%, F(2,42) =3.38, p<.05, and a fent 
nificant contrast between the real and standard These 
analogue conditions for that variable ($ < 05). 
results, and the summaries of all the analyses, 
discussed in Kushner (1977). 


percentage of therapist talk time; | 


>. 


EXTERNAL VALIDITY OF TWO PSYCHOTHERAPY ANALOGUES 


more experienced therapists that were evident 
in the experience main effect were found in 
the real client and nonstandard client inter- 
views but not in the standard client inter- 
views. Rather, the cell means for the standard 
client interviews were almost identical for the 
less experienced and more experienced thera- 
pists. 

The second pattern that was reflected in 
the cell means of the significant Experience X 
Conditions interactions for ambiguity, C%, 
and T% is that there was much greater 
range in the cell means across conditions for 
the interviews conducted by the less experi- 
enced therapists than there were for the more 
experienced therapists. For the less experi- 
enced therapists, the disparities between the 
means of the real client and standard client 
conditions were the most pronounced. For 
the more experienced therapists, the cell 
means of all three conditions seemed roughly 
equivalent. Apparently, the former therapists 
were more influenced by the differences in the 
type of subjects that they were interviewing 
than were the latter therapists. 


Linear Relationships Between the Real and 
Analogue Interviews 


The linear relationships between the real 
and analogue interviews address the ability to 
make predictions regarding the relative per- 
formance of specific therapists in real therapy 
on the basis of his/her performance in an 
analogue, This was investigated via two sets 
of correlations. One was based on the same 
therapists’ scores in the real interviews and 
in the standard client analogue; the other 
was based on the same therapists’ scores in 
the real interviews and in the nonstandard 
Client analogues. The correlations were calcu- 
lated using the mean of the first and second 
Teal interviews for those therapists who par- 
ticipated in two initial intake sessions. The 
results of the correlations are summarized in 
Table 4. An analogue can be said to be a 
Predictor of real therapy behavior if the co- 
efficient is positive and significant. The 
Standard client analogue was a predictor for 
three dependent variables: genuineness, CUL, 


and Q/S. The nonstandard client analogue 
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Table 4 
Correlations of Scores of Same Therapist's 
Real Interviews Versus Each Analogue 


Interview 
E eens ee e e 


Comparison 
Real* vs. Real* vs. 
standard nonstandard 
Variable r df r df 
Genuineness TS 19 20 20 
Ambiguity 27 19 10 20 
CUL 56% 20 30 21 
TUL 12 20 21 21 
Q/S 050r G20 54e* 21 
C% .19 20 23 21 
T% 208 20 35 21 
S% .23 20 —.42* 21 


Note. CUL = mean duration of client utterance; 
TUL = mean duration of therapist utterance; 
Q/S = question to statement ratio; C% = per- 
centage of client talk time; T% = percentage of 
therapist talk time; S% = percentage of silence. 

s Means of raw scores of first and second initial 
interviews were used for those therapists who sub- 
mitted two real interviews. 

*p < 05. 

++p < 01. 


was a predictor of real therapy behavior only 
for Q/S. 

A supplemental set of analyses revealed an 
interesting and somewhat surprising result. 
The correlations were calculated between the 
first and second real interviews for those 
therapists who recorded two initial intake 
sessions. The coefficients were quite low, with 
none of them attaining statistical significance. 
(See Kushner, 1977, for coefficients.) This 
indicates that behavior in one initial intake 
session did not adequately predict behavior in 
a second initial intake that the same therapist 
conducted with another client. It should be 
noted that the correlations between the two 
real interviews did not differ statistically 
from either the real and standard client ana- 
logues or the real and nonstandard client 
analogues. Thus it cannot be concluded that 
a second real interview is a better or worse 
predictor of a therapist’s real interview than 
either the standard client or nonstandard 


client analogue. 
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Table 5 
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Variances of Dependent Variables with F Tests for Pairwise Comparisons Against Standard 


Client Condition 


Standard Standard vs, 

Variance vs, real nonstandard 

Variable Standard Real Nonstandard F df F df 
Genuineness 195 -161 091 1.17 20,21 2.14 20,21 
Ambiguity 40.208 66.171 80.534 1,92 21,20 2.00 21,20 
CUL 15.234 276.590 264.323 7.38* 22,21 17.34* 22,21 
TUL 8.951 8.861 20.917 1.15 21,22 2.34 22,21 
Q/S 1.100 1,442 -806 1.88 22,21 136 721,22 
C% .009 .033 458 3.94* 22,21 5.06* 22,24 
TH 015 014 018 1.15 21,22 1,94 22,21 
S% -007 -004 -006 1.46 21,22 1.16 22,21 


Note, CUL = mean duration of client utterance; TUL = mean duration of therapist utterance; Q/S = quesi 
tion to statement ratio; C% = percentage of client talk time; T% = percentage of therapist talk time; 


S% = percentage of silence. 


a Calculations of variance conducted on arc sin transformations of raw scores, 


*p < .01. 


Differences in Variance Between the Standard 
Client and Other Interview Conditions 


The last set of analyses does not relate to 
the generalizability of the analogues, but 
rather to the rationale behind the use of a 
standard client. The single interviewee should 
theoretically provide a more constant stim- 
ulus than multiple interviewees, Therefore, 
there should be less variance for the depen- 
dent measures in the standard client condition 
than in the real client or nonstandard client 
conditions, This issue was addressed by cal- 
culating the variance in each condition and 
then conducting two F tests—one contrasting 
the dependent measures for the standard 
client analogue and real interviews and the 
other contrasting the standard client and the 
nonstandard client interviews. The variables 
and F tests are shown in Table 5. The Fs were 
significant only for CUL and C%, showing 
that there was considerably less variance in 
client behavior for these variables in the 
standard client interviews than in either the 
real client or nonstandard client interviews. 


Discussion 


Several main findings regarding the simi- 
larity between the analogue and real inter- 
views were obtained in the present study. 
These will be discussed with reference to 


their implications regarding the future use 
of the analogues: First, there were no mean 
differences between the real client interviews 
and the nonstandard client analogues fot 
any of the eight dependent variables. How- 
ever, there were significant mean differences 
between the real client interviews and the 
standard client analogues for mean duration 
of client utterance and mean duration of ther- 
apist utterance. These mean differences indi- 
cate that the levels of functioning of the 
standard client would not be predictive of 
the magnitude of the means of the real ther- 
apy clients. The significant Conditions X Ex- 
perience interactions for ambiguity, C%, a”! 
T%, which showed greater ranges in the cell 
means across conditions for the less, but E i 
the more, experienced therapists indicate thal 
there were mean differences between the rea 
interviews and the standard client analogue 
for the former therapists only. These si- 
nificant Conditions x Experience interactions | 
also indicate that the standard client anê- 
logue did not reflect the mean differences m 
ambiguity, C%, and T% found in the Gis 
interviews between the less and more A 
perienced therapists, but the nonstandaty 
client analogue would have predicted tho: 

differences. For the remaining depender, 
variables, the lack of significant Condi 
X Experience interactions indicates the 4 
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ity of both analogues to reflect true differ- 
ences (or lack of differences) between inter- 
views conducted by less experienced and 
more experienced therapists. 

The correlations between the real and ana- 
logue interviews addressed the ability to 
predict the relative performance of individual 
therapists in real therapy on the basis of 
their behavior in the analogues. The ana- 
logues were rather disappointing as predic- 
tors. The standard client analogue was found 
to be a predictor of genuineness, CUL, and 
Q/S. The nonstandard client analogue was 
found to be a predictor for Q/S only. Al- 
though the low correlations between the two 
real interviews might be attributable to the 
small sample size on which they based (ns = 
14 and 16), other authors (Beutler, Neville, 
& Workman, 1973) have reported low corre- 
lations between real interviews of the same 
therapists, This implies that therapists might 
be highly variable from session to session. 

It should be clear from the above sum- 
mary that the external validity of an ana- 
logue is not an all-or-nothing property; 
rather, the utility of an analogue as a pre- 
dictor of real therapy behavior may be de- 
pendent on several factors, such as (a) the 
dependent variables in question, (b) the 
type of relationship that one would want to 
predict (i.e., mean differences between groups 
of therapists or the relative performance of 
individual therapists), and (c) the experience 
of the therapists participating in the ana- 
logue, It should be noted that the present 
study cannot be seen as a blanket validation 
of the use of the two analogues. Rather, con- 
clusions about the validity of any analogue 
Should be specific, in that they should define 
the variables and conditions for which it is 
Concluded to be predictive of real therapy 
behaviors, The onus is on the researcher to 
EY establish continuity of style be- 
mee the analogue and real therapy, rather 

an assuming it to be there for all variables. 

It is obvious that the results obtained for 
the standard client in the present study may 
ah been ideosyncratic of her particular 
ne style (characterized by rela- 

ively less verbal behavior and shorter inter- 
View lengths) and thus may not be general- 
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izable to other standard clients. A study 
that contrasted the interviews of several 
standard clients to real interviews with the 
same therapists would address this issue. 

In addition to the study proposed above, 
the present study has several other implica- 
tions for future research: (a) It adds more 
evidence that results found in analogues may 
not be generalizable to real therapy behav- 
iors. Therefore, more validational studies 
should be undertaken for other analogue 
situations, (b) Interpretation of results based 
on nonvalidated analogues should be tenta- 
tive in nature. (c) Naturalistic studies ap- 
pear to be largely preferable to analogue 
studies from the standpoint of rigor. (d) 
The consistency of therapists in real inter- 
views should continue to be studied. If it is 
true that therapists are highly variable from 
interview to interview, then sampling several 
interviews by the same therapist would be a 
more valid procedure for future research than 
merely sampling one interview. 


References 


C. W, & Workman, S. N. 
“accurate empathy” 
and Clinical Psy- 


Beutler, D. T., Neville, 
Some sources of variance in 
ratings. Journal of Consulting 
chology, 1973, 40, 167-169. 

Bordin, E. S. Simplification as a research strategy 
in psychotherapy. Journal of Consulting Psy- 
chology, 1965, 29, 493-503. 

Carkhuff, R. R., Kratochvil, D., & Friel, T. The 
effects of professional training: The communica- 
tion and discrimination of facilitative conditions. 
Journal of Counseling Psychology, 1968, 15, 68-74. 

Danish, S. J., D’Augelli, A. R, & Brock, G. W. 
An evaluation of helping skills training effects on 

Journal of Counseling 


havior change. New 
Heller, K., & Marlatt, 
behavior therapy and behavior 
problems in extrapolation. In C. M. 
Behavior therapy: Appraisal and status. New 
York: McGraw-Hill, 1969. } 
Hopke, W. E. The measurement of counselor atti- 
tudes. Journal of Counseling Psychology, 1955, 2, 


] techniques in a miniature 


counseling situation. Psychological Monographs, 


1948, 62(7, Whole No. 294). Nes 
Kushner, K. P. On the external validity of two psy- 
chotherapy analogues (Doctoral dissertation, Uni- 


1402 


versity of Michigan, 1977). Dissertation Abstracts 
International, 1977, 38, 2867b. (University Mi- 
crofilms No. 77-26, 286) 

Matarazzo, J. D., Wiens, A. N., Matarazzo, R. G., 
& Saslow, G. Speech and silence behavior in 
clinical psychotherapy and laboratory correlates. 
In J. M. Shlien (Ed.), Research in psychother- 
apy (Vol. 3), Washington, D.C.: American Psy- 
chological Association, 1968. 

Munley, P. H. A review of counseling analogue 
research methods. Journal of Counseling Psy- 
chology, 1974, 21, 320-331. 

Osburn, H. G. An investigation of the ambiguity of 
counselor behavior. (Doctoral dissertation. Uni- 
versity of Michigan, 1951). Dissertation Abstracts, 
1951, 221. (University Microfilms No. 00-03,544) 

Ornston, P. S., Cicchetti, D. V., Levine, J., & Fier- 
man, L. B. Some parameters of verbal behavior 
that reliably differentiate novice from experienced 
psychotherapists. Journal of Abnormal Psychol- 
ogy, 1968, 73, 240-244. 

Porter, E. H. An introduction to therapeutic coun- 
seling. Cambridge, Mass.: Riverside Press, 1950. 

Roark, A. E. The influences of training on coun- 
selor responses in actual and role-playing inter- 
views. Counselor Education and Supervision, 
1969, 8, 289-295. 

Russell, P. D., & Snyder, W. U. Counselor anxiety 
in relation to amount of clinical experiences and 


KENNETH KUSHNER 


quality of affect demonstrated by clients. Jour- 
nal of Consulting Psychology, 1963, 27, 358-363, 

Sigal, J. J., Guttman, H., Chagoya, L., & Lasry, 
J. C. Predictability of family therapists’ behavior, 
Canadian Psychiatric Association Journal, 1975, 
18, 199-202. 

Sigal, J. J., Lasry, J. C., Guttman, H., Chagoya, 
L., & Pilon, R. Some stable characteristics of 
family therapists’ intervention in real and simu- 
lated therapy. Journal of Consulting and Clinical 
Psychology, 1977, 45, 23-27, 

Strong, S. R. Experimental laboratory research in 
counseling. Journal of Counseling Psychology, 
1971, 18, 106-110, 

Strupp, H. H. The therapist’s contribution to the 
research process: Beginnings and vagaries of a 
research program. In H. H. Strupp & L. Lubor- 
sky (Eds.), Research in psychotherapy. (Vol. 2). 
Washington, D.C.: American Psychological As- 
sociation, 1961. 

Thomas, E. J. Experimental analogues of the case- 
work interview. Social Work, 1962, 7, 24-30. 

Truax, C. B., & Carkhuff, R. R. Toward effective 
counseling and psychotherapy: Training and prac- 
tice. Chicago: Aldine,.1967. 

Winer, B. J. Statistical principles in experimental 
design. New York: McGraw-Hill, 1971. 


Received November 28, 1977 m 


f Journal of Consulting and Clinical Psychology 
1978, Vol. 46, No. 6, 1403-1408 


Malcolm D. Gynther 


Auburn University 


cate that special norms 


The F scale of the Minnesota Multiphasic 
Personality Inventory (MMPI) was pro- 
posed as one of the validity indicators by the 
test’s authors, Hathaway and McKinley 
(1951). The 64 items comprising this scale 
were selected in part on the basis of the fre- 
quency of endorsement and in part on the 
basis of diversity of item content. That is, 
the items chosen were answered in the scored 
direction by 10% or less of the subjects in 
the Minnesota normative adult samples, and 
they covered a wide range of topics, with 
Only a few items referring to any one area of 
behavior or experience. The item tallies used 
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were tested. Item analyses of these p 
10% or less endorsement criterion use 
parisons were made between white and 
this new scale, items of the standard 
MMPI pool that met the 10% criterion 
F scale. White endorsement patterns agreed with the blacks’ F scale, but black 
endorsement patterns agreed with only one third of the standard F-scale items. 
Further, black adults showed comparable levels of infrequency on only 6 of 
the 38 supplementary F items. Although these results do not necessarily indi- 
for the clinical scales are necessary, the amount of 
difference between responses of blacks and whites to rarely endorsed items sug- 


gests that for blacks, this new scale may 
lates associated with endorsement of deviant items than the standard F scale. 


Are Special Norms for Minorities Needed? 
Development of an MMPI F Scale for Blacks 


David Lachar 
Lafayette Clinic 
Detroit, Michigan 


W. Grant Dahlstrom 
University of North Carolina 


As part of a large-scale investigation of the need for special Minnesota Multi- 
phasic Personality Inventory (MMPI) norms for black adult American test 
subjects, 882 normal black adults from Alabama, Michigan, and North Carolina 
rotocols revealed 33 items that met the 
d to develop the MMPI F scale. Com- 


black endorsements of the items on 
F scale, and additional items in the 
but were not included in the original 


be a more accurate measure of corre- 


to identify potential F-scale items came from 
an early subsample of the Minnesota normal 
subjects (111 men and 118 women); subse- 
quent analyses reported in Hathaway and 
Briggs (1957) and Dahlstrom, Welsh, and 
Dahlstrom (1975) indicate that on the more 
complete data of the Minnesota samples, 3 
of the 64 items do not meet the 10% or 
below criterion, and 1 additional item satis- 
fies this criterion only for females. In addi- 
tion, there are 38 items that could have been 
included in the F scale, but they were ex- 
cluded by the test’s authors even though they 
were endorsed by 10% or less of the revised 
Minnesota normal adult group. 

High scores on this scale (typically F > 
16) signified that the profile was invalid due 
to the subject’s carelessness oF Jack of com- 
prehension. Other reasons for obtaining ele- 
vated scores were recognized early in the de- 
velopment of the inventory. Meehl and Hath- 
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Ean > partment of Psychology, Auburn eRe y 
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and subjects who apparently wished to put 
themselves in a bad light also obtained high 
scores on this scale. Dahlstrom, Welsh, and 
Dahlstrom (1972) have described several 
other possible reasons for high scores, such 
as random responding, answering true to all 
items, cyclical alternations of true and false 
answers, a cry for help or pleading for spe- 
cial attention, a bilingual background with 
English as a second language, visual impair- 
ment, and acute psychotic disorganization. 

Research has shown personological and 
psychopathological correlates of elevated 
MMPI F scores obtained by whites. Among 
persons relatively free of serious problems in 
living, the Institute of Personality Assess- 
ment and Research studies (Gough, McKee, 
& Yandell, Note 1) characterized those with 
moderately high F scale scores as moody, 
restless, dissatisfied, and opinionated. Car- 
son (1969) has suggested that T scores in 
the range of 65 to 80 frequently appear in 
sullen, rebellious personalities of the schizoid, 
antisocial, or Bohemian type. High MMPI F 
scores obtained from court cases referred for 
diagnostic evaluation have been shown to be 
related to a diagnosis of psychopathy (Gyn- 
ther, 1961; Gynther & Shimkunas, 1965), 
as well as to the commission of serious sex 
crimes (Gynther, 1962). Elevated F scores 
obtained from psychiatric patients appear to 
be associated with withdrawal, poor judg- 
ment, short attention span, delusions, and 
hallucinations (Gynther, Altman, & Warbin, 
1973). Other correlates of moderate to high 
MMPI F scores can be found in Lachar 
(1974), Duckworth and Duckworth (1975), 
and Graham (1977). Dahlstrom et al, (1975) 
have summarized the research (provided that 
the sources of protocol invalidity mentioned 
earlier can be dismissed) as indicating that 
“the degree of emotional disturbance in the 
individual can be judged with reasonable ac- 
curacy from the elevation of the F scale” 
(p. 31). 

It is interesting to note, in this context, 
that black subjects typically obtain higher 
scores on the F scale of the MMPI than 
white subjects. There have been 10 studies 
to date comparing noninstitutionalized blacks 
and whites on the MMPI. In all but 1 of 
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these studies (several of which attempted to 
control for social class), blacks obtained sig. 
nificantly higher scores than whites on this 
scale. Fifteen studies have compared black 
and white institutionalized subjects (i.e., psy- 
chiatric inpatients and outpatients, prisoners, 
etc.) on the validity and clinical scales, In 
8 of the 15 comparisons, blacks got signifi- 
cantly higher F scores than whites. Although 
evidence of racial bias is most consistent in 
normal samples, there was no instance in 
any of the 25 studies (cf. Gynther, 1972; 
Gynther, Note 2) in which the scores of 
whites on this scale significantly exceeded 
the scores of blacks. 

Should these results be taken as indicating 


that black subjects have more difficulties in 


completing the MMPI validly due to poor 
reading skills, inability to follow directions, 
confusion, or other sources of profile invalid- 
ity, or that black subjects are more rebelli- 
ous, nonconforming, or emotionally disturbed 
than whites? That is, should the white-de- 
tived descriptors be applied to blacks? Only 
two studies have addressed this question. 
Gynther et al. (1973) found no correlates 
associated with F =26 MMPIs produced by 
black psychiatric patients, although a mean- 


ingful cluster of correlates for this code type | 


was established for white psychiatric patients. 


More recently, Hedlund (1977) showed that j 
MMPI F scores of black psychiatric patients 
are positively associated with scores of items | 


assessing “disorientation,” “confusion,” “de 
lusions,” and other behaviors suggesting psy- 
chosis. Hedlund (1977) pointed out that a$ 
far as Scale D and perhaps Scale F corre- 
lates were concerned, “most of the relation- 
ships were similar for blacks and whites but 
that smaller Ns for the black samples prê- 
cluded cross-validated statistical signifi- 
cances” (p. 744). Hedlund, however, com 


i Two additional black-white comparisons have 
been located. One used male misdemeanor Gee 
as subjects, and the other used male drug aba 
Blacks obtained nonsignificantly higher F Pi 
than whites in the former study (McCreary é 
dilla, 1977), but they obtained significantly lowe 
Scores than whites in the other study (Pe aie 
Robinowitz, 1974). This latter finding is um 
among the 27 studies examined. 
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cluded that “very few of the significant white 
relationships were validated for blacks” (p. 
744). In what appears to be a highly perti- 
nent observation, Dahlstrom et al. (1972) 
stated that 


an important part of interpreting test items seems 
to be an ability to share the common cultural 
framework of the derivational samples studied by 
the test authors. Many subjects lacking this essen- 
tial experiential core will show their atypical at- 
tributions and self-labeling by endorsing F-scale 
items in an unusual way. (p. 117) 


This study proposes to determine which of 
the 566 MMPI items were rarely (i.e., 10% 
or less) endorsed by a large sample of nor- 
mal black adults. These items are then com- 
pared with F-scale items (and the 38 other 
rarely endorsed items mentioned earlier) that 
were derived from Minnesota whites using 
the same criterion. If there is a large amount 
of overlap between the two sets of items, 
support would be provided for the continued 
use (and general interpretation) of the 
MMPI F scale for blacks as well as for 
whites. If, on the other hand, the two F 
scales are quite different, it would be sug- 
gested that use of the original F scale with 
blacks leads to inaccurate and inappropriate 
representations of such subjects. These anal- 
yses were carried out as part of a large-scale 
investigation of the need for separate norms 
for ethnic minorities on the standard MMPI 
scales. This project is briefly described below, 
but a more complete report is in preparation 
(Lachar, Gynther, & Dahlstrom, Note 3). 


Method 


Subjects were obtained from Alabama, Michigan, 
a North Carolina to derive norms for blacks on 
e validity, clinical, and other frequently used 
scales of the MMPI (Lachar et al, Note 3). Major 
Sources were church groups and social clubs, which 
Were paid $5 per participant. Black faculty and 
Fine students administered the inventory, usu- 
ISA to 5-10 persons at a time, and also asked sub- 
E to supply face-sheet information on a form 
a for that purpose. Confidentiality was as- 
te by the use of code numbers rather than 
561 i Total sample size was 882, 321 males and 
see cae Age varied from 18 to 65, with means 
mae for males and 34.1 for females. Approxi- 
high Y 15% of the subjects had not completed 
school, 27% were high school graduates, 26%6 
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had some college, 18% had graduated from college, 
and 14% had some postgraduate work. Approxi- 
mately two thirds of the subjects had attended 
segregated schools. Slightly over 50% of the sample 
was married, 30% were single, and the remainder 
were widowed, divorced, or separated. In terms of 
employment status, about 15% were unskilled, un- 
employed, or on welfare; about 21% were semi- 
skilled, approximately 13% were classified as skilled 
manual, nearly 18% were clerical workers, sales- 
persons, or technicians; about 8% were adminis- 
trative or minor professionals, 22% were managers 
or lesser professionals; and 3% were executives or 
major professionals. Mean annual family income 
for the males was $14,756 and for the females was 
$12,296. In response to questions concerning history 
of treatment for emotional (nervous) condition or 
time served in prison, 3.7% of the males responded 
affirmatively to the former question and 21% to 
the latter question. The comparable figures for fe- 
males were 6.2% and .7%. A 

Comparison of sample demographic character- 
istics with contemporary census data (U.S. Bureau 
of the Census, 1976, 1977) suggests that the dis- 
tribution of age, marital status, and percentage of 
unemployment is comparable to national estimates 
of blacks 18 years and older, Examination of other 
variables indicates that our sample had less fre- 
quently failed to complete high school (17% vs. 
50%) and had attained more college experience 
(50% vs. 19%). A correlate of this greater educa- 
tion attainment in our sample was, in compari- 
son to census data, less unskilled and more mana- 
gerial and professional vocations, as well as a higher 
mean yearly family income. This substantially mid- 
dle-class sample was the result of sample selection 
methodology that for the most part included s0- 
cially active and community-oriented individuals. 
These biases in our sample were generally of the 
kind that would make our estimates of the presence 
of race-related differences at the item level con- 


servative rather than extreme; that is, a more 
representative, less educated group of subjects 
more dis- 


would presumably respond in an even 
crepant manner. 


Endorsement rates of all MMPI items were ex- 


amined to locate those items endorsed, either posi- 
or less of the black 


tively or negatively, by 10% 

sample. One could have analyzed male and female 
data separately or could have applied the 10% cri- 
terion to all 882 subjects regardless of sex. The 
former procedure was used to develop the MMPI 


F scale with the 111 males and 118 female proto- 
looks at the endorsement 


cols then available. If one I 
data for the larger, revised Minnesota adult group 
(Dahlstrom et al., 1975, Appendix A), it will be 
noted that both males and females met the 10% 
or less criterion for 47 of the 64 F items. With re- 
gard to the remaining items, males exceeded the 
criterion on 12 items, females on 4 items, and males 
and females on 1 item. However, if the responses 
of both sexes are combined, the criterion was met 
for 60 of the 64 F items. Since the MMPI F scale 
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can best be characterized as consisting of items 
rarely endorsed in the scored direction by the whole 
sample (despite the procedure originally used), we 
felt that a black F scale to be comparable should 
also be determined by responses of the entire sam- 
ple, However, if marked differences between the 
sexes on scale means (and standard deviations) 
were found, different raw-score to T-score conver- 
sions would be considered. 

Comparisons were made between blacks and 
whites on the items that met the 10% or less cri- 
terion for blacks, items on the F scale that met the 
criterion, items on the F scale that did not meet 
the criterion, and items not on the F scale that met 
the criterion for the revised white adult group. 


Results and Discussion 


Examination of the endorsement rates of 
the black sample disclosed 33 items that met 
the 10% or less criterion. Twenty of these 
items are keyed true (10, 14, 23, 48, 49, 85, 
104, 123, 151, 197, 210, 211, 227, 246, 291, 
324, 339, 365, 393, 565), and 13 are keyed 
false (2, 75, 88, 90, 113, 177, 196, 220, 257, 
258, 272, 276, 285). Although a number of 
statements that would be rated “obvious” for 
presence of psychopathology if answered true 
can still be found in this F scale for blacks 
(eg, “I believe I am being followed,” 
“Someone has been trying to poison me,” and 
“Everything tastes the same.”), it is note- 
worthy that a number of old standbys have 
dropped out (e.g., “My soul sometimes leaves 
my body,” “I see things and animals around 
me that others do not see,” and “I com- 
monly hear voices without knowing where 
they come from.”). 

The scores of the black normative group 
on this scale can be summarized as follows: 
for males, M = 2.97, SD = 4.02, and range 
= 0-20; for females, M =2.72, SD = 3.52, 
and range = 0-22. These values for males 
and females are sufficiently similar to make 
separate raw-score to T-score conversions un- 
necessary. A raw score of 3 can be con- 
sidered equivalent to a T score of 50; a raw 
score of 7, equivalent to a T score of 60; 
and a raw score of 11, equivalent to a T 
score of 70. An additional cohort of subjects, 
black psychiatric patients evaluated at La- 
fayette Clinic, obtained the following scores: 
for 197 males (M age = 29.04; M education 

11.93), M = 5.09, SD = 5.16, and range = 
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0-33; for 280 females (M age = 31.17; M 
education = 12.31), M = 5.92, SD = 4.95, 
and range = 0-33. The distributions obtained 
by normal and psychiatric black subjects 


Tange more widely over this scale than do | 


distributions of whites’ scores over the stan- 


dard F scale. It seems clear that some test- ? 


taking differences may still be reflected in 
the way that black subjects describe them- 
selves, even on this selected set of items. 
Relations among selected demographic 
variables, standard F-scale, and black F- 
scale scores were examined. Mean black F- 
scale T scores for various demographic group- 
ings obtained a limited range of 47 to 56, 
whereas standard F-scale T scores ranged 
from 53 to 75. Analyses of variance revealed 
that scores on both scales were significantly 
related to age, education, and job classifica- 
tion of head of household. Standard F-scale 
T scores, for example, averaged less than 60 
for those of our normal black subjects who 
were either 35 or older, had attained a col- 
lege education, had managerial or profes- 
sional status, or lived in a household headed 
by a managerial or professional person. If, 
on the other hand, the subject was 18-24 


years of age, had less than 12 years of edu- | 


cation, was unskilled or unemployed, or lived 
in a household with an unskilled or unem- 


ployed breadwinner, mean F-scale T scores | 


ranged from 67 to 75. 


Table 1 summarizes the endorsement a l 
of blacks and whites on the F items for | 


blacks that were enumerated earlier in E 
section, the MMPI F items, and the 38 itam 
that met the criterion for inclusion in the 


scale but were not so classified by the test | 


authors, Examination of these data shows 
that whites and blacks responded quite simi- 
larly to the 33 items that comprise the 

scale for blacks, Although these figures sug- 
gest a high degree of overlap between 1° 
sponses to the F scale of blacks and responses 
to a substantial proportion of what appe 
to be the MMPI F scale, it should be not 

that 6 of the 28 items do not appear ui 
the original F scale. Comparison of the a 
sponses of blacks and whites to the standar 
64-item scale shows considerable disagte” 
ment. Blacks exceeded the cutoff on 65.6% 


| 
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Table 1 
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Number of Blacks’ F Scale, MMPI F Scale, and Supplementa: 
and Above the 10% Level by Minnesota White parent Adis ar ieee a ie 
Black Adult Subjects in the New Normative Sample 


Blacks 

(n = 33) 
Group <10% >10% 
Blacks 33 0 
Whites* 28 4 


F items 
MMPI Supplementary 
(n = 64) (n = 38) 
eS aa 
<10% >10% <10% >10% 
22 42 6 32 
60 4 38 0 


g MMPI = Minnesota Multiphasic Personality Inventory. 
ne of the 33 items (i.e., 565) cannot be assigned to either category, 


since the exact percentage of true 


responses is not available for the revised Minnesota white adult group. 


of the items. Excluding the 4 items that did 
not meet the criterion for whites, there was 
agreement on 36.7% of the items. Perform- 
ance on the supplementary F items was even 
more disparate; blacks and whites agreed, in 
the sense of both meeting the 10% or less 
ee on only 15.8% of the items. Al- 
a whites were in agreement with 857% 
Rea. items designated as F for blacks, 
fis pecans with only 28 out of the 98 
at at meet the 10% criterion for whites. 
a al ama differences between blacks 
finn ites on the 70 items responded dif- 
ks y to ranged up to 30. The mean per- 
aa of difference for these items was 10. 
eS are many reasons for these differ- 
Bric explanation appears to be intrin- 
tts tees derivation of the MMPI F 
Tange e rarity of the deviant responses 
“ae ae 0% (every white female re- 
7 R true to Item 177, “My mother was 
tO abo woman”) to 10% (and in four cases 
might me as we have indicated). One 
elo if the endorsement rates of blacks 
à Pacha, disagreed with those of whites as 
es of degree of rarity. Perhaps the 
fered ie of blacks to these items only dif- 
Bini a those that were “marginal” mem- 
SF te e category. To test this hypothesis, 
one wit eg were divided into two categories, 
= 34) a 6% or more endorsement rate (n 
ean the other with a 5% or less en- 
that ee rate (n= 30). Of the 22 items 
ag lacks endorsed 10% or less, 18 fell in 
% or less white endorsement category, 


whereas 30 of the 42 items that blacks en- 
dorsed more than 10% fell in the 6% or 
more white endorsement category, x°(1) = 
16.5, p< 01. It would be simplistic (and 
not true) to say that blacks only disagreed 
with marginal F items, but the trend is defi- 
nitely in that direction. 

These findings can be viewed as what one 
might expect from a validity generalization 
study, that is, the typical reduction in num- 
ber of significant items, descriptors, or Cor- 
relates when original relationships are €x- 
amined via a new sample. Would any new 

similar results? Might not 
whites currently residing in 
Alabama, Michigan, and North Carolina with 
the vintage Minnesota whites lead to results 
similar to those found here? This is an em- 
pirical question, and we hope someone will 
collect the data to answer it, Our speculation, 
nt-day group of 
ch from the Min- 
nesota norm blacks did, be- 
cause mean MMPI F scores of white normals 
have not changed much from the established 
norms over the years, 
F scores have repeatedly bı ound te 
at least as high and often substantially higher 
than the norms. 

of overlap between blacks’ 


Since the degree 
and whites’ endorsement rates of MMPI F- 
only about 35%, it would 


scale items was 
ived F scale may 
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deviant responding (and perhaps noncon- 
formity?) than the standard scale. Whether 
blacks who obtain elevated scores on this new 
F scale are as moody, opinionated, rebelli- 
ous, and emotionally disturbed as whites who 
obtain elevated scores on the standard F 
scale have been shown to be must await 
further research. 
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Parental Personality Factors in Child Abuse 


John J. Spinetta 
San Diego State University 


In an attempt to demonstrate 
ents in personality variables, 


that abusing parents differ from nonabusing par- 
the Michigan Screening Profile of Parenting was 


administered to six groups of mothers: (a) adjudicated abusers, (b) spouses of 


adjudicated abusers, (c) mothers convicted 


mothers from a college student population, 
nonabusing mothers from a lower socio- 


middle socioeconomic level, and (f) 


of child neglect, (d) nonabusing 
(e) nonabusing mothers from a 


economic level. Major differences occurred when comparison was made of one 
or more of the first three groups with one of the latter three groups. The groups 


differed significantly on six factor-analyzed cluster categories: (a) 
to one’s own parents, (b) tendency to b 
toward isolation and loneliness, (d) expectations of one’s own children, 


ability to separate parental and child feelings, 
the first three groups scored at levels of higher 


whereas the abusers scored at the highest 


and control. In all of the cases, 
risk than did the latter three groups, 


coming upset and angry, 


relationship 
(c) tendency 
(e) in- 
and (f) fear of external threat 


risk levels throughout. It is suggested that a therapist who helps a parent de- 


velop the ability to maintain equanimity under stress, 


by helping reduce devia- 


tions from the norm in characteristics related to abuse potential, is ultimately 
helping to reduce actual abusive behavior. 


With the growing emphasis in the litera- 
ture on the fact that the causes of child 
abuse are multiple and interactive, many 
therapists who deal with parental personality 
and attitudinal variables are made to feel 
as if they are engaging in a futile effort 
(D’Agostino, 1975; Smith, 1975). Although 
many new and exciting identification and 
treatment programs for child abuse abound 
throughout the country (National Center on 
Child Abuse and Neglect, 1975, 1976), very 
little encouragement has been given to the 
therapist who does not have easy access to 
the new interdisciplinary treatment programs 
and who, in many instances, remains the 
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sole therapeutic agent for a particular set 
of families (Steele, 1975). The problem is 
viewed as sufficiently complex that an in- 
dividual therapist who deals solely with 
parental attitudes is often discouraged. It is 
the purpose of this study to demonstrate that 
parental personality and ‘attitude are impor- 
tant factors in the etiology of child abuse. 
Such a demonstration can give hope to the 
therapist that efforts in dealing with the 
parental personality are aimed in a profitable 
direction and that he or she can be effective 
in reducing potential for abuse. 

It is not my intent to suggest that factors 
of parental background or inadequacy are 
the sole determinants of child abuse, The 
fact is that the causes of child abuse are 
multiple and interactive; there is no single 
type of child abuser or a single causative 
factor as sufficient explanation of abuse 
(Spinetta & Rigler, 1972). Emphasis on 
parental personality is in no way meant to 
detract from these other factors. Rather, it 
is suggested that helping the parent to de- 
velop the ability to maintain equanimity 
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under stress is directly related to situational 
variables, and it can be of central value in 
the rehabilitative or preventive process. 

It is in the broader context of situational 
variables that I ask the question, Why is it 
that the majority of parents do not abuse 
their children? Although in the socially and 
economically deprived segments of the popu- 
lation there is generally a higher degree of 
the kinds of stress factors found in abusing 
families, the great majority of deprived fam- 
ilies do not abuse their children. Why is it 
that most deprived families do not engage in 
child abuse, though they are subject to the 
same economic and social stresses as those 
families who do abuse their children? Is 
there an actual difference between the types 
of stresses encountered by abusing parents 
and nonabusing parents within the same 
socioeconomic level (Gil, 1970, 1976), or is 
the difference in the parents’ manner of 
approaching the stress situation (Kent, 1976; 
Smith, 1975; Spinetta & Rigler, 1972; 
Young, 1976)? I hold the latter position. 
When one takes into account the fact that 
some well-to-do and middle-class families 
also engage in child abuse, then one must 
look for the causes of child abuse beyond 
mere socioeconomic stress. The problem of 
etiology remains insoluble at the demo- 
graphic level alone. 

The present study is an attempt to dem- 
onstrate that however one might explain the 
particular circumstances that helped shape 
the parents’ personality, abusing parents dif- 
fer from nonabusing parents in attitudinal 
and personality variables. 


Method 


Instrument 


In 1972, Schneider, Helfer, and Pollack disclosed 
efforts under way to design and validate a question- 
naire with the goal of uncovering parents who have 
a potential to abuse their small children, They based 
their questions on their clinical experience, which 
suggested that parents who abuse their small chil- 
dren reported more severe physical punishment in 
their own childhood, more anxiety about dealing 
with their children’s problems, more concern about 
being alone and isolated, more concern with crit- 
icism, and higher expectations for performance in 
their children than did nonabusers. After several 
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years of analysis and validation, they published 
first a 74-item and then a 50-item instrument, orig- 
inally entitled Survey on Bringing Up Children 
(Schneider, Hoffmeister, & Helfer, 1976). The instru- | 
ment has since been renamed the Michigan Screen- 
ing Profile on Parenting (Helfer, Schneider, & Hofi- | 
meister, 1977), 

Although the questionnaire has not yet been suf- | 
ficiently validated to be of use as a legally valid 
criterion in decisions regarding child placement or | 
parental readiness to resume parenting functions, it | 
has been shown to be capable of differentiating be- 
tween, attitudes regarding child rearing and regard- 
ing self-awareness and self-control functions in the 
parents, 

With the permission of Helfer, I administered the 
questionnaire to several groups of parents, as dis- < 
cussed below, to see (a) whether abuse-potential 
cluster categories similar to those found by Helfer 
and his associates could be validated in a local 
sample and (b) whether scores based on the locally 
factor-analyzed categories could sort out abusing 
from nonabusing parents. 


Subjects 


As is typical of parents who come to the attention 


of public agencies (National Center on Child Abuse 


and Neglect, 1975), the parents referred to the pals 
ticipating agencies were from low socioeconomic 
levels. The use of such parents in the present study 
is not meant to suggest that abuse takes place only 
at low socioeconomic levels, because it does not 
(Spinetta & Rigler, 1972), Similarly, although more 
Women than men have been found to abuse uar 
children (Gelles, 1973; Gil, 1970; Smith, 1975), chil 
abuse is not an act solely of the mother. However, 
the questionnaire was administered only to women 
to ensure nonconfounding by differences in child- 
rearing attitudes between men and women. f 
Subjects were chosen in the following man 
The participating agencies agreed to administer S 
questionnaire to all of the mothers currently under 
their jurisdiction as active cases. The questiona 
was administered to (a) adjudicated abusers, ( 
spouses of adjudicated abusers, and (c) parents a 
victed of child neglect. The parents in these E 
egories were chosen by the following criteria: 6 
The child was under 5 years of age, and (b) a 
adjudication had been finalized, so that pan 
would not feel that their answers would ataia i 
placement of their child or decisions regarding a 
own disposition. In this manner, workers were pen 
to ensure that responses to the questionnaire W' 
given as honestly as possible. t, the 
For purposes of comparison and contrast, 3 
questionnaire was also administered to groups er 
Parents who were nonabusers with children va ; 
5 years of age. The following groups were T op- 
(d) nonabusing mothers from a college stide F 
ulation whose children were in a day-care cen a 
cause one or both parents were in school, (e) level 
abusing mothers from a middle socioeconomic 
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whose children were in a preschool not because of 
necessity but through express parental wish, and (f) 
nonabusing mothers from a lower socioeconomic 
level with children in a preschool because the 
mother was working. Group f was chosen to match 
as closely as possible the educational, occupational, 
and socioeconomic status of Groups a, b, and c. 
Group d was chosen because it was similar to 
Groups a, b, and c in financial status but not in 
terms of education or potential occupation. Group €, 
different in terms of education, occupation, and 
financial status, and the most representative of the 
population as a whole, was chosen to test possible 
class differences in responding. 

The samples consisted of the following numbers: 
(a) adjudicated abusers, 7; (b) spouses of abusers, 
9; (c) parents convicted of neglect, 13; (d) non- 
abusing mothers from a college population, 15; (e) 
nonabusing mothers from a middle socioeconomic 
level, 15; and (f) nonabusing mothers from a lower 
socioeconomic level, 41. 

The purpose of the study was explained in detail 
to the respective supervisors, the agency officials in 
Groups a-c, and the day-care administrators and 
teachers in Groups d-f. Because of the sensitive 
nature of the accusation of child abuse and neglect, 
and to prevent socially desirable responses, parents 
were not told specifically that the survey’s ultimate 
purpose was to differentiate abuse potential. Rather, 
parents were asked if they wished to take part in a 
survey on attitudes in bringing up children, con- 
ducted by the university to learn how parents 
viewed child rearing. In accord with U.S. Depart- 
ment of Health, Education, and Welfare guidelines, 
parents were promised that the results would remain 
anonymous, and that any parent who wished would 
be given the overall results on completion of the 
study, 

All of the parents who were approached in 
Groups d and e, without exception, filled out the 
Survey as requested. Of the parents approached in 
Group f, all but three (93%) filled out the survey. 
The parents in Groups a-c were approached by as- 
signed workers who had established rapport with 
them and were told that this survey would not only 
aid the university but that it might be of thera- 
Peutic aid to the specific worker in each case. Each 
worker was asked to screen out those parents who 
Would be unduly threatened by the questionnaire, 
those who might be tempted to answer with socially 
desirable responses, and those whose cases were still 
Pending court completion. The workers did not re- 
ceive any refusals from the selected cases. The final 
Small sample thus represents responses from parents 
who were motivated to fill out the questionnaires 
as honestly as possible. Comments from each worker 
on each case attested to the honest efforts of the 
Parents who made up the final samples in Groups 
ee It is my belief that the final sample represents 
è e cases most amenable to treatment. There is no 
ason to suspect that the sample represents the 
most severe of the abusers. On the contrary, 
Workers’ case records show that the final sample is 
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on the conservative side of the abuse-potential con- 
tinuum in the agencies’ overall abuser population. 
Thus, any differences that appear between the 
abuser and nonabuser groups would appear at least 
equally as strong in the general abuser population 
of the agencies in question. With the questionnaire 
aimed at being of eventual use as an aid to the 
therapist in sorting out areas of weakness, honest 
cooperation of the parents was deemed essential. In 
addition, honest cooperation in each of the six 
groups minimized confounding that would appear if 
the groups differed in willingness to participate. 


Results 


A varimax rotated factor analysis of the 
responses to the questionnaires was con- 
ducted by the experimenter. The six clusters 
of variables closely resemble the high-abuse 
potential categories of Helfer et al. (1977). 
The six resultant clusters of the present anal- 
ysis are (a) relationship to one’s own par- 
ents, (b) tendency to becoming upset and 
angry, (c) tendency toward isolation and 
loneliness, (d) expectations of one’s own 
children, (e) inability to separate parental 
and child feelings, and (f) fear of external 
threat and control. 

With these six factor-analyzed cluster 
categories as a basis, a six-column scoring 
form was devised, with direction of scoring 
set so that the higher score on each cluster 
represented abuse potential. Total raw scores 
for each subject were determined for each 
of the six cluster categories. 

A 1X6 analysis of variance was per- 
formed for the six groups for each of the 
six abuse-potential categories. Table 1 gives 
the means and standard deviations for scores 
in each of the abuse-potential categories for 
each subject group. Table 2 gives the results 
of the analysis of variance for each of the 
six categories. ; 

Scores on each of the six abuse-potential 
categories showed that significant differences 
existed among the six groups (df = 5, 90 in 
all cases). The resultant F on the first abuse- 
potential category, relationship to one’s own 
parents, was 4.55, significant at the .001 
level. The resultant F of 6.70 on the second 
abuse potential category, tendency to be- 
coming upset and angry, was significant at 
the .001 level. The resultant F on the third 
category, tendency toward isolation and lone- 
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Table 1 
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Means and Standard Deviations in Each Abuse Potential Category 


i EE ee 


1: Abusers 2: Spouses 3: Neglect 4: College 5: Middle 6: Lower 

Cluster M SD M SD M SD M SD M SD M SD 

1 (Parents) 57.4 14.7 48.7 94 53.3 10.3 44.9 11.6 37.7 10.2 44.3 10.3 
2 (Control) 25.4 9.6 22.2 7.6 22.8 7.3 17.7 4.0 14.1 3.8 16.7 47 
3 (Affiliation) 31.9 82 26.9 5.3 25.9 4.0 22.5 4.2 19.9 3.8 22.5 48 
4 (Expectations) 39.1 18.6 37.3 11.7 34.3 10.7 28:1 7.8 22:3... 6.3 30.0 8.8 
5 (Symbiosis) 17.35 TSA 16,2: ward 19°2) 2:9 149 21 14.5 2.7 16.2 3,3 
6 (Threat) 61.3 16.5 52.6 12.9 57.4 10.8 40.7 8.7 29.3. 5.7 43.9 10.5 


liness, was 7.53, significant at the .001 level. 
The resultant F on the fourth category, ex- 
pectations of one’s own children, was 4.20, 
significant at the .001 level. The resultant F 
on the fifth category, inability to separate 
parental and child feelings, was 3.79, signifi- 
cant at the .01 level. The resultant F of 
13.92 on the sixth abuse-potential category, 
fear of external threat and control, was sig- 
nificant at the .001 level. 

A posteriori tests using the Scheffé method 
were conducted for each of the abuse-po- 
tential clusters. Significant differences were 
found as follows: Group a (abusers) sig- 


Table 2 
Analysis of Variance 


See 


Cluster MS F 

1 
Between 527.5 4.55** 
Within 116.1 

2 
Between 213.5 6.79** 
Within 31.5 

3 
Between 177.3 SS A 
Within 23.5 

4 
Between 409.3 4.20** 
Within 97.5 

5 
Between 35.9 3.79* 
Within 9.5 

6 
Between 1,546.3 13.92** 
Within 11.1 

Note. df = 5, 90. 
*p <.01. 
“p <00 


nificantly differed from Group e (middle- 
class nonabusers) in Abuse-Potential Clusters 
1, 2, 3, 4, and 6. Group a significantly dif- 
fered from Groups d and f in Abuse-Po- 
tential Clusters 2, 3, and 6. 

Group b (spouses of abusers) significantly 
differed from Group e in Abuse-Potential 
Clusters 2, 3, 4, and 6. 

Group c (neglecters) significantly differed 
from Group e in Abuse-Potential Clusters 1, 
2, 5, and 6. Group c significantly differed 
from groups d and f in Abuse-Potential Clus- 
ters 2, 5, and 6. | 

The Scheffé a posteriori test showed that 
the major differences in each of the six abuse- 
potential categories occurred when compati- 
son was made of one or more of the first 
three groups (abusers, abusers’ spouses, and 
neglecters) with one of the latter three 
groups (nonabusers), The greatest differ- i 
ences occurred when each of the first three 
groups was compared to the fifth group 
(middle-class nonabusers). In each of the 
abuse-potential categories, Group e scor 
at the lowest level. Group d (college student 
nonabusers) and Group f (lower socioeco- 
nomic level nonabusers) were the next lowest 
in abuse potential, scoring almost ienga 
throughout. Although the fifth group score 
lowest on all'of the categories, the other two, 
nonabuser groups scored at a level not Sig- 
nificantly higher. In contrast, the abi 
scored at the highest risk level in all but on 
of the abuse-potential categories. 


Discussion 


t- 
The Michigan Screening Profile on Oe 
ing was able to differentiate between abus 


and nonabusing mothers on personality and 


rived set of abuse-potential categories proved 
useful in significantly differentiating between 
abusing and nonabusing mothers within the 
same socioeconomic level in three areas: the 


| tendency to becoming upset and angry, feel- 


ings of isolation and loneliness, and the fear 
of external threat and control. The abusing 
mothers differed significantly from nonabus- 
ing mothers in a middle socioeconomic level 
in the same categories; in their relationship 
to their own parents, both past and present; 
in having higher than normal expectations 
for their young children’s performance; and 
in failing to separate their own feelings from 
those of their children. Although not at a 
significant level, abusing mothers differed 
from nonabusing mothers in the same socio- 
economic level in the latter categories as 
well. Neglecting parents and spouses of 
abusers were also shown to be weak in the 
six abuse-potential categories. 

Personality and attitudinal factors do 
make a difference. Abusing mothers differ 
from nonabusing mothers in areas of attitude 
and personality that have been clinically re- 
lated to potential for abuse (Colman, 1975; 
Corey, Miller, & Widlack, 1975; Kent, 1976; 
Paulson et al., 1974; Smith, 1975; Spinetta 
& Rigler, 1972; Steele, 1975, Tracy & Clark, 
1974; Walters, 1975). The fact that neglect- 


ee Ne ee ee eee 


ing mothers and spouses of abusers also 
scored high on the abuse-potential categories 
demonstrates the power of the test in point- 
ing to weaknesses in parental personality and 
attitudes that can affect the parenting role 
itself, regardless of whether the result is 
actual physical abuse, neglect of the child, 
or passively allowing one’s spouse to abuse 
the child. Intervention and direction is called 
for in each case. 

As stated above, there is no suggestion 
made that factors of parental inadequacy and 
Personality weakness are the sole determi- 
nants of child abuse. Certainly, those in- 
volved in the care of the abusing parent 
must continue to relieve the family as much 
as possible of overwhelming situational 
stresses. However, personality does play a 
Tole. The therapist who helps the parent de- 
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\ attitudinal variables. The empirically de-- 
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velop the ability to maintain equanimity 
under stress can be of immense aid in the 
rehabilitative or preventive effort. 

One must caution that the questionnaire 
cannot be used as a legally valid criterion 
for sorting out abusing from nonabusing par- 
ents, since false positives have been shown 
on occasion (Schneider et al., 1976) and 
since false negatives can appear with those 
parents who refuse to answer the questions 
honestly. It is possible to fake answers by 
giving socially desirable responses. However, 
for those parents in a therapeutic situation 
who respond to the questionnaire with an 
honest desire to be helped, the responses can 
help point to weaknesses in areas that have 
been clinically shown to relate to potential 
for abuse. A therapist who directs interven- 
tional and preventive efforts toward the 
amelioration of parental attitudes, both atti- 
tudes toward the self and toward the child, 
is not, as Alby (1975) suggested, misdirect- 
ing energies, but is rather helping reduce 
deviations from the norm in characteristics 
related to abuse potential and, hopefully, is 
ultimately helping reduce the actual abusive 
behavior. 
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Articulation-of-Body Concept scores 
preoperative level, 


operation and then reached ai 


bypass operation seemed to cause a temporary 


tremely stable personality dimension. 


“Field dependency has become an impor- 
tant personality dimension when attempting 
to explain individual patterns and styles of 
behavior (Witkin, Dyk, Faterson, Good- 
| enough, & Karp, 1962; Witkin, Goodenough, 
& Karp, 1967). Field-dependent individuals, 
phen presented with an organized Gestalt, 
"appear to have difficulty separating out parts 
of that whole, whereas field-independent per- 
sons are more able to perceive the discrete 
Parts of a complex pattern as separate from 
each other (Witkin et al., 1962). The de- 
Bree of field dependency within a certain 
individual appears to manifest itself in draw- 
ings of the human figure. Specifically, field- 
dependent persons show less articulation of 
body parts in their drawings than do field- 
independent individuals. 
An Articulation-of-Body Concept (ABC) 
scale for children has been developed and vali- 
dated by Marlens (Faterson & Witkin, 1970; 
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within 1 year subsequent to surgery, 
after the operation. During the first year after the operation, body concept was 
significantly poorer than at the preoperative level. However, 


from their postoperative disruption of body concept 
t least a preoperative level. Thus, the shock of the 
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Effects of Intestinal Bypass Surgery on Body Concept 


Peter M. Silberfarb, Patricia J. Phelps, 
Peter Hauri, and Charles Solow 
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Stability of body concept, as reflected in the Draw-a-Person Test, was dem- 


onstrated in 14 patients undergoing bypass surgery for severe obesity. Three 
for each patient were obtained: at the 


and at least 2 years 


patients recovered 
within 2 years after the 


disruption in this otherwise ex- 


Witkin et al., 1962). This scale basically re- 
flects three characteristics: (a) the extent of 
identity and sex differentiation in the figure 
drawings, (b) the amount of detail in the 
drawings, and (c) the form level of the fig- 
ures (i.e., general body shape and integration 
on features). Low scores indicate relatively 
primitive drawings, and higher scores are 
given for more sophistication. As predicted, 
children who score higher on the Marlens 
scale generally are found to be more field in- 
dependent on other measures as well, whereas 
children who draw relatively unarticulated 
figures are more field dependent (Witkin et 
al., 1962). Studies containing adults indicate 
similar relationships (Reitman & Cleveland, 
1964; Witkin et al., 1962). Occasionally, the 
5-point Marlens ABC scale is extended to a 
9-point scale in adult studies, and this 9- 
point version was used in the current evalua- 
tion. 
An individual’s level of field dependency 
appears to be stable. It is not changed by 
such interventions as sensoty isolation (Reit- 
man & Cleveland, 1964), electroconvulsive 
shock (Pollack, Kahn, Karp, & Fink, Note 
1), and drugs, including chloropromazine, 
imipramine, and alcohol (Karp, Witkin, & 
ollack et al., Note 1). 


Goodenough, 1965; P 
However, the question still remains as to how 
f a person’s score on 


much the stability © 
$00.75 
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the Marlens scale is dependent on the sta- 
bility of physical body over time. Although 
some of the manipulations studied so far 
have threatened body appearance, none have 
actually changed an individual’s body ap- 
pearance in any dramatic way. 

An ideal opportunity to study the sta- 
bility of body articulation scores presented 
itself when drawings were collected from pa- 
tients undergoing jejunoileostomy, an opera- 
tion that drastically alters body appearance. 
Jejunoileostomy (intestinal bypass surgery 
for massive obesity) leads to a relatively con- 
sistent, substantial, and lasting weight reduc- 
tion. Weight stabilizes after 1 or 2 years, 
usually at somewhat above ideal levels. The 
procedure, however, has distressing side ef- 
fects and complications that are sufficiently 
common to require caution and restraint in 
the use of this therapeutic approach. 

The specific question investigated in this 
study is whether the drastic and lasting 
changes in a person’s body appearance sec- 
ondary to jejunoileostomy lead to temporary 
or chronic changes in body concept as mea- 
sured by the Marlens scale, This question 
seemed relevant, because field dependency 
(indirectly measured by the Marlens scale) 
has assumed a fair amount of importance in 
psychological testing and personality theory. 


Method 
Subjects 


Twenty-nine patients were tested and interviewed 
before and after jejunoileostomy and at follow-up, 
at least 2 years later, as Part of an investigation 
concerning the psychological sequelae of jejunoil- 
eostomy (Solow, Silberfarb, & Swift, 1974). Of these 
29 patients, three scorable figure drawings (at pre- 
Operative, postoperative, and follow-up levels) were 
collected only from 14 patients. In 10 cases, the 
interviewer did not ask for figure drawings on all 
three occasions, because they had not been part of 
the test battery as originally planned. Five other 
patients drew stick figures on at least one of the 
three occasions. Such stick figures cannot be rated 
on the Marlens ABC scale. 

Of the 14 patients who Produced sets of three 
ratable figures, 10 were women and 4 were men. The 
ages ranged from 21 to 50 (M=36 years). Pre- 
operative weights ranged from 102 kg to 223 kg 
(M = 151 kg), representing from 39 to 160 kg over 
desirable weight (M overweight=84 kg). Mean 
weight loss at follow-up was 60 kg, with a range 
of 32 kg to 117 kg. Only 2 of the patients had be- 
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come obese in adulthood; the others had been over- 
weight since before ti age of 16, All patients had 
been severely obese fOr years and had tried many | 
diets without long-lasting success. | 
Eight of the patients in this study had been re- 
ferred for jejunoileostomy by their physicians, and 
6 were self-referred, although motivation was mixed 
in most cases. Five sought surgery primarily because 
of somatic concerns, whereas 9 were motivated pri- 
marily by psychosocial concerns such as the need 
to improve appearance, marriage, and so forth. 
Nine patients were considered reasonably well 
adjusted before the bypass operation, and five were 
seen as distinctly impaired psychologically. Among 
the latter, four were diagnosed neurotic, and one 
was believed to have a personality disorder. De- 
ficient self-esteem, marked self-consciousness about 
appearance, and vocational impairment character- 
ized at least two thirds of the total group and ap- 
pear to be related to the restriction of physical and | 
social activity caused by their massive obesity. 


Procedure 


Each patient was interviewed three times by 4 
trained psychiatrist, once at the preoperative level 
[within 1 week (M =4 days) before surgery], once 
at the postoperative level [from 161 to 370 days 
(M = 235 days) after surgery], and once at follow- 
up (at least 2 years after the bypass operation). 
During each of these interviews, the patient was 
asked to draw a person, Interviewers did not know 
the purpose of this request and were unfamiliat 
with the theory behind the Marlens ABC scale. 

For scoring, each drawing was randomly assigne 
a code number. All other identifying marks were 
removed from each drawing except for ident 
the patient as either male or female. Marlens, E 
developer of the 9-point ABC scale, then rated t 
drawings in a random order. Marlens knew noth 
about the purpose of this experiment except that 4! 
drawings came from adults. 


Statistical Analysis 


Ratings were evaluated by a two-way at 
variance using a mixed repeated measures deel 
[The 14 patients were interpreted as a BERT, 
variable, and the three conditions (prepa var! 
Postoperative, and follow-up levels), as a fixe d sig- 
able.] When the analysis of variance hor 
nificance, differences among the three EEEN 
were further evaluated by matched ¢ tests. „puted 
Pearson correlation coefficients were Beet 
among the three conditions (preoperative, Bal for 
erative, and follow-up levels) and were adjus 
multiple comparisons by Sheffé’s method. 


Results 


aap 
Table 1 indicates the mean ABC pene A 
They were similar at preoperative an 
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Table 1 
| Mean Articulation of Body Concept Ratings 
M ABC 
Time of drawing rating SD 
Preoperative (M = 4 days) 6.29 2.70 
Postoperative (M = 235 days) 5.43 2.34 
Follow-up (M = 32 months) 6.43 2.50 


Note. ABC = Articulation-of-Body-Concept scale. 


low-up levels but were somewhat lower (less 
well articulated) at the postoperative period. 
(For overall results of other interview data, 
see Solow et al., 1974). 

Tables 2 and 3 (correlational analysis) 
document the consistency of ABC ratings 
over time, Most of the variance in Table 2 
was due to variability from one patient to 
the other (F = 18.40, p < .001). Similarly, 
correlations from preoperative to postopera- 
tive and follow-up levels were highly signifi- 
cant, suggesting high test-retest reliability of 
the ABC scale over a fairly extended time 
period. 

Within this overall stability of the ratings, 
there was a slight but nevertheless significant 
decrease of ABC scores postoperatively. This 
is indicated by the significant “time of draw- 
ing” effect in Table 2 (F = 3.19, P< .05) 
and by the ¢-test analyses in Table -3: Scores 
Significantly decreased from preoperative to 
Postoperative levels but increased again at 
follow-up. 

Although the postoperative drawings were 
scored significantly lower than either the 
Preoperative or the follow-up drawings, 
weight loss (either in percentages of kilo- 


Table 3 
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Table 2 
Analysis of Variance for ABC Ratings 
Source SS df MS F 
Patients 221.62 13 17.05 18.40** 
Time of drawing 5.90 2 2.95 3.19* 
Residual 24.10 26 93 = 
Note. ABC = Articulation-of-Body-Concept scale. 
*p < 05. 
** p < 001. 
grams) was not significantly correlated with 


the deterioration in ABC scores from preop- 
erative to postoperative tests. 


Discussion 


The most important result of this study 
is the remarkable consistency in the pa- 
tients’ scores on: the ABC scale from the 
preoperative level to follow-up, even though 
these patients underwent marked changes in 
their own body appearance. (Mean weight 
loss was 60 kg!) This consistency in ABC 
scores in the face of marked body alteration 
supports the hypothesis that the degree of 
articulation of body concept in adults is not 
directly related to the patient’s physical ap- 
pearance. In addition, this finding is made 
more interesting by the fact that the massive 
weight reduction undergone by these patients 
must have also affected their postural ex- 
perience. ; 

One rationale for using human figure draw- 
ings in psychological testing is that they 
reflect the patient’s degree of field depend- 
ence. The present study suggests that in 


Comparisons Among the Three Sets of ABC Ratings 


Time between 


Sets compared compared sets t r 
Preoperative vs. postoperative M = 239 days 3.39%** oe 
Preoperative vs. follow-up at least 2 years 33 -81 

23i 7a 


Postoperative vs. follow-up 


at least 1 year 


Note, ABC = Articulation-of-Body-Concept scale. 
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adults, this concept is relatively fixed, be- 
cause marked alteration in the physical shape 
and size of the body did not change the ABC 
scores at follow-up. If body concept were a 
more fluid entity, one might expect figure 
drawings to change in proportion to the 
amount of body change incurred after the 
bypass surgery. There was no correlation, 
however, between the change in ABC scores 
and total weight lost. 

A second finding in this study lies in the 
fact that the jejunoileostomy did, tempo- 
rarily, shift the patients toward less sophis- 
ticated figure drawings, possibly indicating a 
temporary shift toward more field depend- 
ency after the operation. It seems unlikely 
that reduced motivation to draw the figure 
could have played some role in causing less 
articulated drawings after surgery. High test- 
retest reliability commonly found for ABC 
scores suggest that body articulation is only 
minimally influenced by transient factors 
such as motivation (Faterson & Witkin, 
1970). Also, it would be difficult to explain 
how reduced motivation could occur so con- 
sistently across subjects. Rather, it seems 
that the postoperative period of intestinal 
bypass surgery is a time of major readjust- 
ment, and one might speculate that this 
drastic alteration of one’s body may have 
disrupted the body image temporarily. 

In summary, jejunoileostomy in massively 
obese patients seemed to cause a temporary 
decrease but no permanent change in the 
articulation of body concept. Thus, field 
dependence seems to remain stable not only 
in relation to the effects of sensory isolation, 
drugs, and electroconvulsive therapy (as pre- 
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viously demonstrated) but to dramatic 
changes in actual body configuration as well, 
This stability is in sharp contrast to the 
changes noted in other Psychological vari. 
ables following jejunoileostomy surgery such 
as self-esteem and self-consciousness (Solow 
et al., 1974). This lends further support to 
the concept of field dependence as an endur 
ing personality trait. 


| 
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on the Defense Mechanisms Inventory 
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This study investigated the effect of sex difference, social desirability instruc- 


tions, and the birth order of respondents on the Defense 
(DMI). Using 30 male and 30 female undergraduates, 
each group were given regular instructions, and the 


Mechanisms Inventory 
half of the subjects in 
other half were instructed 


to respond so as to present a favorable impression. It was hypothesized that a 


sex difference would be found on 


Turning-Against-Others (TAO) and Turning- 


Against-Self (TAS) and that social desirability instructions would result in 


significant differences for TAO, 


Projection (PRO), 
and Reversal (REV). It was further hypothesized that 


Principalization (PRN), 
firstborns would report 


less TAO than later borns. In contrast to previously published reports on the 


DMI, a sex difference was found on 


effects were found on TAO, PRO, PRN, and REV. Thus, 
use of the DMI, but its potential clinical 


caution is in order regarding the 
utility suggests that further research is 


Although interest in and use of the notion 
of “defense mechanisms” is firmly ingrained 
in clinical practice, research on the cognitive 
and behavioral operations involved is remark- 
ably absent in academic psychology. In a 
recent publication, Gleser and Thilevich 
(1969) reported on the Defense Mechanisms 
Inventory (DMI). The DMI represents the 
first attempt to derive a comprehensive, 0b- 
jective, and behaviorally stated test of the 
traditionally defined mechanisms of defense. 
The test provides subjects with a checklist 
of possible situations, and the subjects are 
asked to indicate what would be their most 
likely and their least likely reactions. Scores 
on five defense clusters are derived from the 
responders’ rankings: Turning-Against-Others 
(TAO), Projection (PRO), Principalization 
(PRN), Turning-Against-Self (TAS), and 
Reversal (REV). (See Gleser & Ihilevich, 
1969, for a detailed description of each de- 
fense cluster.) 
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PRO only. Further, social desirability 
an interpretive 


warranted. 


The DMI has found a useful role in the 
medical setting inasmuch as practitioners are 
generally interested in making a statement 
about the structure and function of psycho- 
logical defenses in evaluating surgical and 
other medical patients. Two recently reported 
studies lend some support to the construct 
validity aspect of the DMI in this area. Gur 
and Gur (1975) reported that persons who 
scored high on REV (repressive/denial de- 
fenses) had significantly more psychosomatic 
complaints than persons who used affect-ex- 
pressive defenses of TAO or PRO. Klein, 
Gonen, and Smith (1975) reported that high 
TAS and REV scores were consistent with 
the psychogenic diagnosis of a patient with 
painful ecchymosis following surgery for a 
herniated lumbar disc. Similarly, Scholz 
(1973) reported on 35 suicide attempters 
paired with 35 nonsuicidal neuropsychiatric 
patients. As hypothesized, suicide attempters 
displayed significant differences on the TAS 
dimension. 

Psychometric data on the DMI suggest 
adequate reliability, although one study 
found a sex difference on three of the five 
defense scales, with males scoring higher on 
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TAO and PRO and females scoring higher 
on TAS (Weissman, Ritter, & Gordon, 1971). 
Gleser and Ihilevich (1969) originally re- 
ported a sex difference for TAS only. Finally, 
in another construct validity study, Gleser 
and Sacks (1973) reported that the DMI 
adequately predicted actual behavior in a 
conflict situation for males but not for fe- 
males. One possible explanation for this find- 
ing is the influence of a social desirability 
factor on the females’ reactions to the experi- 
mental manipulation, In fact, the social de- 
sirability aspects of the DMI have not been 
evaluated. Such an evaluation may help 
clarify the potential clinical utility of the 
DMI. Further, since published results are 
inconsistent with regard to sex differences, 
and since it is possible, for example, that the 
effects of sex and social desirability may have 
been influential in the Gleser and Sacks 
study, the DMI was evaluated to determine 
the maximum extent to which it may be in- 
fluenced by social desirability in the context 
of sex differences. ‘Tt was hypothesized that 
sex differences would be found for TAO 
(males higher) and TAS (females higher) 
and that main effects would obtain for social 
desirability instructions on TAO, PRO, PRN, 
and REV. Finally, firstborns were expected 
to use TAO less than later borns. This hy- 
pothesis was based on the findings of War- 
ren (1966) that firstborns tend to be more 
concerned with social desirability than later 
borns and the view that TAO would be seen 
as a less socially desirable mode of conflict 
resolution, It also served as a more subtle 


indicator of the social desirability features of 
the DMI. 


Method 
Subjects 


The study contained 60 undergraduate students 
from an introductory psychology class. There were 
an equal number of males and females ranging in 
age from 18 to 23 years (M = 19.1). 


Procedure 


All students completed the DMI and provided 
information about birth order and age. One group 
was given regular instructions as provided with the 
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DMI. The second group was given social desira- 
bility instructions. Each group consisted of 15 males 
and 15 females. The social desirability instructions 
consisted of a modification of Paragraphs 2 and 4 
in the regular instructions, These paragraphs were 
modified to read as follows: 


What we want you to do is to select the ong 
answer of the five which you think is most 
socially appropriate. That is, select the answer 
which is most likely to create a favorable im- 
pression in the social situation, and fill in the 
box labeled “T” by the number corresponding to 
that answer on the attached answer sheet, Then | 
select the one answer which you think is least 
favorable or least socially appropriate and fill in 
the box by that number labeled “F.” Remember: 
You are not to answer in the way that you would 
necessarily react but in the way a person would 
react who is trying to create a favorable social 
image, | 
There are no right or wrong answers here; the 
only thing that should guide your selections is 
your knowledge about how to create a favorable 
image of yourself. Allow your mind to imagine 
for a moment that the event described in the 
story is really happening to you, even though 
you may never have experienced such an event. 
Remember, we are not asking what your behavior 
and responses would be but rather your opinion 
of what the likely behavior would be of some- | 
one trying to present themselves favorably ini 
our society, | 


The data were analyzed by multivariate analysis 
of variance with sex and instructions as independent 
variables. A ¢ statistic was derived to test the hy- | 
pothesis of a birth-order difference on TAO. 


Results 


Multivariate analysis of variance on the 
five dependent measures resulted in a sig- 
nificant overall main effect for instr | 
approximate F(5, 52)= 12.20, <0 
(Dixon, 1973). The interaction was not sig- 
nificant. Subsequently, each dependent mea- 
sure was evaluated in a univariate analysis 
of variance that produced significant main 
effects for sex on PRO, F(1, 56) = 5 
< .05, and for instructions on TAO, ae 
PRN, and REV, F(1, 56) = 32.28, p < 00l, 
F(1, 56) = 16.63, p< 001, F(l, 56) 5 
12.59, p < 001, and F(1, 56) = 52.28, P < 
-001, respectively. Even though no D 
tions were significant, there was a oe 
for males to decrease TAS under social 20 
Sirability instructions, ¢(58) =14, P< -4” 
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whereas females showed no change in TAS in 
the social desirability condition, Finally, 
firstborns scored significantly lower on TAO 
than later borns (Ms = 32.1 and 39.0, SDs 
= 7.96 and 7.01, respectively) as predicted, 
1(58) = 2.33, p < .05, see Table 1). 


Discussion 


Based on the results of this study, it is 
clear that the question of sex differences in 
defensive style is unresolved. In contrast to 
two earlier studies (Gleser & Ihilevich, 1969; 
Weissman et al., 1971), the present study 


failed to demonstrate a sex difference for 


TAS. Further, the finding that males scored 
higher than females on PRO is consistent 
with Weissman et al., but their additional 
sex difference on TAO did not occur in this 
study. Some of these discrepancies may be a 
function of regional differences in the sample. 
Previous studies were done in Northern cities, 
whereas subjects in this study were primarily 
from the rural South. Specifically, the influ- 
ence of the Southern Baptist religious orien- 
tation (40% of the sample) may have mili- 
tated against the selection of TAO defenses 
in both sexes. The fact that PRO, related to 
TAO in terms of the direction of affective 
expression, was used more by males than 
females suggests that although males did not 
report a direct response against the frus- 
trating object or person, they were willing 
to attribute negative characteristics to frus- 
trating persons. This represents a more subtle 
form of aggression and externalization of 
hegative affective experience. It should be 
noted that the present male sample had a 
mean TAO score of 33.6 compared to the 
mean TAO score of 40.9 reported for a col- 
lege sample by Gleser and Thilevich (1969). 
The absence of a sex difference for TAS in 
the present sample is inconsistent with all 
Previously published reports. Higher intro- 
Punitiveness in males may also be a function 
of the sample, however, and the effects of 
Socioeconomic level, religious orientation, and 
demographic variables should be explored. 
Alternatively, one of the most serious short- 
Comings of the DMI in its present form is 
that it is an ipsative measure. Thus, given 
the data from this study, it is likely that one 
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Table 1 

Means and Standard Deviations for DMI 
Subscales With Regular and Social 
Desirability Instructions 


eee 


Males Females 
Instructions M SD M SD 
DMI 
TAO 33.6 12,4 36.6 8.1 
PRO 39.5 5.1 37.3 50 
PRN 46.5 5.6 44.8 5.6 
TAS 401 9.9 40.5 68 
REV 40.7 7.9 40.9 85 
Social desirability 
TAO 21.5 12.4 17.9 83 
PRO 35.1 6.0 30.9 4,2 
PRN 50.5 6.7 51.5 5.5 
TAS 36.3 85 39.7 5.8 
REV 56.6 10.5 60.0 10.3 


Note. DMI = Defense Mechanisms Inventory; 
TAO = Turning-Against-Others; PRO = Projec- 
tion; PRN = Principalization; TAS = Turning- 
Against-Self; REV = Reversal. 


or the other of the discrepant findings (low 
TAO in males or high TAS in males) is re- 
lated to the ipsative property of the instru- 
ment. (For a comprehensive treatment of 
this psychometric issue, the reader is referred 
to Block, 1957; Broverman, 1962; Cattell, 
1944.) 

The findings related’ to social desirability 
have important implications for the clinical 
utility of this instrument. The present sample 
readily identified the TAO and PRO items 
as less desirable than the PRN and REV 
items, Thus, if there is any motivation to 
present a favorable picture of oneself, it 
seems clear that the instrument in its present 
form does not provide adequate safeguards. 
A more subtle indicator of the social desir- 
ability effect is provided by the finding that 
firstborns, generally considered to be more 
inclined to respond in socially desirable ways 
(Warren, 1966), in fact scored lower on TAO 
(a defensive style that was avoided by per- 
sons given social desirability instructions) 
than did later borns. te 

In summary, the DMI is a promising in- 
strument in research on defensive and coping 
processes in personality research. It seems 
almost certain that it will find increasing util- 
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ity in psychosomatic medicine (Gur & Gur, 
1975; Klein et al., 1975). The present study 
provides data that suggest the need for cau- 
tion, however, in using the instrument clini- 
cally because of apparent high susceptibility 
to the influence of social desirability response 
biases. Finally, the ipsative nature of the in- 
strument in its present form has been noted, 
and the psychometric problems posed in this 


regard should be explored in subsequent re- 
search. 
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Blood Nicotine and Carboxyhemoglobin Levels 
After Rapid-Smoking Aversion Therapy 


M. A. H. Russell, Martin Raw, C. Taylor, 
C. Feyerabend, and Y. Saloojee 
University of London, London, England 


Blood nicotine and carboxyhemoglobin (COHb) levels after rapid smoking were 
studied in 15 smokers. Blood nicotine averaged 48.1 ng/ml after rapid smoking 
compared to 32.4 ng/ml after normal smoking (p < .002), and COHb levels 
averaged 12.1% and 8.9%, respectively (p <.001). Normal smoking levels of 
92 smokers in other studies averaged around 30 ng/ml nicotine and 8.2 to 8.5% 
COHb. There was no evidence that the degree of nicotine and carbon monoxide 
intoxication produced during rapid smoking had any relation to the reduction 
in the desire to smoke immediately after the session or to the decrease in 
cigarette consumption on the following day. The potential risks of rapid smok- 
ing are discussed. Tt is suggested that these risks might be reduced by using a 
beta adrenergic blocker and that the procedure could be made completely safe, 
possibly without loss of treatment effect, if subjects were instructed not to inhale. 


_Among the many aversive techniques de- 
vised for the treatment of dependent cigar- 
ette smokers, none have so far shown a con- 
sistent specific effect when subjected to con- 
trolled trial or replication by other research 
workers. A possible exception is a method of 
tapid smoking developed by Lichtenstein and 
his colleagues (Lichtenstein & Danahar, 
1976; Lichtenstein, Harris, Birchler, Wahl, 
& Schmahl, 1973). The procedure involves 
subjects smoking their usual brand of cigar- 
ettes in a rapid and continuous manner, in- 
haling one puff every 6 sec until no further 
smoking can be tolerated. It is not clear how 
much the aversiveness and the subjects’ tol- 
erance limits are determined by nicotine in- 
toxication or by local irritation of the mouth 
and respiratory tract. 

Hauser (1974) has pointed out that ex- 
Cessive intake of nicotine could induce car- 
diac arrhythmia, especially in subjects with 
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coronary heart disease. Screening to com- 
pletely exclude coronary heart disease is 
virtually impossible, and a normal resting 
(or even posteffort) electrocardiograph is no 
guarantee of freedom from such disease. De- 
spite Hauser’s plea for caution, “thousands 
of smokers have undergone rapid smoking, 
many of them in a commercial program 
operated by Schick Laboratories” (Danaher, 
Lichtenstein, & Sullivan, 1976, p. 556). 

In a therapeutic situation, a degree of 
risk is justified, provided that it is neces- 
sary to offset an even greater risk. Assess- 
ment of these risks requires consideration of 
both severity and probability of the poten- 
tial consequences of having treatment and 
going without it. Clearly, therapy should go 
ahead only if the balance is in the subjects’ 
favor, Furthermore, the assessment should be 
made in consultation with the subjects, for 
it is their own evaluation of the severity of 
consequences that should count. The thera- 
pist’s role in a decision that is not of clear 
and obvious overall benefit to the subject 
should be confined to providing information 
on the nature and probability of possible 


hazards. 


At present, information on rapid smoking 
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is deficient not only with respect to the 
degree of hazard, but some doubt also exists 
as to whether there is any specific benefit, 
Apart from the numerous other factors in 
the treatment situation (attention placebo, 
self-monitoring, persuasion from the thera- 
pist etc.), the specific contribution of rapid 
smoking has been small and in some cases 
only transient (Lando, 1975). Indeed, Lich- 
tenstein himself in his latest appraisal mod- 
estly concluded that “it must be acknowl- 
edged that the magnitude of the rapid 
smoking effect does not appear to be large 

. the interpersonal/persuasive aspects of 
the treatment setting are a significant source 
of variance” (Lichtenstein & Danaher, 1976, 
p. 99). 

Of the many toxins present in tobacco 
smoke, excessive absorption of nicotine and 
carbon monoxide (CO) represent the greatest 
potential hazards of rapid smoking, for it is 
the acute toxic effects that are relevant to 
this situation rather than the long-term 
hazards such as lung cancer, Aronow (1976) 
has reviewed the literature on the effects of 
nicotine and CO on coronary heart disease. 
At levels produced by normal smoking, the 
effects of CO and nicotine interact in such 
a way as to increase the risk of a sudden 
heart attack and also to increase the risk 
of sudden death in the event of such an 
attack, The interaction is complex but con- 
sists basically of the setting up of interre- 
lated vicious cycles, For example, nicotine 
increases the oxygen requirement of heart 
muscle, whereas carboxyhemoglobin (COHb) 
reduces its availability. COHb also impairs 
the pumping power and efficiency of the 
heart muscle, leading to further deteriora- 
tion in its oxygen supply. When the heart 
muscle is suffering from lack of oxygen, both 
nicotine and COHb reduce the ventricular 
fibrillation threshold or, in other words, in- 
crease the risk of developing a fatal loss of 
rhythm. 

Few studies have attempted to estimate 
the hazards of rapid smoking, One study 
showed that blood COHb, an index of co 
absorption, increased from an average of 
4.2% before to 7.3% after Tapid smoking 
(Dawley, Ellithorpe, & Tretola, 1976). Un- 
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fortunately, no details are given as to the 
method of rapid smoking used, and crucial 
data such as rate, duration, and number 
cigarettes smoked during the session ari 
mentioned. Another study by Lichtenstein 
his colleagues (Danaher et al., 1976) shov 
that rapid smoking produced a significan 
greater rise in heart rate than normal smok- 
ing, but they also found that CO absor 
tion was not significantly greater, Howe er, | 
their method of COHb estimation was somes] 
what archaic, and they conceded the nee 
for further study using more sophisticate 
measures. More recently, Horan, Ha e 
Nicholas, Linberg, Stone, & Lukaski (19 
have reported the occurrence of electr 
diographic abnormalities during rapid smok 
ing, and they suggested that these may l 
due to nicotine intoxication (Horan, Li 
berg, & Hackett, 1977). Indeed, they con 
mented that “normative data on the amoul 
of nicotine absorbed by subjects during t: 
smoking are an urgent research priori y 
(Horan, Linberg, and Hackett, 1977, p. 346) | 

We are conducting an evaluative trial ol 
tapid smoking among subjects attending 4 
tobacco withdrawal clinic. Full analysis 0 
our findings is awaiting long-term follow-up 
However, in those subjects randomly 4 
signed to rapid smoking, we have measu 
blood nicotine and COHb levels after nor al 
smoking and after their first session of rapi 
smoking. We have also been able to, com 
Pare the levels produced in these subjects ? 
rapid smoking with the average norm 
smoking levels of subjects from a numbel 
of previous studies. Finally, to see whe 
nicotine intoxication contributes to the the 
peutic effect, we have examined the rela 
between the degree of excessive nicotine W 
take during rapid smoking and the reductio 
in cigarette consumption on the day afte 
first session. 

These findings are presented here, and pi 
implications for the possible hazards of ra 
smoking are discussed. No other study # 


ing, and excessive nicotine intake is the al 
potential hazard. It was indeed the f 
concern of Hauser (1974). 
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Method 


Subjects 


were 5 men and 10 women who 
Smokers Withdrawal Clinic at the 
Maudsley Hospital, London, England, and who 
volunteered to take part in a clinical trial of some 
‘new behavioral treatments.” The trial in ques- 
tion was a straight comparative study of rapid 
smoking, cue exposure, and simple support. (This 
study is still in progress at the stage of long-term 
follow-up.) Subjects were excluded if they had a 
dinical or family history suggesting the possibility 
of coronary heart disease. All men over 40 years 
old and women over 50 were excluded even in the 
absence of such evidence. Of the 17 eligible sub- 
jects who were randomly assigned to rapid smok- 
ing, complete blood data were missing for 2; 1 had 
a needle phobia, and in the case of the other, a 
specimen tube was broken in the laboratory. 


The subjects 
attended the 


Procedure 


Venous blood samples were taken on the day 
of the first session of a course of rapid-smoking 
treatment, The sessions took place in the late after- 
noon (between 4 pm. and 6 pm.), and subjects 
were instructed to smoke normally until the start 
of treatment. On arrival at the clinic, they were 
asked to smoke a cigarette in their usual way, 
and a blood sample (normal-smoking sample) was 
then taken 2 minutes after the cigarette was com- 
pleted. The blood nicotine concentration falls rapidly 
from a peak just after a cigarette. In all of our 
studies, we have used a time interval of 2 minutes 
after a cigarette to approximate to the peak level 
of nicotine. The rapid-smoking session was started 
about 15-20 minutes after the normal-smoking 
sample, and a second blood sample was taken as 
close as possible to 2 minutes after completion of 
the last trial of the session. As the sessions were 
given in small groups of four or five subjects, it 
was not always possible to take the blood within 
a minutes of discarding the last cigarette of the 
Session, For one or two subjects, it was taken at 
4-6 minutes, The error from such delay would tend 
to make the rapid-smoking nicotine values appear 
slightly lower than their true levels. 

To have reversed the order of sampling in half 
of the subjects by taking the normal-smoking sam- 
ple after the rapid-smoking sample might have been 
methodologically more correct, but a cigarette taken 
even 30 minutes after a rapid-smoking session would 
ies have been quite “normal.” Besides, the subjects 
ae been smoking normally. all day, and the cigar- 
h e smoked for the “normal-smoking sample” would 
es little extra effect on the nicotine level of 
Re rapid-smoking sample” (Russell, Feyerabend, 

Cole, 1976). 
ae rapid smoking was conducted as follows: 

e subjects were instructed to puff and inhale 
leeply every 6 sec and to continue for as long as 
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possible, even if it meant lighting a second or third 
cigarette. This was followed by a 5-min rest, after 
which the sequence was repeated for up to three 
trials if necessary. Many subjects felt unable to 
face a third trial. 

The desire to smoke was rated on a 7-point scale 
before the first and after the last trial. The ratings 
were as follows: 0= none, 1=very slight, 2= 
slight, 3 = moderate, 4= fairly strong, 5 = strong, 
6 = very strong. 


Blood Analysis 


Blood samples were analysed for COHb using 
an IL 182 CO-oximeter (Russell, Cole, & Brown, 
1973) and for nicotine using gas chromatography 
(Feyerabend, Levitt, & Russell, 1975). The labora- 
tory workers who did the analysis were unaware 
of the design or purpose of the study. 


Results 


shown in addition to 
the means (Table 1). When individual risk 
is involved, it is relevant if only 1 case in 20 
shows an excessive or unusual response. 


Individual data are 


Rapid Smoking in Practice 


The amount. of rapid smoking undertaken 
was subjectively determined by the individ- 
ual’s own tolerance. Seven subjects had three 
trials and seven had two trials, but Subject 
15 (Table 1) could manage only one trial. 
On average, 3.3 cigarettes were smoked dur- 
ing the session at a rate of 2.5 min per cigar- 
ette. Thus, at 1 puff every 6 sec, a mean of 
25 puffs was obtained from each cigarette, 
which is at least double the number taken 
during normal smoking and by the standard- 
ized puffing of the smoking machine at rou- 
tine analysis of tar and nicotine yields 
(Rothwell & Grant, 1974). The total dura- 
tion of smoking during the session averaged 
8.3 min, but it varied widely between sub- 
jects, ranging from 2 min (Subject 8) to 
14.25 min (Subject 4). There was a tendency 
for the amount of rapid smoking tolerated to 
decrease with each successive trial. The av- 
erage duration of the trials were 4.6, 3.0, 
and 2.1 min for Trials 1-3, respectively, and 
the average number of cigarettes smoked per 
trial were 1.7, 1.4, and .8, respectively. The 
tolerance limits were determined by irritation 
to the mouth and throat (six subjects) or by 
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Table 1 
Blood Nicotine and Carboxyhemoglobin (COHb) 
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Levels After Rapid Smoking 


Plasma nicotine 


COHb (%) (ng/ml) 
Nicotine No. 

Initial daily yield of cigarettes Before After After After 

cigarette cigarettes smoked in rapid rapid normal rapid 

Subjects consumption (mg) session smoking smoking smoking smoking 

Males 

1 26 (45) — 3.5 2.8 5.3 49.6 64.0 

2 34 (27) 1.0 4.3 12.1 15.8 16.2 58.3 

3» 72 (60) 1.9 3S 11.9 14,2 41.6 70.0 

4a 21 (25) a 4.5 9.0 13.1 25.7 54.5 

5 17 (20) 1.4 2.5 9.2 12.6 25.3 29.3 

M 34.0 (35.4) 1.3 3.7 9.0 12.2 31.7 55.2 

+SD 22.2 (16.7) +.5 +.8 +38 +440 +13,6 +15. 
Females 

6 28 (30) 7 4.0 8.0 11.5 29.4 62.8 

7 21 (30) 1.3 2.8 1.7 10.2 32.6 48.3 

8 14 (18) 1.2 1.0 3.9 4.8 12.9 10.8 

9e 34 (40) 7 3.3 7.7 11.0 29.8 53.6 

10° 33 (35) ad 2.6 8.0 10.4 29.8 29.1 

11° 28 (40) 1.3 4.5 10.6 14.5 37.2 44.4 

12 36 (45) 1.2 3.5 10.7 14.4 42.5 58.9 

138 30 (45) 1.2 4.5 10.2 14.4 49.2 38.2 

14" 18 (20) 9 4.0 9.4 13.8 19.2 48.1 

15 24 (23) 1.3 1.5 12.9 15.5 45.5 51.0 

M 26.6 (32.6) ti 3.2 8.9 12.1 32.8 44.5 

+SD +£7.2 (10.0) +.3 +1.2 424 43.2 +11.3 +153 

Total 29.1 (33.5) 14 3.3 8.9 12.1 32.4 48.1 

M + SD 413.7 (12.1) E3 EITE. 434 +11.6 15.1 


Note. The data for daily 


parentheses. Subject 1 rolled his own cigarettes, 


who managed only one trial, 


nausea (nine subjects), but no subject vom- 
ited. The experience of nausea was not sig- 
nificantly associated with either relative or 
absolute increase in plasma nicotine level 
after rapid smoking. 


Carbon Monoxide Intake 


COHb levels before and after rapid smok- 
ing are shown in Table 1. In view of the 
slow removal of CO from the body, especi- 
ally under sedentary conditions, the decrease 
over the 15- to 20-minute period between 
taking the “before” sample and starting rapid 
smoking would haye been small (< .5% 
COHb; Russell, Wilson, Cole, Idle, & Feyer- 
abend, 1973). However, this means that the 
full increase attributable to rapid smoking 


a 
cigarette consumption are the self-recorded levels, with the initial levels claimed in 
so that the nicotine 
ences between men and women were statistically significant. 


* These subjects tolerated three trials during the session; the others all had two trials except Subject 15, 


yield was unknown. None of the differ- 


would have been a little more iol 
average increase per subject of 3.2 (SD= 


-9) % COHb. As expected, the difference ue 
tween the mean COHb before and after by 
smoking was highly significant, eee) 
12.99, p < .0001. The increase in COHb m 
related to the number of cigarettes ae 
in the session, r(14) = .81, p < 01, an À 
the total duration of smoking during Ki 
session, r(14) = .58, p < .05, but it was a 
significantly greater in those who tolera A 
three compared to two trials during the E 
sion (3.5 vs. 2.9% COHb), #(11) = 12, 


Plasma Nicotine Levels 


PS i 
The comparison of plasma nicotine 


centrations after rapid smoking with P 


| 


| 
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levels obtained from normal smoking are 
shown in Table 1. The average of 48.1 ng/ml 
after rapid smoking is significantly higher 
than the normal smoking mean of 32.4 ng/ 
ml, (14) = 4.01, p<.002. Plasma nico- 
fine decreases rapidly over the first 10-15 
minutes after completing a cigarette (Isaac 
& Rand, 1972), so that the levels after rapid 
smoking would have been even higher had 
all the blood samples been taken within 2 
minutes of completing the last trial. It is 
probable that a slight delay in taking the 
blood accounted for the few subjects whose 
tapid-smoking levels did not substantially 
exceed the normal-smoking level (e-g., Sub- 
jects 8, 10, and 13). However, no record 
was kept as to which subjects did experience 
such delay. The average relative excess of 
plasma nicotine after rapid smoking was 
ne above the normal-smoking level (Table 

The plasma nicotine after rapid smoking 
was related to the number of cigarettes 
smoked in the session, 7(14) = 55, p<.05, 
and to a lesser extent with total duration 
of smoking during the session (r = .44, ns). 
There was no significant difference in the 
levels obtained by those who tolerated three 
versus two trials, for whom the means were 
48.3 and 47.5 ng/ml, respectively. Plasma 
nicotine levels after normal smoking and 
rapid smoking were not significantly corre- 
lated (r = .42, ns). The association between 
the subjects’ plasma nicotine levels and their 
Usual cigarette consumption was statistically 
significant for rapid smoking, r(14) = .56, 
b< .05, but not for normal smoking (7 = 
37, ns), The nicotine yield of the cigarette 
smoked seemed to have little effect on the 
Plasma nicotine level produced (r = 40, ns, 
for normal smoking; and r= .06 for rapid 
smoking). 


Heart Rate 


N Pulse rate was counted for only four sub- 
jects. However, it seemed clear that the in- 
Crease was produced by the first rapid-smok- 
ing trial and that the second trial produced 
no further increase. The four subjects had 
a mean rate of 88 per minute before start- 


1427 


Table 2 

Lack of Relation of Treatment Effects of the 
First Session to Rapid-Smoking Variables 
and Plasma Nicotine and Carboxyhemoglobin 
(COHb) Increase During Rapid Smoking 


% reduction 
in cigarette 
Reduction consumption on 


in desire day after the 
Variable to smoke session 
M for subjects who had 
2 trials* 3.7 75.4 
3 trials* 24 55.8 
t 1.04 .99 
Product-moment correlations? 
| No. cigarettes 
smoked in session +33 27 
Time taken to 
smoke each 
cigarette —.21 —.38 
Total smoking time 
in session 13 —.04 
Increase in COHb% 29 .22 


Blood nicotine after 

rapid smoking 30 28 
Ratio of rapid- to 

normal-smoking 

blood nicotine 21 29 


Note. One subject (15) tolerated only one trial. For 
a sample size of 15, a 5% level of significance re- 
quires a correlation of .51 or more. The reduction 
in the rating of desire to smoke over the course of 
the session correlated .60 with the percentage of 
reduction in cigarette consumption on the following 


day. 
n= 7. 
by = 15. 


ing the session. This increased to 106.5 after 
the first and 104.7 after the second trial. 


Treatment Response to Rapid Smoking 


A full assessment of the value of rapid 
smoking as a treatment for smokers will be 
reported separately after a 1-year follow-up, 
but two potentially relevant variables for 
treatment outcome deserve mention here be- 
cause of their lack of significant association 
with the amount of rapid smoking in the 
session or the degree of nicotine and CO 
intake (Table 2). 

Subjective ratings of desire to smoke. 
The average rating of 3.7 (SD=1.7) before 
the session decreased to 6 (SD = 1.0) just 
after the last trial, t(14) = 6.87, P< .001. 
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Table 3 


Means + Standard Deviations for Blood Nicotin 
Smoking Compared to Normal Smoking Data Fr 
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e and Carboxyhemoglobin Levels After Rapid 
‘om Other Studies 


SS SS ee 


Usual daily Plasma 
No. cigarette COHb nicotine 
Study subjects consumption (%) (ng/ml) 
Other 
Russell, Wilson, Patel, Cole, and Fi eyerabend 
(1973); Russell, Wilson, Patel, Feyerabend, 
and Cole (1975) 10 27.24 6.9 8242.2 30.1 +107 
Russell, Wilson, Feyerabend, and Cole (1976) 43 33.3 417.1 8.5 + 2.6 30.1 + 12.5 | 
Russell, Sutton, Feyerabend, Cole, and 
Saloojee (1977) 21 33.1 + 11.5 8242.7 31.5 +128 
Sutton, Feyerabend, Cole, and Russell (1978) 18 28.8 + 9.8 8.5 + 2.1 29.0 + 17.3, 
Present 
Normal smoking 15 29.1 + 13.7 89428 324 +116 
Rapid smoking 12.143.4 48,1 (+ 15.7) 


Note. COHb = carboxyhemoglobin. Blood samples were all taken in the afternoon approximately 2 minutes | 
after completing a cigarette. With the exception of rapid smoking, all subjects had spent the day smoking 


their usual brand of cigarettes in their usual manner. 


The subjects had been smoking normally and 
had not been deprived before the session, so 
the initial rating was not very high. 

Self-recorded cigarette consumption. On 
the day after the rapid-smoking session, cig- 
arette consumption decreased on average to 
32.7% (SD =34.4) of the baseline level 
before treatment. Only 4 of the 15 subjects 
failed to achieve a reduction of at least 50%, 
and 3 did not smoke at all. 


Discussion 


There seems little doubt from our data 
that rapid smoking can give rise to blood 
levels of nicotine and COHb that greatly ex- 
ceed those produced by the normal smoking 
of very heavy smokers. Our data from five 
different studies, comprised of 107 smokers, 
are consistent (Table 3). They show that 
our subjects have been heavy smokers with 
an average consumption of about 30 cigar- 
ettes per day compared to a national average 
in Britain of 22 per day for men and 16 per 
day for women (Lee, 1976). With normal 
smoking, COHb levels taken during the after- 
noon approximately 2 minutes after a cigar- 
ette averaged 8.0%-8.5%, whereas plasma 
nicotine averaged about 30.0 ng/ml. In con- 
trast, rapid smoking produced an average 


COHb of 12.1% and a plasma nicotine con- | 
centration of 48.1 ng/ml. The average in- 
crease in % COHb during a rapid-smoking 
session was 3.2, which is the same as the 
increase found by Dawley et al. (1976). 
Since nicotine and CO intake from normal | 
smoking constitute a risk for people with 
coronary heart disease (Aronow, 1976), the 
risk must be greater during rapid smoking.” 
Unfortunately, the magnitude of the risk is 
not easily assessed. In the first place, A 
one has yet attempted to quantify the short- 
term risk of say 1 day of normal smoking t0 
subjects with severe coronary heart aa 
let alone for subjects without evidence K 
such disease. Had our data shown w | 
plasma nicotine and COHb levels after rap! 
smoking were no higher than after ae 
smoking, we could have concluded that ; 
risks did not increase. This was clearly a 
the case, so that one is beholden to mak 
some attempt to assess the risks. 
On the e plasma nicotine E 
after rapid smoking were about 50% AE 
and COHb levels 36% higher than Ra 
normal smoking. It is unlikely that the ; 
ards increase in simple linear relation to a 
blood levels, They probably accelerate ao 
tively from some point. This probably oe 
Occurs with the adverse synergistic inte 
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tion between nicotine and COHb on cardiac 
function (Aronow, 1976). It would, there- 
fore, be reasonable to assume that the risks 
are probably increased far more than is indi- 
cated by simple comparison of blood levels. 
Furthermore, assessment of risk should not 
be confined to consideration of averages. 
Some individuals had exceptionally high 
blood nicotine and COHb levels after rapid 
smoking (Table 1). Subjects 2 and 14 had 
nicotine levels that were, respectively, 26070 
and 150% higher than their normal smoking 
level. 

In a recent article, Horan, Linberg, and 
Hackett (1977) attempted to assess the ex- 
tent to which nicotine poisoning occurs in 
rapid smoking. Most of their information was 
derived from an outdated edition of a well- 
known textbook of pharmacology (Goodman 
& Gilman, 3rd ed., 1965, rather than Good- 
man & Gilman, 5th ed. in 1975), and their 
article contains several errors and misinter- 
pretations, They stated, for example, that a 
single cigarette yields about 6-8 mg of nico- 
tine, yet the average nicotine yield of cigar- 
ettes smoked in the United States in 1975 
was 1.2 mg, and the strongest brand on the 
market yielded only 2.1 mg (Owen, 1976). 
Even the hardest puffing smoker would find 
it difficult to abstract more than about three 
times the standard machine-smoked yield 
from a cigarette. Horan, Linberg, and Hac- 
kett also assumed that increasing the puff 
rate would proportionally increase the num- 
ber of puffs, and hence the nicotine dose, 
obtained from a single cigarette (puff vol- 
ume remaining the same). This is simply not 
true, It is doubtful whether a 10-fold in- 
crease in puff rate (from 1 per min in the 
case of standard machine smoking to 1 per 
6 sec as in rapid smoking) would increase 
the number of puffs obtained from a cigar- 
ette by more than a factor of three. The 
average of 25 puffs per cigarette obtained by 
our subjects with rapid smoking represents 
a factor of approximately 2.5, and this was 
Probably due more to a reduction in puff 
Volume than the increase in puff rate. Horan, 
Linberg, and Hackett seemed not to appre- 
Clate that increases in puff rate produce 
higher levels of plasma nicotine (but not 
COHb), not so much by increasing the nico- 
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tine dose per cigarette as by increasing the 
rate at which the dose is taken. 

In our view, the incidence of cardiac com- 
plications is of greater significance than the 
occurrence of nicotine poisoning or intoxica- 
tion, Nicotine intoxication per se is fairly 
rapidly reversible. Indeed, the pharmaco- 
logical effects of nicotine that are evident in 
many smokers after normal smoking are man- 
ifestations of a degree of nicotine intoxica- 
tion. So-called “green-tobacco sickness,” an 
occupational illness of tobacco harvesters, is 
a form of nicotine intoxication due to ab- 
sorption through the skin during tobacco 
cropping (Gehlbach et al., 1975). The symp- 
toms include nausea, vomiting, dizziness, and 
prostration, far exceeding those usually en- 
countered after rapid smoking. Mortality and 
long-term effects of green tobacco sickness 
have not been documented, but during the 
1973 harvesting season an estimated 9% of 
North Carolina’s 60,000 tobacco growers re- 
ported such illness among their workers (Gehl- 
bach et al., 1975). As far as rapid smoking is 
concerned, the severity of manifest nicotine 
intoxication is relevant only insofar as it is an 
index of nicotine dosage, and hence the risk 
of serious cardiac complications that could 
persist after nicotine intoxication has passed. 
However, the relation of cardiac complica- 
tions to signs of nicotine intoxication is un- 
likely to be very close. One reason is that the 
incidence of cardiac complications depends 
on the extent of underlying coronary heart 
disease. Another reason is that the tolerance 
mechanisms probably differ for the effects of 
nicotine on the brain as opposed to the heart 
and circulation. Finally, in our subjects the 
experience of nausea after rapid smoking was 
not significantly associated with high nico- 
tine intake. 

In view of the complexities, it is virtually 
impossible to make any meaningful quantita- 
tive estimate of the risks of rapid smoking on 
the basis of the blood level data alone. Since 
the main concern is for the cardiac compli- 
cations, the most relevant approach to as- 
sessing the risks should certainly include care- 
ful electrocardiograph analysis during vand 
after rapid smoking (Horan, Hackett, Nicho- 
las, Linberg, Stone, & Lukaski, 1977). Ironi- 
cally, though it is usual to make some esti- 
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mate of risk before widespread use of a new 
treatment approach, the most reassuring evi- 
dence of the relative safety of rapid smoking 
is its use for some years with “thousands 
of smokers” (Danaher et al., 1976) without 
apparent mishap. 

The most careful screening cannot exclude 
the presence of even quite advanced coronary 
artery disease. Rapid smoking must, there- 
fore, involve a greater risk to older subjects, 
35 and up, especially if they are men. Besides 
careful screening, any risk of cardiac arrhyth- 
mia due to rapid smoking could be further 
reduced by the administration of a beta 
adrenergic blocker such as oxprenolol. This 
drug has been shown to reduce the increase 
in heart rate due to normal smoking (Car- 
ruthers, 1976), This approach might be es- 
pecially relevant to those cases who are at 
greater risk but for whom it is particularly 
important to give up smoking. 

As mentioned above, a degree of risk is 
more justified if rapid smoking can be shown 
to be the most likely way to enable a person 
to stop smoking. It was not the purpose of 
this article to evaluate rapid smoking as a 
treatment for smokers. Nevertheless, the fact 
that neither COHb increase during rapid 
smoking nor plasma nicotine levels after rapid 
smoking correlated significantly with reduc- 
tion in cigarette consumption, on the one 
hand, or lowering of the desire to smoke, on 
the other (Table 2), suggests that excessive 
intake of nicotine and CO may not be neces- 
sary elements of any treatment effect. Fur- 
thermore, it seems that there is little to be 
gained by pressing subjects to undertake more 
than ‘two rapid-smoking trials per session, 
since subjects who tolerate only two trials 
show, if anything, a greater reduction in both 
desire to smoke and cigarette consumption 
than do those who go on to a third trial. This 
is not explained by the possibility that heay- 
ier smokers might be more likely to tolerate 
three trials, since mean cigarette consump- 
tion did not differ significantly in the two 
groups (¢ = 1.15, ns). Of course, the possibil- 
ity remains that two-trial subjects might have 
done better if pushed to do a third, or that 
three-trial subjects would have shown even 
less change after only two trials. 

It could be argued that the lack of correla- 
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tion between early treatment outcome and 
nicotine and CO intake during rapid smoking 
was due to differences in individual toler- 
ance. For example, a nicotine level of 30 ng/ 
ml might be highly aversive to one subject 
but relative deprivation to another whose reg- 
ular smoking level was around 50 ng/ml. 
However, this is an unlikely explanation, 
since the ratio of rapid-smoking to normal- 


smoking plasma nicotine level did not corre- | 
late significantly with reduction in desire to 


smoke after the session or with the reduction 
in cigarette consumption on the following 
day (Table 2). 

Our data on heart rate increase (21% 


after the first trial and 19% after the second) | 


show greater changes than the 9% increase 
reported by Hynd, O’Neal, and Severson 
(1976) but are in accord with the more sub- 
stantial, controlled study by Danaher et al. 
(1976). A larger effect on heart rate could be 
expected when rapid smoking follows 4 
period of abstinence. 


It is concluded that rapid smoking F | 
produce excessively high blood nicotine an 


COHb levels and that this constitutes a risk 
to all but the younger smoker. On the data 
available at present, the level of risk is im- 
Possible to calculate in quantitative terms. It 
could probably be reduced by a beta adre- 
nergic blocking drug. Since excessive nicotine 
and CO intake may not be necessary for 4 
treatment effect, it is suggested that the pro- 
cedure could be made completely safe ay 
Possibly no less effective if subjects are aA 
structed to take the smoke into the throa 


2 it 
but to avoid inhaling it into the lungs, for it | 


is only by inhalation that dangerously high 
levels are produced. 
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The Hand Dynamometer as a Neuropsychological Measure 


Carl B. Dodrill 
Department of Neurological Surgery i 
University of Washington School of Medicine 


The sensitivity of the hand dynamometer to the presence of brain damage and 
to its lateralization was evaluated and compared with that of the Tapping Test 
and the Tactual Performance Test. Four groups of 25 subjects each were 
studied (control, right-hemisphere damage, left-hemisphere damage, and bilat- 
eral damage). Measures of performance on each test included those of each 
hand taken separately as well as their sum. To identify the lateralization of 
brain lesions, a method was developed that used the control group as a basis 
for comparison and that simultaneously considered the relative performances 
of each hand on each task. All test variables discriminated between the control 
and brain-damaged groups at high levels of statistical significance. Furthermore, 


the dynamometer discriminated between these groups as well as did the Tapping 
Test and Tactual Performance Test. Finally, the dynamometer correctly identi- 
fied the lateralization of brain lesions in more instances than either of the other 
tests. It is concluded that the hand dynamometer is a neuropsychological mea- 


sure of considerable promise. 


Many years ago, Halstead (1947) demon- 
strated that assessment of voluntary motor 
movement could be useful in evaluating the 
integrity of brain functions using such mea- 
sures as the Tapping Test and the Tactual 
Performance Test. Reitain (1966) expanded 
the use of these measures by demonstrating 
that differences in performance between the 
two hands are related to the relative func- 
tioning capabilities of the two cerebral hem- 
ispheres. Thus, by examining level of per- 
formance and by comparing the two sides of 
the body, these tests of motor speed and 
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agility were established as reliable indicators 
of the integrity of brain functions. ; 

Clinicians have realized that intensity of 
strength of voluntary motor activity might 
also be a reliable indicator of brain functions, 
and many use some strength-of-grip measure 
in neuropsychological assessment. Such i 
has led to an establishment of its clinica 
value as well as to a listing of the dynamom- 
eter by Reitan and Davison (1974) as ê 
neuropsychological measure. On a Be 
basis, Reitan (1974) demonstrated that J 
strength of grip of young brain-damaged a 
honneurological children (ages 5-8) di A 
only slightly. Boll (1974), in working m 
older children (ages 9-14), found much ma 
striking differences. No parallel studies ba 
been done with adults that have direc a 
compared brain-damaged persons with ET 
neurological controls, and none have €v ihe 
ated the dynamometer with respect tO The 
correct placement of lateralized lesions. : 
Present study addresses these areas and ev in 
uates the utility of the hand ao 
comparison with two other better ae 
Neuropsychological measures (Tapping 4%? 
Tactual Performance Test). 
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HAND DYNAMOMETER AS A NEUROPSYCHOLOGICAL MEASURE 


Method 
Subjects 


Four groups of adults (ages 15 and over) were 
formed, with each group consisting of 25 persons. 
Subjects in the control group had negative neuro- 
‘ogical histories. They had never had any disease 
that might have affected the nervous system (men- 
ingitis, encephalitis, polio, diabetes, rheumatic fever, 
‘scarlet fever, etc.), and they had no histories of high 
fever, partial drowning, exposure to gas, heat ex- 
Taustion, fainting spells, or head trauma. They were 
fecruited from a variety of community resources 
Bidding churches, schools, and employment agencies. 
C Three groups of prain-damaged persons were se- 
‘ected on the basis of the primary location of brain 
damage (right hemisphere, left hemisphere, both 
cerebral hemispheres). In each group, there were 5 
individuals with intrinsic brain tumors, 11 with a 

| history of head trauma, and 9 with cerebral vascu- 
lar problems. Neurological diagnoses were established 
by anamnestic information, angiography, pneumo- 
encephalography, electroencephalography, skull x- 
rays, neurosurgical findings, and autopsy. 

Across all groups, a subject-by-subject matching 
procedure was maintained for the variables of sex 

l (there were 20 males and 5 females in each group), 
řace (all subjects were Caucasian), and handedness 
(all subjects were right-handed). Within the brain- 
damaged groups, the subject-by-subject matching 
procedure included the general type of neurological 
difficulty (neoplastic, traumatic, vascular) . Finally, 
within each set of 4 persons (1 from each group), 
Matching was completed as closely as possible for 
age and years of formal education, with the result 
that each group averaged approximately 41.14 years 
Of age and 10.68 years of education. 

As part of their neuropsychological evaluations, 
all subjects were administered the dynamometer, 
the Tapping Test, and the Tactual Performance 
Test, Attention was given 
tive procedures suggested 
emphasis on maximal performance. To assess strength 
of grip, the Smedley Hand Dynamometer was used, 
which registers strength in kilograms. 
Were given in alternating fashion for each hand 
beginning with the right (preferred) hand, and the 
average of the two trials was used as the final score 
for each hand. 

Because the Tactual Performance Test provided a 
total time score summing all trials (including right 
hand, left hand, and both hands), summary scores 
(tight plus left) for the Tapping Test and the 
dynamometer were also provided in addition to the 
Usual scores for each hand alone. 


Analyses 
To evaluate the discriminability of the tests, 
Univariate analyses of variance were run across 


four groups for each test variable, and evaluations 
7 significant differences between groups were as- 
essed by the Newman-Keuls procedure (Winer, 
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1971). In these analyses, homogeneity of variance 
was maintained by converting all data to normalized 
standard scores with a mean of 50 and a standard 
deviation of 10. Performances on the Tactual Per- 
formance Test were considered on a minutes-per- 
block basis. Ñ 

The effectiveness of each of the three tests in 
implicating lateralized damage was assessed using 
only the subjects with lateralized lesions. The per- 
formance by the left (nonpreferred) hand was di- 
vided by the performance of the right (preferred) 
hand so that in each instance a left-to-right com- 
parison in performance could be made with a single 
score. The mean and standard deviation of this 
score for the control group were computed, and 1 
standard deviation on either side of the mean was 
arbitrarily selected as the limit of normal per- 
formance. The performance of each 
the right and left brain-damaged groups was then 
compared with this standard. 
for any brain-damaged patient on each of the 
three measures considered separately indicated that 
the right hand was not performing as well as would 


sidered to be implicated by that measure, and vice 
versa, Chi-square statistics were applied to the 
subjects who were classified by 


Results 


The discriminability of each neuropsycho- 
logical variable considered on @ group-by- 
group basis is given in Table 1. Highly sta- 
tistically significant differences across the 
groups were found with respect to every 
variable, and the control ‘group did better 
than all prain-damaged groups in every in- 
stance. The dynamometer discriminated be- 
tween the normal and prain-damaged groups 
as well as did either of the other tasks. 

The lateralization data are presented in 
Table 2. If performance fell within the nor- 
mal (1 standard deviation) range, neither 
hemisphere was considered implicated and 
placement was made in the “neither” group. 
When one hemisphere or the other was impli- 
cated, all three tests classified a majority of 
individuals correctly, and the dynamometer 
correctly classified the largest number. 


Discussion 
The high level of discriminability demon- 
strated by the dynamometer between normal 
and brain-damaged subjects was unexpected. 
It is true that the prain-damaged groups had 
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Table 1 
Data on All Test Variables for All Groups 


Sh Ss 
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Bilateral 
Control Right damage Left damage damage 
Test and variable M SD M SD M SD M SD F 
Dynamometer 
Right hand 48.12bed 13.39 33.948 12.07 31.42» 16.85 36.16° 9:30) AAN 
Left hand 44.86% 12.15 21.4204 15.18 37.21%» 12.75 32.44%b 11.83 13,73 
Total (right + left) 93.96re4 25.12 55.36% 22.24 68.61%> 26.48 68.60%> 19.61 1043 
Tapping 
Right hand 53.44>e4 6.23 41.768 9.00 37.00" 17.28 40.368 11.21 9.64 
Left hand 49.60%e¢ 5.37 30:72=e 14.30 39.2454 11.50 34.60%° 10.28 1711 
Total (right + left) 103.0404 10,43 72.488 20.27 76,248 26.08 75.44" 18.97 11.96 
Tactual Performance 
Right hand -T6b.ed 50 4.428 5.54 5.70%, 6.03 2.54%¢ 3.28 12.25 
Left hand Odbod 137 6.83804 6.85 2.778 4.69 245mb 3,308.53 
Both hands 39d 21. 4.248 5.94 2.278 4.29 1.59 3.15 813 
Total (all trials) 59bed 32 4, 14ed 5.25 2.638 4.18 1.74%» 2.16 — 11.07 


Note. F statistics were computed on the basis of T scores. All Fs were significant at the .001 level. n = 25. 
Superscripts designated groups with statistically different performances (p < .01). 


a Control. 

> Right damage. 

° Left damage. 

4 Bilateral damage. 


unequivocal evidence of cerebral involve- 
ment. It is also true that the nonneurological 
group, consisted of “off the street” individuals 
rather than the hospital populations usually 
studied (Halstead, 1947; Reitan, 1955; Vega 
& Parsons, 1967). These facts may have 


Table 2 

Numbers of Subjects in the Right- and 

Left-Damaged Groups Classified According 

to the Lateralizing Implications of Their 

Performance 

A ee 
Hemisphere implicated 
Se. 


Test and group Right Left Neither x 


Dynamometer 
Right damaged 12 1 12 
Left damaged 3 13 o 16.88** 
Tapping 
Right damaged 10 6 9 A 
Left damaged 3 7 15 6.01 
Tactual Performance 
Right damaged 9 3 13 6.39* 
Left damaged 2 7 16 7 
*p < .05. 
** p < .001. 


served to accentuate the general differences 
between the groups, but they did not give 
the Tapping Test or the Tactual Performance 
Test any noticeable edge in discriminant abil- 
ity over the dynamometer. This was particu- 
larly surprising in view of the extreme “a 
plicity of the dynamometer. An incidental ob- 
servation was that when performances by a 
preferred hand alone were considered, conky 
subjects outperformed their matched brt 
damaged subjects 91% of the time with 
dynamometer, 82% of the time with the TR 
ping Test, and 84% of the time with : 
Tactual Performance Test. Thus, it rare 
that the dynamometer does effectively sh 
criminate between normal and brain-da e 
aged adults when consideration is ma% 
either on a subject-by-subject or on a group 
by-group basis. 

The Giscriminability of the dynam 
may in part relate to the age of the cb "i 
to whom it is administered. It is of interes 
note that Reitan (1974) showed only me 
discrimination between normal and ei 
damaged young children with the ete 
ter, but Boll (1974) showed better disc 


bility consistent with that obtained with 
der children. The reasons for this are not 
r, although it is possible that the test is 
most useful when a brain insult occurs well 
after the development of cerebral dominance. 
The relatively good lateralization of lesions 
‘by the dynamometer was somewhat surpris- 
ing. Admittedly, the criterion of 1 standard 
deviation above or below the control mean is 
arbitrary. Furthermore, it leads to findings 
at are, if anything, conservative in impli- 
‘cating one cerebral hemisphere or the other. 
For example, the performance by the control 
Ee with the left hand on the Tapping Test 

as approximately .93 that of the perform- 
“ance on the right hand. The standard devia- 
Tion was .09, so that any score from .84 
through 1.02 was considered within normal 
“limits, whereas scores less than .84 implicated 
the right cerebral hemisphere and scores 
greater than 1.02 implicated the left cerebral 
hemisphere. If one assumes that a person’s 
‘Tight (preferred) hand averages 50 on the 
Tapping Test, an identical performance by 
“the left hand would fall in the range of nor- 
mal limits, whereas clinical interpretation 
"Would definitely suggest that the right hand 
‘Was slow, Furthermore, the score with the 
Teft hand would have to be 41 or less in order 
fo implicate the right cerebral hemisphere, 
Whereas in clinical practice scores of 42 or 
43 would certainly raise the question of 
slowness with respect to the left hand. With 
‘the procedure being somewhat conservative, 
it is not surprising to discover that 42% to 
8% of the people evaluated in the laterali- 
tion analysis (Table 2) had performances 
t implicated neither cerebral hemisphere 


likely to account for 
Sions were made, they 
‘of the cases for the dynamometer and in 65% 
“and 76% of the cases for the Tapping Test 
d the Tactual Performance Test, respec- 


tively. 


_ The question can be raised as to whether or 
established for 


ot a cutoff score should be 

dynamometer in the same fashion that it 
as been established for the Halstead mea- 
ures, This appears unwise, because (a) the 
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number of subjects in the present study is 
too small to constitute an adequate standard- 
ization sample, (b) there are obvious sex 
differences that would require separate 
norms, (c) certain vocational and avocational 
activities of individuals may affect scores on 
this test, and (d) the accuracy of the Smed- 
ley dynamometers depends on a spring that 
may become weakened with use and lead to 
error in measurement. Therefore, no effort 
has been made to establish a cutoff score. 
Overall, the hand dynamometer both dis- 
criminates between normal and brain-dam- 
aged persons and lateralized lesions as well 
as do existing measures. It appears to be a 
promising neuropsychological measure that 
warrants both clinical use and further formal 
evaluation, especially in consideration of the 
brief administration time required vis-à-vis 
the other two neuropsychological measures 
(Tactual Performance Test, Tapping Test) 
conventionally used in the Halstead-Reitan 


battery. 
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An Initial Look at the 
Redundancy of Specialized MMPI Scales 


James R. Clopton and Gary L. Klein 
Texas Tech University 


Numerous specialized Minnesota Multiphasic Personality Inventory (MMPI) 
scales have been developed despite speculation that the information provided 
ation that can be obtained from the 13 
Profile. In this study, scores of three specialized 
MMPI scales (the Prejudice scale, the Ego-Strength scale, and the MacAndrew 
Alcoholism scale) were found to be highly related to the scores of the standard 
individual scores of the three specialized scales could 
not be accurately predicted from the standard scales. Furthermore, alcoholic 
and nonalcoholic psychiatric patients were more accurately identified by the 
13 standard scales than by the MacAndrew Alcoholism scale. 


by specialized scales replicates inform: 


MMPI scales of the standard 


MMPI scales. However, 


The Minnesota Multiphasic Personality 
Inventory (MMPI) originally included three 
validity scales that assess test-taking atti- 
tudes and 10 clinical scales that identify 
common types of abnormal behavior. In ad- 
dition to these 13 standard scales, numerous 
specialized MMPI scales have been con- 
structed. Some of these specialized scales 
have been constructed to measure common 
personality dimensions such as dependency 
and prejudice. Other specialized MMPI 
scales have been developed to identify pat- 
terns of abnormal behavior, such as alcohol- 
ism, that are not assessed directly by any of 
the standard clinical scales, Interest in spe- 
cialized scales has been increasing recently 
(Graham, 1978), and a number of special- 
ized scales are routinely scored by automated 
MMPI interpretive systems, 

The latest edition of the MMPI Hand- 
book (Dahlstrom, Welsh, & Dahlstrom, 1975) 
listed 455 specialized scales that measure per- 
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sonality variables or identify patterns of ab 
normal behavior. Many of the speciali e 
scales are too limited for widespread use, bi 
the use of some specialized MMPI scales 
quite common. Three specialized MME 
scales in common use are the Prejudice sca 
(Gough, 1956), the Ego-Strength scale (Bar 
ron, 1953), and the MacAndrew Alcoho sn 
scale (MacAndrew, 1965). The Prejudic 
scale was designed originally to measure a 
Semitic prejudice, but it also appears to a 
Sess the broader trait of rigidity in thinking 
The Ego-Strength scale is frequently used f 
help predict the likelihood that a person 
profit from receiving psychotherapy. 
MacAndrew scale has been shown to differen 
tiate between alcoholic and nonalcoholic p 
tients in a variety of treatment settings. — i 
Despite the attractiveness of developl 
new MMPI scales for specialized tasks & 
the popularity of some specialized M 
scales, there are serious questions regal 
the use of specialized MMPI scales. 
basic question concerns the possible red 
dancy or Superfluity of the me 
vided by the specialized scales. Cal 
(Note 1) asserted that he could predict 
specialized scale scores so well from 
scores of the 13 standard MMPI scales 4 
the specialized scales did not appear to R 
vide any information beyond that whia 


SPECIALIZED MMPI SCALES 


Table 1 
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Comparison of Observed and Predicted Raw Scores 
ee 


Scale Sex n M SD Range R F Accuracy 

Prejudice 

Observed Male 170 8.52 441 0-22 -11 28.87* 

Predicted Male 170 8.52 3.70 1.54-19.16 45.3 
Ego-Strength 

Observed Male 112 42.70 7.25 21-57 .66 14.64* 

Predicted Male 112 42.70 5.89 26.75-52.51 53.6 

Observed Female 85 38.95 7.30 23-57 67 10.87* 

Predicted Female 85 38.95 5.96 25.62-49.70 68.2 
MacAndrew 

Observed Male 112 24.78 5.02 14-38 .62 12.29* 

Predicted Male 112 24.78 3.95 16.01-35.04 59.8 

Observed Female 85 21.84 4.70 8-32 46 4.72* 

Predicted Female 85 21,84 3.20 14.47-31.83 60.0 


Note. The accuracy of the predicted scores is the 


percentage of subjects whose observed scale score was 


within the 95% confidence limits of the value predicted from the regression equation. 


*p < 0001. 


could be provided by the standard scales. 
Unfortunately, Caldwell’s data have not been 
published, and other data regarding the pos- 
sible redundancy of specialized MMPI scales 
are lacking. 


Method 


MMPI data were obtained from 197 psychiatric 
patients and from 170 male applicants for jobs with 
police and fire departments. The psychiatric pa- 
tients and job applicants have been evaluated at a 

regional mental health center during the last 7 
years. MMPI data from the psychiatric patients 
were used to investigate the clinically oriented Ego- 
Strength and MacAndrew scales. MMPI data from 
the job applicants seemed more appropriate for an 
investigation of the nonclinical Prejudice scale. 

Regression analysis was used to determine how 
well subjects’ specialized scale scores could have 
been predicted from their scores on the 13 standard 
MMPI scales, For the Ego-Strength scale and the 
MacAndrew scale, separate analyses were per- 
formed for male (n= 112) and female (n = 85) 
Psychiatric patients. 

A review of the psychiatric patients’ records re- 
vealed that 48 patients (24.4%) had difficulties 
directly related to alcohol abuse. Discriminant anal- 
ysis was used to determine whether alcoholic and 
nonalcoholic patients are more accurately identified 
by the MacAndrew scale or by the 13 standard 
MMPI scales. 


Results 


Table 1 presents the results of the regres- 
sion analyses. The observed scores for all 


three specialized scales were significantly re- 
lated to the scores of the 13 standard scales, 
and a large portion of the variance in each 
of the specialized scales was accounted for by 
the 13 standard scale scores. 

The regression equations were used to de- 
rive predicted specialized scale scores for sub- 
jects. It was then determined, for each of 
the three specialized scales, whether each 
observed scale score was within the 95% con- 
fidence limits of the predicted score. Approxi- 
mately 60% of the predicted scores for the 
Ego-Strength and MacAndrew scales were 
found to be accurate by this criterion (see 
Table 1). For the Prejudice scale, 45.3% of 
the predicted scores were accurate, Thus, 
many of the individual scores of the three 
specialized scales could not be accurately 
predicted from the standard scale scores. 

A ready explanation for the inaccurate pre- 
dictions of the specialized scale scores is that 
the variance among the observed scores of 
each specialized scale was underestimated by 
the predicted scores (see Table 1). Imperfect 
multiple correlation (1.00 > R > —1.00) and 
use of a regression equation with a least- 
squares prediction rule assured the reduced 
variance among the predicted scores (Hays, 
1963, pp. 500-501). As a consequence of the 
underestimation of the variance in specialized 


scale scores, prediction was most accurate 
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for scores close to the mean scale score, and 
the prediction became increasingly less accu- 
rate as the scores departed from the mean. 
The discriminant analysis with the 13 
standard MMPI scales as predictors cor- 
rectly classified all of the female psychiatric 
patients as alcoholic or nonalcoholic and cor- 
rectly classified 91.1% of the male patients 
(85.3% of the male alcoholics and 93.6% 
of the male nonalcoholic patients). In con- 
trast, the discriminant analysis using the 
MacAndrew scale as the predictor correctly 
classified most of the nonalcoholic patients 
(100% of the female nonalcoholic patients 
and 89.7% of the male nonalcoholic pa- 
tients), but it identified few of the alcoholic 
patients (7.1% of the female alcoholic pa- 
tients and 35.3% of the male alcoholic pa- 
tients). The optimal cutoff score for the Mac- 
Andrew scale (the scale score that would 
most correctly classify patients) was-27 for 
female patients and 25 for male patients, 


Discussion 


In this study an attempt was made to pre- 
dict the scores of three specialized MMPI 
scales from the scores of the 13 standard 
MMPI scales, The three specialized MMPI 
scales were found to be highly related to the 
13 MMPI scales of the standard profile, but 
individual scores on the three specialized 
scales could not be accurately predicted from 
standard scale scores. The three specialized 
scales examined in this study appear to pro- 
vide information not available from the stan- 
dard MMPI scales. 

Extreme scores on the MMPI scales are 
often the scores of most importance in a 
clinical setting. In this study the prediction 
of specialized MMPI scale scores became 
increasingly less accurate as the scores de- 
parted from the mean, However, this inac- 
curate prediction of extreme specialized scale 
scores should not be interpreted as an indica- 
tion that predictions from specialized scales 
are superior to predictions from the standard 
scales. The more extreme the scale score, the 
larger the error of measurement it probably 

contains, From this perspective, the most ex- 
treme specialized scale scores are also the 
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- More accurate than the MacAndrew 


least reliable, and, consequently, it is 
istic to expect highly accurate predi 
extreme specialized scale scores. 

In this study, the 13 standard scali 


identifying psychiatric patients who 
alcohol. Future research should seek to 
termine, for other specialized scales, wl 
specialized scale scores or the standa 
scores are more closely related to external 
teria (independent measures of relevant | 
sonality variables or patterns of abno 
behavior). For example, a comparison C 
be made of the predictions of successful 
come in psychotherapy by the Ego-Str 
scale and by the 13 scales of the sta 
MMPI profile. It is not possible to anticip 
the outcome of such comparisons from 
finding that specialized scale scores cannot 
accurately predicted from the sta 
MMPI scales. 


Reference Note 


1. Caldwell, A. B. Recent advances in auto 
interpretation of the MMPI. Paper prese 
at the Fifth Annual Symposium on Recent 
velopments in the Use of the MMPI, Më 
City, Mexico, February 1970. 
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Effects of a Self-Control Manual, Rapid Smoking, and 


Amount of Therapist Contact on Smoking Reduction 


Russell E. Glasgow 
North Dakota State University 


This study evaluated a self-help treatment manual consisting of stimulus con- 
trol, rapid smoking, and coping relaxation techniques. Sixty-nine subjects who 
smoked at least 20 cigarettes per day were randomly assigned to (a) a self-help 
manual with minimal (two sessions) therapist contact, (b) a self-help manual 
with high (seven sessions) therapist contact, (c) a high-therapist-contact rapid 
smoking condition, or to (d) a high-therapist-contact_normal-paced smoking 
condition. Results indicate that while the overall program was moderately effec- 
tive, groups did not differ on percentage of baseline smoking or on number of 
subjects abstinent at posttreatment, 3-month, or 6-month follow-up. Informant 
reports of subjects’ smoking behavior and carbon monoxide analyses of expired 
air samples confirmed these findings. Subjects in the minimal contact condition 
generally followed through on their programs, required less therapist time, and 
were at least as successful as those in other groups in terms of long-term re- 
sults. The implications of these findings for self-help manuals for smoking re- 


duction are discussed. 


Since the Surgeon General’s report on the 
health effects of cigarette smoking in 1963, 
there have been numerous evaluations of 
smoking reduction programs. Reviews of this 
literature (Bernstein, 1969; Lichtenstein & 
Keutzer, 1971; Schwartz, 1969) have con- 
cluded that with few exceptions, the long- 
term effects of these studies have been dis- 
appointing. The typical result has been 
short-term reduction to 10%-40% of base- 
line, with relapse to approximately 75% of 
baseline at a 4- to 6-month follow-up (Mc- 
Fall & Hammen, 1971). A more recent evalu- 
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ation (Hunt & Bespalec, 1974) found that 
initial decreases in smoking frequency gen- 
erally dissipate rapidly and asymptote around 
60% of baseline rate. These authors also 
noted that on the average only 20%-30% of 
those subjects abstinent at termination were 
still not smoking at follow-up. In light of 
these findings, recent reviews (Bernstein & 
Glasgow, in press; Bernstein & McAlister, 
1976; Lichtenstein & Danaher, 1976; 
Schwartz, 1977) have suggested that investi- 
gators focus on ways to maintain initial 
treatment effects. 

One attempted solution to the maintenance 
problem has involved the use of multicom- 
ponent self-help treatment manuals, There 
are several potential advantages to such pro- 
grams. By playing a more active role in a 
program, clients may learn skills and treat- 
ment techniques more thoroughly. With self- 
help programs there is a less abrupt transi- 
tion period when treatment ends than with 
therapist-administered programs, and clients 
have their manuals to refer to if maintenance 
problems arise. It has also been hypothesized 
that client-directed treatments might lead to 
greater maintenance of therapeutic gains than 
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therapist-directed programs by producing 
more internal attributions of success (Kopel 
& Arkowitz, 1975). Although a number of 
studies have been conducted in this area 
(Conway, 1977; Danaher, 1977; Harris & 
Rothburg, 1972; Ober, 1968; Winet, 1973; 
Brengelman, Note 1; Conway & Morton, 
Note 2; Danaher & Lichtenstein, Note 3; 
Pechacek, Note 4), self-help manuals have 
generally not been found effective (Glasgow 
& Rosen, 1978). A possible explanation for 
this situation is that investigators may have 
sacrificed quality for quantity. Most manuals 
contain a smorgasbord of procedures includ- 
ing self-monitoring and hierarchical reduc- 
tion; stimulus control suggestions; self- 
reward, self-punishment, and behavioral con- 
tracting strategies; aversive smoking tech- 
niques; thought stopping or other cognitive 
interventions; information on the hazards of 
smoking and reasons for not smoking; sug- 
gestions for alternative behaviors incompati- 
ble with smoking, and so forth. Subjects are 
generally not given explicit instructions on 
how to implement these strategies. Conse- 
quently, subjects may not learn any of the 
techniques well enough to use them effec- 
tively once treatment has terminated. 

The main purpose of this investigation was 
to evaluate the efficacy of a self-help smoking 
reduction manual that presented in depth a 
few promising techniques in an organized, 
sequential manner. The manual included 
stimulus control, rapid smoking, and coping 
relaxation techniques. 

Stimulus control procedures have been 
found effective in reducing smoking until one 
reaches the level of 10-12 cigarettes per day 
(Marston & McFall, 1971; Sachs, Bean, & 
Morrow, 1970). In the present study stimulus 
control techniques were used as an initial 
component to assist in reduction of smoking 
levels and to teach subjects a behavioral 
problem-solving strategy. Rapid smoking has 
generally been found to be the single most 
effective cessation technique developed to 
date (Danaher, in press; Lichtenstein, Harris, 
Birchler, Wahl, & Schmahl, 1973; Schmahl, 
Lichtenstein, & Harris, 1972). It was in- 
cluded as a way to produce initial cessation 

and to get subjects to devalue the act of 
smoking. The relaxation procedure was in- 
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cluded as a coping skills technique (Go 
fried & Trier, 1974) to provide an alt 
tive response to smoking when subjects 
perienced a craving for a cigarette. 

Another potential advantage of self- 
treatment manuals is substantial savings 
terms of therapist time. Survey data hay 
indicated that many persons wishing to 
smoking would not attend a smoking clin 
but would use a manual (McAlister, 1975) 
Thus, many more clients than are curren 
seen by therapists could be treated at 
greatly reduced cost. Although several smol 
ing manuals exist (Glasgow & Rosen, 19 
few of them have been tested under s 
administered or minimal therapist con 
conditions. Brengelman (Note 1) has te 
a manual that was mailed to participant 
Initial reports of abstinence were impressi 
but the manual did not appear to have b 
compared to therapist-administered tr 
ments or control conditions. Danaher 
Lichtenstein (Note 3) reported less eni 
aging results for coverant control treati 
manuals administered under minimal 
tact conditions. 

The self-control manual was tested un 
two levels of therapist contact. Therap) 
administered rapid smoking and a nor 
paced smoking condition were also inclu 
as “component control groups” against wh 
to evaluate the manual, thus yielding 
treatment groups. Self-monitored frequé 
of smoking was recorded throughout 
ment and at a 3-month follow-up. A 6-md 
phone call follow-up provided a further € 
mate of long-term effects. Carbon moni 
analyses of expired air samples and in 
mant reports of smoking provided additior 
indices of smoking behavior. Process measu 
included subjects’ therapeutic expectan 
amount of therapist contact time, indices 
how well subjects carried out assign : 
(follow through), and ratings of the 
pleasantness of aversive smoking sessions- ” 


Method 
Subjects 
Smokers were solicited through local media 


nouncements of a smoking reduction ge 
lection criteria included smoking at leas! 
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day; informed consent; and payment of a $20 
deposit, $15 of which was refundable on completion 
of the project. Applicants with a history of cardio- 
vascular or chronic respiratory problems (8.9% of 
those contacting the clinic) were excluded. Subjects 
meeting the above screening criteria were required to 
obtain their physician’s consent before beginning the 
program, Of those seeking physician approval, 8.1% 
were denied, most often due to pregnancy. 

Sixty-two smokers, 32 men and 30 women, com- 
prised the final sample. They averaged 32.6 years of 
age, estimated their baseline smoking rate at 31.0 
cigarettes per day, had a self-monitored baseline of 
14.7 cigarettes per day, and had smoked for an 
average of 15.1 years. 


Therapists 


Therapists were three male and three female un- 
dergraduate psychology majors. Therapists received 
extensive training over a 2-month period prior to 
the study. Training involved reading relevant back- 
ground material, observing demonstrations of pro- 
cedures, using role-playing techniques in the various 
groups, and seeing at least one pilot case. Treatment 
was standardized across therapists by means of 
procedural outlines for each session. Adherence to 
therapeutic procedures was further ensured by 
weekly group meetings and individual feedback from 
the project supervisor, who observed sessions or 
listened to audiotapes of a majority of the sessions. 
Therapists saw approximately an equal number of 
male and female subjects in each treatment condition. 


Procedure 


Subjects were given a general description of the 
program and were informed of the selection criteria 
Over the telephone. Interested subjects were then 
randomly assigned to one of four treatment groups. 
Following an individual intake session, subjects 
began monitoring their cigarette consumption. After 
a 1-week baseline, subjects met individually with 
their assigned therapist for their respective 3-week 
treatment programs. 


Treatment Groups 


Minimal contact self-control (n=15). This 
Broup received a 37-page manual detailing a multi- 
Component treatment program for nonsmoking 
(Glasgow, Lichtenstein, & Danaher, Note 5). The 
Manual presented a behavioral analysis of smoking 
and emphasized the importance of recording and 
Counteracting smoking urges. Jnitial chapters fo- 
cused on training in progressive relaxation and 
Stimulus control techniques for hierarchical reduc- 
tion. A three-phase relaxation training program prê- 
Sented relaxation as a coping strategy for use when 
experiencing urges to smoke. 

The rapid smoking procedure closely followed that 
desctibed by Kopel (1975). One trial consisted of 
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puffing on a cigarette every 6 sec until the subject 
could not bear to continue or until 10 min had 
elapsed, whichever came first. Immediately after each 
trial subjects completed a checklist of possible nega- 
tive sensations (Glasgow, 1977). After a 2- to 4-min 
rest period, subjects underwent a second trial iden- 
tical to the first, with the exception of a 5-min time 
limit., There were six such sessions consisting of two 
trials each. Sessions were held on the Ist, 2nd, 4th, 
7th, and 10th days after the initial session. 

Subjects’ progress through the manual and com- 
pletion of relaxation and rapid smoking sessions 
were recorded on a progress schedule contained in 
the manual, Subjects in the minimal contact group 
initially met with a therapist to receive their man- 
ual, a rationale for the program, and a demonstra- 
tion of relaxation procedures, They then worked on 
their own, meeting once more with their therapist 
midway through the program to receive their first 
rapid smoking session. Subsequent rapid smoking and 
relaxation sessions were self-administered by clients 
at home. Therapists called weekly to check on sub- 
jects’ progress and to answer questions. 

High contact self-control (n= 15). This group 
received the same manual as the minimal contact 
group but had regular meetings with a therapist. 
Subjects were assigned to read a section of the 
manual and then met with their therapist to imple- 
ment the assignments in that section. Seven meet- 
ings were held over the 3-week treatment period. 
Subjects received more direction from therapists on 
relaxation and stimulus control procedures than 
did minimal contact subjects, but rapid smoking 
and relaxation sessions were held at home after ini- 
tial demonstrations. Treatment techniques and the 
sequence of components were identical to those of 
the minimal contact group. 

High contact rapid smoking (n= 16), This 
group was intended as a replication of clinic-ad- 
ministered rapid smoking as used in previous re- 
search, The procedure and spacing of rapid smoking 
were identical to that for the manual groups, but all 
sessions were therapist administered. There was a 
9-day “preparation period” after an initial meeting 
for subjects in this group before beginning rapid 
smoking. This was to insure that all groups com- 
pleted treatment at the same time. Flaxman 
(1978) found that such a waiting period before a 
“target date” for beginning rapid smoking was more 
effective than beginning rapid smoking immediately. 

High contact normal-paced smoking (n=16). 
This group received an “aversive smoking” procedure 

i Kopel, 
1975; Lichtenstein et ahs 
specific treatment effects. It 
one’s normal rate while focusing on the unpleasant 
aspects of the pure smoking experience. Subjects were 
instructed to smoke until they could not bear to 
continue or until 5 min had elapsed, whichever came 
first, Otherwise, the rationale, number and spacing 
of sessions, and procedures used were identical to 
those of the rapid smoking group. If subjects smoked 
faster than one puff every 15-20 sec, they were 
reminded to smoke at their normal rate. 
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Measures 


Subjects’ therapeutic expectancies were assessed 
after the first treatment session and midway through 
the program following the first rapid smoking or 
normal-paced aversive smoking session, Monitoring 
of number of cigarettes smoked continued through- 
out treatment and during the week after treatment 
ended. Subjects again monitored their smoking for 
a 1-week period 3 months after treatment had 
ended and were then scheduled for a final interview. 
At this meeting deposits were returned and un- 
announced breath samples were collected. Expired 
air samples, which were analyzed for carbon mon- 
oxide (CO) content, were collected to provide an 
objective index of recent smoking behavior (Dana- 
her, Lichtenstein, & Sullivan, 1976; Lando, 1975; 
Kopel, Note 6). Details of the collection procedure 
are available in Glasgow (1978). Informant reports 
of smoking were obtained by mailing informants a 
one-page questionnaire that asked for estimates of 
subjects’ recent smoking behavior. Information on 
smoking rates 6 months after treatment ended was 
obtained by telephone calls asking for subjects’ esti- 
mates of their average daily cigarette consumption 
during the preceding week, 


Results 


Of the 69 subjects beginning the program, 
62 completed treatment. Five subjects 
dropped out after the first session for reasons 
unrelated to treatment, 1 subject dropped 
out after the fourth session following her 
doctor’s advice when she found out that she 
was pregnant, and 1 subject in the normal- 
paced group dropped out after four sessions 


Table 1 


Treatment Group Means on Process Measures for Trials 1 and 2 Collapsed Across Sessions 


Manual Aversive smoking only 
oe Ae a 

Minimal High Rapid Normal 
contact contact paced es 

Measure 1 2 1 2 1 2 1 si 
Expectancy of success 3.6 
After first treatment session* 5.7 5.1 5.8 ji 
After first aversive smoking 4 6d 
session® 6.1 6.1 6.3 48 
Length of trial in min.» 5.8 3.6 7.4 45 SiG. ESS, 5.0 11 
No. cigarettes smoked> 2.5 1.4 3.0 19 2.7 1.6 1.2 79 
No. negative sensations checked®.« 10.2 11.5 12.2 13.3 8.6 9.9 6.4 54 

Aversiveness rating? 5.9 6.4 5.9 6.1 Si 6.3 48 
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apparently because she was not helped by 
the program. Three-month follow-up infor- 
mation was obtained on 61 of the subjects, 


Preliminary Analyses 


One-way analyses of variance revealed no 
significant differences between treatment 
groups on demographic indices, smoking his- 
tory variables, or baseline smoking rates, | 
There were no between-groups differences on 
expectancy of success, either after the first 
treatment session or after the first aversive 
smoking session. All groups indicated mod- 
erate to extreme confidence in their ability 
to stop smoking in their program. 

Measures taken from the negative sensa- 
tions checklist completed during aversive 
sessions revealed significant between-groups 
differences (see Table 1) on average number 
of cigarettes smoked, number of negative sen- 
sations endorsed, and aversiveness ratings for 
both the first and second trials (averaged 
across sessions). Groups also differed on 
length of trial for the first trial only. Planned 
comparisons indicated that rapid smoking 
was significantly more intense (i.e, mote 
cigarettes smoked, more sensations endorsed) 
than the normal-paced procedure on each of 
these measures. The normal-paced group 
smoked at an average rate of 1 puff every 
24.2 sec, compared to the 6-sec pace set for 
rapid smoking groups. 


a Based on a 7-point rating scale where 1 = not 


€ Possible total of 20. 


1 atall,4 = moderately, and 7 = extremely. 
» Significant between-groups differences were observed on this measure. 
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Table 2 
Means and Standard Deviations on Percentage of Baseline Smoking and Therapist Contact Time 
% baseline smoking 
Post Follow-up Therapist time* 
Group M SD M SD M SD 
Manual 
Minimal contact 19.6 32.1 46.8 50.8 92.7 KYR 
High contact 19.7 39.0 48.7 45.5 188.1 33.3 
Aversive smoking only 
Rapid 10.6 17.2 63.6 42.1 167.9 23.0 
Normal 17.4 35.4 58.2 41.8 153.9 25.9 


* In minutes; between-groups differences significant at the .001 level. 


; Follow-through measures indicated that 

subjects in all groups completed most of 
their programs. Three measures of comple- 
tion of homework assignments for subjects 
receiving the manual were obtained from en- 
tries in the progress sheet in the front of 
subject’s manuals. Subjects averaged 93% 
completion of reading assignments, 95% 
completion of relaxation practice sessions, 
and 90% completion of aversive smoking 
sessions, There were no significant differences 
on follow-through measures between the min- 
imal contact group and the high contact 
group. Subjects receiving clinic-administered 
rapid smoking and normal-paced smoking 
completed 100% and 94% of their aversive 
sessions, respectively. 


Effectiveness and Efficiency of Treatments 


Percentage of baseline smoking averaged 
across treatment groups was 16.776 (3.4 cig- 
arettes per day) during the week after treat- 

ment, and 40% of subjects were completely 
abstinent. At the 3-month follow-up, subjects 
averaged 54.7% of baseline (12.4 cigarettes 
per day), and 28% were not smoking. Corre- 
lated t tests revealed a significant decrease 
in smoking rate across groups from baseline 
to posttreatment, ¢(61) = 13.29, P< 001. 
Even though subjects were still smoking sig- 
nificantly less at the 3-month follow-up than 
they had at baseline, #(60) = 7.03, $ < .001, 
they were smoking significantly more than 
ý aora at posttreatment, #(60) = 6-42, ? 
001. 


Between-groups differences on number of 
cigarettes smoked, percentage’ of baseline 
smoking, and number of subjects abstinent 
failed to reach significance either at post- 
treatment or at the 3-month follow-up. The 
pattern of treatment means (see Table 2) 
suggested that rapid smoking alone was 
somewhat more effective than the other 
treatments at the end of the program, but the 
great variability in the data appears to have 
precluded between-group differences. By fol- 
low-up the situation had reversed itself, with 
the manual groups being somewhat superior 
to other conditions. Again, effects were much 
too variable to obtain significance. Analyses 
of CO concentrations and informant reports 
of smoking similarly failed to reveal signifi- 
cant between-groups differences. 

It was possible to contact 50 of the 62 
subjects for the 6-month phone contact. At 
this point in time, subjects reported averag- 
ing 70.4% of their baseline smoking rate. 
Only 16% of those contacted reported being 
abstinent. The trend for manual groups (Ms 
= 66.9% and 64% of baseline for minimal 
and high contact groups, respectively) to be 
slightly superior to control groups (Ms = 
80.1% and 67.8% for rapid smoking and 
normal-paced smoking, respectively) contin- 
ued, but it again failed to approach signifi- 
cance on any outcome measure. : 
Efficiency was indexed by computing 
amount of therapist time spent in contact 
with clients from progress reports completed 
by therapists immediately after each session. 
There was a highly significant effect on this 
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measure, F(3, 58) = 30.4, p < .001. Tukey 
post hoc tests (Winer, 1971) revealed that 
the minimal contact manual group was more 
efficient than all other treatments and that 
the normal-paced group required less thera- 
pist time than did the high contact manual 
group. The minimal contact group required 
an average of approximately 14 hours (in- 
cluding telephone calls) of therapist contact, 
slightly more than half the time required by 
other groups (see Table 2). 


Relationships Between Measures 


The Pearson product-moment correlation 
between CO concentrations and self-report 
of cigarettes smoked that day was 40 (p < 
.05). The correlation between subjects’ re- 
port of time elapsed since they smoked their 
last cigarette and CO levels was —.62 (p 
< .001). A £ test comparing abstinent and 
smoking subjects on CO levels was highly 
significant, (53) = 4.89, p < .001. Mean CO 
levels were 29.9 parts per million (ppm) for 
smokers and 10.4 ppm for nonsmokers. In- 
formant reports of subjects’ smoking were 
obtained on 57 of the 62 subjects. Twenty- 
one subjects reported abstinence when their 
informants were contacted, and all of these 
reports were confirmed by informants. 

In a search for predictors of treatment 
success, numerous demographic, smoking his- 
tory, and process measures were correlated 
with outcome at posttreatment and follow-up. 
The only variable found to be consistently 
related to outcome was the expectancy rating 
immediately after the first aversive smoking 
session. This rating was modestly correlated 
with percentage of baseline smoking at post- 
treatment (r = —.37, p < .01) and at the 3- 
month follow-up (r = —.26, p< .05). Post- 
treatment smoking rates were moderately 
correlated with follow-up smoking rates (rs 
= 38 and .47 for number of cigarettes and 
percentage of baseline, respectively). j 


Discussion 


The overall magnitude of observed change 
was somewhat better than has typically been 
reported in the smoking literature (Hunt & 
Bespalec, 1974; McFall & Hammen, 1971) 
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but was less impressive than most previous 
investigations of rapid smoking (Lichten: 


Note 7; Weinrobe & Lichtenstein, Note 8) 
There are several possible explanations fo 
this result. The present investigation use 
undergraduate paraprofessional _ therapis 
and it is possible that this led to diminishet 
effectiveness. However, this is unlikely, 
Weinrobe and Lichtenstein (Note 8) hat 
successfully used paraprofessionals as rapi 
smoking therapists. Also, one of the mol 
experienced therapists in the present stud 
had previously served in studies that ha 
produced more impressive absolute result 
Analyses revealed that this therapist was n 
more successful than other therapists in thi 
investigation. 
A fixed number of rapid smoking session 
and trials per session were used rather tha 
continuing until subjects reported abstinent 
as in earlier research. While offering great 
experimental control, the standardized ap 
proach likely reduces the clinical impact © 
treatment. Other recent studies using rapi 
smoking with a fixed number of sessions hav 
generally reported absolute results simila 
to those of this study (see Danaher, in press) 
in contrast to the more impressive earliel 
results. Investigations of the effects of vary 
ing termination criteria (e.g, Weinrobe 4 
Lichtenstein, Note 8) deserve further atten 
tion. } 
The main factor militating against Sig 
cant results in this investigation was t 
enormous within-group variability observeđi 
This is a common finding in smoking reduc- 
tion outcome studies (Bernstein & cie a 
in press; Lichtenstein & Danaher, 1976) es 
suggests the need for identifying predic 
of success for different treatments. Consisteni 
with past research, the present study We 
relatively unsuccessful in identifying demor 
graphic or smoking history variables t 
correlate with the outcome of standardi 
programs. It is suggested that future researi 
should pursue the alternative strategy aliz 
evaluating different methods of individual 
ing treatment programs. Investigators ¥ a 6) 
the locus of control scale (Rotter, 1s 
have been moderately successful in assignim 


subjects to appropriate smoking reduction 
programs (Best, 1975). It is possible that a 
recent extension of the internal-external 
Jocus of control dimension to the specific area 
of health (Wallston, Wallston,; Kaplan, & 
Maides, 1976) would be even more effective 
in tailoring treatments to subjects. 

It may be that features in the design of 
the manual reduced its effectiveness. One 
possibility is that portions of the manual 
may have been too demanding or compli- 
cated. Even though the program was re- 
stricted to a few techniques, these were cov- 
ered in detail, and subjects may not have 
been provided with sufficient time to learn 
them adequately. The process of identifying 
effective (and ineffective) techniques in mul- 
ticomponent treatment programs remains a 
complicated but important task for future 
research, 

This study is consistent with numerous 
other investigations in finding only short- 
term (Winet, 1973; Conway & Morton, Note 
2) or no beneficial effects (Conway, 1977; 
Danaher, 1977; Ober, 1968; Danaher & 
Lichtenstein, Note 3) from manuals. It must 
be concluded that at the present time effec- 
tive self-help manuals for smoking reduction 
do not exist. It is suggested that authors 
concentrate on evaluating alternative treat- 
ment approaches before rushing to publish 
unvalidated programs. It may be that there is 
an addictive component of smoking—likely 
associated with nicotine (Russell, 1974; 
Schacter et al., 1977)—that is resistant to 
self-control approaches, Possible alternative 
strategies would be to establish a goal of 
controlled smoking (Frederiksen & Petersen, 
Note 9) rather than complete cessation or 
ithe use of nicotine chewing gum as a supple- 
Ment to smoking manuals. 

Despite its failure to improve treatment 
success, the minimal contact self-help manual 
condition was very efficient and produced 
long-term results at least as good as those 
of clinic-based treatment conditions. Unlike 
many other studies on self-help manuals 
(Glasgow & Rosen, 1978), minimal contact 
subjects completed the great majority of 
their programs, The high follow-through per- 
‘centages in this program may be attributable 
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to the progress chart in the manual, which 
specified the schedule that subjects should 
adhere to in completing their programs. The 
results of the present study suggest that if 
an effective smoking reduction manual were 
developed, evaluation under minimal contact 
or self-administered conditions would be war- 
ranted. 

The failure of rapid smoking to produce 
results superior to those of the control group 
is somewhat puzzling in light of previous in- 
vestigations using similar groups (Danaher, 
1977; Kopel, 1975; Lichtenstein et al., 
1973). This result appears to be a combined 
effect of rapid smoking being less effective 
and the normal-paced procedure being more 
effective (at follow-up) than in other studies. 
Process measures indicated that rapid smok- 
ing was more intense and aversive than nor- 
mal-paced smoking, thus suggesting that the 
treatments did differ procedurally. The rela- 
tively good long-term performance of what 
was intended as a control group, along with 
the failure of process measures taken during 
aversive sessions to correlate with outcome, 
suggests that the normal-paced procedure 
may be better construed as an alternative 
treatment than as a control group. Its use 
might be recommended in cases in which 
rapid smoking is contraindicated (Lichten- 
stein & Glasgow, 1977). 
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Delinquent behavior is conceptualized as a manifestation of situation-specific 
social-behavioral skill deficits. The research was in two phases. In Phase 1, a 
measure consisting of 44 behavioral role-playing and problem-solving items— 
the Adolescent Problems Inventory (API)—was empirically developed, along 
with an item-specific criterion-referenced raters’ manual. The inventory was 
designed to identify strengths and weaknesses in the personal and interpersonal 
skills repertoires of adolescent boys. Phase 2 was concerned with the validation 
of the API. In an initial validation study, the API responses of institutionalized 
delinquent boys were rated as less competent than the responses of either of 
two nondelinquent groups of teenage boys (“good citizens” and “leaders”) from 
a public high school. Analyses of the inventory’s characteristics showed it to be 
reliable, to be composed of items with little or no cluster structure, and to have 
» extraordinary discriminant power. A second validation study compared the API 
responses of two groups: institutionalized delinquent boys who had frequent 
behavioral problems within the institution and institutionalized delinquent boys 
who had few acting-out problems within the institution, The former group was 
judged to respond less skillfully. A third validation study replicated previous 
group differences between delinquents and carefully matched nondelinquents. 
The study also showed that the type of directions given (“What would you 
do?” vs. “What is the best thing to do?”) and test format (free response vs. 
multiple choice) significantly affected performance. It is suggested that re- 
searchers using a social skills conceptualization of personality do more thorough 


assessment studies of behavior pathologies before embarking on the develop- 
ment of large-scale social skills training programs, 


It has been suggested that some individ- male adolescent delinquents (Sarason & 


uals behave maladaptively simply because 
they lack the requisite skills to do better 
(e.g., McFall, 1976). In recent years, this 
skill-deficit conception of deviance has been 
reflected in numerous experimental skill- 
training programs aimed at treating such 
clinical populations as nonassertive college 
students (McFall & Twentyman, 1973), shy 
males (Twentyman & McFall, 1975) alco- 
holics (Sobell & Sobell, 1973), psychiatric 
inpatients (Goldsmith & McFall, 1975), and 
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zer, 1971). 
Unfortunately, nearly all skill-trav 
studies to date have been treatment orien 
that is, they have been concerned either 
evaluating the general therapeutic utility © 
skill-training programs or with assessing 
specific contributions of various training 
ponents, such as instructions, modeling, 
hearsal, or feedback. Meanwhile, many i 
damental questions concerning the underly 
assumptions, concepts, and methods of | 
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skill-training approach have been ignored. 
Some investigators, for example, have de- 
veloped the content of their skill-training 
programs without first conducting a thor- 
ough and systematic analysis of the perform- 
ance problems supposedly addressed by the 
programs. As a result, they have had no way 
of knowing whether their programs actually 
focused on the most relevant problem situa- 
tions for their clients or whether the be- 
haviors taught in the programs represented 
genuine solutions to these target problems. 
Furthermore, some investigators have offered 
skill training without first establishing that 
their clients actually were deficient in the 
particular skills being taught. 

What. is needed at this point is research 
aimed at developing a taxonomy of the par- 
ticular problem situations and skill deficits 
most characteristic of particular clinical pop- 
ulations, In the absence of such basic re- 
search, it will be difficult to develop valid 
methods for assessing and classifying the 
skill deficits of individual clients, and it will 
be difficult to determine what new behaviors 
clients need to acquire in order to perform 
more competently. Clearly, basic taxonomic 
research is a prerequisite to further treat- 
ment-oriented research. 

The present research was based on a so- 
cial-skills conception of delinquency among 
adolescent boys. Specifically, it was hypothe- 
sized that boys who have gotten into trouble 
with the law (i.e., adjudicated delinquents) 
would show situation-specific skill deficits 
when their performance in selected tasks was 
compared to that of matched nondelinquent 
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boys. The research was conducted in two 
phases. The first was concerned with identi- 
fying problem situations facing today’s teen- 
agers that might differentiate between the 
performance skills of delinquent and nonde- 
linquent boys and, concomitantly, with de- 
veloping explicit situation-specific criteria for 
evaluating performance competence. The first 
phase culminated in the creation of a be- 
havioral role-playing measure of social skills, 
the Adolescent Problems Inventory (API). 
The second phase was concerned with vali- 
dating the API by empirically evaluating its 
ability to differentiate between delinquent 
and nondelinquent boys, By implication, this 
phase also represented an indirect test of the 
utility of the underlying social-skills concep- 
tion of delinquency. Although the research 
did not involve any treatment efforts, one 
of its long-term aims was to provide a solid 
foundation on which to build future skill- 
training programs. 


Development of the Adolescent 
Problems Inventory 


The procedures for the first phase were 
adapted from Goldfried and D’Zurilla’s 
(1969) guidelines for the behavioral analy- 
sis of social competence. There were five se- 
quential steps: (a) situational analysis, (b) 
item development, (c) response enumeration, 
(d) response evaluation, and (e) construc- 
tion of the inventory and rater’s manual. 
The purpose, procedures, and products of 
each step were as follows: 

Situational analysis. The first step in- 
volved the identification of problem situa- 
tions that might be related to delinquency. 
Initially, a large pool of promising situations 
was gleaned from a variety of sources: the 
sociological and psychological literature on 
the etiology of delinquency; case files of 
institutionalized delinquents; structured in- 
terviews with nondelinquent boys (ages 15- 
17); interviews with correctional psycholo- 
gists, social workers, youth counselors, and 
teachers; and an open-ended questionnaire, 
given to 22 institutionalized delinquent boys 
(ages 14-18, M=15.8, SD=1.2), that 
asked about the problems of today’s teen- 
agers. This initial pool of situations subse- 


1450 


quently was reduced to a sample of 51 gen- 
eral descriptions of common problem situa- 
tions; this reduction was achieved by elimi- 
nating redundancies, by condensing similar 
situations into a single version, and by ex- 
cluding situations that seemed unrelated to 
social skills. 

The ultimate aim was to develop an inven- 
tory not a scale. If we had been developing 
a scale, we probably would have selected 
maximally similar or highly-related items 
from among a pool of items assumed to rep- 
resent a common domain, factor, or attribute. 
Since we were developing an inventory, how- 
ever, we tried to select maximally dissimilar, 
nonoverlapping items. In addition, we did 
not assume the existence of any underlying 
factors, domains, or general attributes. We 
assumed only that the items had two things 
in common: (a) They were descriptive of 
problem situations with which many teenage 
boys are familiar, and (b) the problem situ- 
ations were ones that if mishandled could 
get a teenage boy into trouble—conceivably 
into legal trouble. Thus, the assumed rela- 
tionship between inventory performance and 
delinquency was essentially one of risk. The 
more frequently a teenager handles problem 
situations competently, the less likely he will 
get into trouble and be judged a delinquent, 

The more universal and difficult a partic- 
ular situation, the more appropriate it was 
considered to be for the purposes of the 
present research, Therefore, each of the 51 
situations was rated on two 4-point scales 
by a new sample of 22 institutionalized de- 
linquent boys (ages 14-18, M = 15.8, SD = 
1.2), The first scale assessed how “common” 
they felt the situation was for boys their age; 
the second scale assessed how “difficult” they 
thought the situation would be for them to 
handle. A composite index of these common 
and difficulty ratings was constructed, and 
the 51 situations were rank ordered on that 
index to determine their appropriateness for 
inclusion in the study. Based on this rank- 
ing, nine situations were eliminated for being 
too uncommon or too easy, leaving a final 
pool of 42 problem situations. 

Item development. The purpose of this 
second step was to translate the 42 general 
problem descriptions into specific items suit- 
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able for use as stimuli in a behavioral role- 
playing test. Ninety potential test items were 
written in the format of narrative descrip- 
tions in which the scene, characters, history, 
and goals of each incident were presented, 
At the end of each item, adolescents were 
asked to indicate how they would respond 
if they found themselves facing such a prob- 
lem. The particular question posed to re- 
spondents took one of the two forms, de- 
pending on the particular item: (a) “What 
would you say or do now?” or (b) “What 
can you do to go about resolving this prob- 
lem?” The following are examples of the 
two types of items: * 


You're visiting your aunt in another part of town, 
and you don’t know any of the guys your age there, 
Yowre walking along her street, and some guy 8 
walking toward you. He is about your size. As he is 
about to pass you, he deliberately bumps into you, 
and you nearly lose your ‘balance. What do you say 
or do now? 


It is Saturday morning, and you have nothing planned 
for the whole day. There’s nothing to look forward 
to all day. You feel bored already, just thinking about 
it. You need some kicks. What can you do to g0 about 
solving this problem? 


Some items posed isolated problems, 
whereas other items posed related or S 
quential problems. For instance, the example 
in which the respondent was bumped into 
by a stranger was followed by this related 
item: 


Now what if he had done the same thing, bumped 
into you, and you nearly lost your balance, and f 
time he said, “Look where you're going, clumsy 
What do you say or do now? 


The next step W3 
nses, 


Response enumeration. 
to obtain a sample of possible resp0 
representing a wide range of social com 
tency, to each of the 90 items. Twenty-three 
subjects participated in this step. There w 
12 institutionalized delinquent boys (M T 
= 16.2, SD=1.0); 6 nondelinquent boy 
from public and parochial high schools 
age = 16.9, SD = .9); and 5 adults (4 men; 
1 woman) with professional experience : 
working with delinquent boys. 


Í 


p t. 
1 Copies of all 90 items are available on reques 
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The items were administered to subjects 
in individual sessions lasting between 1} and 
24 hours. Adolescents were instructed to re- 
spond as they typically would in each prob- 
lem situation; adults were asked to play the 
role of “expert advisors” and give what they 
considered to be the best response for a 
teenage boy to make in each situation. The 
examiner read each item aloud, the subject 
responded orally, and the response was re- 
corded verbatim. In addition, any questions 
or difficulties were noted and used as guides 
in subsequent efforts to refine and clarify in- 
structions and item wording. 

Response evaluation. The purpose of this 
step was to evaluate the quality of the 23 
responses per item obtained in the preced- 
ing step. Judges, working independently, 
were asked to rate the competence of these 
responses using whatever subjective criteria 
they felt were important. Thirteen adults (4 
men, 9 women) volunteered to serve as 
judges; they were advanced undergraduate 
Psychology majors, clinical psychology in- 
terns, and professional psychologists. De- 
pending on the amount of time that they 
were able to volunteer, each judge rated be- 
tween 22 and 90 items. For any particular 
item, the number of judges ranged from 8 
to 13, with a median of 11. Each judge was 
given a 145-page packet containing the 90 
items along with typed transcripts of 23 re- 
Sponses to each. Judges worked independ- 
ently, at their own pace, at home. They were 
blind as to the purpose of the study and the 
origins of the responses that they were evalu- 
ating. To minimize possible order effects, the 
23 responses to each item were arranged 
randomly; furthermore, 5 judges progressed 
from Item 1 to Item 90, 5 went from Item 
90 to Item 1, and 3 worked from Item 45 
Outward in both directions. 

All of the judges first classified each of 
the 23 responses to each item as “compe- 
tent,” “incompetent,” or “neither competent 
Nor incompetent.” ? Then they rated the rela- 
tive competence of each of the 23 responses 
Per item on a 50-point scale, ranging from 

= maximally incompetent to 50 = maxi- 
mally competent. In addition, four judges 
Were asked to specify in writing, as explicitly 
aS Possible, which criteria they had used to 
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evaluate all of the responses to a particular 
item, and a fifth judge orally explained the 
criteria that he had developed. The 50-point 
ratings and the written criteria were used 
later in the construction of a rater’s manual. 

Based on judges’ evaluations, the original 
list of 90 items was reduced to a final set of 
44 items. Interjudge agreement and item dif- 
ficulty were the two criteria used for item 
selection, An item was retained (a) if there 
was 75% or higher interjudge agreement in 
the competency classification of the 23 re- 
sponses to it and (b) if 25% or more of 
the 23 responses to the item were judged 
to be incompetent by at least 75% of the 
judges. 

Construction of the inventory and rater’s 
manual. The 44 items meeting both criteria 
were rewritten and polished in a final effort 
to eliminate any ambiguities or other prob- 
lems noted in preceding steps. These items, 
in their final form, comprised the API, which 
was designed to be administered as a be- 
havioral role-playing test. 

A rater’s manual for the API was also de- 
veloped in this step. The manual presented 
explicit criteria for rating subjects’ responses 
to each of the 44 items on a 5-point scale. 
Scale values were 8, 6, 4, 2, and 0; these 
values corresponded to, judgments ranging 
from very competent to very incompetent. 
The criteria, which were item specific, were 
based on data provided by judges in the pre- 
ceding response evaluation step. The manual’s 
criteria reflected the distinguishing charac- 
teristics of the responses that judges had 
identified as competent, incompetent, or 
neither. Finer distinctions among responses 
were made by taking into consideration 
judges’ ratings on the 50-point scale. When: 
ever possible, the manual incorporated judges 
statements concerning the criteria that they 
used in their evaluation of responses to par- 
ticular items. Sample responses representing 
different levels of competency were also in- 
cluded in the manual. 


2 Judges were allowed to sort the responses without 
concern for their distribution across the three cat- 
egories; for example, it was possible for a judge to 
classify all 23 responses to a particular item as “‘com- 
petent.” 
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To illustrate the item format and scoring 
manual, one item (12) is presented in full 
below. In addition, Table 1 presents a sum- 
mary abstract of all 44 problem situations 
retained in the final version of the API.* 


It is 1:30 at night, and you’re walking along a street 
near your home. You’re on your way home from your 
friend’s home, and you know it is after curfew in your 
town. You weren’t doing anything wrong. You just 
lost track of time. You see a patrol car cruising 
along the street and you feel scared, because you 
know you can get into trouble for breaking curfew. 
Sure enough, the car stops next to you, the policeman 
gets out, and he says, “You there, put your hands on 
the car. Stand with your feet apart.” What do you 
say or do now? 

Score: 


8—4rITHER the subject does it without saying any- 
thing or he asks a brief general question respect- 
fully. 
EXAMPLE: “What’s wrong, officer?” “Is some- 
thing the matter?” or he explains honestly and 
convincingly where he was. 

6—The subject explains where he was, etc., but in 
a less assertive or less convincing manner, 
EXAMPLES “I just got out of Pete Jones’ house. 
You can call him if you want to.” 

4—No specific criteria . . . midway between re- 
sponses scored 6 and 2. 

2—The subject is antagonistic or flippant or in- 
solent. 


O—EITHER the subject hits the policeman or he runs 
away. 


Validation of the Adolescent 
Problems Inventory 


In this phase of the research, the API was 
subjected to two tests of concurrent dis- 
criminant validity. The first was a study 
comparing the API responses of three groups 
of adolescent boys assumed to represent three 
points along a continuum of social compe- 
tence: a group of institutionalized delin- 
quents, a comparison group of nondelinquent 
peers (“good citizens”), and a group of non- 
delinquent adolescent “leaders.” The second 
study involved a comparison between the 
API responses of two groups of institutional- 
ized delinquents, who differed in terms of 
their frequency of their placement in a con- 
finement cottage as a result of so-called 
acting-out behaviors, according to institu- 
tional records. 
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Study 1 
Method 


Subjects. There were three groups, with 20 Cau- 
casian boys in each, for a total of 60 subjects.4 The 
three groups were as follows: 

1. Delinquents: These boys were all residents of a 
state correctional institution for juvenile offenders. 
They ranged in age from 14.1 to 17.8 years (M = 
16.4, SD = 94). According to Hollingshead’s (Note 
1) index of socioeconomic status, which is based on 
the head-of-household’s educational level and occupa- 
tion, the group’s mean socioeconomic level was 50.9 
(SD = 10,52); this indicates that the typical de- 
linquent subject came from the working class (Class 
IV). The various offenses for which the boys had 
been institutionalized were car theft, sale and posses- 
sion of drugs, burglary, robbery, battery, vandalism, 
truancy, runaway, forgery, and arson. It was the first 
stay in the state correctional institution for 14 of the 
boys, the second stay for 3, the third for 2, and the 
fourth for 1 (M = 1,5, SD = 88). The length of stay 
at the time of testing ranged from 3 days to 13 
months. 

2. Good citizens: These nondelinquent boys were 
attending a public school. They were selected by the 
school’s guidance counselors, who were asked to 
nominate individuals who were law-abiding, mature, 
responsible, able to get along well with peers and 
adults, and involved either in extracurricular activities 
or after school jobs. Their ages ranged from 14.3 fo 
17.7 years (M = 16.4, SD = .94), The mean socio- 
economic level for the group was 45.7 (SD =12.03); 
like the delinquent subjects, they tended to come pri 
marily from the working class. Prior to testing, eac! 
nondelinquent was asked whether he had ever experi- 
enced legal difficulties; no one admitted to having 4 
police record. er 

3. Leaders: These subjects were also public hig! 
school students selected on the basis of guidance coe 
selors’ nominations. In addition to possessing all be 
the attributes of the good citizens, these subjects wey 
recognized as student leaders; they were the edito ë 
of the school newspaper and yearbook, student sen: 
ators, class presidents, and star athletes. Their ags 
ranged from 14.8 to 17.9 years (M = 16.8, SD = 2. 


only one significant difference: The mean se 4 
nomic status of the leaders (M = 34.4, SD = 
was higher than that of the other two groups 


8The Adolescent Problems Inventory was ai 
righted by the first author and may not becuse 
any form without her written permission. gays 

4 This initial study was limited to Caucasian fern 
due to the availability of subjects and to a a 
about not examining too many variables in D gen- 
study. Obviously, the experimental results can iy re- 
eralized only to Caucasian boys until subsea erob 
search examines the relevance of the Adolescent 


lems Inventory for non-Caucasian boys- 


SKILL DEFICITS IN DELINQUENTS 


Table i 
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Abstract of 44 Problem Situations Covered in the Final Version of the 


Adolescent Problems Inventory 


———— 


. A male, peer, stranger deliberately bumps into you on the street.* 


, Same as #1, plus he blames you.* 


. A gym teacher picks on you, makes you do extra pushups.*> 


. A friend suggests buying booze illegally.* 


. You want to break up with your girlfriend without hurting her.*»> 
. The school principal threatens to suspend you for hassling a substitute teacher.* 


i 
2 
3 
4. 
5, Your father tells you to stay home on Saturday night.* 
6. 
7 
8 


. You come home late at night and your father is waiting up for you and is angry.*-» 
9. Your are called names by some guy in the schoolyard. 
10, Your mother tells you to put on decent clothes before leaving the house.*» 
11. A friend wants you to deliver some drugs; he offers drugs and money in return," 
12, You are stopped on the street by a policeman after curfew.* 
13. Your father wants you to stop seeing one of your male friends.* 
14, Another boy makes an insulting remark about your mother.* 
15, A friend suggests that you two steal a handgun from a discount store.* 
16. You back your car over the neighbor’s trash can; he yells at you." 
17. Your friend is upset because you dated a girl he likes.* 
18, You’ve been grounded, A friend urges you to sneak out of the house.* 
19. Your father gives you an ultimatum about getting your hair cut.* 
20, A policeman comes to your door and asks for you.* 
21. A teacher accuses you of writing obscene words on the walls in the men’s room.* 
22. A friend suggests joy riding in a car with the keys left in it. 
23. You run out of gas, get to work late, and get fired.* 
24. Your father gets upset when you ask to borrow the car.* 
25, A friend asks you to steal something for him from where you work.* 
26. While with a friend, your father angrily tells you to go clean your room.* 
27. An older friend asks you to help hold up a gas station.* 
28. You want to ask the manager of a McDonald’s for a job.* 


29, Your girlfriend offers you a joint at a party.* 


30. You ask a girl for a date and she says that her father won’t let her go out with you,» 
31, A girl’s father meets you at the door and says he won’t let her go out with you." 

32. Peers at school hassle you about your criminal record.* 

33, A job interviewer is biased by your criminal record.* 


34. A teacher hassles you about your criminal record, 


35. You wake up in a bad mood.*» 


36. You need more money, your parents can’t give i 


job.» 
37, Your are bored and want some fun.* 


38. You are studying for a final exam. A friend wants you to go to a concer 


39, Your mother forbids you to see a friend again.* 


it to you, and you are too young for a regular part-time 


t instead.* 


40. Your girl breaks up with you. You feel miserable.* 
41. You don’t feel like delivering your paper route today.* 


42. You feel hopelessly lost in a geometry class.* 


43. You have a car and want something exciting to do.* 


44, Your mother hassles you about going to church. 


r Good citizens > delinquents. 
Leaders > good citizens. 


005); the leaders tended to come from the upper 
Middle class (Class II). 

Procedure. The API was administered to subjects 
Y a female examiner in individual sessions lasting 
approximately 1 hour each.5 After presenting the in- 
structions, the examiner played an audiotape con- 
taining the 44 test items. For items involving inter- 
actions with men, the tape-recorded voice was that of 
à man; for interactions involving women, the voice 
Was a woman’s, The examiner presented the test items 


remote control switch to start 


and stop the stimulus tape. Subjects’ oral responses 
were recorded on a second machine for subsequent 


evaluation by trained “blind” judges. 


one af a time, using a 


5Jt is an unanswered empirical question as to 
whether the use of an adult female examiner had any 
significant effect on the teenagers’ responses, Future 
research must examine this question. 
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Working independently, two raters listened to sub- 
jects’ recorded test responses and rated the com- 
petence of each response on a scale ranging from 0 
(very incompetent) to 8 (very competent), according 
to the criteria outlined in the raters’ manual. Inter- 
mediate rating values (1, 3, 5, and 7) were permitted 
if judges felt that responses were midway between 
categories. The raters were a male and a female, both 
seniors majoring in psychology. The protocols of 10 
subjects (5 delinquents, 5 nondelinquents) were used 
for training, Interrater reliability for the remaining 
50 protocols was analyzed by computing the Pearson 
product-moment correlation between raters’ scores 
for individual responses to all 44 items across all 50 
protocols (i.e., the correlation was among 2,200 pairs 
of ratings). There was a high level of interrater agree- 
ment (r= .99). The mean of the two raters’ judg- 
ments was used in all subsequent analyses, 

‘In addition, data on IQ and grade point average 
for the preceding year were obtained on most sub- 
jects from institutional and school records. Unfor- 
tunately, the IQ estimates for nondelinquent and de- 
linquent subjects were based on different tests: For 
nondelinquents the estimates came from the Henmon- 
Nelson test; for delinquents the estimates came from 
the Culture Fair Test, a nonverbal measure, and 
from the Wide-Range Vocabulary Test, a verbal mea- 
sure. 


Results 


Across all 44 items, delinquents earned a 
mean score of 2.73 (SD=.77); the mean 
for good citizens was 5.86 (SD = .65); and 
the mean for the leaders was 6.77 (SD = 
.60). Planned group comparisons, based on 
total API score, revealed that the leaders 
performed significantly better overall than 
did the good citizens, F(1, 57) = 19.63, p < 
.001. An item-by-item comparison revealed 
that the leaders significantly outperformed 
the good citizens on 7 out of the 44 in- 
dividual items. 

Good citizens, in turn, performed signifi- 
cantly better overall than did the delin- 
quents, F(1, 57) =217.71, p< .001. This 
latter comparison is the one of greatest in- 
terest, since these two groups were compar- 
able in mean age and socioeconomic level. An 
item-by-item comparison showed that the 
good citizens also performed significantly bet- 
ter (p < .05) than the delinquents on 42 
out of the 44 individual items. In absolute 
terms, the delinquent group’s deficient per- 
formance was reflected in their earned mean 
rating of less than 4.0 (i.e., a value more 
indicative of incompetence than competence) 
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on 36 of 44 items. In contrast, good citizens 
earned mean ratings of less than 4.0 on only 
1 item. Leaders did even better, earning no 
mean ratings below 4.0 and a mean of 5.0 
or better on all but one item. 

Table 1, which provides a brief abstract 
of the 44 API situations, also indicates which 
items yielded significant group differences 
(p < .05) in the comparisons between lead- 


ers and good citizens and between good citi- | 


zens and delinquents. 

Grade point averages and IQ scores for 
the three groups were obtained from insti- 
tutional and school records. The mean IQ 
scores of the two nondelinquent groups were 
not significantly different: leaders = 1118 
(SD = 6.91) and good citizens = 102.7 (SD 
= 16.93). However, leaders earned signif- 
cantly higher grade point averages than good 
citizens: 3.25 (SD =.50) and 2.75 (SD= 
.63), respectively; F(1, 35) = 5.20, p < .05. 
The good citizen group, in turn, had a sig- 
nificantly higher mean IQ than the delin- 
quent group, whose mean score on the Cul- 
ture Fair Test was 93.2 (SD = 10.13) and 
on the Wide-Range Vocabulary Test was 
87.4 (SD= 10.09); Fs(1, 35) =4.33 an 
11.12, respectively (both ps < .05). The de- 
linquent group also had a significantly lower 
grade point average (M = 1.80, SD = 93), 
F(1, 35) = 25.83, p< .05. Overall, there 
was a strong correlation between total 
scores and verbal IQ scores (r =-70, t< 
.05).° When correlations were computed S¢P* 
arately within the delinquent and the non- 
delinquent samples, however, the signi cant 
relationship disappeared (for delinquents, 1 
= .14; for nondelinquents, r= —.03)- 

The troublesome relationship between 
score and API performance could not 
ruled out as a contributing factor M 
obtained group differences on the API. r 
could the effect of IQ be satisfactorily Cm 
trolled through statistical manipulations, SY : 
as analysis of covariance, because of the rea 
sons cited by Lord (1967). Despite thes? 


1Q 
e 
e 


6 In this correlation, nondelinquent IQ scores bai 
based on the Henman-Nelson test; delinquent s¢ 
were based on the Wide Range Vocabulary iture 
When delinquents’ IQs were estimated by the Cu 58). 
Fair Test, the correlation was somewhat lower ¢ 


| 
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limitations, the IQ data were examined in 
another way for heuristic purposes. Two sub- 
samples of eight subjects each were drawn 
from among the most intelligent of the de- 
linquents and the least intelligent of the 
nondelinquents; in this way, two subsamples 
equated on mean IQ were composed (delin- 
quents’ IQ = 101:6; nondelinquents’ 1Q= 
101.0). The mean API scores of these groups 
(2.51 and 6.60, respectively) were still highly 
discrepant, F(1, 14) =85.00, p<.05. Of 
course, these results are only suggestive, due 
to the problem of statistical regression and 
the fact that the selected subsamples may 
not have been representative of their parent 
groups. These results tend to imply that a 
subject’s verbal skills may contribute to his 
scores on both the IQ and API measures, 
but the API seems to measure something 
above and beyond verbal intelligence. 


Characteristics of the Inventory 


Reliability. The API was designed to be 
an inventory, rather than a scale; neverthe- 
less, a reliability analysis was performed 
using the entire sample of 60 subjects. The 
results of the analysis appear in Table 2. 
These statistics should be interpreted with 
caution, since they were computed on a 
sample containing only extreme groups rather 
than on a random sample of adolescent boys. 
The effect of using extreme groups is to 
inflate estimates of internal consistency such 
as the coefficient alpha and the corrected 
item-total correlation. 

Item relationships. Because the number 
of situations relative to the number of sub- 
jects was large, it was not appropriate to 
Perform a factor analysis with the present 
data. However, four hierarchical cluster- 
analytic techniques were used in an attempt 
to group situations on the basis of similarity 
of subjects’ performance competence. A com- 
Plete linkage and an average linkage cluster 
analysis were performed on both the correla- 
tion and the squared Euclidian distance ma- 
trices (Lance & Williams, 1967; Sneath & 
Sokal, 1973). The situational clusters that 
emerged differed from one technique to the 
other and were generally uninterpretable in 
terms of their content. Between 8 and 14 
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Table 2 
Reliability Analysis of the Adolescent 
Problems Inventory 
Variable M SD 
Total score 225.48 9.06 
Item M 5.12 15 
Item SD 2.86 48 
Interitem r 39 1S 
Corrected item-total r 62 3 


Note. Coefficient alpha = .966, 


weak clusters were identified, depending on 
the solution used. For example, one of the 
clusters in the complete linkage cluster analy- 
sis of the correlation matrix consisted of four 
situations: (a) “Your father forbids you to 
go out with a friend,” (b) “you're embar- 
rassed to ask a teacher for help,” (c) “you're 
fired by your boss for accidently being late to 
work,” and (d) “a policeman comes to your 
door and asks for you.” The majority of 
clusters showed a similar lack of interpreta- 
bility, Even the two individual items with 
the strongest association (r = .79) were not 
related in a readily apparent way: #19. 
Your father gives you an ultimatum about 
getting your hair cut; #29. Your girlfriend 
offers you a joint at a party. 

The lack of clear results in the cluster 
analyses indicates that competence scores are 
not the proper measure on which to construct 
a situational taxonomy, Instead, one probably 
would do better to classify situations either 
on the basis of the similarity among their 
stimulus properties or on the basis of the 
similarity of the specific behavioral task re- 
quirements of the situations. The lack of con- 
sistent clustering of the situations, in con- 
junction with their moderate intercorrela- 
tions, leads to the conclusion that competence 
scores show a rather high degree of situa- 
tional specificity, especially when the items 
are specifically designed to be nonoverlapping 
in their content, as in the API. 

Discriminating power. Although planned 
comparisons indicated that good citizens per- 
formed significantly better than delinquents 
on 42 of the 44 API items, the significant Fs 
do not reveal the degree to which the APT 
can actually discriminate between _ these 
groups. A discriminant analysis, which is 
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more appropriate, was performed using four 
sets of items from the API. Mean scores were 
computed for each of the 20 delinquents and 
20 good citizens across 6 situations involving 
aggression (AGR), across 21 situations in- 
volving interactions with adult authorities 
(AUT), across 2 situations involving bad 
moods, (BM), and across 8 situations in- 
volving resistance to temptation (RTT). The 
resulting discriminant function was Y = 
—.10AGR — 32AUT — .02BM — .14RTT 
+ 2.53. Probabilities of misclassification for 
this function were computed using the U 
method of Lachenbruch and Mickey (1968), 
who showed that this method is superior to 
the method of validating the function by 
resubstituting the original data. The U 
method gives good estimates of the results 
that would be obtained by using a fresh vali- 
dation sample. 

Results indicated that the estimated prob- 
ability of misclassifying a delinquent is .00, 
and the estimated probability of misclassify- 
ing a good citizen is .11. In other words, the 
discriminant function based on the four con- 
tent scores of the API was 89% correct when 
it was used to classify the subjects in the 
derivation sample. In particular, its success 
rate was 100% in correctly identifying the 
delinquents. 

Since the proportion of delinquents in the 
derivation sample was .5, it would be mis- 
leading to use these results to estimate the 
performance of the API if it were applied to 
a more typical population of adolescent boys. 
Therefore, an additional analysis was per- 
formed using various base rates, selection 
ratios, and costs of misclassification, all of 
which are important factors that must be 
considered when evaluating the ability of an 
instrument such as the API to discriminate 
between delinquents and nondelinquents. 

Considering the difficulties in correctly 
identifying a population with a very low base 
rate (Meehl & Rosen, 1955), the API can 
be expected to perform remarkably well. To 
illustrate, consider a hypothetical situation: 
Suppose there is a population in which the 
base rate of adjudicated delinquents is known 
to be .03, the selection ratio is fixed at .08, 
and it would cost 25 times as much to mis- 
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classify a delinquent as to misclassify a non- 
delinquent. Under these conditions, the results 
of our discriminant analysis indicate that the 
API could be expected to perform as follows 
if used to classify 100 boys: We could expect 
to be correct 95 times out of 100 in identi- 
fying delinquents and nondelinquents. This 
is superior to what we could expect if we 
were to try to identify boys on a random 
basis, in which case we would be expected to 
be correct only 90 times out of 100. A selec- 
tion ratio of .08 means that we would be 
interested in selecting 8 out of every 100 boys 
(e.g., candidates for a training program). 
Applying our discriminant rule, 3 of these 8 
would be delinquent (i.e., valid positives), and 
5 would be nondelinquents. If we had elected 
instead to select 8 boys randomly, we could 
expect virtually no delinquents to be selected. 
Finally, the validity coefficient of the API, in 
this context, is .60, which is the mathemati- 
cally derived, theoretical upper limit to the 
validity that any instrument could attain. 


Study 2 


Subjects in the preceding validation study 
were presumed to represent three levels of 
performance along the full continuum of s0- 
cial competence, The purpose of this second 
validation study was to determine how well 
the API could differentiate between two 
groups of boys representing less extreme 
points on that continuum. Specifically, the 
study compared the API performances of 
two groups of institutionalized delinquent 
boys who were known to differ in their his- 
tory of disruptive behaviors and rule viola- 
tions within the institution. 


Method 


There were two groups, with 15 Caucasian Ge 
linquent boys in each, for a total of 30 subjects. 
were residents of a state correctional facility at ° 
time, and none had been subjects in the first a ip 
One group was comprised of boys whose records int 
cated that they had a history of frequently engagini 
in disruptive behaviors within the institution; i 
is, they had spent more than 25 days (W 
days) during the preceding 6 months in the insti 
tion’s security cottage as a result of running oe 
Possessing drugs or contraband, assaulting pee’ 
staff, or other serious misbehaviors. The second gr0uP 
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was comprised of boys who had spent less than 5 
days (M = 1.74 days) in the security cottage during 
the same period. 

The boys ranged in age from 14 to 17 years. The 
mean age of the high-disruptive group was 16.46 
years (SD = 2.82); for the low-disruptive group, it 
was 16.40 (SD=.75). A comparison of the mean 
socioeconomic backgrounds of the two groups was not 
statistically significants however, the high-disruptive 
boys tended to come from Class V families, whereas 
the low-disruptive boys tended to come from Class IV 
families. The testing procedure was identical to the 
one used in the first study, The two raters in the 
present study achieved absolute agreement on 93.5% 
of their ratings for individual responses to each of the 
44 items by each of the 30 subjects. 


Results 


A one-way analysis of variance revealed 
that the low-disruptive subjects earned sig- 
nificantly higher total scores on the API than 
did high-disruptive subjects, (1,28) = 
5.92, p < .025. Furthermore, in an item-by- 
item comparison, the low-disruptive group 
performed better on 32 of the 44 items, x"(1) 
= 8.64, p < 01. 


Study 3 


This study addressed two important and 
unanswered questions from the previous stud- 
ies. First, it assessed whether poor API per- 
formance is actually due to skill deficits, as 
hypothesized, or whether it is simply an arti- 
fact of the task’s instructions. Standard API 
instructions are “Imagine that you're ac- 
tually in the situation, and tell me in your 
exact words what you would say or what you 
would do if you were really there.” Presuma- 
bly, the delinquent subjects faithfully fol- 
lowed the instructions and reported their 
typical responses. It is possible that they 
would have given more competent responses, 
however, if they had been asked to give the 
best responses that they could think of, 
tather than their typical responses. This pos- 
sibility was explored in this study. 

Second, assuming that delinquents do have 
skill deficits, what is the specific nature of 
such deficits? In previous studies, the API 
was administered in an open-ended, free- 
response format, The present study examined 
whether delinquents would show comparable 
deficits if the API were administered in a 
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multiple-choice format. Delinquents may be 
deficient at generating competent responses, 
but would they also be deficient at recogniz- 
ing competent responses? 

The present study also introduced two 
methodological refinements to the basic 
paradigm used in the preceding studies. The 
first dealt with subject selection. In Study 1, 
good citizens were selected on the basis of 
guidance counselor nominations. Post hoc 
analyses indicated that the good citizens and 
delinquents were comparable on mean age and 
socioeconomic status but not on IQ. In the 
present study, each delinquent subject was 
carefully prematched on age, socioeconomic 
status, and IQ with a nondelinquent subject, 
who was drawn from public school files. 

The second refinement was in the area of 
response ratings. In Studies 1 and 2, raters 
listened to the audiotaped responses of each 
subject to all 44 API items before proceeding 
to rate the next subject’s responses. This 
procedure may have introduced a rating bias. 
For example, a subject’s early responses may 
have contaminated ratings of his later re- 
sponses. A subject’s grammar, diction, or 
revelation of illegal behavior may have led a 
rater to give that subject a generally nega- 
tive evaluation. To control such influences in 
the present study, typed transcripts of sub- 
jects’ responses were rated. Furthermore, 
raters evaluated all subjects’ responses to a 
given item before proceeding to the next 
item; the order of answers within each item 
was randomized. 


Method 


The subjects were 40 delinquent boys 
from the Wisconsin School for Boys at Wales and 40 
nondelinquent boys from the Madison Public School 
System. All were Caucasian, 16 or 17 years old, with 
IQs between 82 and 117. All were from families in 
which the head of the household was a small inde- 
pendent business person; clerical or sales worker; or 
skilled manual, semiskilled, or unskilled worker, Pub- 
lic school boys were asked if they had ever been in 
any trouble with the police; only those who indicated 
that they had not were included in the study. 

Each delinquent subject was matched with a non- 
delinquent youngster of approximately the same age, 
IQ, and socioeconomic status. Of necessity, available 
scores were used to estimate subjects’ IQs. Unfor- 
tunately, the jntelligence tests previously administered 
to the two populations were not the same; the Wide- 


Subjects. 
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Range Vocabulary Test was used for the delin- 
quents, and the Lorge-Thorndike Intelligence Test 
was used for the nondelinquents. Socioeconomic status 
was estimated using the Occupational scale of the 
Two-Factor Index of Social Position (Hollingshead, 
Note 1). 

Design. A 2X 2X2 X2 factorial design was used. 
The four factors were subjects (delinquent vs. non- 
delinquent); instructions (standard—‘What would 
you do” vs. alternate—“What is best to do”); test 
format order (free-response format administered first 
vs. multiple-choice first) ; and item-group order. (The 
half of the items administered first to one group of 
subjects was administered second to the other group 
of subjects, and vice versa.) These four factors yielded 
16 experimental cells, with five subjects per cell. 
Matched pairs of delinquent and nondelinquent sub- 
jects were randomly assigned to cells across the 
remaining three factors. 

Each subject was administered all 44 API items 
under only one set of instructions—either standard or 
alternate. However, every subject was administered 
one half of the test in a free-response format and 
the other half in a multiple-choice format, with the 
format order balanced across subjects. To control for 
possible differences due to item content, the 44 API 
items were split into two equivalent groups of 22 
items each. The presentation order of item groups 
(A and B) was balanced across subjects, instructions, 
and test formats. 

Development of stimuli. The alternate instructions 
for the API told the subject to describe “What you 
think Someone should say or do in that situation—not 
necessarily what you would do, but what you think 
is the best way someone could solve that problem?” 

The multiple-choice version of the API was con- 
‘structed by writing five response alternatives for each 
of the 44 items. The five options per item were de- 
rived from the response criteria given in the API 
scoring manual. The five responses corresponded to 
the points on the manual’s 5-point rating scale: very 
competent, competent, neither competent nor incom- 
petent, incompetent, and very incompetent. 

Item Groups A and B, with 22 items each, were 
constructed as follows: All API items were classified 
according to the type of interaction involved (egs 
with parents, teachers, male peers, girls, police, etc.). 
The items from each category were divided as equally 
as possible between Item Groups A and B. The 
equivalency of the resulting groups was then checked 
by examining the scores on A and B items previously 
earned by subjects in Study 1: delinquents: t(42) = 
408; good citizens: £(42)=.973; and leaders: 
t(42) = .198 (all ns). The item groups appeared to be 
satisfactorily equivalent. 

Procedure. The procedure was identical to that in 
Study 1, except for the following important modifica- 
tions or additions: ý 

Each subject was individually administered one of 
the eight possible versions of the API by a female 
experimenter. To control for differences in i 
ability, the subject listened to an audiotaped presenta- 
tion of the API at the same time that he read from 
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the written test. If multiple-choice items were being 
presented, the five alternative responses also were 
presented on the audiotape, as well as in the written 
test, 

During the free-response half of the test, the sub- 
ject responded orally to each item, and his response 
was tape-recorded for subsequent transcription and 
scoring by raters. In the multiple-choice half of the 
test, the subject responded to each item by circling the 
letter corresponding to his response choice on the test 
booklet. The experimenter subsequently scored the 
multiple-choice responses with a key, assigning values 
on a 5-point scale (0, 2, 4, 6, and 8), corresponding 
to the competence criteria in the original API scoring 
manual. 

Typed transcripts were made of subjects’ oral re- 
sponses under the free-response format. These tran- 
scripts then were edited to remove conspicuous use 
of grammar, word choices, or self-disclosure that 
might inadvertently reveal the identities of delin- 
quent subjects. Finally, these typed responses were 
grouped by item and randomized for subsequent rat- 
ing by four volunteers from an advanced under- 
graduate psychology class. One pair of raters rated 
half of the items; the other pair rated the rest. Across 
all items in Group A, the mean interrater agreement 
was .81, with the intercorrelations on individual items 
ranging from .64 to .93. Across all items in Group B, 
the mean agreement was .74, with interrater agree- 
ment on individual items ranging from .49 to .94. 
When raters were asked to repeat their ratings of 
selected items, as a way of estimating intrarater 
reliability, the mean was .90, with a range between 
-76 and .96. 


Results 


To evaluate whether the subject selection 
procedures yielded equivalent groups, ¢ tests 
were performed on the delinquent and non- 
delinquent groups for the three matching 
variables of age, IQ, and socioeconomic 
status. There were no significant differences 
on age, (39) = .545, or socioeconomic status, 
t(39) = .206. There was, however, a signifi- 
cant difference on IQ scores, t(39) = 2.961, 
? <.05, with nondelinquents scoring highet. 
Although the IQ difference between group 
was statistically significant, the absolute mag- 
nitude of the difference was too small to be 
very meaningful, One standard deviation on 
the intelligence tests that were used 1S o 
points; yet, the mean IQ difference betwee! 
delinquent subjects and their matched CO” 
trols was only 3.8 points, with a range T 
tween 0 and 13 points. Moreover, a pen ; 
product-moment correlation computed an 
all subjects between IQ and total API sco? 
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Mean Response Ratings and Standard Deviations for Subjects in Each Testing Condition 


—— _rorvrvrvre > —— 


Test A Test B Test A Test B 
Group 1 2 1 2 1 2 1 2 
Free-response format 
Delinquents 
M 2.90 3.47 3.98 4.62 4.55 4.65 4.06 4.83 
SD .14 1.23 1.10 1.03 1.50 1.25 94 1.42 
Nondelinquents 
M 4.73 4,82 4.88 5.80 5.29 4.46 6.14 5.43 
SD 64 40 AS 58 39 54 66 44 
Multiple-choice format 
Delinquents 
M 5.96 5.49 3.87 4.22 5.78 5.20 5.87 5.60 
SD .82 53 1.38 2.22 1.30 -18 1707, 1.31 
Nondelinquents 
M 6.38 5.42 6.09 5.96 6.29 6.27 5.91 6.16 
SD 49 -67 55 40 60 £99) 81 49 


Note. 1 and 2 refer to items taken first and second, respectively. 


was nonsignificant and relatively low (.06). 
This suggests that there was a negligible rela- 
tionship between IQ scores and performance 
on the API. 

Results were analyzed by two separate 
analyses of variance, one for free-response 
items and one for multiple-choice items; 
since it could not be assumed a priori that 
free-response and multiple-choice formats 
were equivalent, it was not appropriate to 
treat them as repeated measures within a 
single overall analysis of variance. Table 3 
presents the mean response ratings and stan- 
dard deviations for the subjects in each test- 
ing condition. The group means, collapsed 
across format order and item-group order, are 
graphically presented in Figure ic 

There was a significant performance dif- 
ference between delinquents and nondelin- 
quents both on the free-response version of 
the API, F(1, 64) = 21.68, p < .001, and on 
the multiple-choice version, F(1, 64) = 
10.59; p < 005. Delinquents earned a mean 
tating of 4.13 on free-response items (SD = 
133), and nondelinquents earned a mean of 
5.19 ($D = .76). On the multiple-choice 
Version, delinquent subjects had a mean score 
of 5.25 per item (SD = 1.48); for nonde- 
linquents, the mean was 6.06 (SD = 65). 


Within the free-response format, there was 
a significant main effect due to instructions, 
F(1, 64) = 5.30, p < .05, with the alternate 
instructions (which called for subjects to give 
the best response) leading to higher scores 
than the standard instructions (which asked 
subjects to tell how they would actually 
react in the various situations). This same 
instruction effect was not present in the 
multiple-choice version of the API, F(1, 64) 
= 3.42. 

The performances of delinquents and non- 
delinquents also were compared under stan- 
dard versus alternate instructions within each 
of the test formats (i.e. free response and 
multiple choice). The Kolmogorov-Smirnov 
two-sample test was used for these compari- 
sons; it assesses the similarity of the distri- 
butions of scores for the two subject groups. 
Significant differences were found between 
delinquents and nondelinquents in the free- 
response format with both standard instruc- 
tions (p< .01) and alternate instructions 
(p < 05). In the multiple-choice format, 
however, no significant differences were found 
for either set of instructions. 


To compare within-group performance on 


the multiple-choice and free-response formats, 


matched ¢ tests were performed. Delinquents 
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MEAN RESPONSE COMPETENCE (API) 
D 


INSTRUCTIONS: STANDARD ALTERNATE 
FREE -RESPONSE 


FORMAT: 


DELINQUENT 


[_] NonDELINauENT 


STANDARD 
MULTIPLE - CHOICE 


ALTERNATE 


Figure 1. Mean response competence of delinquents and nondelinquents under two sets of instruc- 
tions (standard vs. alternate) and two test formats (free response vs. multiple choice). 


performed significantly better on the multi- 
ple-choice version than on the free-response 
version, (39) = 8.86, p < .001. The same 
pattern was also found with nondelinquents, 
t(39) = 3.55, p < .001, 

Significant differences were found between 
delinquents and good citizens on 42 out of 44 
items in Study 1. In the present study, how- 
ever, an item-by-item comparison between 
mean scores obtained under standard instruc- 
tions in the free-response format (i.e, the 
instructions and format used in Study 1) 
yielded significant group differences (p < 
.05) on only 13 items. This result must be 
qualified by a number of considerations: The 
present study was designed with different 
objectives in mind; the number of subjects 
contributing to the group mean on each item 
was only 10, 5 of whom were administered 
the free-response test following the multi- 

ple-choice test; and there was a tendency for 
subjects to perform better on free-response 
items after being exposed to the multiple- 
choice items and answers. Furthermore, re- 


sponses in this study were rated from typed 
transcripts rather than from audiotapes, and 
this may have eliminated some of the dis- 
tinguishing response behaviors. It was 
deemed inappropriate to attempt a more 
detailed item analysis with the present data, 
since there were only 5 subjects per cell. 


Discussion 


The first phase of this research was we 
cerned with the development of ans 
role-playing inventory of social skills in a S 
lescent boys—the API—and a criterion-rer 
erenced raters’ manual to accompany the in 
ventory, The second phase was cone n 
with evaluating the API. Three studies m 
conducted, along with eT psychome! 
analyses of the inventory itself. 4 > 

Study 1 found that the API coudiait ta 
tiate between delinquents and none 
and that nondelinquents who differed ze a. 
social competence, according to schoo! m 3 
ance counselors’ nominations, also differe 
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their overall API performance. The API was 
shown to be reliable; to lack any strong, co- 
herent, or interpretable item structure; and 
to be an unusually strong predictor of sub- 
jects’ group membership. Study 2 demon- 
strated that the API was not merely sensi- 
tive to gross differences between delinquent 
and nondelinquent groups. Institutionalized 
delinquents who were known to differ in their 
disruptiveness within the institution also were 
significantly different in their API perform- 


ance, This finding suggests that subcultural 


differences alone probably do not account 


( for differences in API performance. 


The third study replicated the previous 
findings. The API again differentiated be- 
tween delinquent and nondelinquent subjects. 
Such differences were found even though sev- 
eral methodological improvements were in- 
troduced—namely, in subject-matching and 
in response-rating procedures. Moreover, the 
Study found that instructions affected the 
API performances of both delinquents and 
Nondelinquents: When told to respond with 
“the best” solution, all subjects did better 
than when they were told to say what they 


“would actually “do.” The study also showed 


that all subjects did better when given the 
API items in a multiple-choice format than 
When given the items in a free-response for- 
mat, 

Taken as a whole, the present research at- 
tests to the utility of a competence model of 
assessment. Specifically, the research sug- 
gests that the API is a valid measure of so- 


= cial competence in adolescent boys. Generally, 


the research provides support for the hy- 
Pothesized relationship between social skill 
deficits and interpersonal/legal difficulties. 
The findings of the present research are 
Consistent with the results of previous studies 
in the area of delinquency. Numerous inves- 
tigators have repeatedly demonstrated that 
linquents, as a group, do not perform as 
Satisfactorily as nondelinquents on a variety 
Of measures. The critical difference between 
this research and previous work is not in its 
demonstration of performance deficiencies 
among delinquents; rather, the difference is 
how it proposes that such deficiencies 
should be identified, assessed, and interpreted. 
evious investigators typically have sought 
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to isolate either a core cause (e.g., genetic 
flaw; weak superego) or a unitary person- 
ality trait (e.g., aggressiveness) to explain 
delinquency. The present research, in con- 
trast, was based on the notion that a wide 
and varied array of skill deficits may be re- 
lated to delinquent behavior, Both delin- 
quents and nondelinquents are likely to per- 
form competently in some tasks and incompe- 
tently in others, and the pattern of particular 
deficiencies will vary considerably among 
individuals and within groups. No single 
deficit or pattern of deficits is likely to ex- 
plain delinquency. Rather, the probability 
that an individual will be classified as a de- 
linquent increases as a function of at least 
three factors: First, it increases to the extent 
that the individual lacks the requisite skills 
to deal effectively with the everyday problem 
situations confronting him; second, it in- 
creases as a function of the frequency with 
which he encounters such problem situations; 
and third, it increases as a function of the 
degree to which his incompetent solutions to 
such problem situations take the form of 
illegal behaviors. 

Although the present research is a first 
step toward identifying some of the relevant 
problem situations and common performance 
deficits characteristic of delinquent boys, a 
number of important questions remain to be 
answered by future research. 

First, it will be important to gather addi- 
tional evidence on the external validity of 
the API; that is, do subjects’ role-played 
responses to the API correlate with their 
actual responses to the same situations in 
real life? 

Second, far more attention must be devoted 
to the task of defining situation-specific re- 
sponse competence before we can feel confi- 
dent about using the API situations and scor- 
ing criteria to develop a valid skill-training 
program for adolescent boys. The criteria for 
evaluating competence in the present re- 
search were based on the judgments of so- 
called experts. Even though it is encouraging 
that there was consensus among the judges 
and that their criteria successfully discrimi- 
nated among subject groups, it 1s possible 
that such criteria simply are not valid when 
applied to the life contexts of delinquents. A 
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response that is an effective solution to an 
interpersonal problem in one milieu may not 
be very effective in another. Rather than 
relying on the subjective opinions of judges 
to define competence, future research should 
take a more empirical approach; competence 
in. a specific situation should be defined in 
terms of an objective assessment of the ac- 
tual consequences of the various response 
alternatives. 

The competence model and the methodol- 
ogy of the present study might profitably be 
applied to the assessment of other patho- 
logical behaviors (e.g., depressive behaviors, 
alcohol abuse). Gradually, a classification 
could be developed of particular problem 
situations and skill deficits that are associ- 
ated with a variety of clinical populations. 
Such a classification could be individualized 
to take into account the specific strengths 
and weaknesses in the skills repertoire of any 
new client. Since treatment programs are 
only as good as the underlying assessment, 
those of us involved in social skills training 
might do well to call a moratorium on the 
further elaboration of treatment programs 
until we have more systematically assessed 
both the nature of the clinical problem and 
the type and extent of the skill deficits of the 
individuals with whom we are working. 


Reference Note 


1, Hollingshead, A, B. Two factor index of social 


position. Unpublished manuscript, Yale University, 
1957, 
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The present study had three primary focuses: (a) to identify major empirical 
clusters of multiple drugs; (b) based on these empirical drug clusters, to de- 
velop an empirical typology of multiple drug abusers; and (c) to characterize 
each derived drug cluster and type of multiple drug user in terms of a coherent 
set of theoretically based psychosocial variables. To accomplish these objec- 
tives, drug use and psychosocial data were collected from 440 clients in four 
drug and alcohol treatment programs. A cluster analysis was performed on 
chronicity/frequency indices that had been calculated for each of 15 drug 
classes. Four multiple drug clusters were identified by this analysis: (a) cocaine/ 
other opiates and synthetics/methaqualone/illegal methadone; (b) inhalants/ 
codeine/nonnarcotic analgesics; (c) marijuana/amphetamines/hallucinogens ; 
and (d) minor tranquilizers/barbiturates. Two substances, heroin and alcohol, 
did not cluster with any other substances but were frequently used by this 
sample, and consequently these two substances were retained in further anal- 
yses, yielding six basic drug clusters. Next, a typology of drug abusers, rather 
than drug clusters, was developed empirically by means of proximity cluster 
analysis. Eight quantitatively and qualitatively distinct types of multiple drug 
abusers were identified solely by analysis of their standing on the use of the 
six basic clusters of drugs. Finally, the set of psychosocial measures was found 
to be differentially related to use of the six types of drugs and to the eight 
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types of drug abu 


adequacy of the theory underlying the m 


analytic strategies. 


Evidence has accumulated in recent years 
documenting a substantial increase in the 
polydrug and multiple drug abuse patterns 
encountered in a variety of drug treatment 
settings (Benvenuto & Bourne, 1975; Carlin 
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sers. These differential findings were discussed in terms of the 


easures and in terms of alternative 


& Post, 1971; Duncan, 1975; Fisher & Brick- 
man, 1974; Smith & Wesson, 1973), involv- 
ing a concomitant proliferation of distinctive 
multiple drug use patterns, that is, regular 
use of two or more kinds of drugs. Conse- 
quently, there is a pressing need for sys- 
tematic research into identification of the 
major types of multiple drug abusers requir- 
ing some kind of drug treatment. To develop 
such a typology, an initial task is to identify 
major patterns of multiple drug abuse and to 
describe the psychosocial factors that are 
associated with each major pattern. The 
present study has three primary aims: (a) 
to identify major empirical clusters of multi- 
ple drugs; (b) based on these empirical 
clusters, to develop an empirical typology of 
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multiple drug users; and (c) to describe each 
cluster of drugs and type of user in terms of 
the same set of theoretically derived psycho- 
social variables. 

From a review of the existing research 
literature in this area, it is our contention 
that the simultaneous investigation of these 
three aims would represent a distinct advance 
over present research strategies. This is so 
because nearly all of the present literature 
germane to the above aims reflects one of 
four approaches: 

1. Some investigators have attempted to 
identify distinct patterns of multiple drug 
abuse but have provided no information 
about the distinguishing characteristics of 
persons using different patterns of drugs (e.g., 
Simpson & Sells, 1974; Smart & Whitehead, 
1972). 

2. Others have provided systematic de- 
scriptions in terms of theoretically articu- 
lated sets of psychosocial variables and their 
relationships to various forms of social devi- 
ance, but these investigators have not focused 
on patterns of multiple drug use nor have 
they studied significant numbers of drug 
abusers with problems of the magnitude 
typically encountered in drug treatment 
populations (e.g., Braucht, 1974; R. Jessor, 
1976; R. Jessor, Jessor, & Finney, 1973). 

3. A third approach is represented by stud- 
ies concerned with only one pattern of multi- 
ple drug abuse, often providing anecdotal or 
demographic descriptors not based on any 
identifiable theoretical perspective (eg, 
Campbell & Freeland, 1974; Chambers, 
1969; Devenyi & Wilson, 1971; Hamburger, 
1964; Kirby & Berry, 1975; Ludwig & Le- 
vine, 1965). 

4. A fourth approach has attempted to 
identify multiple drug use patterns among 
general high school or college populations 
(eg., Blum, 1969; Brehm & Back, 1968; 
Groves, 1974; Johnston, 1973) or has at- 
tempted to describe initial stages in the 
progression of drug use among general ado- 
lescent populations (Kandel, 1975; Kandel & 
Faust, 1975; Single, Kandel, & Faust, 1974) 
—populations without significant numbers of 
persons having the extensive involvement, 
experience, and advanced problems with drugs 
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that constitute the norm among drug treat- 
ment clientele. 

Against this background of extant research 
strategies and knowledge, it would be advan- 
tageous to develop research that would inte- 
grate the best features of these lines of in- 
quiry. Thus, the focuses of the present arti- 
cle are on the identification of distinct types 
of multiple drug use patterns and multiple 
drug users in a large sample of persons hav- 
ing major drug problems and on an analysis 
of the psychosocial characteristics of those 
types of multiple drug users. 

Having stated the objectives and raison 
d’être for the present study, the literature 
would suggest that an optimal investigation 
of these objectives must possess several key 
methodological features. To achieve the first 
two objectives—identifying major empirical 
clusters of drugs and types of multiple drug 
abusers—three requirements must be met: 
(a) the reduction of a large potential num- 
ber of drug use patterns to some manageable 
number without loss of information, a Te 
quirement met by any of several forms of 
dimensional analysis; (b) obtaining a large, 
maximally heterogenous sample of persons 
who are regular users of more than one kind 
of drug; and (c) assessment of the frequency 
and chronicity of each kind of drug used for 
each person in the sample. ; 

To address the third objective—theoretl- 
cally based differential descriptions of multi- 
ple drug use patterns and types of multiple 
drug abusers—an approach that appears to be 
promising in accounting for drug abuse must 
be identified, and appropriate measures de 
rived from the theory must be selected. In the 
present study, the theoretical approach is 4 
variant of social learning theory as develope 
by Jessor and his colleagues (R. Jess 
Graves, Hanson, & Jessor, 1968; R. as i 
Jessor, 1977). This theory has been fruit! i 
in understanding and predicting adolescen! 
deviant behavior, including alcohol and mar 
juana use, Principal measures in this stu y 
included a set of sociocultural and ee 
ality variables developed by R. Jessor oe 
(1968), R. Jessor, Collins, and Jessor (1972 
R. Jessor et al. (1973), R. Jessor (1976), i 
Jessor and Jessor (1975), and Brau 


i e 
(1974). It should be emphasized that th 
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present study is not intended to be a theory- 
testing endeavor; rather, as an initial ex- 
ploratory inquiry, the present study uses 
these theory-based variables in a purely de- 
scriptive role. 


Method 


Structured interviews were conducted, with 440 
clients receiving treatment in seven drug programs lo- 
cated in the Denver metropolitan area. The largest 
percentage were patients in the Polydrug Treatment 
Center (2 = 164); 100 were patients in the Denver 
General Hospital Alcohol Detoxification Ward; 102 
were patients in two methadone programs; and the 
remaining 74 were patients in three other Denver 
drug programs. The anonymity of respondents was 
assured. Across this sample of 440 clients, the average 
age was 30.5 years, 73% were male, and the ethnic 
groups represented were as follows: 48% Anglo 
American, 31% Mexican American; 14% Black- 
American, and 7% American Indian or Oriental. The 
socioeconomic level of the sample was predominantly 
lower middle class, It may give some perspective on 
the nature of this sample to note that 59% reported 
having been convicted of at least one (nontraffic- 
related) crime, whereas 44% reported two or more 
convictions. 

Data collected included age, sex, ethnicity, socio- 
economic status level, number of arrests, and number 
of convictions (lifetime basis, whether drug related or 
not); a two-factor self-concept scale (Miskimins & 
Braucht, 1971); a scale of stressful life events 
(Holmes & Rahe, 1967); a measure of assaultiveness 
(Buss & Durkee, 1957), and a measure of hostility 
guilt (Mosher, 1966). The interview also included 
34 other psychosocial scales; because these measures 
and the theoretical framework from which they were 
derived have been described previously (Braucht, 
1974; R. Jessor et al, 1972; R. Jessor et al, 1968; 
R. Jessor & Jessor, 1975; S. L. Jessor & Jessor, 1974), 
they will not be elaborated on here. 

Many investigators, including Sadava (1975), 
Walizer (1975), and Gorsuch and Butler (1976), have 
noted that most studies of drug use have limited 
value because they have relied either on simple mea- 
sures of use versus nonuse of a particular drug or 
have measured drug use along a set of absolute fre- 
quency of use categories, ignoring both (a) the length 
of the period of time during which the drug had been 
used and (b) the total pattern of multiple polydrug 
Use, The present approach attempted to redress both 
of these limitations by assessing the use of a number 
of drugs in terms of both the frequency of use and 
the length of time that these drug(s) had been used. 

Fifteen drug classes were studied: heroin, illegal 
Methadone, other opiates and synthetics, alcohol, 
barbiturates, amphetamines, cocaine, marijuana, hal- 
lucinogens, codeine, nonnarcotic analgesics, volatile 
inhalants, major tranquilizers, minor tranquilizers, 
and methaqualone. Each client was questioned ex- 
tensively as to current and past drug use history for 
ach of the 15 drug classes. 


1465 


During pilot work with clients of three of the drug 
programs, it was found that they could reliably pro- 
vide information about which drugs they had been 
using on a recent, regular basis, and that they could 
also reliably report how frequently they used these 
drugs and how long they had been using them. To 
incorporate as much of this information as possible 
in a single index for each of the 15 drugs, a chro- 
nicity/frequency (C/F) index was calculated for each 
drug class used, representing the product of (a) the 
number of months of recent, regular use of that drug 
and (b) the number of times per week that the per- 
son used the drug across the time in a above, Thus, 
for example, a drug used either for 6 months on a 
daily basis (6 X 7) or for 3 months on a twice daily 
basis (3 X 14) yields a C/F index of 42; a drug used 
daily for 1 year yields a C/F index (12 X 7) of 84; 
and a drug used three times per day across a period 
of 1 year (12 X 21) yields a C/F index of 252. It 
should be noted that except for alcohol use, the 
definition of “regular use” was left to the subject. 
That is, the interviewer did not attempt to impose a 
predetermined definition for the amount of a drug 
that constituted a single use of the substance, With 
regard to alcohol use, the questions asked specifically 
about drinking to drunkenness, so that drinking to a 
drunken state constituted a single use episode for this 
drug. 


Results 


Cluster Analysis of Chronicity/Frequency 
Indices 


To identify the most general groups of 
drugs defined by the 15 C/F drug use in- 
dices, a full cycle key cluster analysis was 
performed (Tryon & Bailey, 1970). Four 
cluster dimensions were identified involving 
11 of the original 15 drugs. The remaining 
four drugs included (a) heroin, (b) alcohol, 
(c) major tranquilizers, and (d) the over- 
the-counter drugs. s 

Inspection of the mean C/F indices across 
the entire sample for the latter four drugs 
showed that the major tranquilizers and over- 
the-counter drugs were little used (12th and 
14th ranked of the 15 drugs), whereas heroin 
(6th ranked) and alcohol (2nd ranked) were 
used to a considerable degree. For this reason, 
both heroin and alcohol were included as the 
Sth and 6th basic drug clusters in addition to 
the four identified by the empirical key clus- 
ter analysis. Even though these two single 
substances are not clusters of multiple drugs, 
it was decided to retain them throughout sub- 


sequent analyses in order to provide a com- 
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parative perspective against which the four 
true clusters could be viewed. 

The resulting six drug substance dimen- 
sions (patterns of multiple drug use involving 
14 of the original 15 drugs) accounted for 
99% of the original communality and 92% 
of the mean square of the original correlation 
matrix. Thus, these six clusters of drugs can 
be studied as they relate to each other and 
to other variables with very little loss of 
information. In the section below, each cluster 
is described in terms of (a) the drugs defin- 
ing the cluster and (b) statistics indicating 
the strength of association among usage of 
the drugs défining each cluster. 

Drug Cluster 1: (cocaine, other opiates 
and synthetics, methaqualone, and illegal 
methadone). The use of the four drug 
classes comprising this basic drug cluster is 
highly interrelated—The heavy user of one 
drug class is very likely to be a heavy user of 
the other drugs and vice versa. This is evi- 
denced by the domain validity for this first 
drug cluster of .85 and the cluster score’s 
alpha reliability coefficient of .72. Across the 
entire sample the average intercorrelation of 
the C/F drug use scores among these four 
drugs was .40. 

Drug Cluster 2: (inhalants, codeine, and 
nonnarcotic analgesics). The use of inhal- 
ants, codeine, and the nonnarcotic analgesics 
is also strongly related in our sample—Per- 
sons who use one heavily are likely to be 
heavy users of the other two, This drug clus- 
ter had a domain validity of .82 and an alpha 
reliability of .68. Across the entire sample, 
the average intercorrelation of these three 
C/F drug use scores was .36, 

Drug Cluster 3: (marijuana, ampheta- 
mines, and hallucinogens). As shown by this 
third cluster’s domain validity (.76) and the 
cluster score’s alpha reliability (.57), the 
degrees of use of marijuana, amphetamines, 
and hallucinogens are strongly associated with 
one another—The heavy user of one is likely 
to be a heavy user of the others. Across the 
sample as a whole, the average intercorrela- 
tion of the C/F drug use scores for these 
three drugs was .28. 

Drug Cluster 4: (minor tranquilizers and 
barbiturates). The use of minor tranqui- 
lizers and barbiturates was significantly as- 
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sociated in our sample (r = .36, $ < 001), | 
This drug cluster’s domain validity was 62, 
and its alpha reliability was .38. Thus, the 
heavy user of barbiturates is also likely to be 
a heavy user of the minor tranquilizers and 
vice versa. | 
As stated earlier, neither heroin nor alco- | 
hol formed a multiple drug pattern with one | 
another or with any other drugs. Because of 
the significant levels of use of both heroin | 
and alcohol in this sample, however, these 
two drugs have been included as the fifth and 
sixth basic drugs in the remaining analyses. | 


Correlations Among the Six Clusters of Drugs | 


The correlations among the cluster scores 
for the six patterns of multiple drugs are 
presented in Table 1, which shows that the 
six clusters form moderately intercorrelated 
oblique clusters of multiple drugs. 


Relationships Among Use of the Six Basic 
Patterns of Drugs and Other Drug 
Use Variables 


Table 2 presents correlations among the 
six basic drug cluster scores and scores on 
four indices of polydrug use: (a) number of 
drugs (up to 15 possible) ever used; (b) 
number of drugs used recently on a regular 
basis; (c) number of drugs used recently on 
a daily basis; and (d) extent to which sub- 
jects reported the deliberate use of drugs in 
combinations for the express purpose of 
achieving some desired effect. 

It may give some perspective on the rela- 
tionships in Table 2 to note that across the 
entire sample, the mean number of drugs 
ever used was 6.5 drugs (out of a possible 
15), that the average number of different 
drugs currently used on a regular basis was 
2.8 drugs, and that an average of 2.0 different 
drugs were being used on a daily basis. 


Relationships Among Use of the Six Types of 
Drugs and Psychosocial Variables 


Table 3 presents the correlations among 
use of the six clusters of drug use (based 0” 
simple sum, nominally weighted cluster 
scores) and selected psychosocial variables. | 


Í 
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Table 1 
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Intercorrelations, Internal Consistency, and Mean Use Levels of the Six 


Empirical Clusters of Drugs 


Drug cluster 2 3 4 5 6 
- 

1, Cocaine/other opiates & synthetics/ x 

methaqualone/illegal methadone = .06 .24** = 12* 10" 05 
2. Inhalants/codeine/nonnarcotic 

analgesics = alot 24 —.04 00 
3. Marijuana/amphetamines/hallucinogens — .11* 19** 05 
4. Minor tranquilizers/barbiturates — —.03 .07 
5. Heroin ea —.06 
6. 


. Alcohol (to drunkeness) 


Domain validity of drug type cluster 


85.82 -16 62 i = 


Alpha reliability of drug type cluster score .12 68 SA 38 — — 
M chronicity/frequency of use 68.8 27.6 293.0 948 45.6 1188 
*p<.05 
*p<.01 


Table 3 shows that the six drug cluster scores 
are differentially associated with 36 psycho- 
social variables (7 other variables that were 
available for this analysis do not appear in 
Table 3 because correlations with them 
failed to reach the p < .01 level). 

First, Table 3 shows that the level of use 
of the first drug cluster (cocaine/other opi- 
ates and synthetics/methaqualone /illegal 
methadone) is significantly related to only 
two psychosocial variables. A high level of 
use of drugs in this first cluster is associated 
with (a) the presence of a peer subculture 
that is supportive of such drug use and (b) 
a high level of life stresses (the only group of 
multiple drugs to be significantly related to 
life stress level). 

Second, Table 3 shows that the use of 


Table 2 
Significant Correlations Among the 


the inhalants/codeine/nonnarcotic analgesics 
cluster is significantly related to six psycho- 
social variables. The greater the use of drugs 
belonging to this second cluster, (a) the less 
the religiosity, (b) the greater the aliena- 
tion, (c) the more external the locus of con- 
trol orientation, (d) the more important the 
personal effects (escape from problems) func- 
tions of such use, (e) the more strained the 
communication regarding drugs with one’s 
family, and (f) the greater the discrepancy 
between the kind of person one feels one is 
versus the kind of person one would ideally 
like to be. A group picture suggested by this 
array of relationships is that inhalants/co- 
deine/nonnarcotic analgesics use is associated 
with a psychosocial syndrome of alienation, 
lack of values and norms, lack of meaning- 


Six Empirical Clusters of Drugs and 


Polydrug index 


No. drugs ever used i 
No. drugs used recently on a regular basis 
No. drugs used recently on a daily basis 
Extent of the use of drugs in combination 
to achieve an effect 


Cluster 
we | i E 
1 2 3 4 5 6 
16.40 16 —.20 
37 27 66 23 15 
96 13 


—14 17 60 30 


Note. All table entries are correlations significan 
nificance at the p < .01 level have been omitted. 


t at the p < .01 level; correlations that failed to attain sig- 
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ful channels of communication with one’s 
family, feelings of helplessness and inferior- 
ity, and a strong desire to escape from the 
pressure of personal problems. 

Inspection of Table 3 shows that the use 
of the third class of drugs (marijuana/am- 
phetamines/hallucinogens) is related to 25 
psychosocial characteristics (presented in the 
third column of Table 3). The general psy- 
chosocial condition suggested by this array of 
significant relationships is one of marijuana/ 
amphetamines/hallucinogens use on the part 
of youth embedded in family and peer groups 
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who model and support drug use. The use of 
drugs in this cluster appears to be firmly 
entrenched in a drug subculture with all of 
its characteristics as popularly conceived, 
Use of this cluster is not only associated with 
the desire to escape from personal problems 
but is also associated with the function of 
enhancing social occasions, the only class of 
drugs that is perceived to serve this social 
enhancement function. 

In contrast, Table 3 shows that use of the 
minor tranquilizers/barbiturates cluster is as- _ 
sociated with a pattern of psychosocial char- 


Table 3 
Significant Correlations Among the Six Drug Clusters and Psychosocial Variables 
Cluster 
Psychosocial variable 1 2 3 4 5 6 
Age in years —.28 14 25 
Need value for conformance goals = 19 
Intolerance of deviance =.17 
Religiosity —.13 —.25 —.17 
Need value for conformance success —.16 —.19 
Alienation 18 25 AT 
Social agents’ agreement - 1 7 : 
Life chances disjunctions ; 15 
Peers’ advice salience —.17 -12 
Expectation of conformance success t — 23 
Family advice salience a 
Peers’ value for conformance goals eas 5 
Internal locus of control 16 aaa ah 
Peers’ value for conformance goals > = hie 
Opportunity to procure drugs Fu 
Peer support for drug use 16 a ou 
Positive social function of drugs i fe 
Personal effects function of drugs ae 
Conforming social functions of drugs eh be si 13 
Family value for conformance goals f 
Exposure to parental medical drug use JA 
Family support for drug use Be 
Family proscription of alcohol use ee 
Internal negative functions of drug use 1 
External negative functions of drug use 2 
Hostility guilt score AS 
Lay aes likelihood 7e 
ase of communication re drugs with famil: eH 
Ease of communicati i “i =14 
Assaultiveness mi o ea -14 
Total life stress score «13 = 28 
No. arrests p5 
No. convictions —.16 ay T 
Socioeconomic status 0 
Self ideal-self discrepancy 4 
Self-others, over valuing others lt 19 a 


Note. All table entries are 


V correlations signi! 
nificance at the p < .01 le pE 


vel have been omitted. 


cant at the p < .01 level; correlations that failed to attain sig- 4 
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Table 4 
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Drug Use Cluster Score Profiles of the Eight Types of Drug Users 


Profile Overall 
homo- 
Type n % 1 2 3 4 5 6 geneity 
r Nae experimenters 230 55.0 48.0 485 45.8 47.1 48.1 45.3 99 
» Alcoholics 67 16.0 47.9 48.7 46.9 47.2 47. 

3. Minor tranquilizer/barbiturate a fe 

users ; 17) 4.1 49.9 49.0 47.3 80.0 47.5 46.7 90 
4. He: heroin and some 

c aine/other opiates (no 

illegal methadone users) 16 3.8 50.9 484 50.3 47.5 101.2 45.2 91 
5. Hallucinogen and minor tran- 

quilizer barbiturate users 11 2.6 48.0 48.8 604 60.9 47.8 46.6 86 
6. Heavy minor tranquilizer / 

barbiturate and some co- 

deine/inhalant/analgesics 

users 8) 19: 50.0 54.0 45.2 58.6 48.4 46.6 93 
7. Codeine (heavy), inhalant/ 

analgesics and some minor 

tranquilizer/barbiturate users AANS Er 49.6 77.0 45.4 52.5 47.5 44.7 89 
8. Hallucinogen users 62 14.9 48.5 48.6 61.2 47.3 48.8 45.6 92 

Eta 98 98 .91 .98 97 97 


acteristics suggestive of older users who are 
intolerant regarding deviance from conform- 
ist ethos, who have pervasive feelings of help- 
lessness in controlling their own destiny, and 
who lack confidence in their own vocational 
career success. 

The use of heroin is related to only two 
psychosocial variables: Heavy heroin use is 
associated with having been arrested several 
times and with a family ethos that places 
a high value on the importance of conformist 
goals, 

Finally, Table 3 shows that the use of the 
last basic type of drug (alcohol) is related to 
older user age, importance attached to using 
alcohol as a means of conforming to social 
expectations, elevated levels of criminal ar- 
rests and convictions, low socioeconomic 
Status, and a classic sign of depression—the 
feeling that others think more of one than 
one thinks of oneself. 


Identifying Distinct Types of Drug Abusers 


Thus far, six drug clusters (i:e., six types 
of drugs) have been identified, and the pat- 
terns of relationships of each cluster of drugs 
with a set of psychosocial characteristics have 
been described. At this point, typology of 


drug abusers, rather than a typology of 
drugs, was developed empirically by means of 
proximity cluster analysis. Eight quantita- 
tively and qualitatively distinct types of mul- 
tiple drug abusers were identified using the 
standard procedures of Tryon and Bailey’s 
(1970, p. 147) method of iterative condensa- 
tion on centroids. Of the 440 abuser profiles, 
418 were classified into one (and only one) 
of the eight drug abuser types. The remain- 
ing 22 abusers’ profiles were too discrepant 
or unique to be included in any of the eight 
types. 

Thus, independent of their standing on 
psychosocial variables, eight distinct types of 
drug users were identified solely by analysis 
of their standing on the use of the six basic 
clusters of drugs. Table 4 provides a rough 
descriptive label for each type of drug user 
and presents the distinctive profile of the use 
of the six basic clusters of drugs for each 
type of user. 

Eta statistics in the bottom row of Table 4 
indicate the degree to which each of the six 
drug clusters account for type membership. 
ht homogeneity statistics 


Conversely, the eig i 
indicate the average degree to which member- 


ship in each of the eight drug user types 
specifies scores On the six drug use cluster 
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score dimensions. All of these statistics are at statistical integrity and “tightness” (Tryon 
highly acceptable levels, indicating that the & Bailey, 1970, p. 161). ; 
typology obtained possesses a high degree of Table 5 presents a more complete descrip- 


Table 5 i 
Drug User Types: Significant One-Way Analyses of Variance 
of Psychosocial Dependent Variables 


Variable 1 2 3 4 5 hy hc 8 


n 230 67 17 16 11 8 
total sample percentage 55 16.0 4.1 3.8 2.6 1.9 
Chronicity/frequency indices of drug use 
Heroin 
Methadone = 
Opiates + 
Alcohol = + 


7 62 
1.7 149 


Ua! 
+ 
I 


Barbiturates G 
Amphetamines = 
Cocaine me 
Marijuana = 


DEG 

Hallucinogens se + 
Codeine = oe fa | 

Analgesics F as aE 
+ | 
‘ 


fs Tele a 
BE SFI 
+1+ 
1 
1 
i! 


Inhalants = 
Tranquilizers 
Major 
Minor 
Methaqualone 
No. drugs ever used 
No. drugs used recently and regularly - 
No. drugs used daily 
Sum chronicity/frequency 


Use of drugs in combination 
No. arrests 


| 
| 
I 
I 
l 


I 
Ti 
+ ı 
+ i 
L] | 


I 
PEET 
VEI 

| 

l 


l 
I+++++1 


jal Su 


oe E 
No. convictions + - 
Socioeconomic status = 

+ 

pan 


I 
| 
| 


Self-others, overvaluing others 
Age 
Sex 


| 


All 
Female 65% 
Ethnicity i 


Alienation ai 
Internal locus of control 

Opportunity to procure drugs -= = 
Positive social functions = à 
Conforming social functions = 
Personal effects functions k F 


Internal negative functions of drug use $ 
Assaultiveness 


EES 
| 
| 


l 
+ +++ 


Peers’ value for conformance + 3 K f 
Peers’ value for deviant a oe ah a H a j 


s Oldest. 
b Youngest. 

e No blacks. 

d Mexican Americans, few whites. 
e No Mexican Americans. 
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tion of each type, both in terms of drug use 
patterns and psychosocial characteristics. 
With drug user type as the independent 
variable, a series of one-way analyses of vari- 
ance was performed for each psychosocial 
variable. Each row of Table 5 represents a 
significant F ratio of chi-square ($ < 01) 
that indicates a significant difference among 
the eight drug user types on that psycho- 
social characteristic. The plus and minus 
entries in each row of Table 5 describe the 
results of a posteriori multiple-range tests 
(by the method of least squared differences) 
that were performed following the significant 
F test. On any given psychosocial variable, 
drug user types with a plus sign are rela- 
tively high on that characteristic and are 
significantly different ($ < .01) from user 
types with no table entry (who are “average” 
on that characteristic), who are, in turn, sig- 
nificantly different (p < .01) on that psycho- 
social variable from user types with a minus 
sign. 

Table 5 has been designed to communicate 
a wealth of significant findings about the 
characteristics of each type of drug user as 
directly and completely as possible. The char- 
acteristics of each type of drug user suggested 
by the pattern of findings in Table 5 are 
briefly summarized below. 

Drug User Type 1: Infrequent experiment- 
ers. Comprising the bulk of our sample, this 
type of user has experimented with over 6 
different drug classes (of 15 possible) during 
his/her lifetime and has recently used 1.7 
drugs on a regular basis, put is unlikely to 
have used amy on a daily basis. This type 
tends not to use drugs in combinations very 
often—42% have never consciously used 
drugs in combinations to achieve an effect. 

In contrast to the other types of drug 
users, these infrequent experimenters feel 
that they themselves are in control of the 
direction on their lives. In general, they, 
their families, and their peer groups are rela- 
tively conformist in orientation, and they 
enjoy a relatively high socioeconomic status. 
Finally, the infrequent experimenters tend 
not to use drugs to escape the press of per- 
sonal problems. 

Drug User Type 2: Alcoholics. 
holic type of drug user is older, 


` The alco- 
poorer, and 
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has been convicted of crimes more often (11 
times) than any other type of drug user. He 
or she has had experience with fewer drugs 
during his/her lifetime than any other type 
of user (on the average, about three dif- 
ferent drugs). The alcoholic type was cur- 
rently using 1.9 drugs on a regular basis and 
1.1 drugs (alcohol) on a daily basis. More 
than any other type, alcoholics tend mot to 
use drugs in combinations, at least not inten- 
tionally—65% reported “never” using drugs 
in combinations to achieve an effect. Alco- 
holics are relatively conformance oriented in 
their own values and beliefs, and although 
they perceive little support for drug use 
among their peers, they report drinking as 
one means of conforming to their social 
group. Compared to any other type of drug 
user, they recognize fewer drawbacks to 
their own use of drugs. 

Drug User Type 3: Barbiturate and minor 
tranquilizer users. The user of barbiturates 
and minor tranquilizers is likely to be an 
older, white, middle-class housewife who feels 
powerless to control the direction of her own 
life, Relatively conformist in orientation, and 
with the lowest rate of arrests and convic- 
tions, the users of barbiturates and minor 
tranquilizers were the only type of drug 
users in our sample who had never had any 
experience with heroin or methadone. In re- 
gard to the conscious use of drugs in com- 
binations to achieve an effect, 31% of these 
barbiturate/minor tranquilizer users reported 
“always” using such combinations. Also, they 
reported their access to drugs to be easier 
than any other type of drug user. 

Drug User Type 4: Narcotics users. 
These users of heroin, illegal methadone, 
other opiates, cocaine, and (in conjunction 
with these narcotics) hallucinogenic drugs, 
tend to hold beliefs contrary to the conform- 
ist ethos—They are assaultive and tolerant 
of deviance. The average narcotics user m 
our sample had been arrested an average of 
28 times but had only been convicted three 
times. They feel that their peers devalue con- 
formance success more than does any other 
type of drug user. Their use of these “hard” 
drugs is heavy, and 31% of this type of user 
report always using drugs in combinations 
to achieve an effect. In our sample, 57% of 
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the members of this type were of Mexican- 
American descent, the only type composed 
primarily of this ethnic group. 

Drug User Type 5: Amphetamines, mari- 
juana, and major tranquilizer users. Users 
of amphetamines, marijuana, and the major 
tranquilizers have used more different drugs 
during their lives (11 drugs); they have used 
the widest variety of drugs recently on a 
regular basis (7 drugs); they have used the 
most different drugs on a daily basis (3.8 
drugs); and they are more likely than any 
other type of user to be using drugs in 
combination to achieve an effect—73% of 
these users reported always using such com- 
binations; and none reported “never” using 
combinations. 

These heavy multiple drug users are ex- 
tremely young, alienated, and assaultive, 
They are more likely than any other type 
of user to value drugs as a means of escap- 
ing from their personal problems, and they 
also see drugs as social enhancers and as 
means of conforming to their social group. 
In our sample, no member of this drug user 
type was of Mexican-American descent, 

Drug User Type 6: Methaqualone, minor 
tranquilizer, and barbiturate users. These 
drug users feel themselves to be in a highly 
stressful environment while feeling relatively 
helpless to change the course of their lives. 
Very unlikely to be extraverted or assaultive. 
they do not view drugs as a means of “going 
along” with the group, In fact, they feel that 
their peer group disapproves of drug use. 
These users see more drawbacks to their own 
use of drugs—in terms of loss of self-control, 
self-respect, and loss of friends—than bes 
ny other pee of drug user, 

rug User Type 7: Codeine. ; 
analgesic, and methaqualone eee 
type of user tends to use alcohol (to drunk- 
enness) very infrequently, if at all. They are 
relatively high in Socioeconomic status, In 
our sample, there were no black members of 
this drug user type, and all members of this 
type were male, In general, members of this 
type hold relatively conformist values and 


attitudes. Their drug use occurs despite their 


report that (a) they find drugs hard to get 


and (b) their peers do not support drug use. 


Drug User Type 8: Hallucinogenic users, 
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Primarily users of marijuana and the hallu- | 
cinogenic drugs (they use more than any 
other type of drug user), they also are using 
some amphetamines. These are young, as- 
saultive users, and in contrast to members 
of Drug User Type 5 (who use many of 
the same kinds of drugs), they do not feel 
alienated. More than any other type of user, 
they reported that their peers are supportive 
of drug use, and they see the primary func- 
tion of drug use to be as a means of enhanc- 
ing social occasions. 


Discussion 


To return to the three primary aims of 
this study, six basic classes of drugs were 
identified empirically. In the present sample, 
four multiple drug clusters were found, and 
neither the use of heroin nor alcohol was 
strongly associated with the use of any other ~ 
drug. 

In general, the four multiple drug clusters 
found here do not correspond to patterns 
previously reported in Simpson and Sells's 
(1974) investigation of the problem of iden- 
tifying multiple drug abuse patterns among 
a drug treatment population. In their study 
of 11,380 patients who were included in the 
initial 2 years of the National Institute of 
Mental Health- Texas Christian University 
Drug Abuse Reporting Program (DARP), 
28 distinctive patterns were found. The most — 
frequent pattern, representing over 28% of 
the patients, was the daily or weekly use 
of heroin alone. Moreover, daily or weekly 
use of heroin with cocaine, marijuana, and ~ 
with both also had prevalent patterns. These 
four patterns—each involving heroin—at- 
counted for just over 52% of their patients. 

There is, however, a major difference be- 
tween the Simpson and Sells (1974) study 
and the present one—the differing require- 
ments for admission to the participant treat- 
Ment programs. The treatment agencies 
within DARP were primarily treating opiate 
addicts, and during the first 2 years, users of 
other drug classes usually were not admitted 
unless some level of opiate use was also 1m- 
dicated. Thus, given these selection criteria 
the finding of a high prevalence of heroin 
use, singly and in combination, woul 
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expected. In contrast, the present study con- 
tained subjects from a variety of programs, 
and although 23% were drawn from an al- 
cohol treatment program and 23% from a 
heroin treatment program, the majority 
(54%) were from programs not having these 
selective admission requirements. 

In exploring our second objective—the de- 
velopment of an empirical typology of mul- 
tiple drug abusers—eight distinct types were 
identified, These types were an intriguing 
finding, as some of the types, such as Type 
1 (marijuana/hallucinogen users), conform 
to previously reported and commonly de- 
scribed types, whereas other types would 
not be expected on an a priori basis. For 
example, Type 5 (amphetamine/marijuana/ 
major tranquilizer users) and Type 7 (co- 
deine /nonnarcotic analgesic/inhalant/metha- 
qualone users) are types that are not widely 
described in the literature. 

As with any study of this sort, it should 
be noted that the typological findings ob- 
tained here are functions not only of “real” 
drug-using phenomena but also of the 
“method” used to observe those phenomena. 
Strictly speaking, the present typology must 
be regarded as a function of the particular 
drug use measures used, the composition of 
the sample studied here, and the specific 
Statistical typing procedure used. In this re- 
gard, however, the finding of significant de- 
gtees of communality between the key C/F 
measures and other indices of polydrug use 
(see Table 2), the achievement of a large 
sample from a variety of drug program 
sources, and the selection of a well-known 
standardized set of cluster analysis proce- 
dures all suggest that the obtained typology 
is relatively robust with regard to these 
Method factors. 

The third objective was to determine the 
differential association of a coherent set of 
Psychosocial variables with each of the six 
drug clusters and the eight drug user types. 

he particular set of psychosocial measures 
Used in this study is based on social learning 
theory as articulated by the Jessors and 
their colleagues, and these measures have 
een shown previously to be powerful pre- 
dictors of adolescent deviant behavior. Be- 
Cause the users of the drugs comprising the 
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third drug cluster (marijuana/amphetamines/ 
hallucinogens) most closely approximate the 
subject population on which this theoretical 
approach has been developed, the strong as- 
sociations of these psychosocial variables 
with this drug cluster were not unexpected, 
Nevertheless, if the explanatory scope of the 
underlying theory were adequate, one would 
have expected stronger associations with the 
remaining multiple drug clusters. The gen- 
eral absence of significant relationships be- 
tween these psychosocial measures and the 
other multiple drug clusters (particular with 
Clusters 1 and 5), as well as the striking 
differences in strength of association, raises 
questions about the utility of this theoretical 
approach in explaining a wide variety of mul- 
tiple drug use patterns. It may be that alter- 
native theories and their associated con- 
structs will be required to provide sets of 
descriptive variables that relate to differing 
drug use patterns and users, as has been im- 
plied by several recent articles (see Bentler 
& Eichberg, 1975; Braucht, Brakarsh, Fol- 
lingstad, & Berry, 1973; Lettieri, 1975; Mc- 
Glothlin, 1975). On the other hand, it may 
be that merely introducing key moderator 
variables or the use of alternative analytic 
strategies such as those suggested by Gorsuch 
and Butler (1976) and Dunnette (1975) will 
reveal additional explanatory power now la- 
tent in the present social learning theory. 

Although the present study has succeeded 
in examining specific subgroups of people in 
relation to specific types of drugs (advocated 
by Kessler, Paton, & Kandel, 1976), it should 
be stressed that this study represents an ex- 
ploratory effort. Future research is needed 
to build on this step, replicating and €x- 
tending the present investigation with larger 
numbers of subjects from differing locales 
and differing subcultural groups: To provide 
perspective, a sample of “normals” should 
be interviewed using the same set of mea- 
sures employed with drug-abusing groups. 
There is also a need for further sampling of 
theories and theoretical constructs in order 
to provide an arena for the comparative as- 
sessment of competing theories in explaining 
various patterns of multiple drug abuse. 

If this line of research were to be pursued 
successfully, we believe that the resulting 
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typology would not only contribute to a basic 
understanding of these problems, but it 
would also be of great value to those charged 
with the prevention and treatment of these 
problems. Given a comprehensive, general 
typology, differential public health program- 
ming could be developed for each type of 
user; the planning, locating, and staffing of 
these programs could be expected to benefit 
from the knowledge base provided by a com- 
prehensive typology. Finally, given such a 
typology, it should be clear that differential 
evaluations of prevention and treatment pro- 
grams could be done in a way that is not now 
possible—Type of program, type of drug 
user, and their interaction could be ex- 
amined. Thus, both in terms of pure under- 
standing and practical utility, additional re- 
search along the lines of the present study is 
needed, 
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This study investigated the relative effectiveness of three therapeutic compo- 
nents common to behavioral marital therapies: procedures designed to change 
behavior, procedures to change attitudes, and nonspecific therapeutic effects. A 
hierarchical ordering of these components produced three treatment conditions 
—nonspecific, behavioral, and behavioral—attitudinal. Twenty-seven couples ex- 
periencing marital distress were randomly assigned to one of the three treatment 
conditions and one of five paraprofessional counselors. After four therapy ses- 
sions, the groups were compared on measures of self-reported satisfaction, daily 
reported pleasing (or displeasing) relationship events, and observations of com- 
munication skillfulness. All groups showed significant decreases in negative re- 
lationship behaviors. The behavioral-attitudinal group, compared to the other 
groups, showed significantly greater improvement in reported marital satisfac- 


tion, pleasing behaviors, and positive communication Tesponses. 


Comprehensive reviews on the outcome of 
marital therapies present evidence supporting 
the efficacy of behavioral strategies to im- 
prove marital functioning but caution that 
much of the evidence is not based on rigorous 
empirical investigation (Gurman & Kniskern, 
in press; Jacobson, 1978a). The consist- 
ently positive findings on behavioral marital 
therapy come from series of uncontrolled 
case studies (e.g., Weiss, Hops, & Patterson 
1973) and, more recently, from studies com. 
paring behavioral marital therapy to no- 
treatment controls (e.g., Jacobson, 1977), 
Studies comparing behavioral marital ther- 
apy to other theoretical approaches provide 
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more equivocal results (e.g., Liberman, Le- 
vine, Wheeler, Sanders, & Wallace, 1976). In 
both controlled and comparative studies, the 
utility of behavioral marital approaches is 
evidenced more consistently on behavioral 
outcome measures than on traditional self-re- 
port measures (e.g., Harrell & Guerney, 
1976). 

Overall, the recent proliferation of out- 
come studies supports the use of behavioral 
Strategies to increase marital accord, but the 
studies do not identify which ingredients of 
the behavioral treatment package are respon- 
sible for change. For the most part, investi- 
gators have applied multiple intervention 
Procedures without using experimental de- 
signs that permit identification of the effica- 
cious treatment components. According to 
Jacobson and Martin (1976), communica- 
tion skill training and contingency contract- 
ing are the two therapeutic elements most 
commonly used in behavioral marital thera- 
Pies. However, recent investigations by Ja- 
cobson (in press) and by Turkewitz and 
O'Leary (Note 1) suggest that communica- 
tion training, by itself, may be sufficient to 
Produce marital improvement. The present 
study was designed as a components analy- 
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sis to examine the relative efficacy of thera- 


peutic ingredients associated with communi- 
cation training in behavioral marital therapy. 

The present study served to isolate two 
features that are typically included in be- 
havioral marital therapies but are not identi- 
fied as the components that promote thera- 
peutic change. A nonspecific therapy group 
was included to isolate and control for non- 
specific therapeutic effects that accompany 
and perhaps enhance behavior change tech- 
nologies. In addition to controlling for stan- 
dard nonspecific effects, such as relationship 
and expectancy factors, this nonspecific 
group also controlled for the unintentional 
effects related to behavioral assessment pro- 
cedures. A recent study by Jacobson (1978b) 
is the only example in which improvement in 
behavioral therapy clearly is not a function 
of nonspecific effects: He found that couples 
who received a behavioral treatment im- 
proved significantly more on three out of 
four outcome measures than couples in a 
nonspecific control group. 

The present study also isolated cognitive 
restructuring as a potentially active compo- 
nent of behavioral marital approaches. A 
behavioral—attitudinal treatment was de- 
signed that intentionally included procedures 
to change attitudes as well as procedures to 
change behaviors. The purpose of the cogni- 
tive restructuring procedures was to help 
spouses reattribute the source of their mari- 
tal problems to “bad relationship skills” 
rather than “bad people” in the relationship. 
When spouses attribute marital difficulty to 
the personality of the other spouse, they tend 
to discount that person’s attempts to be re- 
inforcing and maintain their pessimistic im- 
pressions regarding the relationship (Mar- 
golin, Christensen, & Weiss, 1975). The 
cognitive restructuring procedures in this 
study were designed to help spouses to (a) 
be less blaming; (b) become more accepting 
and responsive to their partners’ efforts to be 
pleasing; and (c) realize that the goal of 
relationship improvement was mutual, rather 
than personal, gain. Rather than being a 
Substitute for behavior change procedures, 
the cognitive restructuring procedures were 
intended to make it easier for spouses to 
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engage in behavior change without losing 
face. 

The three treatment conditions in this 
study systematically introduced therapeutic 
components to determine the most efficacious 
combination of components: (a) nonspecific 
effects (NS condition); (b) behavioral train- 
ing plus nonspecific effects (BT condition); 
and (c) attitudinal restructuring plus be- 
havioral training and nonspecific effects (AB 
condition). It was predicted that couples in 
the AB condition would experience greater 
behavioral improvements in their relation- 
ships than those receiving the more standard 
BT treatment. It was also predicted that 
couples in treatment groups based on social 
learning theory (AB and BT) would experi- 
ence greater behavioral improvements in their 
relationships than couples in the NS group. 
Since nonspecific effects have been related to 
more superficial, as opposed to behavioral, 
changes, it was hypothesized that couples re- 
ceiving that treatment would make gains, 
perhaps even equal to the other groups, on 
the measures indicating self-reported marital 
well-being. 


Method 


Design 


This research is best conceptualized as an analogue 
outcome study, since the treatment offered was an 
intensified, abbreviated, and standardized version of 
naturalistic therapy. Treatment parameters not salient 
to the behavioral or cognitive components were con- 
stant across treatment groups, for example, number 
of sessions and amount of therapist contact. Spouses 
were seen conjointly at weekly intervals by their own 
individual therapist over a period of 4 weeks for a 
total of four 2-hour sessions. The fixed treatment 
sequence consisted of a (a) 1-week baseline evalua- 
tion; (b) 2-week intervention phase; and (c) 1-week 
posttreatment evaluation. Each couple was randomly 
assigned to one of the three treatment conditions and 
to one of five paraprofessional counselors for 4 weeks 


of couples counseling. 


Subjects 


wenty-seven couples experiencing marital distress 
T apated in the study. Although couples paee 
cruited through publicity on radio, television, and in 
the local newspapers, acceptance into this project was 
based on evidence of marital distress. Marital distress 
was initially assessed by the experimenter in a tele- 
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phone interview and a 1-hour intake session. In addi- 
tion, couples had to meet two of the following three 
criteria to be accepted into the project: (a) a mean 
spouse Locke-Wallace score < 100; (b) a total couple 
Areas of Change score > 15 major problem items; 
and (c) a therapist rating <4 based on an 8-point 
rating scale ranging from high marital distress to high 
marital adjustment. The mean spouse Locke-Wallace 
score of participating couples was 71.8 (SD = 22.5). 
Of the 31 couples who volunteered for the project 
and who completed screening procedures, 2 couples 
did not meet the criteria for marital distress, 1 couple 
declined treatment, and 1 refused treatment after the 
first session, Upon entering counseling, all couples paid 
a $50 refundable security deposit that was contingent 
on keeping scheduled appointments and completing 
homework assignments. The total deposit was re- 
funded to all except 1 of the 27 participating couples. 
Client couples’ varied substantially in age (range 
from 20 to 72 years; Mdn = 31.5) and income (range 
from $3,500 to $27,000; Mdn = $15,000), and had a 
relatively high education level. (For 19 couples, one 
or both spouses had obtained a college degree.) 
Seven of the couples were childless, but overall the 
sample averaged 1.4 children per family. For 11 of 
these couples, one or both partners had been engaged 
in prior therapy. Analyses of variance were used to 
examine possible group differences in age, length of 
marriage, number of children, and income; no sig- 
nificant between-group differences were found. 


Therapists 


Of the five persons serving as paraprofessional 
therapists? two had completed college and three were 
undergraduates. Since none of the therapists had 
Previous experience in counseling cou; les, thi 
underwent 40 hours of training she A gine "ha 
first cases and continued to Participate in both group 
and individual supervision during the time they saw 


cases. All training and supervisi 
the ‘first author. pervision was conducted by 


Measures of Treatment Validity 


Two measures wi 


ity of the major assumptions bel 
design that (a) treatment m 


ef NEN learning theory appli 
ems. A posttest treatment evaluatio; i ii 
designed to measure satisfaction with ee e 
the first assumption. The Consultation Readiness In 
t consisting of multiple-choice questions we 
lowing brief vignettes of marital conflicts (simila: to 
the Inventory of Marital Conflicts by Olson & R; i 5 
1970), was used to measure spouses’ Knowledge x 
and capacities for applying behavioral pap k 
Spouses independently read the vignettes of Aan t- 
ical marital problems and identified the statement fi t 
best described factors contributing to each proble 3 
and/or strategies for remedying each problem, ia 
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Measures of Treatment Outcome 


Multidimensional outcome criteria were used to de- 
termine how the separate therapeutic components 
contribute differentially to specific outcomes. Six dis- 
tinct measures were used: three self-report inven- 
tories, two daily reported observational indices from 
the home, and observational data from videotaped 
interactions in the laboratory. 

Self-report satisfaction inventories. Three ques- 
tionnaires were used to measure global marital satis- 


faction of spouses prior to the first session and after ` 


the final session. The Locke-Wallace Marital Adjust- 
ment Inventory (Locke & Wallace, 1959), which was 
chosen for its wide usage in traditional marital litera- 
ture, provides an index of marital compatibility and 
stability. The Areas of Change Questionnaire (Weiss 
et al, 1973) provides an index for measuring the 
amount of change that each spouse desires in the 
partner’s performance of specific relationship be- 
haviors. This 34-item inventory asks each respondent 
to indicate on a 7-point scale whether (a) she/he 
desires the spouse to change a particular behavior and 
(b) it would please the partner if the respondent were 
to change. Each spouse also completed the Adjective 
Check List (Gough & Heilbrun, 1965) indicating 
which adjectives accurately described the partner, 
thereby capturing the evaluative trait labels that 
Spouses cognitively assign to one another. 

Daily reported home observations. Every evening 
throughout the 28 days of treatment, each spouse re- 
corded the number of pleasing and displeasing rela- 
tionship events that she/he received from the partner 
as well as the frequency of pleasant thoughts that 
she/he had about the partner. Pleasing and displeas- 
ing behavioral events were recorded by means of the 
Spouse Observation Checklist (SOC; Weiss et aly 
1973; Weiss & Margolin, 1977), which provides 400 
sample relationship behaviors representative of j 22 
areas of marital functioning (e.g., communication, 
affection, companionship). The specific behaviors have 
either a pleasing (e.g., “Spouse gave me a massage ) 
or displeasing (e.g., “Spouse criticized me”) effect on 
the recipient. Each spouse also recorded the occut- 
rence of pleasant thoughts about the partner on 4 
tracking card, 

Laboratory observational data, During the first 
and last sessions, each couple engaged in two 10-min- 
ute negotiation sessions during which the spouses 
attempted to solve a relationship problem without any 
interruptions or assistance from the therapist. Topics 
for both negotiation sessions were chosen by the 
couple before they began the discussions. Therapist 
instructions specified that the couple was to act a$ 
they would normally and to problem solve as bes 
they could. These negotiation sessions were videor 
and then coded by trained observers using the Marita 
Interaction Coding System (MICS; Hops, Wills, pa 
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- terson, & Weiss, Note 2), a 29-category observational 
system that provides a sequential accounting of verbal 


and nonverbal communication processes. Validation 
of the MICS comes both through treatment studies, 
which demonstrated that the MICS was moderately 
sensitive in discriminating preintervention to post- 
intervention changes (Patterson, Hops, & Weiss, 
1975), and analogous studies, which demonstrated 
that the MICS accurately discriminated couples who 
defined themselves as distressed or nondistressed 
(Birchler, Weiss, & Vincent, 1975). MICS coders were 
undergraduates, who were trained for a total of 20 
hours or until they reached the minimum criterion of 
70% reliability.. Coders worked in pairs, and two 
pairs coded each 10-minute segment. To calculate 
percentage of agreement, one coder was randomly 
chosen to be “calibrator.” The total frequency of 
agreements between coders was divided into the total 
number of codes recorded by the calibrator; that is, 
agreements plus disagreements, This quotient was then 
multiplied by 100, Any observation that did not reach 
the criterion of 70% reliability was recoded by an- 
other coder pair. High interobserver reliability was 
maintained by once per week random spot checks 


on individual coders. 


Intervention 


Communication training. The main therapeutic 
purpose of all three treatment conditions was to im- 
prove communication skills, The general structure 
used by all groups consisted of 10-minute negotiation 
discussions, during which spouses attempted to re- 
solve a conflictual issue, and 10-minute feedback 
Periods, during which spouses shared reactions to the 
Process of the preceding negotiation discussion, Each 
2-hour therapy session included four negotiation ses- 
sions interspersed with four feedback discussions. 
Specific procedures used during the negotiation and 
feedback discussions varied according to group assign- 
ment, 

Communication training for the B r 
Porated principles common to many behavioral mari- 
tal treatments, namely that spouses must (a) ex- 
Plicitly define (pinpoint) specific behaviors that the 
partner is to accelerate and then (b) faithfully rein- 
force the occurrence of these behaviors, The specific 
behaviors to be increased were “helpful” communica- 
tions, which were operationally defined by the part- 
ner, During each negotiation discussion, one spouse 
Was designated as the “sender” and the other as the 
“receiver” of helpful communications. While attempt- 
ing to resolve a problem issue, BT couples used an 
electromechanical apparatus to increase the rate at 
which the sender emitted helpful responses. 

The electromechanical apparatus consisted of (a) 
a raucous buzzer functioning as a negatively reinforc- 
ing stimulus and (b) a pleasant single-tone chime 

unctioning as a reward. The buzzer was activated by 
4 recycling interval timer, which timed an interval 
of 1.75 minutes, The spouse designated as receiver 
coded, by means of a silent hand-held button, each 
time the sender emitted a helpful response. Each cod- 


T group incor- 
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ing of a response (a) activated the pleasant chime 
signaling to the sender that she/he had been helpful 
and (b) reset the timer to postpone the buzzer. If 
the 1.75 interval expired without a helpful response, 
the timer activated the buzzer, signaling to the 
sender that she/he must emit a response to terminate 
the buzzer. Spouses alternated roles for each new 10- 
minute negotiation discussion, During the feedback 
periods, the receiver for the previous discussion iden- 
tified which behaviors the sender was to accelerate, 
The sender then rehearsed unfamiliar responses before 
his/her next trial as sender. In these BT procedures, 
each spouse identified what the partner could do to 
increase communication helpfulness. 

In the AB condition, spouses also defined behaviors 
that they wanted accelerated, However, only those 
behaviors agreed on by both partners were accepted 
as targets for change, This strategy was adopted to 
avoid the problems that arise when one spouse de- 
mands a change that the other is reluctant to make. 
In a manner similar to BT couples, AB couples used 
the electromechanical apparatus and alternated be- 
tween sender and receiver roles, The difference, how- 
ever, was that AB couples worked to increase the 
rate at which they agreed on the coding of helpful 
responses rather than to increase one spouse’s output 
of these responses (cf. Margolin & Weiss, in press). 
In the AB condition, both spouses simultaneously 
coded helpfulness, The sender coded helpful behaviors 
that she/he emitted, and the receiver coded the 
sender’s behaviors that were perceived as helpful. The 
goal of the exercise was to increase sender-receiver 
agreements, which were defined as simultaneous re- 
sponding by both partners within a 2-sec interval. 
Each of these agreements activated the chime and 
reset the buzzer. During feedback periods, AB spouses 
identified which communication behaviors they had 
jointly defined as helpful and then pinpointed and 
rehearsed these behaviors, Success on this task was 
labeled as couple agreement; that is, the spouses held 
congruent perceptions about the process of their com- 
munication even though they were still in conflict 
over content issues. 
followed a similar 
d feedback dis- 
training, Dur- 
spouses did not use 
us, nor were they 
they engaged in 
e discussions that focused on a 


relationship problem. 
the therapist encourage 
ing statements and discuss percep! 
aged prolonged discharge of pent-up emotion. Rather 
than helping spouses to define specific behaviors, the 
therapist directed his/her activity toward the reflec- 
tion and acceptance of each spouse's feelings. 
Increasing the pleasing behaviors. The secondary 
therapeutic purpose across treatment conditions was 
to generate immediate gratifications for couples to 
offset the tedium of communication skill training. 
This treatment goal was achieved through homework 
assignments using the SOC. NS couples continued to 
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use the SOC in the same manner as all couples had 
done during the assessment phase. Each evening, NS 
spouses indicated which pleasing and displeasing items 
had occurred during the previous 24-hour period. In 
addition to tracking those events, each spouse in the 
BT group pinpointed desirable events on the SOC 
and worked toward a 100% increase in his/her output 
of those events that the partner had identified as 
pleasing. AB couples first agreed on a subset of events 
that they both identified as pleasing and then worked 
as a team toward a 100% increase in the identified 
couple pleasing behaviors. 

Reading assignments. Assignment of the following 
readings lent rationale and credibility to the proce- 
dures for each group: NS couples read a chapter 
from Lederer and Jackson’s (1968) Mirages of Mar- 
riage about communication in a marital relationship. 
BT couples read Chapter 2 from a manuscript by 
Weiss and Ford (Note 3) that defines behavioral 
principles, such as pinpointing, reinforcement, and 
shaping and demonstrates how these principles apply 
to marital interactions. AB couples read Chapters 1 
and 2 of the Weiss and Ford manuscript; Chapter 1 
describes how unsatisfying relationships are a function 
of poor relationship habits rather than either spouse’s 
bad intentions. 

Cognitive restructuring. In addition to the other 
therapeutic components, AB treatment included pro- 
cedures to modify spouses’ cognitions regarding the 
source of their marital discontent. The cognitive re- 
structuring component, which was conveyed through 
therapist explanations and readings, contained the 
following messages: (a) Blame does not reside with 
one or the other spouse; (b) both persons suffer as 
participants in a dysfunctional relationship; and (c) 

frustration from feeling powerless to improve the 
relationship is often confused with resentment toward 
the spouse. Spouses were encouraged to apply non- 
blaming explanations to: situations in which the 
Partner’s behavior was perceived as undesirable, 


Results 
Validity of Treatment 


_ Two inventories completed at the conclu- 
sion of treatment provided an internal valid- 
ity check on whether the treatment condi- 
tions captured the intent of the experimental 
design, A one-way analysis of variance of 
the Posttreatment Evaluation Questionnaire 
which measured consumer satisfaction re- 
vealed no group differences in how couples 
perceived or evaluated the different treatment 
modalities. The Consultation Readiness In- 
ventory provided information on the dual 
questions of (a) whether couples in both þe- 
havioral groups were adequately instructed 


in social learning theory principles and (b) 
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whether, in fact, couples in the nonspecific 
group were privy to the same social learning 
theory information provided to the two be- 
havioral groups. Findings on this inventory 
also confirmed design expectations. Mean 
item scores for behavioral (22.44) and be- 
havioral-attitudinal (22.11) couples were 
significantly higher than the scores for non- 
specific couples (12.22), F(2, 24) = 11.06, 
p< Ol. 


Outcome Criteria 


Each outcome variable was analyzed for 
within-group and between-groups changes. 
Separate matched-pair ¢ tests were used for 
within-group analyses to determine pretreat- 
ment to posttreatment changes for each 
group on each variable. Given the large num- 


ber of analyses generated by this statistical . 


approach, any particular test meeting the 
.05 level of statistical significance must be 
interpreted cautiously. The relative effective- 
hess of different treatment conditions was 
examined by between-groups analyses. Co- 
variate procedures were used on all between- 
groups analyses to control for differences in 
prescores on each measure. Planned-compari- 
son ¢ tests on the adjusted posttreatment 
means were used when between-groups dif- 
ferences had been predicted; analysis of co- 
variance was applied when there were 2 
such predictions. In each statistical analysis, 
Couples were analyzed as units; couples 
scores were either the sum or the average of 
husband-wife scores. 

Self-report satisfaction inventories. Table 
1 presents the results for the four self-report 
Measures. Within-group effects were mea- 
sured by comparing pretreatment and post- 
treatment scores, obtained prior to therapist 
Contact and after the posttreatment assess- 
ment week; ¢ values are presented for pre- 
treatment to posttreatment comparisons. Be- 
tween-groups effects were examined through 
the adjusted posttreatment means. , 

The within-group marital satisfaction 
Scores on the Locke-Wallace revealed that 
both the AB and the NS groups reported sig: 
nificantly increased satisfaction. In addition, 
significant between-groups differences were 
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Within group 


Between-groups 


Pretreatment Posttreatment adjusted 
Dependent variable M SD M SD Pre-post t peace: 
Locke-Wallace* 
BT 75.05 22.22 80.27 
4 e $s 22.84 —1.10 X 
a 63.22 14.63 87.77 27.15 —3,99%** Ce 
[ 77.05 21.38 84.16 24.16 —2.89** 78.55 
j of change*° 
11.72 3.09 9.86 5.11 1.50 10.06 
a 13.44 5.19 7.22 4.93 325% 6.30 
k 11.00 3.67 9.66 4.93 1.34 10.37 
ACL positive adjectives? 
f: a 66.22 14.92 53.44 17.39 448° 42,30 
N 45.55 20.40 45.88 23.49 —.26 54.74 
É. S 51.66 19.61 49.77 21.33 48 52.06 
ACL negative adjectives“? 
4 15.88 8.22 13.66 8.57 2.79* 14.55 
AB 21.00 7.87 12.88 8.05 Ehi 9.97 
NS 14.33 7.28 12.77 10.68 AL 11.41 


ote. BT = behavioral training plus nonspeci 
ining and nonspecific effects; NS = nonspeci! 
verage of husband-wife scores. 
AB is significantly higher than BT and NS scores. 
Low score = high marital accord. 
Sum of husband-wife scores. 


4,2 < .025, one-tailed. 
m, Ê X 201, one-tailed. 
b < .005, one-tailed. 


pound for adjusted posttreatment means, 
(2, 23) = 3.59, p < .05; a post hoc Scheffé 
alysis on these data supported the conclu- 
ion that the AB group’s posttreatment 
Ores were significantly higher ($ < .05) 
than those of the other two treatment con- 
tions. 
A similar pattern of results was obtained 
for the Areas of Change Questionnaire: Only 
the AB group significantly reduced total con- 
Ct scores from pretreatment to posttreat- 
2 ent measurements. Since all groups im- 
toved, between-groups differences were not 
Significant. 
é The Adjective Check List results were ex- 
‘Mined by separate analyses of mean fre- 
ltencies of checked positive and negative 
djectives, The AB group significantly re- 
ced their mean frequency of negative la- 
$, although the mean frequency of positive 


*Change occurred in direction opposite to prediction. 


fic effects; AB = attitudinal restructuring plus behavioral 
fic effects, ACL = Adjective Check List. 


labels remained the same. The BT group re- 
duced the frequency of both negative and 
positive adjectives, perhaps reflecting a tend- 
ency to use all trait labels less often. 
Couples in the NS group did not change 
from baseline rates on either adjective mea- 
sure. No between-groups differences were 
found for mean frequencies of either positive 
or negative adjectives checked. 
Overall, the AB group, relative to the 
other groups, consistently demonstrated a 
greater degree of improvement on the three 
self-report measures. Contrary to prediction, 
reported marital satisfaction as evidenced on 
the Locke-Wallace, increased in only two of 
the three groups. P 
Daily reported home observations. Spouses 
recorded daily pleasing and displeasing 
m the partner, that is, 


events received. fro 
SOC items, as well as pleasant thoughts 
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Table 2 
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Treatment Effects on Measures of Mean Daily Recorded Home Data (Frequency of Items) 


ithi Between-groups 
Within-group e 
ttreatment adjusted 
Pretreatment treatment Posttreatm poten 
Seeds ; M SD M SD M SD Pre-post ¢ M 
Us iad 25.35 10.02 3648 20.87 36.57 24.85 -1 45 rp 
AB> 26.31 12.38 37.78 13.99 41.96 15.22 —4.74 ba 
NS 25.73 16.18 21.95 14.08 24.13 14,58 55 g 
ree 5.41 4.75 4.28 2.77 1,94 1.90 2.52° re 
AB 5.01 3.45 3.35 81 3.29 2.28 2.54 3.1 
NS 5.67 4.64 3.90 2.38 2.10 2.12 sas? 2.00 
t thoughts S 
or oe 4.79 2.68 5.02 3.06 5.31 2.96 —1.17 4.74 
AB 3.68 1.47 4.33 2.69 5.18 2.04 —2.37* 5.46 
NS 3.77 4.16 3.97 3.76 4.06 4.36 —.38 4.13 


Note. BT = behavioral training plus nonspecific effects; AB = attitudinal restructuring plus behavioral 
training and nonspecific effects; NS = nonspecific effects; SOC = Spouse Observation Checklist. 

* BT and AB scores are significantly higher than NS scores. i 

» Home observation data were missing for one AB couple since the husband was scheduled to leave for 
reserve duty immediately upon completing the 2-week treatment. 


* p < .025, one-tailed. 
** p < .01, one-tailed. 
*** p < .005, one-tailed. 


about that person. The AB and BT groups 
were expected to demonstrate greater im- 
provements than the NS group, and the AB 
group was expected to display greater im- 
provement than the BT group. Separate 
planned-comparison ¢ tests were used to ana- 
lyze these directional predictions, 

Table 2 presents mean daily spouse pleas- 
ing, displeasing, and pleasant thought totals. 
Since the data were collected throughout 
treatment, the table also includes midtreat- 
ment scores averaged across the 14-day in- 
tervention. The pretreatment and posttreat- 
ment means are for 5 and 7 days, respec- 
tively, of data collection? Pleasing events 


d as an intervention 


This measure assesg 
had followed the i 
pleasing events duri 


drawn, 


During intervention both 


: behaviorally 
oriented groups increased their 


mean daily 


tates of pleases approximately 43% over 
their baseline rates. At posttreatment as- 
sessment, the AB group further increased 
its mean rate of pleasing behaviors, whereas 
the BT group merely maintained its inter- 
vention rate. Only the AB group demon- 
strated significant pretreatment to post- 
treatment change in rates of pleasing “es 
haviors. Planned-comparison ¢ tests confirme 
the prediction that both behavioral groups 
would exchange more pleasing events than 
the NS group, #(23)= 2.15, p < .025. Ex- 
amination of rates of displeasing behaviors 
revealed results unlike those for pleasing be 
haviors; the mean daily rate of displeasing 
behavior decreased significantly for all three 
groups. 

The cognitive restructuring procedures : 
were expected to produce significant i 
creases in spouse-related pleasant though a 
The AB group confirmed this especta agy 
Even though the other groups showé 


rere 
* Data from the first 2 days of pretreatment 7 
discarded because of their excessive variability 
groups. 
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changes indicating improvement, they were 
not significant. 

Results of the daily recorded home data 
suggest that pleasing and displeasing be- 
haviors were differentially effected by the 
treatment components, All groups displayed 
decreases in displeasing behaviors that were 
not even the target of any specific interven- 
tion technique. Yet increases in pleasing be- 
haviors occurred only when there was a di- 
tect intervention to change those responses; 
merely tracking the occurrence/nonoccur- 
tence of pleasing behaviors was not associ- 
ated with changes in reported frequency of 
those behaviors. 

Laboratory observational data. Commu- 
nication skillfulness was assessed by MICS 
coding of two pretreatment and two post- 
treatment 10-minute videotaped samples of 
tach couple negotiating a relationship prob- 
lem. Average point-by-point observer agree- 
ment for the 108 coded samples was 83.8%. 
Coding was reduced to two summary scores 
for purposes of data analysis: positive re- 
Sponses (agree, approval, accept responsi- 
ility, compromise, humor, problem solution, 
attention, assent, laugh, positive physical 
contact, and smile) and negative responses 
(complaint, criticize, deny responsibility, ex- 
cuse, no response, not tracking, put down, 
and turn off), Neutral behaviors such as 


Table 3 
Treatment Effects on Laboratory Obse 
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unfocused problem description were not in- 
cluded in the analyses. To control for differ- 
ences in individual activity rates, each of 
these summary categories was expressed as 
the proportion of total responsiveness; in- 
dividual husband and wife proportions were 
averaged for the couple score. 

Table 3 provides pretreatment, posttreat- 
ment, and adjusted mean proportion scores 
for MICS coded positive and negative re- 
sponses. Only the AB group significantly in- 
creased positive communication skills be- 
tween pretreatment and posttreatment mea- 
surements. The mean increase in the positive 
response proportion was approximately 35% 
over baseline for the AB group, as compared 
to 9% and 20% increases for the BT and 
NS groups, respectively. A planned-compari- 
son ¢ test for between-groups measures re- 
vealed that the AB group increased signifi- 
cantly more than the BT group. 

The between-groups differences found with 
the positive communication score were not 
paralleled with the negative communication 
score. Regardless of treatment mode, all 
groups reduced their mean proportion of 
negative behaviors to less than 50% of their 
baseline levels. The magnitude of these re- 
ductions was comparable for all treatment 


groups. 
These results are consistent with the find- 


rvational Data (Mean % Positive and Negative MICS 


cores) 
Between-groups 


t adjusted 
D Pretreatment Posttreatmen iiss nl) 
ee : M SD M SD Pre-Post £ 
Proporti iti 
a = 38.51 12.56 41.99 7.81 =18 asr 
wv 35.29 8.44 47.71 5.38 Ssi 41.66 
Ns 34.01 11.73 40.83 6.67 = 1:83 i 
eae ; 
Br ees 933 OA 3.15 418 1.86% 3385 
AB 8.80 6.60 349 282 2a 391 
BS 11.30 6.30 550 4.38 421 i 
| restructuring plus behavioral 


ae. BT = behavioral training plu 
a Jning and nonspecific effects; NS 
score is significantly higher than th 
< .05, one-tailed. 

< .025, one-tailed. 
< .005, one-tailed. 


e BT score. 


* 
*k ; 
"5 


s nonspecific effects; 
= nonspecific effects. 


AB = attitudina! 
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ings for SOC pleasing and displeasing be- 
haviors; the between-groups difference for 
increases in desirable behaviors demonstrates 
strong support for the inclusion of the addi- 
tional features of attitudinal restructuring. 
Reductions in negative responses do not ap- 
pear to be related to specific therapeutic 
procedures, since all treatments were equally 
effective in producing a decrease in these 
behaviors. 


Summary of Results 


Overall, nine outcome measures spanning 
three modes of observation figured into the 
within-group and between-groups analyses. 
Out of the nine within-group ¢ tests, the AB, 
BT, and NS groups showed significant 
changes in the predicted direction on 8, 3, 
and 3 measures, respectively. Significant be- 
tween-groups differences were found on the 
following three outcome criteria: Locke- 
Wallace scores, SOC pleasing behaviors, and 
MICS positive behaviors. The AB group dis- 
played significantly greater improvement 
than the BT group on the first and third 
measure and greater improvement than the 
NS group on the first measure. Thus, the AB 
group demonstrated significantly greater 
improvement than one or both of the other 


groups on at least one measure in each as- 
sessment mode. 


Discussion 


This study presented a method for im- 
plementing a therapeutic intervention for 
marital distress while maintaining careful 
control over the application of specific treat- 
ment procedures, Although a representative 
clinical sample was used, the following fea- 
tures of the study limit its applicability to 
naturalistic behavior marital therapy: (a) 
a highly abbreviated intervention period; (b) 
inexperienced paraprofessional therapists: 
and (c) highly standardized procedures Be. 
cause of the abbreviated nature of the in- 
tervention, the results of this study reflect 
changes that would occur during the initial 
stage of a standard course of therapy. It is 
as yet undetermined whether the demon 
strated improvement would persist over üm 
Due to clinical considerations, follow-up data 
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were not collected to examine the mainte- 
nance of treatment effects. Since couples in 
this study received only a portion of a com- 
plete treatment, they were offered additional 
counseling through a marital treatment 
group. Follow-up data on the original study 
would have been confounded by this addi- 
tional treatment, since only a portion of the 
couples participated in these groups and the 
participating couples joined the groups at 
different times after completing the original 
treatment. 

The purpose of this study was to explore 
how specific therapeutic components differ- 
entially effect the initial stage of behavioral 
marital therapy. The results of this study 
revealed two predominant trends. Couples in 
the AB group, who received all three of the 
identifed therapeutic components, consist- 
ently demonstrated greater improvement | 
than couples in the other treatment groups. 
However, this finding applied only to cri- 
teria measuring change in a positive direc- 
tion, that is, the measurement of attained 
relationship benefits. Criteria measuring dys- 
functional coupling behaviors revealed sig- 
nificant reduction for all treatment groups; 
the treatment groups were not differentially 
effective in lessening problematic relation- - 
ship behaviors, 

There are several explanations to account 
for the reduction in negative behaviors dis- 
played by all treatment groups. First, it 18$ 
Possible that demand characteristics ass0- 
ciated with entering counseling heighten the 
level of negativity displayed at that time. 
Negative MICS behaviors would be high if 
spouses entered therapy with the notion that 
therapy is a time to argue and air grievances. 
The negative behaviors might then lessen 
over the weeks if the therapist did not rem- 
force these behaviors or if the spouses them- 
selves grew tired of these unproductive pat- 
terns, Likewise, SOC displeasing behavior 
would be high if spouses used that instru- 
ment as a vehicle to demonstrate to the 
therapist the faults of the partner. SOC dis- 
pleasing behavior might also decrease onC® 
the therapist communicated that she/he Un- 
derstood what each spouse was experiencing: 

Second, since the NS group received the 
same assessment procedures as the 4 


j 


í 
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treatment groups, the reductions in negative 
ehaviors displayed across all three groups 
may be indicative of clients’ reactivity to the 
ehavioral assessment procedures (cf. John- 
son & Bolstad, 1973; Jones, Reid, & Patter- 
son, 1975). That is, spouses’ knowledge that 
their behaviors were being observed may 
ave motivated them to reduce their nega- 
tive behaviors in an effort to avoid censure 
y the partner or therapist. Unfortunately, 
reactivity to observations by the partner (on 
he SOC) or the therapist (through the 
MICS) cannot be distinguished from other 
nonspecific effects. It is interesting to note, 
however, that the data do not support an 
interpretation of similar reactivity in the 
ositive direction; that is, spouses in the NS 
group did not increase positive or facilitative 
ehaviors. 

In light of more positive outcome litera- 
ture on behavioral marital therapies (e.g., 
Jacobson, 1977; Weiss et al., 1973), what 
accounts for the BT group's relative ineffec- 
iveness in this study? The brevity of the 
treatment may be one explanation; although 
the Cognitive component appears to generate 
more rapid improvement, standard BT ther- 
apy might produce the same changes over a 
normal course of therapy. Second, the experi- 
Mental design of this study excluded all cog- 
nitive restructuring factors from the BT 
group. Perhaps the systematic elimination 
of cognitive restructuring features in the BT 
treatment actually removed a competent that 
is typically included in behavior therapy un- 
der the rubric of reeducation or therapist 
Persuasiveness, Most writings about be- 
havioral treatments emphasize specific be- 
havioral techniques and take for granted 
the procedures used to help clients generate 
a cognitive set that facilitates behavior 
change, Even though this study illustrates 
the pitfalls associated with behavioral train- 
ing conducted in the absence of adequate 
Cognitive preparation, more careful descrip- 


ition of therapeutic procedures is needed to 


determine the extent to which behavior ther- 
‘pists currently use cognitive restructuring 
Procedures. 

Overall, the results of the present study 
Suggest that the combination of the three 
identified therapeutic components in the AB 
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group provided an effective method to con- 
trol aversive exchanges as well as an in- 
crease in rewarding marital interactions. 
These results pose the question as to which 
aspect of the cognitive restructuring compo- 
nent made the AB treatment more effective 
than the BT treatment. The two cognitive 
restructuring features that differentiated the 
AB from the BT treatment were (a) the 
modification of therapeutic exercises to focus 
on mutual rather than individual goals and 
(b) the reattribution message conveyed 
through readings and therapist instructions. 
Anecdotal feedback from clients suggested 
that the focus on mutual goals was helpful 
for distressed spouses who were reluctant to 
engage in behavior changes requested by the 
partner. The mutual goal setting allowed each 
spouse to exercise more control in defining 
the behaviors that she/he was to change. 
However, for spouses who share very few 
common goals, the mutual focus may limit 
the scope of therapeutic change and impede 
overall progress. 

Cognitive restructuring within a marital 
therapy framework provides the most inno- 
vative aspect of the AB treatment. Recently, 
a number of investigators working on a vari- 
ety of problem areas (cf. Mahoney, 1974; 
Meichenbaum, 1977) have used cognitive re- 
structuring procedures to reduce the incapaci- 
tating emotions that tend to accompany dys- 
functional behaviors, making those behaviors 
resistant to change. The cognitive component 
in this study was an initial attempt in using 
cognitive restructuring procedures to inter- 
rupt the negative thoughts that can impede 
relationship improvement. The therapeutic 
procedures encouraged spouses to abandon 
blaming attributions, to accept greater per- 
sonal responsibility for relationship failure, 
and to be more accepting of their partners’ 
positive efforts. This approach, which is 
somewhat similar to Ellis’ (1962) rational- 
emotive psychotherapy, was expected to de- 
crease the likelihood of mislabeling ambig- 
uous behaviors or overresponding to negative 
relationship exchanges. The reattribution 
procedures in this study were presented in a 
standardized fashion that disallowed their 
being tailor-made to @ couple’s particular 
needs. Furthermore, no attempt was made to 
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rehearse the cognitive restructuring materials 
or to monitor how spouses used them. None- 
theless, the encouraging findings for the AB 
treatment strongly support the application 
of cognitive restructuring procedures to re- 
duce spouses’ blaming attributions and to 
foster prerequisite attitudes for constructive 
problem solving. 
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Reade and Wertheimer (1976) reported an 
analogue diagnostic study in which a brief case 
description was given to two groups of clinical 
judges who rated the probability that the ficti- 
tious patient was schizophrenic. The two case 
descriptions differed only in their second sen- 
tence, which stated either that “Mr. S has an 
identical twin, who 7 years ago joined the army 
and has visited home only once for a 2-week 
ee stay” (p. 878), or “Mr. S has an identical 
win, who 4 years ago was placed in a mental 
Reset with a diagnosis of simple schizo- 
ae (p. 878). The remainder of the case 
RA lescribed parental roles (father was domi- 
Save and quick-tempered; mother was at- 
a ad spoiling), emphasized the patient’s 
with aa erona] relations (introverted, shy 
i a tls, few close relationships), and referred 
its possibly symptomatic behaviors (refus- 
his pas to his father for several days after 
“job, a a berated him for not looking for a 
i ma 7: and “a slow but pronounced change 
neglect pee which included untidiness, 
havior Personal appearance, “strange” be- 
change of his father’s funeral, and frequent 
malingeri jobs due to poor peer relations and 
S had a The clinicians who read that Mr. 
mean probar paren identical twin rated a 
aS the bability of schizophrenia of .66, where- 

Clinicians who read that Mr. S’s twin 
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Comments 


Bias or Artifact? A Reply to Reade and Wertheimer 


ae ; David J. McDowell 
University of Maine and Worcester State Hospital, Worcester, Massachusetts 


In a recent Journal of Consulting and Clinical Psychology brief report, Reade 
and Wertheimer reported an analogue study of the diagnosis of schizophrenia 
and inferred from their data that diagnosticians’ judgments of the probability 
of schizophrenia are biased inappropriately by the information that a patient 
has an identical twin diagnosed as schizophrenic. In rebuttal, it is argued that 
the procedures used in that study compelled the diagnostic behavior that the 
authors criticized and that the implicit generalization from this analogue study 
to reasonable clinical practice was unjustified. 


had joined the army rated a mean probability 
of schizophrenia of .39. The two groups’ ratings 
differed significantly (p<.01) by a Mann- 
Whitney U test, and the authors concluded that 
the information that Mr. S had a twin diagnosed 
as schizophrenic was responsible for this dif- 
ference. 

Should such information increase the clini- 
cian’s likelihood of diagnosing a patient as schiz- 
ophrenic? Reade and Wertheimer (1976) argued 
that diagnosis “should ideally be based on an 
analysis of individuals’. . . behavior . . . [and 
not on the] knowledge that a predisposition to 
schizophrenia may be inherited.” They con- 
cluded that 


family history, diag- 


rior to availing themselves of o 
H obtain relatively con- 


nosticians [should] attempt to l 
clusive behavioral information, Jest they inadvertently 


increase the likelihood of a diagnosis of psychopathol- 
ogy where none exists” (p. 878). 


t and conclusion arè 


h this argumen 
FRN g t to the data reported. 


sound, neither is relevan 


i inici t per- 
In this study, clinicians (a) were not ; 
mitted to observe the patient's herena e ) 


i inent family history 
were given pertinen ee a records; and (c) 


instead of, good beha’ i 
ware asked ee indicate the “aaah Be 
specific diagnosis rather than make ris his 
diagnosis in a free-response format. salle the 
cedures used in the study thus com tors 
clinical-diagnostic behavior that the investiga! 
then decry! 

What the reported 
in the absence of direc! 


data do indicate is that 
t observation or good be- 
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havioral records, and with an extremely restric- 
tive set of data from which to assess the prob- 
ability of a specified diagnosis, these clinicians 
attended to a relevant family history variable 
that has been shown to correlate significantly 
with the diagnosis in question. Reade and Wert- 
heimer implied that the diagnostic judgment of 
these clinicians was biased by nonbehavioral data. 
On the contrary, their experiment demonstrates 
(a) that clinicians are aware of the significant 
concordance rates for schizophrenic identical twins 
and their undiagnosed probands and (b) that 


COMMENTS 


the analogue study of diagnostic judgment may 
differ sufficiently from reasonable clinical prac- 


tice that generalization to the latter is un- | 


justified. 
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In a recent article in this journal, Amante, 
Van Houten, Grieve, Bader, and Margules 
(1977) described the racial and socioeconomic 
differences obtained on commonly administered 
tests of perceptual functioning. Although the 
article raises some important and interesting 
issues concerning environmental factors related 
to neuropsychological functioning, it unfortu- 
mately stumbles into the logical abyss occupied 
by much of the literature concerning learning 
disabilities and neurological deficits. 

The authors tested 


a random sample of 225 third-grade public and paro- 
chial school children of both sexes representing a 
 čross-section of subjects drawn from all of the major 
ethnic groups and socioeconomic levels” (Amante et 
al, 1977, p. 525) 


Present in their community. The results indi- 
cated that between 40%-60% of the sample 
exhibited “evidence” of perceptual-motor, visual- 
Perception, and auditory discrimination handi- 
Caps as measured by the Bender-Gestalt Test, 
the Frostig Developmental Test of Visual Per- 
ception, and the Wepman Auditory Discrimina- 
tion Test, respectively. The authors then pro- 
ceeded to interpret such results as indicating 
Neuropsychological deficits and discussed the 
implications of the ethnic and socioeconomic 
ifferences, 

I wonder how psychologists and educators can 
Continue to support the use of these tests for 


lagnosing neuropsychological problems when 
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Psychometric Phrenology Revisited: 
Comments on Neuropsychological Testing 


Steven Waksman 
Multnomah County School Mental Health Program 
Portland, Oregon 


Amante, Van Houten, Grieve, Bader, and Margules’ interpretation of results 
obtained from testing third-grade children with the Bender-Gestalt Test, the 
Frostig Developmental Test of Visual Perception, and the Wepman ‘Auditory 
Discrimination Test is critically evaluated. It is pointed out that such tests not 
only lack empirical and linguistic validity, but they also tend to confuse their 
users and often hamper more appropriate remediation services to children with 
learning disabilities. The continued support of such tests is questioned. 


they identify 40%-60% of randomly selected 
children as neuropsychologically handicapped or 
impaired. Undoubtedly, the vast majority of 
these “handicapped” children were functioning 
appropriately or excelling both academically and 
socially in their environments. The lack of lin- 
guistic (Payne, 1975) and empirical (Anderson, 
1965; Blakemore, 1965; DiCarlo, 1965; Gold- 
berg, 1959; Larson, Rogers, & Sowell, 1976) 
validity of such widely used instruments is ap- 
palling. As Jastak and Jastak (1976) have 
pointed out, not only are such tests poorly con- 
ceived and validated, but in reality they “con- 
vey no meaningful information” and only tend 
to confuse their users as well as the teachers 
and parents of the children on whom they are 
used. 

This process is witnessed all too often by 
school personnel who refer students with learn- 
ing problems for evaluation and recommenda- 
tions. The evaluation reports that often return 
with the referred student describe perceptual 
deficits or handicaps as the “cause” of the 
learning problem. According to the Amante et al. 
(1977) data, every third-grade child has a 40%- 
60% chance of being diagnosed as perceptually 
handicapped. To confound the issue even more, 
there is a myriad of perceptual training pro- 
grams available that offer no research support 
for their effectiveness (McIntosh & Dunn, 1973) 
and that often stand in the way of appropriate 
remediation attempts (Mann, 1970). In short, 
as Mann (1971) has previously argued, such 
“psychometric phrenology” and the perceptual 
training programs that often follow such ability 
assessment offer little more than the “philoso- 


phies of faculty psychology.” 
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Although perceptual problems are often a cor- 
relate of neurological impairment (Chalfant & 
Scheffelin, 1969; McIntosh & Dunn, 1973), the 
converse of this is not necessarily true. Learn- 
ing problems may result from many factors 
(Eichenwald & Fry, 1969; Jastak & Jastak, 
1976; McIntosh & Dunn, 1973; Moyer & New- 
comer, 1977), and there is still no evidence as 
yet (Goodman, 1977) that neurologically im- 
paired children (except in very rare cases) 
should be taught any differently from nonim- 
paired children. 

The time has come for us to stop blaming 
children for their lack of academic achievement, 
to stop scaring parents with brain-damaged 
labels, and to start sweeping the field of learn- 
ing disabilities clear of faulty measurement and 
foggy logic. 


References 


Amante, D., Van Houten, V. W., Grieve, J. H., Bader, 
C. A., & Margules, P, H. Neuropsychological deficit, 
ethnicity, and socioeconomic status. Journal of 


ee and Clinical Psychology, 1977, 45, 524- 
535. 


Anderson, J. M. Marianne Frostig Developmental 
Test of Visual Perception (3rd ed.). In O. K. Buros 
(Ed.), The sixth mental measurement yearbook, 
Highland Park, N.J.: Gryphon Press, 1965. 

Blakemore, C. B. Bender-Gestalt Test. In O. K. Buros 
(Ed.), The sixth mental measurement yearbook. 
Highland Park, N.J.: Gryphon Press, 1965. 

Chalfant, J. C, & Scheffelin, M, A, (Eds.). Central 
processing dysfunctions in children: A review of 
research, Phase III (HINDS Monograph No. 9). 


COMMENTS 


Washington, D.C.: U.S. Government Printing Of. 
fice, 1969. 4 

DiCarlo, L. M. Auditory Discrimination Test. In | 
O. K. Buros (Ed.), The sixth mental measurement | 
yearbook. Highland Park, N.J.: Gryphon Press, | 
1965. 

Eichenwald, H. F., & Fry, P. C. Nutrition and learn- 
ing. Science, 1969, 163, 644-648. 


ments: The diagnosis of organic brain damage from | 
the Bender-Gestalt Test. Journal of Consulting 
Psychology, 1959, 23, 25-33. J 

Goodman, J. F. The diagnostic fallacy: A critique of 
Jane Mercer’s concept of mental retardation. Jour- 
nal of School Psychology, 1977, 15, 197-206. 

Jastak, J. F, & Jastak, S. R. The Wide Range 
Achievement Test: Manual of instruction. Wilming- 
ton, Del.: Guidance Associates of Delaware, 1976, 

Larson, S. C., Rogers, D., & Sowell, V. The use of 
selected perceptual tests in differentiating between 
normal and learning disabled children. Journal of 
Learning Disabilities, 1976, 9, 85-90. 

Mann, L. Perceptual training: Misdirections and re- 
directions. American Journal of Orthopsychiatry, 
1970, 40, 30-38. í 

Mann, L. Psychometric phrenology and the new 
faculty psychology: The case against ability assess- 
ment and training. Journal of Special Education, 
1971, 5, 3-14, } 

McIntosh, D. K., & Dunn, L. M. Children with 
major specific learning disabilities. In L. M. Dunn 
(Ed.), Exceptional children in the schools. New 
York: Holt, Rinehart & Winston, Inc., 1973. 

Moyer, S. B., & Newcomer, P. L. Reversals in read- 
ing: Diagnosis and remediation. Exceptional Chil- j 
dren, 1977, 43, 424-429. 

Payne, J. L. Principles of social science measurement. 
College Station, Tex.: Lytton, 1975. | 


Received November 28, 1977 m 


sa 


Journal of Consulting and Clini 
{ie Vol. 46, No. G, 1491-1402, oe 


Response to Psychometric Phrenology Revisited: 
Comments on Neuropsychological Testing 


Dominic Amante 
West Shore Mental Health Clinic, Muskegon, Michigan 


Steven Waksman has raised some controversial issues in his critique of a neuro- 
psychological study. The study dealt with the ecological distribution of central 
nervous system pathology in working class and socially disadvantaged black 
children and white children. Some of the issues that Waksman raises can be 
resolved, at least partially, but others cannot because of the fact that basic 
differences of opinion continue to exist in the absence of uncontested theory 


and decisive empirical data. 


_ Waksman (1978) has raised some vital issues 
in his critique of a neuropsychological field study 
that we reported in this journal recently 
(Amante, Van Houten, Grieve, Bader, & Mar- 
gules, 1977). Unfortunately, he has also intro- 
duced a variety of irrelevant issues. 

It is important to observe that Waksman’s 
critique is highly selective. For example, it does 
not deal with the intelligence test results. Con- 
sequently, it fails to refer to perhaps the most 
positive feature of the Amante et al. study— 
namely, the possibility that there are no differ- 
ences in the levels of general intelligence or 
auditory discrimination in black children and 
white children of comparable socioeconomic 
status when the level of visual-motor function- 
ing is controlled. His interest is in deficit con- 
ditions only, but even in this case he misrepre- 
sents our data and overgeneralizes the con- 
clusions, 

Based on an extensive clinical and research 
tradition, many medical and behavioral scien- 
tists have concluded that various forms of mild 
or severe neuropathology measurably influence 
a broad spectrum of psychological functions 
Including language development, affective states, 
perceptual-motor skills, cognitive abilities, and 
Various other behavioral parameters (Dimond & 
Beaumont, 1974; Reitan, 1975; Strub & Black, 
1977). Consequently, there is no essential reason 
to Conclude, as does Waksman, that the majority 
of perceptually impaired children were ade- 
quately functioning in their environments. Per- 
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formance on the Bender-Gestalt, for example, 
does relate to nontest factors such as school 
readiness, intellectual level, academic achieve- 
ment, emotional and behavioral problems, and 
many diverse forms of neurological pathology 
(Koppitz, 1975). 

It is doubtful if the test instruments that 
were used are as psychometrically unsophisti- 
cated as Waksman and others would have us 
believe. In fact, such techniques may be pref- 
erable considering the reliability and validity of 
simple behavioral observation, subjective im- 
pression, or “clinical intuition” (Korchin, 1976; 
Landy & Trumbo, 1976). It is true that modern 
medicine has been unable to devise neurodiag- 
nostic techniques of perfect reliability and valid- 
ity (Kennedy & Ramirez, 1964), and as nearly 
everyone now realizes, psychologists and educa- 
tors have no instruments that are diagnostically 
impeccable (Anastasi, 1968). There is, however, 
a plausible rationale behind neuropsychological 
assessment (Boll, 1977), but in all cases we 
necessarily deal with diagnostic probabilities— 
not certitudes (Amante, 1976). 

Some other issues should be considered, 
Waksman’s (1978) primary interest appears to 
involve learning disabilities and related prob- 
lems as well as tertiary prevention, which Cap- 
lan (1964) defines as an attempt to reduce the 
residual defect(s) associated with various con- 
ditions of physical and behavioral pathology. 
However, Amante et al. (1977) dealt with more 
general issues involving a diversity of neuro- 
psychological outcomes, and the central thrust 
of the article pertains to primary and secondary 
prevention. That is, our basic concern involved 
the identification of environmentally based etio- 
logical factors, the control of which may reduce 
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the incidence and the prevalence of central ner- 
vous system pathology in children. 

It is conceded that the differential diagnosis 
and treatment of learning disorders and related 
problems is a controversial area (Larsen & Ham- 
mill, 1975; Ross, 1976; Spache, 1976). The 
interesting point that Waksman (1978) over- 
looks, however, is that it is also true that many 
of the leading theoreticians and researchers in 
the field believe that most (if not all) learning- 
disabled children are neurologically impaired, 
that they frequently do appear to have percep- 
tual-motor and behavioral problems, and that 
various specialized educational methods individ- 
ualized for the child often are called for (Cant- 
well, 1975; Johnson & Myklebust, 1967; Kenny 
& Clemmens, 1971; Kephart, 1969; Myers & 
Hammil, 1976; Reitan & Boll, 1973). William 
w Cruickshank’s (1972) position is not atypi- 
cal: 


I would like to suggest . . . that irrespective of the 
presence or absence of diagnosed neurological dys- 
function, learning disabilities are essentially and al- 
most always the result of perceptual problems based 
on the neurological system (p. 383). 


be denied 
helpful to them, and prev: 
will be seriously undermined, 
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Can Clients Provide Valuable Feedback to Clinicians About 


W Their Personality Interpretations? A Reply to Greene 


C. R. Snyder, Mitchell M. Handelsman, and Janet R. Endelman 
University of Kansas 


Greene’s article on the Barnum effect is criticized on conceptual, methodologi- 
cal, and statistical grounds. A reanalysis of the data is presented, along with 
background research that conflicts with Greene’s conclusions regarding (a) the 
ability of people to judge the accuracy of personality interpretations that they 
receive and (b) the role of feedback uniqueness in that judgment. 


In commenting on previous Barnum effect re- 
search, Greene (1977) suggested that the re- 
cipient of personality feedback “accepts an inter- 
pretation as being descriptive if it is accurate, 
regardless of its uniqueness” (965, emphasis 
added). Contrary to this assertion, however, 
previous studies have indicated that the unique- 
ness of feedback does influence acceptance on 
two levels. First, a consistent finding is that the 
higher the base-rate accuracy for people in gen- 
eral (i.e., the lower the uniqueness), the greater 
the acceptance of the feedback (e.g., Collins, 
Dmitruk, & Ranney, 1977; Merrens & Richards, 
1970; Mosher, 1965; O’Dell, 1972; Snyder & 
Shenkel, 1976; Sundberg, 1955; Weisberg, 
1970). That is, the content of the interpretive 
feedback should not be unique in order to elicit 
acceptance on the part of the recipient. Second, 
it is important for the recipient of diagnostic 
feedback to believe that the feedback is derived 
Specifically for him or her (ie. that it is 
Unique), Research on this point has shown that 
tecipients of feedback are more acceptant of 
the feedback when they believe that it was 
Specifically derived for them” on the basis of 
Psychological tests, as compared to “statements 
that are generally true of people” (Snyder, 
1974; Snyder, Larsen, & Bloom, 1976; Snyder 
& Larson, 1972; Snyder & Shenkel, 1976; Ziv 
& Nevenhaus, 1972). 

Other comments are warranted on the meth- 
odology used in the Greene (1977) study. 
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Greene used a fairly typical procedure in the 
Barnum effect paradigm (cf. Snyder, Shenkel, 
& Lowery, 1977, for review). Undergraduate 
students in two classes completed a psychologi- 
cal test in one class session, and in the next 
class session each student was given the same 
interpretive feedback. Students then rated the 


interpretation 


according to the extent to which this personality 
interpretation described their own personality, Sec- 
ond, they rated the extent to which it described 
them as a unique individual, that is, as different 
from their classmates. Third, they rated the extent 
to which it described one of their classmates. 


(Greene, 1977, p. 965) 


Subjects responded to each question on a 5-point 
scale (5 = excellent, 4= good, 3 = average, 2 
= poor, 1 = very poor). Methodologically, the 
within-subject sequence may have sensitized sub- 
jects to become highly skeptical. It would have 
been more desirable to counterbalance the order 
of items to handle this possibility. Also, it 
should be noted that any within-subject differ- 
ences may be due in part to shifting internal 
anchor points resulting from different content 
across questions. 

Given the aforem cat R ik a 
senior vs. junior-sophomore) X ions (1, 
j 3) analysis of variance would have been 
preferable to the single one-way analysis of 
variance that was reported. (In this subsequent 
analysis the reader should be cautioned not to 
overinterpret the results because of the previ- 
ously mentioned problems with the three ques- 
tions.) When Greene’s (1977) data were reana- 
lyzed this way, the main effects of class and 
questions were significant. The main effects were 
qualified by a significant Class X Questions in- 
teraction, F(2, 98) = 3.22, p<.05. This inter- 
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action resulted because the senior and junior— 
sophomore classes did not differ significantly on 
the “describe classmates” question, but the jun- 
jor-sophomore class (a) accepted the interpre- 
tation more highly, (49) = 2.06, p<.045, and 
(b) perceived the interpretation as being more 
unique for them than did the senior class, (49) 
= 2.90, p < .006. Whether these differences be- 
tween classes are due to greater sophistication 
on the part of the senior as compared to junior- 
sophomore class members, as Greene (1977) 
suggested, is not discernable from these data. 
This follows because the two classes may have 
differed on a multitude of possible dimensions 
that may influence their reactions to feedback 
(e.g., time of course in terms of day and sem- 
ester, course content, instructor, or class com- 
position in terms of sex, age, race, etc.), Also 
the inference. made by Greene (1977) that stu- 
dents could realize “that the same interpretation 
could as accurately be applied to any of their 
classmates” (Greene, 1977, pp. 965-966, empha- 
sis added) is overstating the results, since the 
actual statement read “rate the extent to which 
it describes one of your classmates.” One other 
suggestion regarding the present procedure is 
that performing the experiment in a classroom 
setting may have aroused far more Suspicion on 
the part of subjects than the individual testing 
and feedback situation that is typical of actual 
clinical practice and recent research within this 
paradigm. The literature Suggests that the situ- 
ation in which a client receives interpretive feed- 
back from a clinician may yield a rather highly 
acceptant and undiscerning recipient of feed- 
back (Snyder & Clair, 1977), 
Greene’s conclusion that “the Barnum effect 


can become a less formidable adversary if the 


clinician is careful to instruct the student or 
client as to exactly what questions are to be 


1 
answered” P- 966) is unwar- 
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valua! 
pretations are seen as more accurate thi 


bona fide ones (Merrens & Richards, ` New; 
O'Dell, 1972). / 
An old quote comes to mind in regard to this 
entire issue. Upon hearing of his death, Mark 
Twain wrote the following cable to the Asso- 
ciated Press in 1897, “The reports of my death 
are greatly exaggerated” (cited in Bartlett, 1968, 
p. 763). Given the present state of research, the 
same could be said of the Barnum effect. 
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Can Clients Provide Valuable Feedback to Clinicians About 
Their Personality Interpretations? Greene Replies 


Roger L. Greene 
Texas Tech University 


Snyder, Handelsman, and Endelman’s response to Greene overlooked several 
significant issues that are important in reconceptualizing research on students 
acceptance of generalized personality interpretations. These issues are stated 
explicitly, and their relevance to research in this area is described. 


Snyder, Handelsman, and Endelman (1978) 
have provided an interesting commentary on 
Greene’s (1977) article on student acceptance of 
generalized personality interpretations. Somehow 
they have managed to overlook the major thesis 
of his article, and instead they have chosen to 
comment on several tangential issues. Since 
they overlooked the major point of his article, 
it will be reiterated here so that other research- 
ers will not make the same oversight. General- 
ized personality statements are constructed so 
that these statements are an accurate (true) 
description of all students. Hence, it does not 
make sense then to ask students whether these 
statements are accurate when in fact they are 
constructed to be so. Unfortunately, most pre- 
vious research on students’ acceptance of gen- 
eralized personality interpretations has asked 
the student how accurately such statements fit 
his/her personality (e.g., Forer, 1949; Snyder, 
1974; Ulrich, Stachnik, & Stainton, 1963). The 
issue of importance is whether students believe 
these generalized Personality interpretations to 
be unique. 

Since Snyder et al. (1978) use the term 
uniqueness in two different senses, it is not clear 
exactly what point they are trying to make in 
their comment on this issue. Uniqueness can be 
used to describe the base-rate accuracy of a 
generalized statement (ice, the lower the base- 


Tate accuracy of any statement or group of 


statements, the more unique the statement 
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would be). Uniqueness is also used to describe 
the procedures whereby the student is led tof 
believe that the particular generalized state: | 
ments that he/she receives have resulted from 
the interpretation of some type of assessment 
process. The appropriateness and effectiveness of 
the procedure used by the researcher to create 
this illusion of uniqueness of test interpretations | 
no doubt will affect students’ ratings of gen: 
eralized interpretations, as Snyder et al. indicate 
However, this procedure can be and should be 
independent of the question whether the set of | 
generalized interpretations that the student re 
ceives is unique or descriptive of only one per- 
son. Greene (1977) used the word unique in the 
latter sense, in that he asked students whether 
they realized that generalized personality state 
ments could be applied as appropriately to one of { 
their classmates as to themselves. Snyder et al. | 
(1978) stated that “the content of the inter- 
Pretive feedback should not be unique in order 
to elicit acceptance on the part of the recipi- 
ent” (p. 1493). Why they should want to & 
Plicitly eliminate the only pertinent question 
that can be asked of generalized personality 1- 
terpretations is not clear. To reiterate an eat 
Point, it makes little sense to ask student 
whether generalized interpretations are accurate ji 
The imperative question is whether students a 
Tecognize that generalized interpretations a 
trivial because they can be applied to he, 
and Greene (1977) demonstrated that suae | 
can indeed recognize the triviality of generalia 
Statements at least in one context. y ; 
Snyder et al. (1978) do raise one valid E 
odological issue. Since a repeated-measures a 
sign was used in Greene’s (1977) original ae 
it is necessary to conduct a similar study bie, 
either a between-subjects design or a coun in | 
balanced repeated-measures design to Ree 
the generality of his results. Only empirica l 
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can answer the methodological question that 
they have raised. 

One final issue remains to be discussed— 
whether clients can provide clinicians with feed- 
back about their personality interpretations. 
Snyder et al. (1978) stated that “although cli- 
nicians in actual practice may ask clients to give 
their acceptance reactions to feedback, they 
probably do not ask clients about the unique- 
ness of the feedback” (p. 1494). Clinicians who 
are interested in avoiding the pervasive influence 
of the Barnum effect should be specifically ask- 
ing their clients the latter question. If clients 
can recognize whether personality interpreta- 
tions are unique, which needs to be empirically 
tested, clients can be a valuable source of feed- 
back to clinicians. 

In concluding this reply, a quote comes to 
mind: “What’s gone and what’s past help/ 
Should be past grief.” (Shakespeare, 1963, p. 
65: The Winter’s Tale, Act III, Scene 2). Hope- 
fully, experimenters will allow the previous ap- 
proach to studying students’ acceptance of gen- 
eralized personality interpretations to die a 
Peaceful death and instead infuse some new 
direction into an interesting research area. 
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Premature Conclusions Regarding Black and White 
Suicide Attempters: A Reply to Steele 


Richard C. Bedrosian and Aaron T. Beck 


Department of Psychiatry 
University of Pennsylvania 


The conclusion reached by Steele, that his samples of black and white suicide 
attempters were clinically similar is questioned, as is his inference that his re- 
sults require a reexamination of the supposed need for Separate black and white 
psychologies. Specific criticisms are raised regarding some of the variables 
chosen by Steele for his comparisons, and the manner in which he chose to 


interpret his data. 


As a result of his comparison of black and 
white suicide attempters, Steele (1977) concluded 
not only that the two groups were clinically 
quite similar but also that his data necessitated 
a reexamination of the contention that separate 
Psychologies are needed for blacks and whites 
(Mosby, 1972). Although data contrary to the 
latter conclusion (which is viewed by us as pre- 
mature) will no doubt be drawn from a variety 
of sources by other critics, the point to be made 
herein is that the proclamation of “no difference” 
between the black and white Suicide attempters 
in Steele’s sample was likewise ill-advised. 

Steele concluded that the black-white dif- 
ferences in his sample were negligible after hav- 
ing observed in his comparison between the two 
groups that “only 11 out of the 42 clinical varia- 
bles are statistically significant at the 5% level 
of confidence” (Steele, 1977, p. 983). Implicit in 
his statement is the assumption that on an a 
priori basis, many more of the variables should 
have revealed a difference between the two 
groups. In other words, Steele seems to have 
assumed that all or most of the variables that 
he examined Tepresent equally important aspects 
of suicidal behavior and/or ideation, and that 
the way to test for differences between the two 
groups is to compare them on a host of variables 
and then assign a “box Score.” Apparently no 
effort was expended to select suicidal risk varia- 
bles that reflect either observed or hypothesized 
differences between white and black subcultures. 
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What the box score approach also overlooks, of 
course, is that some variables may be more 
useful or interesting than others. With regard to 
suicidal behavior, “pessimism and hopelessness 
might be more relevant as a predictor than 
“paranoid delusions” or “patient’s acknowledged 
attempt to get some sleep,” for example. 

Closer examination of the indices used by 
Steele (1977) reveal some variables that might 
have been expected to bear little relation to the 
Phenomenon of suicidal behavior. ‘Depressive 
delusions,” “paranoid delusions,” and “thought 
disorder” all presumably referred to the pres- 
ence of psychotic processes on the part of the 
suicide attempter. Yet there is evidence that the 
neurotic—psychotic distinction may not be par- 
ticularly relevant to the study of suicidal b 
havior. In a study of attempters, neurotics an! 
Psychotics showed no differences on self-report 
and clinical measures of suicidal intent, as We 
as on ratings of the lethality of their attempts 
(Lester & Beck, 1976). On the other hand, in 
view of the data supporting a relationship be- 
tween alcohol intake and suicidal behavior ea 
field & Montgomery, 1972), Steele might we 
have included a measure of alcohol const 
at the time of the attempt. These examples wer 
not raised in order to second guess Steele; they 
have been included here in order to ane 
against premature closure on the issue of t 
black-white comparison. tl 

If one assumes as Steele (1977) appare 
does not that his data represent random resul À 
then his findings do have some clinical DRE 
Whites scored higher than blacks on ratings o, 
depression and hopelessness, two variables “H 
traditionally have been used to assess Er 
intent and that have consistently surfaced in 
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iterature as predictors of suicidal risk (Beck, 
Kovacs, & Weissman, 1975; Minkoff, Bergman, 
k & Beck, 1973). Moreover, Steele reported 
that blacks were less motivated to appeal for 

Ip and that they spent less time between plan- 
ning the act and actually making the suicidal 
attempt. In trying to assess the suicidal risk in a 
black patient, the psychotherapist may find that 
the usual cues (e.g., hopelessness, depression, 
“cry for help”) for making such judgments are 
less useful than if a white patient had been in- 
volved. Further, Steele's data suggest that the 
black patient may require a more rapid interven- 
tion in order to avoid an attempt once the sui- 
‘tidal ideation has been detected. 

A more adequate test of whether black and 
white suicide attempters represent different popu- 
lations should use dependent variables that are 

(a) known to bear a relationship to the phe- 
“nomena of suicidal behavior and/or ideation and 
are (b) presumed to reflect some area of dissimi- 
larity, etiher observed or hypothesized, between 
black and white subcultures. This critique raised 
“questions about the way in which Steele (1977) 
‘Satisfied the first criterion, but it seems clear 
that he paid no attention to the second. Some 
well-informed hunches about where the differ- 
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ences between black and white suicide attempt- 
ers might lie could have provided Steele’s pro- 
ject with a clearer focus, and perhaps, greater 
credibility. 5 
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Steele’s Reply to Bedrosian and Beck 


Robert E. Steele 
University of Maryland 


The two basic criticisms of Bedrosian and Beck are addressed: the appropriate- 

ness of the variables chosen and the interpretation of the results with regard 

to black psychology. A rationale is given for the inclusion of clinical variables 

that were not particularly relevant to suicide attempt behavior. Two additional 

variables, history of alcohol and illicit drug abuse, are reported to support the . 
similarity position between black and white suicide attempters. A final com- 

ment is made on the relevance of the need for a separate black psychology. 


Bedrosian and Beck’s (1978) criticism of my 
recent article (Steele, 1977) revolves around two 
issues—(a) the appropriateness of the variables 
chosen to examine black-white differences in 
suicide attempt behavior and (b) the prematur- 
ity of my claim that one empirical “box score” 
study is sufficient to question the viability of a 
black psychology. Bedrosian and Beck chose to 
address the first point themselves and suggested 
that “other critics” would attack the latter. Let 
me take this opportunity to counter both Bed- 
rosian and Beck as well as to address the other 
critics on the second issue. 

Bedrosian and Beck’s (1978) first line of 
attack was that I assume that “all or most of 
the variables . . . represent equally important 
aspects of suicidal behavior and/or ideation” 
(p. 1498). I made no such assumption, In fact, 
the data summarized and condensed in the first 
paragraph of Footnote 1 (Steele, 1977, p. 984) 
were initially presented in three tables in the 
original manuscript submitted to this journal. 
These variables were selected after an extensive 
literature review of suicide attempt behavior. 
Key variables identified by Farberow and 
Schneidman (1961) and Stengel (1964) were 
included in this study. 

The rationale for including the other “host of 
variables” was that in addition to focusing on 
suicide attempt behavior between blacks and 
whites, I wanted to see what an extended 
“clinical profile” between these two groups 
would look like in light of the claims made by 
others that there is a need for a separate black 
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psychology. This claim does not represent my 
point of view. Therefore, Bedrosian and Beck 
are incorrect to suggest that I expected “on @ 
priori basis” black-white differences. 

Bedrosian and Beck’s (1978) second line of 
attack was my inappropriate inclusion of PSys 
chotic clinical variables (depressive delusions, 
paranoid delusion, and thought disorder) in light 
of the findings by Lester and Beck (1976). I 
would like to point out to my critics that my" 
study was planned, the literature review was 
conducted in 1969, and the results of the Lester 
and Beck study were reported in 1976. 

Bedrosian and Beck’s third line of attack wasi 
that I committed the “sin” of omission; that i$j 
I left out a key variable known to be related to 
suicide attempt behavior, namely alcohol ool 
sumption. I concede that my study would have! 
been improved if this variable had been Mi 
cluded. Let me hasten to add, however, E 
the residents did take a history of alcohol E Í 
illicit drug abuse as a part of their medical wor i 
up. On these two variables, which were not 1 
Ported in the original study, there were a 
Statistically significant differences between black) 
and white suicide attempters that are consisten 
with my interpretation that there are no m 
ingful “clinical” differences between these tWi 
groups. nd 

Let me make a final comment on the seco A 
point that other critics will address. It was E 
my original intention to suggest that this a 
study is sufficient to invalidate the claim E 4 
black psychology is needed. However, my inte 
Pretation of these results does not prov 
supportive evidence for a black psychology E 
tion. I feel that my call for a reexamination 
the need for a black psychology in light of € ay 
pirical evidence is a valid one. A recent StU 
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by Steele (1978) that examined psychosocial 
f factors in depression (a variable related to sui- 
found that there were no signifi- 
tant relationships between race and measures of 
lepression. Race did interact significantly with 
other factors such as social mobility. Upwardly 
mobile blacks and downwardly mobile whites 
howed more psychological disturbances on fac- 
tors relating to depression. It is my growing 
conviction, which has some empirical support, 
that race and/or ethnic identity is a critical 
factor in determining behavior in complex inter- 
actions with other significant variables such as 
sex, religion, social class, physiological correlates, 
and so forth. Just because each of these fac- 
tors may be important, I do not feel that it 
makes theoretical and/or empirical sense to 
elevate any of these factors to a salient status, 
such as a “female” and/or “black” psychology. 
I agree with Kluckhohn (1949) that human 
beings regardless of our cultural backgrounds 
| share certain universal characteristics, that cer- 
tain characteristics are shared with other mem- 
bers of our ethnic groups, and that we share 
certain characteristics as individuals. It is the 
goal of psychology to understand the complex 
interrelationships among these three factors. I 


COMMENTS 


1501 


do not feel that a narrowly conceived ethnic 
and/or individual psychology will achieve this 
goal. 
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Psychotherapy or Massage Parlor Technology? 


Comments on the Zeiss, Rosen, 


and Zeiss Treatment Procedure 


Kent G. Bailey 


Virginia Commonwealth University 


There are numerous ethical, moral, philosophical, and social psychological issues 
involved in modern sex therapy. Psychologists have accorded sex therapy a 
warm reception into the field, but present ethical guidelines are insufficient to 
protect clients from psychological damage in the form of massive intrusions on 
privacy and reoriented moral and religious values. Further, the more explicit q 
procedures seem to carry a message to society that “anything goes.” The Zeiss, 
Rosen, and Zeiss procedure is used as a reference point for discussing these 


various issues. 


In a recent review, Byrne (1977) outlined 
the progressive evolution of research on sexual 
behavior from animal studies in the early phases, 
to primitive cultures, abnormal sexuality, and 
finally up to the contemporary focus on nor- 
mality emanating from Freud, Kinsey, and Mas- 
ters and Johnson. Byrne stated that social psy- 
chologists were attracted to the area by virtue 
of revolutionary societal changes in attitudes and 
actual behavior, by research initiated by the 
Commission on Obscenity and Pornography, and 
by the pressing need to solve the problem of 
unwanted conception. Following his in-depth his- 
torical analysis of research on sexuality, Byrne 
concluded that social Psychologists have much 
to gain and much to offer by Providing a “warm 
welcome” to this fundamental area of social 
psychological interest. 

Perusal of recent issues of the Journal of 
Consulting and Clinical Psychology (JCCP) re- 
veals that sex research is being warmly received 
by clinical researchers and therapists as well, and, 
further, it appears that there is a parallel trend 
to use increasingly explicit materials, methods, 
and procedures in the sex laboratory. A recent 
JCCP issue, for example, 
which introductory psychol 
films of masturbation and subsequently responded 
to various measures of sexual arousal and af- 
fectivity (Mosher & Abramson, 1977), and an- 
other study used a behaviorally oriented mastur- 
bation procedure for treating inorgastic women 


ogy students viewed 
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included a study in” 


(Zeiss, Rosen, & Zeiss, 1977). Although both off 
these studies produced some data of interest, the 
explicit, high-impact treatment procedures raise 
numerous ethical, moral, philosophical, and po4 
tential legal issues that were not addressed, an 
the crucial issue of cost versus benefit was vir 
tually ignored. In other words, there was little 
concern as to whether the data or treatmen 
effects generated outweighed the potential psy? 
chological harm to the subject or client, and, 
more importantly, there was little concern re 
garding the cumulative social impact of the] 
“anything goes” message that such procedures 
communicate to the subject, the client, other 
Professionals within and out of the field of 
Psychology, and, finally, at some point, to the| 
public at large. 

Using the Zeiss et al. (1977) study as a ref 
erence point, I would like to address some of 
the general issues raised above in greater detail. 


The Ethical Issue 


Although intangibles like “intent,” ‘compe-| 
tence,” and even “confidentiality” possess UN- 
clear ethical boundaries, all therapists agree On 
one thing—Sexual liaison between therapist and 
client is unethical and is to be uniformly ge 
scribed (American Psychiatric Association, 1973; 
American Psychological Association, 1977)- 
Furthermore, such contact produces negative). 
Psychological effects on the client in most m| 
stances (Taylor & Wagner, 1976) and typicall 
Tepresents a form of exploitation of female ¢ 
ents by male therapists (American Psycholo} 
cal Association, 1975). Indeed, sexual contac! 
within the therapy relationship may be seen 35 
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tantamount to breaking the incest taboo (Taylor 
& Wagner, 1976), for like it or not our desperate, 
anxious, and lonely clients cast us in a parental 
role, and our own needs catalyze the process. 
Now, we have here an instance in which a 
particular form of sexual expression is con- 
demmed as morally reprehensible, contrary to 
client welfare, and patently unethical. It is rare 
that behavioral scientists are so definitive about 
anything, and I cannot think of any other in- 
stance in the sexual realm in which the blue 
pencil applies with such stringency. Given the 
innumerable possibilities for sexual behavior and 
misbehavior, it is surprising that only therapist- 
client sexual contact should warrant our ire. I 
suggest that we be fair, and, rather than “warmly 
welcoming” any new combination or permutation 
of arms, legs, and genitalia that leads to orgasm, 
we should use caution and discretion in intruding 
into the sex lives of our clientele. 
Unfortunately, the Zeiss et al. (1977) study 
fails to exercise such caution and discretion, and 
what we have instead is a simplistic, functional- 
istic, treatment “program” that successfully 
yields orgasm, ignoring all the while the more 
profound psychological implications of the pro- 
cedure. I in no way question the intent or moti- 
vation of Zeiss et al., for they behaved as ethi- 
cal behavior therapists should, by using the 
‘simplest, most direct, and most effective tech- 
niques to achieve treatment goals in the shortest 
time possible (AABT, 1976). I do, however, 
question their methods and a system of profes- 
sional ethics (or lack thereof) that allows thera- 
pists to declare open season on the sexually mal- 
adjusted client, with the only interdiction being 
that the therapist cannot join in on the fun. If 
sexual liaison with the client is the therapist’s 
version of “playing house, 
treatment procedures like those of Zeiss et al. 
makes for “playing doctor,” with the willing, 
dolllike client going along with anything—and I 
mean anything! The gradual fading in of Mr. A 
comes to mind here (Zeiss et al, 1977, P: 893), 
along with Mrs. A apparently thrusting and vi- 
brating herself into a case of pneumonia, of; 
better still, Mrs. B’s sudden, mysterious sexual 
longing for organic vegetables following her 
reading of Friday’s (1974) My Secret Garden. 
All of this would be humorous were it not so 
tragic. It is a mystery to me how conversational 
psychotherapy has made the sudden transition to 
‘massage parlor technology involving vibrators, 
mirrors, surrogates, and now even carrots and 
cucumbers! Can we say that the empirical, theo- 
retical, and philosophical groundwork has been 


” then no-holds-barred - 
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laid for such Procedures? Do we have data on 
the psychological, social, and spiritual side ef- 
fects of such procedures? I think not; it is more 
parsimonious to assume that therapists are giv- 
ing clients what they want (or think they want) 
whether they need it or not. Furthermore, the 
distinction between “legitimate” and “illegiti- 
mate” sex therapists is becoming increasingly 
blurred as the money pours in and the pop 
therapy bandwagon rolls on. The Zeiss et al. 
procedure represents one of the more dramatic 
instances of “legitimacy by acclamation,” with 
applause supplied by the behavioral community 
on the one hand and the instant gratification 
crowd on the other. The Zeiss et al. procedure 
and others like it do little to allay Albee’s (1977) 
fear that modern society is creating “an im- 
pulse-indulgent society of consumers, and psycho- 
therapists have become the new gurus explaining 
life’s elusive purpose” (p. 150). 

Our system of professional ethics, then, should 
develop the breadth and depth to accommodate 
more than the obvious parameters of competence, 
confidentiality, interpersonal relations, use of 
tests, effectiveness of treatment, and so forth, It 
should also grapple with the subtler aspects of 
client welfare, including intrusions of privacy, 
indirect assaults on the client’s moral and re- 
ligious values, and, most importantly, abuse of 
the power to direct and facilitate social change. 
Just as Freud, Skinner, and even Spock have 
effected massive social change directly with new 
ideas and techniques, and indirectly with in- 
numerable covert stipulations that emanate from 
their work, the modern sex therapist carries a 
ssage to a credulous public looking 
y for guidance. Although the field 
of psychology in general implicitly espouses ex- 
aggerated individualism (Hogan, 1975), liberal 
ideology (Ornstein, 1975), and instant gratifica- 
tion (Albee, 1977), it is the sex therapist who 
most seriously assaults the traditional institu- 
tions (see Campbell, 1975) of marriage, family, 
and religious values. It is time that we come to 
grips with these admittedly elusive side effects to 
therapy, and it is time that we set proper ethical 
guidelines and sanctions within the field before 
jnevitable legal and professional disasters befall 
us from without. The recent brouhaha at the 
State University of New York at Albany (Smith, 


1977) is a frightening illustration of the catas- 
trophes that can befall the psychologist when 


ethical guidelines are ignored or are given mere 


token consideration. 


powerful me: 
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The Legal Issue 


As Tryon (1976) pointed out, concern with 
the ethics and legality of behavioral practices 
increases in proportion to the effectiveness of 
learning principles applied to human behavior. 
When the therapeutic procedures are highly spe- 
cific and highly effective, the independent varia- 
bles are laid bare for legal scrutiny, and this 
naturally leads to discussion of the whos, whens, 
and wheres of using such powerful procedures. 
The Zeiss et al. (1977) procedure ranks at the 
top in both specificity and “effectiveness” in the 
orgastic sense, and the question arises as to how 
such procedures would fare in the courtroom. 

Tryon (1976) reminds us that we live in an 
increasingly litigious society, and behaviorally 
oriented clinicians especially need to take nec- 
essary defensive precautions in order to avoid 


suffering 
virtue of 


lowed from treatment. Could 
Procedure withstand thetorical as 


sel for the Plaintiff, who would, no doubt, dwell 
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on the grevious “wounding of spirit” suffered by 
his or her clients? 

The client’s right to privacy is another im- 
portant issue with potential legal implications, 
Until sex therapy came along, the psychothera- 
pist merely intruded into every cognitive crack 
and crevice possible, but now we “play doctor” 
to the fullest and perform behavioral “opera- 
tions” that intrude into the client’s private life to 
a degree unimaginable a decade ago. Unless the 
sex therapist can demonstrate that less intrusive 
alternate procedures are not feasible on a cost- 
benefit basis, then it seems to me that he or she 
could be subject to a malpractice suit if chal- 
lenged by a client on that basis. I am unaware of 
any empirical research that has addressed the 
question of differential effectiveness vis-à-vis 
“hard” sex therapy procedures like those of 
Zeiss et al. versus “soft” conversational or less 
direct approaches. Unless, the hard, more in- 
trusive approaches are found to be vastly su- 
perior in effectiveness to softer ones, then appli- 
cation of the former approaches would be at 
Worst unethical and possibly illegal, and at best, 
just bad therapy. 


The Power Issue 


One of the most important and generally 
ignored variables in the therapy situation is that 
of therapist power and dominance over the 
client. Anyone who has supervised therapy 
Practicum for a 2nd or 3rd-year clinical student, 
however, is immediately struck with the jockey- 
ing for control that goes on between this young 
Pre-PhD and the client, and when we are in- 
formed that “rapport has been established,” we 
know, at long last, that the neophyte therapist 
has finally gotten the upper hand. Were it not 
for the almost hypnotic power of the therapist, 
one wonders whether therapy would occur at all, 
for a great amount of what we call therapy is 
just plain persuasion and the selling of agendas 
of various sorts. It is difficult to imagine therapy 
Proceeding on the basis of client aversion and 
disrespect for the therapist, and it seems to me 
that client cooperation is the most precious 
commodity in the therapy situation. ‘ 

With such a gross status asymmetry and im- 
balanced dominance relationship, the therapist a 
ìn a position to extract levels of “obedience” 
that Milgram could only dream of, and such 
Power necessarily requires judicious control and 
self-monitoring throughout the entire therapy 
process. In view of these considerations, I am 
not surprised that Zeiss et al. (1977) had little 
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dificulty in introducing the dildo in treatment, 
nor would I disagree that “clients will accept the 
procedures [the six-step program] when they are 
presented with an appropriate rationale” (p. 
894). It might be more instructive to ask, “What 
will a client who needs therapy not do for the 
marketer of the product that he or she wishes to 
consume?” The answer—practically nothing! If 
anyone doubts this assertion, I suggest that they 
read Rosen’s (1977) article dealing with the 
ready tendency for clients to sign away their 
rights to privacy under sign-away pressure, one 
type of which is the implicit message “do it my 
way or you don’t get therapy.” Thus, we see 
that mere client consent, or client capitulation as 
the case may be, is a suspect criterion for 
legitimizing a particular therapeutic intervention. 

There is a related issue that I wish to pring 
up at this point. Despite intellectual acceptance 
of the trappings and regalia of the sex labora- 
tory, I see little evidence that sex therapists wish 
to ply their trade on members of their own fami- 
lies, friends, neighbors, Or loved ones. Rather, it 
is the impersonal stranger who is the modal 
consumer of the sex therapists” goods, and, as 
such, may be treated within the context of a 
dangerous underlying double standard. According 
to W. D. Hamilton's kinship selection theory 
(Barash, 1977), we tend to be more altruistic 
and beneficent toward those with whom we share 
the greatest number of genes, whereas those who 
share few genes with us fall in the despised, or, 
at best, tolerated outgroup. At the social level, 
we see a similar in-group versus out-group theme 
in Nietzsche’s master morality and slave moral- 
ity, a double-standard in which the aristocratic 


class exploits and oppresses the masses for their 
milton’s theory 


own good. Perhaps comparing Ha 


to the therapy relationship is @ false analogy, 
that therapists can, in 


and perhaps suggesting 

some respects, be seen as an aristocratic class 
ministering to the masses is a bit strong, but 
there is little evidence to suggest that clients are 
treated with the dignity and respect that we 
ignificant others. For 
example, I have often seen tl 
cavalierly of “divorce; 
rogate partners,” and the 
clients while their own divorces 
are treated with the utmost reverence. I per- 
sonally believe that clients deserve the respect 
and reverence that we accord our loved ones, 
and I question whether the 


Zeiss et al. proce- 
dure, and similar ones, operate oD that basis. 
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The Philosophical-Theological Issue 


I believe that sex therapy is appropriate and 
desirable. when performed by a psychotherapist 
who combines sensitivity to philosophical, moral, 
and theological issues with his or her armamen- 
tarium of technological skills. Without this sensi- 
tivity, poking and punching with dildos and the 
like becomes little more than animal husbandry 
at the human level. When science discards its 
humanity for the sake of technology, its subjects 
and clients become things rather than people 
(Goodfield, 1977), and science becomes es- 
tranged from the society that sustains it. As 
Goodfield so eloquently reminds us, this is noth- 
ing new, for science has, from its very begin- 
ning, been alienated from both the arts and the 
laity. She summarizes the historical background 
for the two most recurring criticisms of science: 
(a) First, science is cold and jnhuman and does 
not concern itself with the needs of society; and 
(b) second, somehow science manages to ex- 
tract the warmth and beauty of the world, and, 
in the process, drains itself. In concluding her 
essay, Goodfield asserted that science and soci- 
ety can no longer afford estrangement, and the 
scientist must work to understand the public as 
well as vice versa. Furthermore, the complexi- 
ties of modern society disallow professional al- 
legiance to a methodological ethic alone; rather, 
the scientist must work to apply “knowledge of 
facts in new compassionate ways” (Goodfield, 


1977, p. 585). et 
One good way to start is by recognizing that a 


plurality of perspectives 18 

minimal understanding of something 80 ap- 
parently straightforward as the masturbatory 
behavior of a given client, It is naive and arro- 
gant to assume that masturbation can be insti- 
tuted or facilitated in a treatment program with- 
out due consideration of the impact this may 
have on the moral and spiritual well-being of the 
client. Campbell (1975) has argued 
psychotherapy’s unwarranted rejection of moral 
tradition, and, more recently, the issue of thera- 


none of this applies to 
but knowledgeable behavior therapists 
Lazarus, Mahoney, Meichenbaum) have 
ed 


eg 
: that the cognitive sphere cannot be 
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artificially separated from action and behavior. 
On the face of it, a masturbation program of 
any type would appear to carry innumerable 
hidden messages and covert stipulations—all or 
most of which are antithetical to Judeo-Christian 
traditions—and it is high time that sex thera- 
pists pay some attention to the impact that 
their procedures have on clients’ values and 
religious beliefs. 


Concluding Comment 


As Kuhn (1970) tells us, science in its ad- 
vanced stages is naturally insulated from society 
and social problems, and each more-or-less iso- 
lated scientific group shares common values and 
beliefs, and a restricted view of the world is 
taken for granted. It is only within such a con- 
text that the more explicit forms of sex therapy 
operate, hidden behind behavioristic jargon and 
academic degrees. Only on closer scrutiny by the 
disciplinary nonbeliever or the lay outsider do 
we see that distinctions between legitimate and 
illegitimate forms of sex therapy are extremely 
difficult to make; and only then do we see the 
tenuous ethical, moral, and professional nature 
of the enterprise, 
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Is Masturbation Still Wrong? Comments on Bailey's Comments 


Nathaniel N. Wagner 
University of Washington 


Although Bailey in commenting on the Zeiss, Rosen, and Zeiss article on mas- 
turbatory techniques in sex therapy, sees ethical, moral, philosophical, social 


and psychological issues, 
effects.” Bailey demands 


he lists no specific negatives except “potential side 
that therapy using directed masturbation must be 


“vastly superior” to conventional techniques or else it is unethical. No rationale 
for this discrimination 1s presented except potential side effects. The antiscien- 


tific nature of this argument is noted. 


Bailey’s (1978) comments on the Zeiss, Rosen, 
and Zeiss (1977) article on “Qrgasm During In- 
tercourse: A Treatment Strategy for Women” 
continues a long tradition of intellectual and emo- 
tional arguments against masturbation. This tra- 
dition reached its zenith in the mid-18th cen- 
tury when S. Tissot of France wrote “Onana, a 
Treatise on the Diseases Produced by Onanism” 
(Dearborn, 1966). Following the intellectual and 
moral position set forth by Tissot in 1866, a 
British physician, Isaac Baker Brown, described 
the surgical removal of the clitoris to prevent 
masturbation and its serious side effects. The 
title of his volume communicates the flavor: “On 
the Curability of Certain Forms of Insanity, 
Epilepsy, Catalepsy and Hysteria in Females” 
(Corea, 1977). 

The essence of Bailey’s (1978) criticism of 
Zeiss et al. is that they are insensitive to the 
“psychological, social, and spiritual side effects 
of such procedures” (p. 1503). A little later, 
Bailey acknowledges that “these admittedly 
elusive side effects to therapy” (P: 1503) have 
not occurred as, yet. This does) not prohibit 
Bailey from warning of the serious negatives that 
may accrue to “the traditional institutions of 
marriage, family, and religious values” (p. 1503). 

Let us be clear about what Zeiss et al. (1977) 
recommend in their article. A six-step treatment 
Program for women who are jnorgasmic during 
intercourse is described. The program helps 
these women experience orgasm with vaginal 
containment of their partner's penis, whereas 
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previously they could only experience orgasm 
with manual stimulation of the clitoris. Mastur- 
bation with a dildo is used as a means of gen- 
eralizing the response from clitoral stimulation 
to vaginal containment. There is a thoughtful 
discussion of the work of Masters and Johnson 
(1966) demonstrating the physiologic similar- 
ity of orgasm regardless of the main source of 
stimulation. Zeiss et al. (1977) argue that if a 
woman wants to experience orgasm with inter- 
course, this “is a justifiable behavioral goal” (p. 
891). Although controversial among Sex thera- 
pists, their position is a reasonable one that has 
many adherents. ‘The opposing view is that all 
orgasms are equivalent and that it is a waste of 
time and may be unsuccessful to attempt to 
change the orgasmic pattern. 

That many women seek medical and psycho- 
logical assistance with orgasmic problems is wel 
established. Masters and Johnson (1970), Lo- 
Piccolo and Lobitz (1972), Kaplan (1974), Bar- 
pach (1974), and Heiman, LoPiccolo, and Lo- 
Piccolo (1976) provide clear evidence of con- 
tinued innovation, treatment, and research into 
this aspect of female sexual dysfunction. 

Bailey's (1978) critique suggests a return to 
more conventional talking therapy 

erous of potentially serious side effects. He talks 
about the harm that can befall therapists if they 
“caution and discretion in intrud- 


ing into the sex lives of our clientele” (p. 1503). 
It is hard to understand how one can intrude 
e who comes for 


into the sex 
assistance with a Sex! 


In an interesting turn for " 
from an ethical and moral stance, Bailey (1978) 


criticizes sex therapists for not using these tech- 
niques on “their own families, friends, neigh- 
bors, or loved ones” (p. 1505). Apparently, the 
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need for the therapist to be as objective as 
possible eludes him. It is an elementary fact of 
therapeutic practice, as medicine has long ac- 
knowledged, that it is foolish and dangerous to 
be professionally involved with persons with 
whom one has a preexisting emotional bond. 


The Story of Onan 


When the arguments that Bailey raises are 
carefully examined, it is clear that masturba- 
tion, and its presumed serious side effects, is 
the central issue. It may be helpful to examine 
the historical origin of our attitudes toward 
masturbation, Within the western tradition, the 
Old Testament Biblical story of Onan (Genesis, 
38, 6) is central. Onan refused to impregnate his 
older brother’s widow, practicing coitus interrup- 
tus instead, which brought the wrath of God 
and Onan’s death. The biblical scholars’ argu- 
ment as to whether Onan’s sin was in refusing 
to obey the levirate requirement to give seed to 
his brother or whether it was in the spilling 
of his seed on the ground is not at issue here 
(Bullough, 1976). The word onanism has his- 
torically been synonymous with masturbation 
(Freud, 1927), and it is currently the word for 
masturbation in German and other European 
languages. Clearly, Onan’s spilling of the seed 
on the ground soon evolved to other forms of 
spilling seed on the ground, that is, masturba- 
tion. The masculine bias should be noted. 

Within the confines of a Propopulation stance, 
which was rational and justifiable in Biblical 
times, masturbation threatened the survival of 
the species. High infant mortality and an over- 
all high death rate helped to develop a pro- 
natalist policy of severely 


tect clients from 
form of massive intrusions on privacy and re- 
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(Hardin, 1968), one could argue that masturba- 
tion is a constructive way of dealing with sexual 
interest. To continue to perceive masturbation 
as intrinsically psychologically damaging is ar- 
chaic and simply not true. If there are serious 
side effects, where is the scientific literature 
documenting these effects? Bailey has not made 
the case. 


The Ethical Issue 


Having responded in a general way to the 
highly emotionally charged criticisms of Bailey, 
I would like to respond specifically to each issue 
raised. The first is the ethical one. Bailey (1978) 
States that Zeiss et al. are allowed “to declare 
Open season on the sexually maladjusted client, 
with the only interdiction being that the thera- 
pist cannot join in on the fun” (p. 1503), The 
fact that all therapies include value judgments 
is not new and has been thoughtfully discussed 
by London (1964). Behavioral therapists are 
probably less guilty of making value judgments 
in the guise of therapy than such nonbehavioral 
therapists as psychoanalysts or Gestalt or hu- 
manistic therapists. A thoughtful discussion of 
the ethical problems in behavior therapy is pro- 
vided by Goldfried and Davison (1976). Zeiss 
et al. are not any more unethical in using mas- 
turbation as a therapeutic technique than scores 
of talking therapists who have used their cli- 
ent’s masturbatory fantasies in therapy. Zeiss 
et al. are probably even more effective in help- 
ing the client reach the goal that brought the 
Person into treatment. 


The Legal Issue 


Bailey’s (1978) argument here is that the 
techniques of Zeiss et al. would probably not 
“survive a malpractice suit by virtue of failure 
to fully consider potential side effects” (p. 1504). 
He suggests that clients might look back a year 
later and find themselves “demeaned,” “humili- 
ated,” and “suffering irreparable psychological 
harm.” Here we have rhetoric about potential 
side effects without data. There is no empirical 
substantiation other than the imagined plea of 
a counsel concerning “the grevious wounding of 
spirit.” Every one of these legal issues was 
raised against contraceptive practice, and if pef- 
sons like Margaret Sanger (Reed, 1978) had not 
been willing to risk the wrath of others, We 
would still be in the Dark Ages on that issue. 
Despite the proliferation of sex therapists, the 
only legal issues raised in the courts have been 
those of the use of surrogates and the clearly 


— 
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unethical practice of sexual relations between 
therapists and clients (Taylor & Wagner, 1976). 
Bailey's fears of legal problems are just that— 
fears. 

The privacy issue raised in Bailey’s (1978) 
discussion of legal issues is hard to understand. 
He argues that unless the “hard” behavioral 
techniques are “found to be vastly superior” to 
the “soft” conversational ones, the application 
of these techniques would be “at worst un- 
ethical and possibly illegal, and at best, just bad 
therapy” (p. 1504). No reason is set forth as 
to why these techniques must be “vastly su- 
perior” rather than just equal. This is a strange 
double standard—that techniques that use di- 
rected masturbation must be vastly superior or 
else they are unethical. Bailey does not offer a 
single reason for the necessity of the different 
standard except the vague argument concerning 
potential side effects. 


The Power Issue 


There is not much need to comment here, 
for Bailey has simply stated that therapists 
have more power than clients, which is certainly 
true. He suggests that Zeiss et al. misuse their 
power, again without documentation except the 
previously noted assertion that they do not prac- 
tice on their friends, lovers, and spouses. 


The Phi losophical-Theological Issue 


The sum of Bailey’s (1978) argument is that 
“on the face of it, a masturbation program of 
any type would appear to carry innumerable 
hidden messages and covert stipulations—all or 
most of which are antiethetical to Judeo-Chris- 
tian traditions” (p. 1506). There it is! “On the 
face of it,” masturbation is wrong: Anyone who 
recommends it, even if it helps people to become 
well-functioning human beings is guilty 
of defective professional judgment or to use 
Bailey’s term is unethical. 


Concluding Comment 


Bailey has raised a number of historic con- 
cerns about nonreproductive sexual behavior in 
expressing concern about the use 
tory techniques in therapy. i 
entirely based on potential dangers and on 
highly charged value judgments. 
antiscientific, in that it recommends that no new 
technique be considered until it is e 
be “vastly superior” to nventional techniques. 


Interestingly, Baily does 


research to be conducted; he 
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— of procedures that he finds unaccept- 
able. : 
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Ethical and Professional Issues in Sex Therapy: 
Comments on Bailey's ‘ 
“Psychotherapy or Massage Parlor Technology? 


G. Terence Wilson 
Rutgers—The State University 


Sex therapy should be conducted by a skilled therapist who is sensitive to the 
professional and ethical issues that are inherent in treatment. Potentially ad- 
verse side effects of therapy should be continually assessed. To date, however 
the findings from sex therapy, including the Zeiss, Rosen, and Zeiss procedure, 
show predominantly positive consequences, It is imperative that the fully in- 
formed client have decision-making primacy in setting treatment goals. Value- 
free therapy does not exist, and therapists must be careful not to impose their 
own personal biases. The Principle of the least intrusive treatment alternative 
and the nature of intrusiveness are discussed. 


In his response to Zeiss, Rosen, and Zeiss 
(1977), Bailey (1978) raises some important 
ethical and professional issues that relate to psy- 
chotherapy in general and 
ticular. It is therefore particularly unfortunate 
that Bailey does more to distort than to clarify 
the critical issues in a reply that contains ir- 
relevant, arbitrary, and extremist position state- 
ments that are empirically unsupported. Nor is 


Therapeutic Ethics 


It is fundamental to the behavioral treatment 
of sexual (or any other) problems that the cli- 
should have the major say in setting the 


tariness, and competency ( Friedman 


" A 1975). Thi 

pics described by Zeiss et al. 1977) PET 
eraj 

h aA on a voluntary basis and certainly seem 
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assume that they were given a full description 
and explanation of the treatment procedures. 
No less is required in order for such methods 
to be carried out successfully, However, Bailey 
(1978) criticizes Zeiss et al. for being “totally 
insensitive to the serious matter of side effects” 
(p. 1503). 


Side Effects of Therapy 


Accountability in therapy requires adequate 
assessment of the behavior that is the target of 
treatment. Side effects, or changes in behaviors 
that are not directly targeted for change, are 
Possible in all forms of psychological treatment, 
including behavior therapy. Accordingly, rele- 
vant, related behaviors should be assessed in 
addition to the specifically treated behavior. 
Some of these changes may be judged to be de- 
sirable, whereas others may not. In behavior 
therapy, more often than not, these side effects 
have been Positive (Kazdin & Wilson, 1978b). 
A problem faced by all practitioners is how 
Widely to cast this assessment net in monitor- 
mg concomitant or generalized behavior change. 
Of Course, one can always resort to the thera- 
Pist’s clinical acumen and sensitivity in detect- 
mg untoward (or positive) consequences of in- 
tervention. Although the therapist’s judgment 
is not to be overlooked, more systematic mea- 
Surement of the broad effects of treatment out- 
come is needed. 

though no formal guidelines exist, two gen- 
eral classes of behaviors would appear to be 
directly related to the specific goal of facilitat- 
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ing coital orgasm in the cases in point. One is the 
frequency and nature of the clients’ sexual be- 
havior as a whole. The other concerns the qual- 
ity of the clients’ emotional and interpersonal 
relationships with their partners. In this con- 
nection it should be noted that Zeiss et al. 
(1977) obtained pretreatment and posttreatment 
measures of both marital adjustment and mu- 
tual sexual behaviors using widely accepted mea- 
surement inventories (Locke & Wallace, 1959; 
LoPiccolo & Steger, 1974). The increases in 
marital satisfaction reported by Zeiss et al. are 
consistent with evidence from other sex therapy 
programs that have included directed masturba- 
tion training (Leiblum, Rosen, & Pierce, 1976). 
Since both partners were actively involved in 
the treatment program, the therapist had access 
to another source of clinical data pertaining to 
treatment effects and was in a favorable position 
to detect any adverse impact on the clients’ in- 
terpersonal relationships. To the degree that the 
therapist, supported by evidence from the in- 
ventories, was unable to notice any negative 
side effects, Bailey’s objection is undermined. 
One of the problems with Bailey's (1978) 
reply is that it is consistently negative an 
does not offer any concrete constructive leads 
as to how to conduct ethically responsible ther- 
apy. For example, he fails to specify what side 
effects are to be feared. Repeatedly, he issues 
vague, value-laden caveats about sex therapists 
assaulting “the traditional institutions of mar- 
riage, family, and religious values” (p. 1503). 
He ignores the evidence showing that exposure 
to pornography does not appear to alter sexual 
morality (Commission on Obscenity and Pornog- 
raphy, 1970), that the direct behavioral treat- 
ment of sexual disorders enhances marital satis- 
faction and often prevents otherwise certain 
divorce and family disruption (€8-, Masters & 
Johnson, 1970), and that serious negative side 
effects of such therapy have yet to be reported. 
Quoting questionably relevant position state- 
ments by Albee (1977) and by Campbell (1975) 
is not good enough. At worst it amounts to ill- 
concealed moralizing. h, 
“Nor does Bailey (1978) specify how potential 
side effects are to be assessed Or under what 
conditions sex therapy is ever to be appropriate. 
Rather, he appeals to therapy performed by “a 
psychotherapist who combines sensitivity to phil- 
osophical, moral, and theological issues with his 
armamentarium of technological skills” (italics 
added, p. 1505). Who would disagree with this? 
Reflecting the view of behavior therapists €n- 
gaged in the treatment of sexual dysfunction, T 
ave previously emphasized that effective 


process 
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treatment 


involves more than instruction in the art of body 
massage and the use of vibrators. Inadequate inter- 
personal relationships and lack of communication 
are more often than not the reasons for sexual dis- 
tress. Accordingly, sex therapy requires trained 
therapists who are skilled and experienced. (Franks 
& Wilson, 1977, p. 401) 


Zeiss et al. (1977) explicitly caution that thera- 
pists should remain sensitive to these broader 
and more subtle issues of marital (interpersonal) 
interaction and self-esteem. Sex therapy does not 
obviate the need to consider these factors care- 
fully. It often is directly responsible for im- 
proving marital harmony, enhancing self-esteem, 
and providing greater self-fulfillment (Annon, 
1974; O'Leary & Wilson, 1975). 

If this is so, what then does Bailey find so 
objectionable? It appears that it is both the 
goal and the method of Zeiss et al.’s (1977) 


therapy. 


Who Sets the Goals of Therapy? 


Behavior therapy requires that the client have 


decision-making primacy in setting treatment 
goals. The therapist’s role is to assist clients in 
evaluating the probable consequences of differ- 
ent courses of action (Bandura, 1969). In this 
the therapist inevitably influences the 
client (Wilson & Evans, 1976). In doing $0, it is 
crucial that the therapist's biases be recognized 
and honestly declared. Particular care should 
be taken in helping the client to differentiate 
between advice and information that has some 
empirical basis (in sex therapy, 8+ see Annon, 
1974). This candor, in addition to the descrip- 


tion of explicit treatment methods directed to- 
that allows 


ward specific goals in a manner 
continual assessment of progress (accountabil- 
ity), renders viable the notion of informed con- 
sent in clients such as those described by Zeiss 
et al. (1977). In this context the client's goal 
is to be respected. If the therapist 1s unwilling 
to cooperate with the client for either personal 
or professional — ag should say s0 
er the client elsewhere. 
A procedure, with» which Zeiss et al.’s 
(1977) ms to be consistent, can be 
contrasted with Bailey's ee ana hae 
j . His view is t “open sea- 
vane uia on the client; that the 


» ig being declared 
ena = in which “anything goes” and 


i e 
a ae rred”; that therapists like Zeiss 
al. are “playing doctor” with “dolllike” cli- 
ae aca may judge for themselves how 
accurate a representation this is of the Zeiss 
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et al. procedure in particular or in sex therapy 
in general. It is my own view that these hyper- 
bolic horrors tell us more about Bailey’s per- 
sonal values than about the treatment and its 
perception by, and effects on, the clients, Of 
course, Bailey, or anyone else for that matter, 
is entitled to be upset by the goals of sex ther- 
apy (in this case, the mutual enjoyment of 
masturbation and coital orgasms by a consent- 
ing husband and wife that is made possible by 
a direct behavioral method). Others, including 
sex therapists, will find this unobjectionable— 
even desirable. What is important is that neither 
view be imposed on the client. 

There is a danger that in his no doubt well- 
intentioned desire to protect clients, Bailey is 
encouraging the subtle imposition of a particular 
set of values on clients. The theme that “the 
therapist knows best” and should be careful be- 
fore condoning the self-indulgence in his or her 
clients’ runs throughout Bailey’s (1978) reply. 
Nowhere is it more obvious than when he ob- 
jects to “therapists . . . giving clients what they 
want (or think they want) whether they need 
it or not” (emphasis added, p. 1503). The im- 
plication is clear: The therapist becomes the 
arbiter of what the client “really” wants and 
what he/she really needs, Among other con- 
temporary treatment approaches, behavior ther- 
apy, in principle, involves a deliberate attempt 
to avoid this insidious and patronizing attitude. 
Therapists would do well to take seriously Mis- 
chel’s (1977) reminder that our clients “are 
the best experts on themselves and are emi- 


nently qualified to participate in the develop- 
ment of descriptions an i 


mention decisions—about 


Selecting Treatment Methods: 


i The Criteria o 
Effectiveness and Intrusiveness i 


Effectiveness. 


cal and legal considerations di 
tor i 1 ctate that the 


Wilson, 1977; Kazdin & Wilson, 1978b; and 
Marks, 1976.) Suffice it to state here that th 
are studies indicating the greater efficac ibe 
direct (behavioral) Compared to indirect Aes 
Psychotherapy) techniques (Lazarus 1961; 
3 ; 
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Marks, 1976; Obler, 1973) or a waiting-list 
control group (Munjack et al., 1976). More- 
over, the more direct (“harder”) the method, 
the greater the apparent efficacy (Kockott, Ditt- 
mar, & Nusselt, 1975; Mathews et al., 1976). 
This latter tendency is consistent with the more 
general finding that performance-based (direct) 
methods are more effective than those that rely 
on verbal or vicarious procedures (cf. Bandura, 
1977a; Wilson, 1978). 

There are as yet no unambiguous data show- 
ing that orgasmic reconditioning significantly 
alters sexual behavior (Conrad & Wincze, 1976), 
However, uncontrolled clinical reports continue 
to suggest that it may be a useful method (eg, 
Lobitz & LoPiccolo, 1972; Wilson, 1973). Al- 
though most of the evidence for the efficacy of 
behavioral methods rests primarily on uncon- 
trolled clinical reports, evaluation of the rules 
of evidence must take into account the fact that 
the consistently high success rates that have 
been reported by widely differing programs are 
unprecedented (e.g, Kaplan, 1974; Masters & 
Johnson, 1970). Nothing like it has ever been 
reported even with highly selected clients in un- 
controlled clinical reports (O'Leary & Wilson, 
1975). Although the final verdict must await the 
appropriate controlled outcome studies, the avail- 
able evidence unquestionably indicates that di- 
rect performance-based methods are the only 
reasonable alternative for treating most forms 
of sexual dysfunction. 

Intrusiveness. Bailey’s assumption is that 
direct sex therapy methods such as orgasmic 
reconditioning are highly intrusive while tradi- 
tional psychotherapy is not. The issue is more 
complex than this simplistic dichotomy implies, 
however. Orgasmic reconditioning is a well-speci- 
fied technique that is self-administered by two 
informed sexual partners. It has a limited goal 
(orgasm) ; its effects on sexual behavior are rea- 
Sonably predictable and clearly observable to 
the client. Is this what intrusiveness means? 
Simply because it is a direct behavior change 
method that involves sexuality does not neces- 
sarily make a technique intrusive. : 

Compare these procedural criteria with tradi- 
tional Psychotherapy. The success of psychody- 
namic therapy is predicated on the development 
of a workable transference relationship. During 


* Interestingly, a major reason why Masters and 
Johnson (1970) emphasized the importance of a 
al-sex therapy team was to avoid the develop- 
ment of a transference relationship. Their treat- 
ment is geared to enhancing emotional communi- 
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the course of this intense emotional relationship, 
unconscious thoughts, forbidden impulses, hidden 


fantasies, and a wealth of deeply intimate ma- 
terial are probed. Therapy has the relatively 
vague goal of insight without specific operational 
referents that are immediately observable to 
the client (cf. Bandura, 1969). Could this not 


be construed as intrusive? Is this approach not 
more likely to delve into “life’s elusive purpose” 
than the limited attempt to enhance orgasm 
during coitus? The answers to these questions 
will vary depending on the specific circumstances 
of each case and cannot be brushed aside by 
oversimplifications about what at face value 
seems to be the case. 


Causal Models and the Therapeutic Relationship 


_ Bailey (1978) portrays therapy as a process 
in which extremely powerful treatment methods 
are administered to dolllike clients by omnipo- 
tent therapists who “extract levels of ‘obedience’ 
that Milgram could only dream of” (p. 1504). 
Although this view is occasionally shared by 
some proponents and opponents of behavior 
modification alike, it fares poorly under critical 
scrutiny, Bandura (1978) has discussed the con- 
ceptua inadequacies of this sort of unidirec- 
tional causal model of human behavior, making 
a com pelling case for the reciprocal determin- 
ism of behavioral influence. The therapist’s in- 
fluence may be considerable, but it is far from 
total. (See Davison, 1973, and Wilson & Evans, 
1976, for a fuller discussion of the ways in which 


the therapist may influence the client, how uni- 
lateral therapist control is limited, and why 
preferable to external 


self-regulated change is i 
influence.) It is more realistic, not to mention 


more humble, to note that the therapist is more 


a consultant than a controller, skillfully direct- 
ing consciously involved clients in active, self- 
regulated problem-solving strategies instead of 

through “almost 


dominating puppetlike figures 


hypnotic power.” 

_ Conjuring up images of automatic condition- 
ing, obedience training, and master-slave rela- 
tionships makes for lively polemics but faulty 
Psychology (Bandura, 1977b). The sense of this 
discussion can be summarized by observing that 
Considerations of both ethics and efficacy re- 
quire a therapeutic relationship iD which the 
Client is an active, sé 


1f-directed participant. 


s. This would 


cation between the clients themselve È 
emotional 


be impeded by the client forming an 
attachment to the therapist. 
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Concluding Comments 


Bailey’s response contains several irrelevant 
and arbitrary assertions that despite his dis- 
claimer, could be read to impugn the profes- 
sional ethics of sex therapists. Two of the more 
obvious examples can be mentioned. First, it is 
difficult to see how the issue of sexual contact 
between therapist and client is relevant to a 
serious analysis of the Zeiss et al. (1977) pro- 
cedure. This is a mischievous juxtaposition that 
may mislead by “guilt through contiguity.” Sec- 
ond, there is the allegation that sex therapists 
hold to a double standard of ethics in prac- 
ticing massage parlor technology on clients who 
are strangers while shunning the same methods 
with their “loved ones.” In fact, I know of 
cases that contradict this claim. If the reader 
finds my own subjective observation less than 
convincing, then my point about Bailey’s specu- 
lation has been made. 

Unquestionably, sex therapists should monitor 
both the direct and indirect effects of their 
treatment methods on clients’ functioning. This 


attention to potential side effects of therapy 


should be part of an expanded evaluation of psy- 
chotherapy within a broader set of criteria than 


has usually been the case up to now (cf. Kaz- 


din & Wilson, 1978a). The consequences of a 

society are that different cultural 
Il have different social values that are 
sometimes difficult to reconcile, There is no 
such thing as yalue-free therapy, and the im- 
pact of the therapeutic process on personal and 
social mores requires constant attention and 
analysis. In part, the resolution of these thorny 
issues will depend on their public airing as in 
the present interchange of ideas.2 However, it 
is my contention that the judicious use of 
direct behavioral methods for the treatment of 
sexual dysfunction by a skilled therapist who is 
sensitive to these ethical issues constitutes sound 
clinical practice. The benefits to our clients are 
often considerable, and I suggest that the onus 
is now on those who would oppose these meth- 
ods to show sufficient cause for continued re- 


pluralistic 
groups wil 


calcitrance. 


mab hth As 


27The questions ° 
oals of treatment in 


s i & Wilson, i 

idered elsewhere (cf, Davison \ 

Garfield 1974; Kohlenbers; 1974; Strupp, 1974). 
e iscussion should be viewed in this 


The present al behavior change. 


broader context of ethics and sexu 
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Comments on Levin et al. and Rosen and Kopel: 
Internal and External Validity Issues 


Gary M. Farkas 
University of Hawaii 


This comment discusses problems of internal and external validity relevant to 
applied laboratory research in the modification of sexual behaviors. Used as 
examples are two recent case studies published in this journal. Suggestions are 
presented for improving the internal validity and the generalizability of studies 


using the laboratory paradigm. 


control over tumescence. This skill has been 
shown by normal subjects (Henson & Rubin, 
1971), and may result from cognitive mediation 
(Farkas, Sine, & Evans, in press). Both Levin 


In a recent issue of this journal, Levin, Barry, 
Gambaro, Wolfinsohn, and Smith (1977) and 
Rosen and Kopel (1977) presented attempts to 


modify sexual response in two patients. Levin 
et al. sought to reduce the degree of penile re- et al, and Rosen and Kopel made a crucial 


sponse to pedophilic stimuli in a man with a error in assuming that incarcerated patients sub- 
19-year history of pedophilic behavior. Rosen ject to contingencies of the judicial system 
would provide truthful data. Since this is un- 


and Kopel attempted to modify tumescence in 
a man who had exhibited both transvestite and likely, some measure of deception is needed. 
exhibitionistic behavior for 30 years. Each group For these particular patients, the index would 
of authors reported apparently successful treat- be the degree to which they possessed voluntary 
ments with regard to those measures assessed. control of tumescence to problematic stimuli, 
The issue for discussion in this comment will If, for example, a patient exhibits the ability 
be the selection of dependent variables, both to suppress arousal to 50% of full tumescence, 
during baseline measurement and with regard should not therapy be designed to enhance this 
to the wider issues of external validity, Although skill and assessment be formulated to measure 


the previously cited articles are the subjects of the effect of therapy above and beyond that 
my knowledge, 


this reply, they exemplify other studies that ability presented de novo? To e 
have similar shortcomings: no single case or group study has taken this 
voluntary control issue into account. 


se of baseline measurement 
There are additional problems of external 


rence by which treatment 
ed. Thus, pretreatment validity that necessitate response. As Rosen and 
noted, strain gauge measurement 


ternal validation of an Kopel (1977) urem¢ 
is a precise and reliable indicant of physiological 


however, the degree to which 
des externally valid evidence 


_ The primary purpo 
is to provide a refe 
efficacy can be evaluat 
Measures provide an ini 


intervention’s effect. One might question, how- 
ever, the relevance of the pretreatment condi- sexual arousal; ho 
check for in- this measure provi 


tions used in these studies as & s r a 
ternal validity. In this discussion, I will ex- of behavior change 1S subject to several chal- 
amine only one of the measures used—percent- lenges. First, whereas tumescence may eh 
age of full erection to deviant stimuli. the first and most reliable physiological index 
In Levin et al. (1977) and Rosen and Kopel of arousal (Zuckerman, 1971), it does FA po 
(1977) the primary threat to internal validity vide @ measure congruent with self-report. 
inhibitory Farkas et al. (in press) have demonstrated that 

orrelation between 


was the ability of males to possess 1 j 
7 under ideal conditions, the corr b 

these measures averages .46, with donaldersb 

intain 


The author is indebted to Raymond C. Rosen yariability. Moreover, these authors mal 
and Jack S. Annon for their contributions to his that a high degree ctive arousal may be 
learning experience. reported while minimal tumescence 1S evident. 
pease for reprints should be ee oy sR Second, to what degree is penile arousal an 
arl ogy, University STRAT. ; 

Hawaii, Deps s PA aola Hawaii essential criterion of sexual behavior? Stated 
96822. 430) Cane e f differently, is there evidence that modification 
the ‘American Psychological Association, Inc. 9022-006X,/78/4606-1515$00.75 
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of penile responsivity in laboratory settings 
predicts overt behavioral changes in extralabora- 
tory environments? Much data, both in bio- 
feedback and aversion therapy studies of sexual 
problems, indicate that this is not necessarily 
the case (e.g., Barlow, Agras, Abel, Blanchard. 
& Young, 1975). 

A final issue concerning generality regards the 
external validity of the treatment stimuli com- 
pared to stimuli encountered in the natural set- 
ting. Certainly, slides of nude girls and video- 
tapes of transvestism-exhibitionism are unlike 
those stimuli evoking problematic behaviors in 
the extralaboratory situation. Thus, just as we 
have witnessed a generalization gradient to non- 
treatment slides (e.g., Levin et al., 1977), how 
steep will this gradient become when extralabor- 
atory cues are assessed? 

The present experimental literature suggests 
extreme caution in interpreting studies that ex- 
amine reduced tumescence Tesponse to prob- 
lematic stimuli as a function of some treatment. 
However, if one continues to accept the labora- 
tory paradigm as having utility in treatment 
studies, what modifications in procedure might 
prove useful? 

First, given the finding of unreliability be- 

tween self-report and tumescence, an assessment 
of this factor for each patient would be valu- 
able during the prebaseline phase to establish an 
upper limit of confidence in the generalizability 
of the data. 
‘ Second, therapists should assess the degree of 
inhibitory control that clients Possess at pre- 
baseline measurement. For the incarcerated pa- 
tient with a high degree of Voluntary control, 
the internal validity of any treatment effect mea- 
sured by this method would be Seriously ques- 
tioned. 

Third, the nature of t 
should be modified. Mosi 
not because of arousal 
in laboratories but becau: 
haviors elicited by highly 
external setting. At this j 
of cue generation (Abel, 
Mavissakalian, 1975) 
Promise, at least 
of intervention, 

Finally, tumescence respo; 


y mses rare] 
to constitute the only behavior adne G 
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carceration in cases of sexual problems. Rather, 
a complex series of motor behaviors (e.g., fond- 
ling young children) appears to contribute to, 
or wholly constitute the basis for, society’s un- 
willingness to accept certain aberrations. Thus, 
should not these motor behaviors be appropriate 
targets for direct intervention? Targeting penile 
response without concomitant attempts to mod- 
ify cognitive and behavioral factors that may co- 
occur or predominate as the basis for maladap- 
tive sexual behaviors does not appear to hold 
promise for maintenence and generalization of 
appropriate responses in the natural situation. 

Although direct measurement of sexual arousal 
has proved a welcome innovation, the limita- 
tions of the method must be recognized. With 
continued discussion and experimental refine- 
ments, it is hoped that the validity of this 
method will improve and that more effective 
treatments will be developed. 
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Penile Tumescence as a Measure of Sexual Arousal: 
A Reply to Farkas 


Saul M. Levin and Salvatore Gambaro 
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Lawrence Wolfinsohn 
Dane County Mental Health Center, Madison, Wisconsin 


This reply to Farkas’ criticism of our recent case study recognizes the suscep- 


tibility of the penile measure to faking. However, 
solution of obtaining a measure of voluntary control 


value of Farkas’ suggested 


prior to treatment in getting around the problem. Instead of regar 
be used to facilitate treatment, as Rosen. 


as an assessment problem, it might 


and Kopel did and as Farkas suggests. We 


measure but obtained self-report data and i 
Also, we did not treat penile erection in 


ed versions of real-life situations, includ- 
d with such situations. 


behavior relevant to child molesting. 


response to children; we treated imagini 


ing the cognitive events associates 


Farkas (1978) has criticized the use of penile 
erection as a dependent variable in our case 
study of the effects of covert sensitization on 
pedophilic behavior ( Levin, Barry, Gambaro, 
Wolfinsohn, & Smith, 1977). Farkas is concerned 
with both the internal and external validity of 
the measure. He contends that control of penile 
tumescence (Henson & Rubin, 1971) makes it 
inappropriate as an index of sexual arousal dur- 
ing baseline. He also raises questions about the 
value of the measure as an indication of general 


pedophilic tendencies. 

There are several points to be made in re- 
buttal, Although normal subject and sex of- 
fenders (Laws & Holmen, Note 1) can control 
penile response, the solution Farkas suggests, 

trol prior to 


obtaining a measure of such con 


treatment, does not get around the problem. 
to exert sub- 


For one thing, if a subject is able 


stantial voluntary control over penile response 
or to treatment, it 


when instructed to do so pri 

does not follow that subsequent changes âre a 
to faking rather than to genuine modifications in 
the sexual attractiveness of the stimuli. For 


example, if our patient had been able to volun- 
response to viewing 


tarily suppress erection in 
Slides of children before treatment and had then 


sent to Saul M. 
endota Mental 
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we are doubtful about the 
ding faking 


did not rely solely on the penile 
indirect information about motor 


manifested a response decrement during treat- 
ment without suppression instructions, it would 
not necessarily follow that the treatment changes 
were due to a conscious effort to fake indiffer- 
ence to the slides rather than to a genuine de- 
crease in attraction to children, Conversely, the 
really clever individual might realize that the 
baseline procedures were designed to pick up 
faking and try to “beat the game” by pur- 
posely not suppressing response during baseline 
evaluation and then faking later during and after 
treatment. 

‘An additional point is that instead of re- 
garding faking as an assessment problem, it 
might be used to facilitate therapy. Treatment 
would consist of systematically training the pa- 
tient to suppress erection to & variety of real 
and imagined stimuli, This, of course, is similar 
to what was done by Rosen and Kopel (1977). 
These investigators used biofeedback procedures 
to train a man to suppress penile erection to a 
videotape of himself engaging in deviant sexual 


behaviors. Our treatment procedure was dif- 
d to associate aver- 


ferent, in that we attemptet v 
ical and psychological 


thoughts, and feelings re- 
a child. The relative 
events (as 
, penile erection, or a combina- 
of course, an empirical question. 
Barlow (1977) has presented a comprehensive 


framework for evaluating sexual behavior. He 
pointed out the value of assessing self-report, 


effectiveness 
we tried to do) 
tion of both is, 
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motor, and physiological responses. In our study 
we used all three measures to support the in- 
ference of therapeutic change. We were able to 
demonstrate decrements in pedophilic tendencies 
during and immediately after treatment and on 
extended follow-up, with changes evident in 
penile measures, self-report, and also on neces- 
sarily indirect measures of motor behavior. The 
last index is perhaps most important, since the 
ultimate criterion is whether or not the patient 
molests children or gets into other difficulties 
following treatment. At last report, our pa- 
tient had not done so. Farkas (1978) implies 
overemphasis on the penile measure, which we 
feel is not the case. 

A final point needs to be made regarding a 
possible misunderstanding on Farkas’ (1978) 
part. In discussing the importance of treating 
motor behaviors, he says, 


Targeting penile -response without concomitant at- 
tempts to modify cognitive and behavioral factors 
that may cooccur or predominate as the basis for 
modification of sexual behaviors does not appear 
to hold promise for maintenance and generalization 
k ae responses in the natural situation, 
Pp. 


This implies that we used penile response as 
the reaction to be treated. This is not the case. 
In treatment we used imagined versions of real- 
life situations that had in the past, and might in 
the future, serve as occasions for sexual attrac- 
tion to children. We also instructed the patient 
to imagine the pedophilic thoughts and fantasies 
that occurred in these situations, 

In summary, changes in penile erection mea- 
Sures obtained in a laboratory setting are ob- 
viously not to be regarded as the sole criterion 
of therapeutic change. Such measures need to 
be supplemented with self-report techniques and 
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evaluations of motor behavior. The fact that 
erection can be voluntarily controlled is, indeed, 
a problem that may very well be almost im- 
possible to get around (Laws & Holmen, Note 
1). As Farkas himself seems to suggest, it may 
be best to capitalize on such control by using 
it in treatment instead of regarding it as an 
assessment problem. 


Reference Note 


1. Laws, D. R, & Holmen, M. L. Sexual response 
faking by pedophiles. Unpublished manuscript, 
Atascadero State Hospital, Atascadero, California, 
undated, 
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Role of Penile Tumescence Measurement in the 
Behavioral Treatment of Sexual Deviation: Issues of Validity 


Raym 


ond C. Rosen and Steven A. Kopel 


Department of Psychiatry, College of Medicine and Dentistry of New Jersey 
Rutgers Medical School, Piscataway 


Farkas has raised three specific issues concerning the validity of penile tumes- 


cence assessment of sexual arousal in the laboratory. Some of these issues had 


been addressed in our original case study, 
roader perspective. While acknowledging 
we nevertheless see an important place 
ent of sexual disorders. 


to place the issues of validity in a b 


the limitations of the erection measure, 
for it in both the assessment and treatm 


In Rosen and Kopel (1977), we reported the 
results of a case study using penile tumescence 
measurement and biofeedback to modify the 
transvestite /exhibitionistic pattern in a 45-year- 
old male client, Despite an initial positive out- 
come, remission of the deviant behavior was re- 
ported to us by the client's spouse approxi- 
mately 24 years after treatment. Farkas (1978) 
has raised issues concerning the validity of the 
tumescence assessment as described in Rosen 
and Kopel, in addition to the broader issues of 
the usefulness of tumescence measurement in 
clinical studies, Some of the issues addressed 
by Farkas had, in fact, been dealt with in our 
original study. However, the validity issues that 
have been raised deserve a broader perspective 
than that offered by Farkas. The purpose of this 
reply is to examine the question of validity of 
tumescence measurement in greater depth. 

To begin with, it is worth considering the 
growing appeal of penile plethysmography as & 
laboratory assessment measure. Rosen and 
Keefe (in press) have reviewed the burgeoning 
field of tumescence measurement and have con- 
cluded that it is a key dependent measure for 
assessment of male sexual arousal. Historically, 
the need for objective determina 
Preference and response has 
on self-report and greater sn 
sures of sexual arousal (Zuckerman, 
Pecially when treating pcr offender, there are 
obviously good reasons for supplementing the 
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and this reply to Farkas is intended 


client’s self-report with objective physiological 
assessment. In fact, good clinical practice dic- 
tates the use of multiple criterion measures for 
a comprehensive assessment of clinical outcome 
(Keefe, Kopel, & Gordon, 1978), For example, 
in our case study we described the use of a 
standardized paper-and-pencil instrument for 
evaluation of sexual interaction (LoPiccolo & 
Steger, 1974), subjective ratings of arousal to 
erotic stimuli, client and spouse reports of mari- 
tal/sexual satisfaction, and laboratory tumes- 
cence assessments. 

Regarding the validity of tumescence mea- 
surement, Farkas (1978) raises three specific 
issues: (a) The fact that males are able to 
exercise some degree of voluntary control of 
tumescence (Henson & Rubin, 1971; Rosen, 
1973) is presented as a threat to the internal 
validity of the measure. (b) With respect to 
external validity, Farkas challenges the associa- 
tion between strain-gauge data and self-report of 
overt behavior. and (c) The generalization from 
laboratory to “real-life” stimuli is viewed as 
a further threat to external validity. We will 
deal with each of these three issues in turn. 

Levin, Gambaro, and Wolfinsohn (1978), in 
reply to Farkas (1978), have pointed out logi- 
cal inconsistencies in the suggested independent 
of voluntary control of penile re- 
sponse. The potential for response “faking” is 

“prebaseline assessment” as 


though we agree 
need to be concerne 
tumescence, Farkas’ suggestions do not appear 
to offer a real solution. Further, in our case 


study we examined several sources of evidence 
regarding the possibility of tumescence faking. 


/4606-1519$00.75 


1520 


From the gradual, progressive learning curve [of 
penile tumescence] it appears unlikely that suppres- 
sion of response was due to distraction or demand 
characteristics . . . Furthermore, the client's sub- 
jective ratings of arousal changed more gradually 
than actual penile responses—further evidence 
against an explanation in terms of demand charac- 
teristics. Finally, penile tumescence suppression con- 
tinued to improve even below the limit of sensory 
awareness (approximately 10% full erection). 
(Rosen & Kopel, 1977, pp. 914-915) 


The second issue, the external validity of 
tumescence measurement (i.e., the notion of re- 
sponse generalization), is complex and merits 
further discussion. In part, this concern involves 
the consistency between physiological, self-re- 
port, and overt behavioral indices of sexual 
response. According to Farkas, the correlation 
between self-report and tumescence is weak. 
However, other researchers (e.g., Abel, Blan- 
chard, Murphy, Becker, & Djenderedian, in 
press) have found strong associations between 
these two measures. Furthermore, with respect to 
the relationship between laboratory control of 
tumescence and subsequent overt sexual behavior 
in the natural environment, other reports (e.g., 
Csillag, 1976) have shown meaningful general- 
ization. Thus, there is independent empirical 
support for the external validity of the tumes- 
cence measure, However, in any given instance, 
external: validity should not be taken for granted. 
Rather, clinicians or researchers should seek 
independent evidence for convergent validity 
(Campbell & Fiske, 1959) across response mea- 


sures through a multiple-criterion assessment 
package. 


In the treatment of sexual deviations, one 


lack of consistency 


( ponse systems. For 
instance, a number of exhibitionists treated in 


i. For example, with Tespect 
we used his 
Tments in assessment of 
1s noteworthy that tumes- 


actual transvestite ga 
sexual arousability. It 
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cence measurement revealed that contrary to our 
expectations, fondling of these garments with 
active imagery failed to elicit arousal. However, 
subsequent videotaping of the complete transves- 
tite-exhibitionistic script did prove to be a reli- 
able eliciting stimulus for tumescence response. 
This suggests that the issue of real-life versus 
laboratory stimuli raised by Farkas may be an 
oversimplification. The videotape, although a 
more contrived stimulus in some respects, ap- 
peared to incorporate subtle but critical param- 
eters of the real-life pattern. It appears from 
this case that treatment stimuli should not be 
selected a priori but rather should be determined 
by an empirically based, comprehensive assess- 
ment. 

Finally, the unsuccessful long-term result re- 
ported in our case study raises the possibility 
that too much reliance was placed on penile 
Plethysmography in assessment and treatment. 
On the contrary, after intensive review we have | 
concluded that the absence of tumescence as- 
Sessment beyond the initial 4-month follow-up 
period was a strategic error. Our reliance on 
the client’s self-report during the subsequent 
2-year period provided a series of false outcome 
assessments. We believe that tumescence mea- 
Surement, as originally used, might have cued 
the therapists to the gradual return of the deviant 
arousal pattern, thus providing a critical oppor- 
tunity for reinstatement of the treatment pro- 
gram. 
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A Note on Psychosomatic Factors in the Etiology of Neoplasms 
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Watson and Schuld attempted to study the relationship between psychopathol- 
ogy and subsequent development of neoplasms. Their results, which indicated 
no apparent connections between these variables, are limited due to methodo- 
logical flaws, both relating to sample selection. The study sample, composed of 
psychiatric patients, was highly restricted along one of the variables studied— 
psychopathology—and was further confounded by uneven distribution of a po- 
tentially carcinogenic factor—alcoholism. Though there is no empirical evidence 
of psychological causation of neoplasms, the Watson and Schuld study is not 


the one to lay this issue to rest. 


The recent study by Watson and Schuld 
(1977) concerning the relationship between psy- 
chopathology and subsequent development of 
malignant and benign neoplasms deserves credit 
for attempting to test several psychosomatic 
theories in a prospective manner. In this way, 
the authors commendably strove to eliminate 
methodological problems inherent in retrospective 
studies. Unfortunately, while dealing with one 
set of confounding variables, Watson and Schuld 
have neglected to consider another: the issue of 
restricted sample. 

In the discussion of their results, Watson and 
Schuld made note of several caveats regarding 
their results, including possible weaknesses of 
the Minnesota Multiphasic Personality Inventory 
as a comprehensive catalogue of personality 
traits, premorbid differences between groups on 
variables not measured, and inability of such a 
study to address the issue of immediate premor- 
bid loss of trauma. (To this last factor I would 
add that the study can say nothing whatsoever 
with regard to premorbid loss, immediate or 
eee They neglect, however, to consider 
the effect of using a sample of iatric i 
patients as subjects, i alias 

In view of the specialized nature of the sam- 
ple, it is not Surprising that no significant dif- 
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ferences in psychopathology were found between 
groups. The basic finding of Watson and | 
Schuld’s (1977) study is that among a group of 
individuals for whom psychopathology had been 
diagnosed, the degree of psychopathology was 
not related to subsequent development of neo- 
plasms. This is quite different from comparing 
groups of pathological versus nonpathological 
individuals and making a statement regarding 
Psychopathology in the general (nonhospitalized) 
Population. The use of a sample of people who 
represent a highly specialized subsample along the 
distribution of one of the major variables in the 
study (psychopathology) places strong limita- 
tions on the kinds of conclusions that can be 
drawn from Watson and Schuld’s results. The 
only justifiable conclusions derived from the 
data are with regard to the degree of psycho- 
Pathology within a pathological subgroup as it 
relates to neoplastic vulnerability. 

In addition, the use of subjects suffering from 
alcohol-related problems injects another con- 
found into Watson and Schuld’s (1977) results 
in that such individuals may be more prone to 
Specific types of malignancies, and this factor 
may- obscure or outweigh any psychopathology- 
neoplasm relationship. In fact, these individuals 
are overrepresented by a ratio of 6:1 within the 
malignant group. 

This is not to say that there is a great deal of 
Prospective, objective data in support of psycho- 
Pathological theories of neoplastic etiology such 
as that of Bahnson and Bahnson (1966), for 
there is not, and work is needed in this area. 
Such research, however, will need to carefully 
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Psychosomatic Etiological Factors in Neoplasms: 
A Response to Kellerman 


Charles G. Watson 
Veterans Administration Hospital 
St. Cloud, Minnesota 


Kellerman argues that our use of psychiatric patients in a study designed to 
search for psychosomatic etiological factors in neoplasms may have led to our 
negative results. However, we suggest that the use of psychiatric patients in- 
creased the heterogeneity of the sample and probably enhanced, not limited, 
the likelihood of positive findings. He also suggests that our inclusion of alco- 
holics in the study may have masked real differences between neoplasm and 
control subjects. However, new analyses run on subsets of our malignancy and 
malignancy-control samples from which alcoholics were first deleted failed to 
support his contention. New analyses run to test for differences between the 
frequencies of various high-scale Minnesota Multiphasic Personality Inventory 
types-in neoplasm and control groups also failed to support the view that neo- 
plasm patients are qualitatively different from controls on the inventory. 


In an earlier article (Watson & Schuld, 1977), 
we described a study comparing the diagnoses 
and Minnesota Multiphasic Personality Inven- 
tory (MMPI) responses of psychiatric patients 
who later developed neoplasms to those of 
psychiatric controls who did not. We found a 
considerably smaller number of significant dif- 
ferences between the two groups than would have 
been expected to appear on a chance basis, and 
we concluded that our results failed to support 
the increasingly popular theory that such growths 
are psychogenic. 

Kellerman (1978) argues that our negative 
results may reflect the use of a “highly special- 
ized” psychiatric sample and that we should 
have used a psychologically normal sample in- 
stead. He suggests that the homogeneity that can 
be inferred from the limitation of the sample 
to psychiatric patients may have attenuated rela- 


our samples showed considerab 
in type of Pathology than one 
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find among normals, including, as it did, schizo- 
phrenics, manic-depressives, involutional psy- 
chotics, neurotics, alcoholics, personality disor- 
ders, and brain-damaged patients. In fact, this 
heterogeneity may have increased the probability 
of significant neoplasm-personality relationships 
beyond that which might have been expected 
among normals. We would expect such relation- 
ships to be most conspicuous in samples with 
diverse and substantial amounts of psychopa- 
thology, and we suspect that our choice of a 
Psychiatric ward sample enhanced—not reduced 
—the probability of positive results. Despite this 
Presumed bias, our results were strikingly nega- 
tive. 

Kellerman (1978) also criticizes the inclusion 
of alcoholics in our samples. (Six appeared in 
our malignant neoplasm group and one among 
its controls.) He notes that since large quanti- 
ties of alcohol may be carcinogenic, their inclu- 
sion might have obscured psychopathology—neo- 
plasm relationships. If his reasoning is correct, 
it seems to us that if anything, the inclusion of 
alcoholics should have increased the probability 
of our finding relationships between personality. 
types and neoplasm proneness, since more alco- 
holics appeared in the malignancy sample than in 
its controls. Despite this (possible, at least) bias 
toward positive results, our results were nega- 
tive. Nevertheless, we have recalculated t tests 
for MMPI scales in subsamples of our malig- 
nancy and malignancy-control groups from which 
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subjects with alcohol-related diagnoses were first 
deleted. These analyses were based on 28 sub- 
jects, 14 from the malignancy group and 14 con- 
trols matched with them for age. None of the 13 
resulting ¢s (which ranged from .06 to 1.55) run 
to test for differences between the MMPI scale 
means of the samples was significant, even at 
the .10 level. The view that our negative findings 
resulted from the physical effects of alcohol was 
not supported. 

The Kellerman (1978) critique’s emphasis on 
our finding that the relationships between neo- 
plasm proneness and degree of pathology were 
low suggests the possibility that neoplasm risk 
may be related to quality, rather than amount, 
of measured psychopathology. Since we used a 
wide variety of (qualitatively different) depen- 
dent variables—eight diagnostic categories, 13 
MMPI scales, and several hundred individual 
MMPI items—and found little evidence that 
any of these variables is related to neoplastic 
vulnerability, we have been skeptical about the 
possibility that qualitative differences exist. Nev- 
ertheless, we have run additional analyses to test 
for them. 

To evaluate the qualitative difference hypothe- 
sis, we determined the two highest MMPI clinical 
scale T scores for each subject and compared 
their frequencies in our neoplasm and control 
samples with sign tests. Only 1 of the 20 sign 
tests (10 each for malignant /malignant-control 
and benign/benign-control comparisons) was sig- 
nificant at the .05 level. No significant differences 
were associated with malignant growths, but 
Hysteria (Hy) peaks were less common (= 


.02) among patients who later developed benign 
growths than among their controls. Since high 
Hy scores are generally positively correlated with 
psychosomatic tendencies, and since only 1 of 
the 20 differences was significant, these findings 
offer little support for the suggestion that our 
earlier procedures may have masked qualitative 
differences between neoplasm and control sam- 
ples or the view that neoplasms are psychogenic. 

Kellerman (1978) is correct in noting that our 
results can be generalized only to psychiatric 
patients. However, our negative results in psychi- 
atric subjects, and the absence of any prospective 
study that has shown premorbid differences be- 
tween neoplasm and appropriate control patients 
on either personality inventories or blind ratings 
of projective tests, lead us to doubt that impor- 
tant neoplasm-psychopathology relationships will 
emerge among normals, since both the quantity 
and diversity of psychopathology in the latter 
group is limited. Nevertheless, the question is 
ultimately an empirical one, and the issue will 
have to be resolved with additional studies, 
rather than speculations such as those offered by 
Kellerman and by us. 
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The Therapist-as-Fixed-Effect Fallacy 
in Psychotherapy Research 


Colin Martindale 


University of Maine 


Studies of psychotherapy involve sampling two sets of subjects from two pop- 
ulations: patients and therapists. Conclusions about psychotherapy should thus 
be based on statistical evidence that results are reliable across both patients 
and therapists. In most published research concerning psychotherapy, no sta- 
tistical evidence is provided that findings can be generalized beyond the partic- 
ular sample of therapists studied. In spite of this, researchers tend to draw 
conclusions concerning psychotherapy and therapists in general. Analysis of 
variance designs that allow generalization of results across both therapists and 
patients are described. The serious problems with inappropriate analyses of 
variance—treating therapists as a fixed effect or ignoring the therapist factor 


altogether—are discussed. A review of recently published studies of psycho- 
therapy reveals that most researchers have done one or the other of these 


inappropriate analyses. 


Psychotherapy and related procedures involve 
two participants or sets of participants, thera- 
pists and patients, Consequently, the researcher 
who wishes to study psychotherapy is faced with 
the necessity of generalizing findings to two 
Populations: a population of therapists and a 
Population of patients. To do this, it is neces- 
sary to select random samples of both patients 
and therapists and then to perform statistical 
analyses that will provide assessments of relia- 
bility for both samples. However, the majority 
of studies on Psychotherapy concentrate solely 
on reliability across patients, The consequence 
is that most of these studies tell us nothing 
about psychotherapy in general and a lot about 


data concerning 
have been im- 
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fects are handled in either of these ways, it is 
impossible to generalize results beyond the im- 
mediate sample of therapists used in one’s study. 
This comment presents no original ideas or 
methods; it is merely a reminder to psychother- 
apy researchers that they tend regularly to use 
the analysis of variance in what is known in 
other contexts to be an inappropriate manner. 
Since this inappropriate usage can drastically 
inflate the possibility of Type I errors, the re- 
minder would seem to be in order.! 


Fixed Versus Random Factors 


Factors in analysis of variance are either fixed 
or random. In general, a factor is fixed if all 
treatment levels of interest are included in an 
experiment, whereas a factor is random if its 
treatment levels are drawn at random from a 
larger population of possible levels. It is clear 

t in research on psychotherapy there must be 
two random factors: therapists and patients. 


*The necessity for generalization to more than 
one population is not unique to psychotherapy re- 
search. Clark (1973) has pointed out almost the 
same problems in psycholinguistic studies in which 
it is necessary to generalize to both populations of 
words and of subjects. Brunswik (1947), Ham- 
mond (1954), and Rosenthal (1966) have also 
lealt with the necessity of generalizing to more 
than one population in a variety of situations. 
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This has crucial implications for the choice of 
correct error terms in analysis of variance. In 
an analysis in which subjects are the only ran- 
dom factor, error terms are derived from esti- 
mates of within-cell variation. The degrees of 
freedom of these terms are dependent on the 
number of subjects. However, in analyses in 
which there are random factors besides subjects, 
this is not the case. Many error terms are not 
properly derived from within-cell error, and the 
degrees of freedom of these terms are not de- 
pendent on the number of subjects but on the 
number of levels of the other random factors. 
_Consider an experiment comparing the rela- 
tive effectiveness of psychoanalysis and covert 
sensitization. There are several ways of de- 
signing such an experiment. In a crossed design, 
one randomly selects therapists and W pa- 
tients, All » therapists perform both types of 
treatment, and a subset of patients is assigned 
to each of the therapist-treatment combinations. 
If the therapist factor is taken as a random one, 
then to assess treatment effects, an F ratio is 
formed by dividing the treatment mean square 
by the Treatment X Therapists mean square, 
with df=1 and m—1 in the case of two treat- 
ments. For one reason or another, it may be 
Preferable to use a hierarchical design in which 
n therapists are nested under the two treatments. 
For example, m, therapists could be assigned to 
do covert sensitization, and , different therapists 
could be assigned to do psychoanalysis. The V 
Patients would be assigned at random to thera- 
Pists as above. In this case, the correct de- 
nominator for the treatment mean square is the 
therapists within treatments mean square in form- 
B an F to assess the main effect of treatments. 
In the case of two treatments, this denominator 
Will have 2(n,—1) degrees of freedom. In this 
eee one collapses over patients altogether 
eh making the analysis. It does not matter 
th ether each of the m therapists sees one or a 
eee patients. A third possible design would 
volve a completely randomized one-way anal- 
of variance. This would necessitate random 
ation of n therapists and N patients (where 
ae ), random pairing of therapists and pa- 
S S, and random assignment of these pairs to 
i two treatments. In this case the treatment 
m is assessed against the within-cell mean 
s with n—2 degrees of freedom for the 
ca „Of two treatments. These are elementary 
siderations that are discussed in detail in 
(ign andar book on design such as Winer 
+1) or Kirk (1968). The important point is 

in each case, the effective upper limit on 

ĉes of freedom is given by m rather than 


en ŘS P 
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by N. The number of therapists in one’s study 
—not the number of patients—determines the 
degrees of freedom of F ratios and, hence, their 
power. 


Typical Problem and Appropriate Solutions 


If the effective N for a therapy study is ”% 
rather than W, and if it is difficult to obtain 
large numbers of therapists, then one is faced 
with a problem. An inherently variable event 
such as psychotherapy process or outcome in- 
tuitively calls for a large number of degrees of 
freedom if tests of hypotheses are to have much 
power. Where are these degrees of freedom to 
come from? The only fully satisfactory solution 
is to design one’s study to include a lot of thera- 
pists. If this is not possible, the only legitimate 
hope is that if » must be small, one could pool. 
To do so, interactions of treatments with thera- 
pists or therapists-within-treatment effects must 
be small. In a crossed design, if the Treatment 
X Therapists interaction can be shown to be 
zero, at a liberal a, it may be legitimate to pool 
the sum of squares for Therapists X Treatments 
with the within-cell sum of squares to obtain a 
within-cell mean square estimate with a larger 
number of degrees of freedom for testing all 
effects. In a hierarchical design, it would be nec- 
essary to show that the therapists-within-treat- 
ment effect was zero. Then, this sum of squares 
could be pooled with the within-cell sum of 
squares. In either case, to avoid Type II errors 
(accepting the hypothesis that the effect in ques- 
tion is zero when it should be rejected), it is 
necessary to test the hypothesis that the effect 
to be eliminated is zero at a high value of a: 
Kirk (1968, p. 215) suggests a= .25, but Winer 
(1971, p. 379) suggests œ= .20 or 30 for such 
tests. It is inappropriate to use a=.05 or .01 
for such tests, since this would inflate the prob- 
ability of Type I errors in tests of treatment 
effects beyond the nominal œ for such tests. It 
should be noted that many statisticians advo- 
cate a conservative “never pool” approach, and 
even those with a more moderate attitude advise 
against pooling effects when there is any a pri- 
ori reason to expect that such effects might exist. 
Given this, pooling involving Treatment X Ther- 
apists interactions or therapists-within-treatment 
effects must always be viewed as an extremely 
“liberal” practice. 


Inappropriate Analyses 


Treating the Therapist Factor as Fixed 


There are several inappropriate solutions to 
the problem. One obvious solution is to treat 
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the therapist factor as fixed rather than random. 
In this case, the divisor in obtaining the F ratio 
for treatments will always be the within-cell 
mean square, which will have degrees of freedom 
primarily dependent on N rather than #. The 
problem with this solution is that it completely 
destroys the scientific value of the study, since 
all conclusions must be restricted to the par- 
ticular therapists used in the study. (Of course, 
if one were deliberately carrying out a study 
of a specific set of therapists, as in a program 
evaluation, it would be quite proper to treat the 
therapist factor as fixed.) 

Psychotherapy researchers who explicitly in- 
clude a therapist factor in their analysis of vari- 
ance designs do in fact tend to regard this fac- 
tor as fixed. That this has serious effects can 
be demonstrated by a reanalysis of the data 
from several well-known studies. Paul (1966) 
investigated the efficacy of two types of therapy 
(desensitization and insight) and an attention- 
placebo control treatment. Five therapists each 


saw three patients in each of the three treat-° 


ment conditions, Patients were assigned at ran- 
dom to these conditions, This was a crossed 
design: Therapists were crossed with treatments, 
and patients were nested under Therapists x 
Treatments, Paul treated the therapist factor as 
fixed and obtained a number of F ratios for 
treatment effects. These Fs were obtained by 
dividing the treatment mean square by the 


Table 1 


Analyses of Variance of Therapist Ratings o 
Six Dependent Variables From Paul td 


Therapist Therapist 


fixed random 
Treatment effect P Fo 
Specific improvement in 
performance anxiety 5,57% 6.88* 
Improvement in other 
r 12.049% 5,780 
aene Propet S19% 436 
Appropriateness of type 
therapy 17.5500 4 gge 
Appropriateness of 
length of therapy 12.51% 5 396 
Therapist comfort sase gi 
E a 
EEEN 
p <05. 
“p< oi. 
$ < o. 
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Table 2 
Analysis of Variance of Therapist Ratings of 
Global Patient Improvement from Truax et al. 


Source df Ms* Fe Fb 

High vs. low 

conditions (A) 1 810 10.28** 3.06 
Therapist within 

conditions 

(B within A) Dh 2:65 >: 3.36" 0 — 
Role induction ys. 

no role 

induction (C) 1 3.60 4.57* 1.18 
AXC 1 10 ie = 
C X B within A 2 3.05 3.87* = 

Error 32 79 


* Therapists fixed, from Truax et al. (1966, p. 397). 
b Therapists random. 


within-cell mean square. Significant Fs from the 
Paul study are given in Table 1, It may be 
asked, what would happen to the Fs if we were 
willing to believe that therapists were selected 
at random? The answer is shown in the second 
column of Table 2, The Fs here were obtained 
by dividing the treatment mean square by the 
Treatment X Therapists mean square. Two of 
the Fs are no longer significant (p> .05), and 
b falls from less than .01 and .001 to less than 
05 for the others. Paul’s original F values tell 
us what would probably happen if his study were 
replicated with exactly the same therapists and 
different patients. Our F values tell us what 
might be expected if his study were replicated 
with other therapists and other patients. 
example of an often-cited hierarchical 
study in which the therapist factor was fixed 
was done by Truax et al. (1966). The study 
Concerns four psychiatrists who were rated on 
the therapeutic conditions of accurate empathy, 
genuineness, and warmth. The psychiatrists were 
Placed into a group that offered high levels of 
conditions and a group that offered low 
levels. Each of the therapists saw 10 patients, 
half of whom had previously undergone a “role 
induction” interview and half of whom had not. 
Thus, therapists were nested under levels of 
therapeutic conditions; therapeutic conditions 
and therapists within therapeutic conditions 
were crossed with role induction; and patients 
Were nested under Therapists X Role Induction. 
Truax et al. treated therapists as fixed and thus 
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used the within-cell mean square to test all 
effects and interactions. 

Results of their analysis of therapist ratings 
of global improvement are given in Table 2. If 
therapists are treated as random rather than 
fixed, then the correct divisor for the conditions 
mean square is the therapists within con- 
ditions mean square, and the correct divisor for 
the role induction mean square and for the Con- 
ditions X Role Induction mean square is the 
Role Induction X Therapists Within Conditions 
mean square. Results: when the therapist factor 
is taken as random are shown in the last column 
of Table 2. As can be seen, none of the Fs even 
approach significance. Truax et al.’s results apply 
only to the four therapists in their study. There 
is no reason to expect them to replicate. 


Ignoring the Therapist Factor 


Another illegitimate approach is to ignore 
therapists altogether in the analysis. That is, 
one might do a study involving a crossed or 
hierarchical design and then analyze it as if it 
were a completely randomized one-way design. 
This is, of course, incorrect for a variety of 
teasons. Not the least of these reasons is that in 
doing this, nonindependent observations are 
treated as if they were independent. The num- 
bers in psychotherapy research are often things 
Such as ratings of improvement by therapists. 
If one therapist makes more than one of these 
ratings, then these ratings cannot be expected 
to be independent. Inclusion of a therapist fac- 
tor allows the researcher to take account of this. 
One ought to include in an analysis all factors 
that might reasonably be expected to have some 
Systematic effect. It would be difficult to argue 
ae dividual therapists would have no such 

ct, 

It is quite true that ignoring a relevant source 
of Variation in analysis of variance will gen- 
ly inflate the within-cell error term, but this 
ee the point if the correct error term is 
‘i Used in the first place. For the types of de- 
said that I have been discussing, it is definitely 

St the case that the within-cell mean square 
ie be inflated sufficiently to allow its use. In 
ae as M becomes large, one virtually guaran- 
e 2 Significant treatment F even given that 
: treatments variance is 0 if there are any 
Tapist differences in effectiveness. 

k pne ignoring the therapist factor leads to 
which of necessary error terms in an analysis in 
ai the therapist factor is random, it involves 

Mplicit assumption that therapists constitute 
factor and is inappropriate for the rea- 
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sons already given. Finally, ignoring the thera- 
pist factor leaves no statistical way of specifying 
the generality of our findings vis-à-vis the ther- 
apist population or even the sample of thera- 
pists used in a particular study,* 


Discussion 


Several counterarguments to the one presented 
here can be raised. Perhaps the most persuasive 
is that therapists in psychotherapy studies are 
not in fact chosen in a purely random manner 
from a defined population and that the therapist 
factor must therefore be conservatively con- 
sidered as fixed. In reality, regarding therapists 
as fixed is not conservative at all, since it is 
generally much easier to obtain statistically sig- 
nificant results if the therapist factor is treated 
as fixed. Even if the researcher is scrupulous in 
pointing out that this factor is being considered 
as fixed and that results should not be general- 
ized across therapists, the very fact that the 
results have been published strongly implies 
that they are believed to be of some generality. 

Another objection would be that these criti- 
cisms could just as well be leveled at psychology 
experiments in general, that to ask psychother- 
apy researchers to include therapists in their 
designs is no more reasonable than to ask ex- 
perimental psychologists to include experimenters 
in theirs, Actually, this is not a bad idea, Hope- 
fully, experimentalists who use several experi- 
menters do in fact assess the reliability of their 
results across experimenters either by including 
them in preliminary analyses of by some less 


2 Virtually all recent psychotherapy research has 
handled therapists in one or the other of these 
inappropriate manners. To get some idea of the 
pervasiveness of these design problems, the 1975 
issues of the Journal of Consulting and Clinical 
Psychology and the 1973 and 1974 issues of the 
Journal of Abnormal Psychology were surveyed, 
Brief reports and articles with n=1 were excluded. 
The search yielded 33 articles reporting the appli- 
cation of analysis of variance to results of studies 
of psychotherapy or related techniques such as in- 
terviewing, counseling, and behavior modification. 
In only 1 of these was the therapist factor treated 
as random. It can be inferred from reported de- 
at the therapist factor was 
e studies, Therapist ef- 
(by, e.g., failing 


to test all effects using 


in 8 of the studies. 
sored in the analyses of variance reported in the 


remaining 21 articles. 
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formal procedure. This becomes important to 
the extent that the experimenter’s role goes be- 
yond that of a passive observer. Be this as it 
may, it is hopefully the case that experimenters 
are usually not explicitly included in analyses, 
in that their effects are negligible and their pres- 
ence would merely clutter the presentation. The 
point is not that every conceivable factor should 
be included in analyses of variance but merely 
that factors of generally agreed-upon substan- 
tive importance should be included. 

Another argument that can be quickly dis- 
posed of is that the results of psychotherapy 
studies do in fact replicate well. In this view, the 
theoretical problem of lack of generality is com- 
pensated for by the empirical fact of repeated 
replication. One may consult the appendix to 
the review by Luborsky, Chandler, Auerbach, 
Cohen, and Bachrach (1971) to see the fallacy 
of this argument. With few exceptions, there is 
unimpressive consistency in findings concerning 
the same hypothesis. Ultimately, the reliability 
of any phenomenon is demonstrated by repeated 
independent replications. The p values attached 
to F ratios serve to give other researchers some 
idea as to whether attempts at replication would 
probably be successful. Viewed from this per- 
spective, p values derived from analyses of var- 
iance in which the therapist factor is fixed are 
grossly misleading. 

A random factor is actually defined by two 
attributes: (a) its levels are sampled, and (b) 
this sampling is done at random. What does one 
do if levels of the therapist factor have been 
sampled but the sampling was not random? The 
conventional advice is that if levels of a factor 
are not sampled randomly, then the factor must 
be regarded as fixed, But thi advice is given in 
the hope that if a sampled f: 


‘actor were so re- 
garded, no credence to the erali 
sults would be given, generality of the re- 


even if they were not randoml; 

3 y 

E E A has occurred, That is, the 
es t tought the particular subjects 3 

hiatus with the investigator wou 

respect to the experiment in Question. The 


Sampled is that 
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| 
or that one should not make every effort at ` 
random sampling of therapists. I have merely 
tried to outline a rationale for arguing that some 
studies analyzed with the therapist factor as 
fixed could legitimately be reanalyzed with this 
factor considered as random rather than being 
discounted completely. 

To what extent do the criticisms discussed in 
this comment invalidate published psychother- 
apy research that has been inappropriately ana- 
lyzed? Unless treatment Fs can be recomputed— 
and this is seldom possible—in studies that have 
treated the therapist factor as fixed or have ig- 
nored it altogether, there is good reason to be 
wary of generalizations based on them. As the 
reanalyses described above demonstrate, treating 
the therapist factor as random rather than as 
fixed can have drastic effects on the significance 
of F values. Effects originally reported as sig- 
nificant at the .01 and .001 levels may be re- 
duced to nonsignificant levels by this change in 
analysis, Given this, it is quite likely that the 
published research on psychotherapy contains an 
extremely high proportion of Type I errors. We 
know, then, a lot less about psychotherapy than 
we thought we did. 
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Fallacy of Statistical Absolutes in eg hel Aewarch 


Gordon L. Paul and Mark H. Licht 
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Martindale’s assertions are criticized fo 
formity in domains and classes of varial 
and the need to fit design and analyses 


r failure to recognize the lack of uni- 
les involved in psychotherapy research, 
to the data obtained and the questions 


asked. Fallacies regarding absolute or invariant use of statistical models and 


techniques are discussed. Included are 


inappropriate statistical bases for gen- 


eralization, confusion in the contribution of statistical analyses to internal ver- 
sus external validity, and misapplication of statistical procedures—particularly 


“therapists as random effects” for mo: 


accumulation of knowledge from psycho! 
from adequate specification of classes of variables and focused 


questions, combined with thoughtful design and tactics, 
tion of textbook statistical models to a 


Kiesler (1966) drew attention to the “uni- 
formity assumption myths” that had historically 
plagued research in psychotherapy- Nearly all 
classes of variables of interest in psychotherapy 
research are more heterogeneous than our short- 
hand descriptive labels suggest. Failing to recog- 
nize such heterogeneity resulted in neither inde- 
dependent nor dependent variables being ade- 
quately considered and in overgeneralizations 
from inadequately specified data. Recent articles 
give evidence that these myths of uniformity are 
being resurrected (e.g, Martindale, 1978; Smith 
& Glass, 1977; Mariotto, in press). Martindale 
Tesurrects uniformity assumption myths through 
assertions that results of psychotherapy research 
should be generalized to psychotherapy, patients, 
and therapists “in general”—a focus on ques- 
tions known not to be fruitful for over 2 decade. 
He further extends inappropriate. assumptions of 
uniformity to statistical “absolutes,” which are 
asserted as requirements for research in psycho- 
therapy. 


Complexity of Variables in the Research 
Enterprise 


Before addressing Martindale's (1978) falla- 
Clous assumptions (space limitations prohibit 
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st psychotherapy research. Systematic 
therapy research will more likely come 


experimental 
not from rigid applica- 


reas in which they do not fit. 


discussion of his points with which we agree), 
a brief reminder of the complexity of variables 
and the nature of the psychotherapy research 
enterprise may aid in orientation. These issues 
are developed and described in detail along with 
recommended designs and tactics by Paul 
(1969). 

Other than the greater number and complexity 
of variables involved, the principles and methods 
required for psychotherapy research appear to 
be no different than those of any other area. 
Because the knowledge obtained in psychother- 
apy research should ultimately find its way to 
clinical practice, both internal and external va- 
ally important (Campbell & 
Stanley, 1966). Internal validity refers to the 
degree to which plausible rival hypotheses have 
study. Without internal 


validity, a study is uninterpretable. There is 
for establishing inter- 
that 


ty can be established only by 


of uniformity of labeled 
clients, therapists, OF treatments, any study of 


psychotherapy should include descriptive, mea- 
trol operations for each of sev- 


Jasses of variables to establish 
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ing external validity—hypotheses that can later 
be subjected to empirical test. Within the client 
domain, these classes of variables include (a) 
the clients’ distressing behaviors that are the 
focus of treatment; (b) the clients’ relatively 
stable personal-social characteristics; and (c) 
the clients’ physical-social life environment. 
Within the therapist domain, these classes of 
variables include (a) the specific therapeutic 
techniques through which change in problem be- 
haviors is attempted; (b) the therapists’ rela- 
tively stable personal-social characteristics; and 
(c) the physical-social treatment environment. 
The third major domain, time, serves to further 
specify the set of circumstances for other classes 
of variables and to determine the focus and na- 
ture of assessments needed within and between 
periods related to treatment, dependent on the 
questions asked. 

Paul (1969) proposed the ultimate questions 
to be answered in clinical research (including 
psychotherapy) as: “What treatment, by whom, 
is most effective for this individual with that 
specific problem, under which set of circum- 
stances, and how does it come about?” (p. 44). 
Although no single study can ever answer these 
questions, specification of the aspect of the 
question for which answers are sought, combined 
with adequate description, measurement, or con- 
trol of each of the classes of variables, allows 
stronger internal validity and meaningful accu- 
mulation of knowledge across studies. The means 
of obtaining answers or partial answers then be- 
comes a question of design and tactics, 


Fallacies of Statistical Absolutes 


With the above reminders 
of variables in the enterprise, let us t i 
; urn speci- 
fically to Martindale's (1978) more E 
assumptions and assertions, These fall into two 
groups } a that fail to reflect reality and 
absolutes that inappropriately assum i 
ity of design and analyses, Ria 


of the complexity 


Absolutes That Fail to Reflect Reality 


therapists used in the 
in this assertion is the assi 


ing is random sampli. 
onceming the conte 
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tion of statistical analyses to internal versus 
external validity in an experimental study. 
Campbell and Stanley (1966) noted the “re- 
luctance to accept Hume’s truism that induction 
or generalization is never fully justified logi- 
cally” (p. 17). Unlike random sampling in sur- 
vey research in which the logic of probability 
statistics provides justification for extrapolation 
of sample findings to a defined population, the 
focus of randomization in experimental research 
is nearly always on internal validity, not repre- 
sentativeness of the sample. Contrary to Mar- 
tindale’s assumptions, calling a factor or levels 
of a factor “random” in an investigation of psy- 
chotherapy does not make it so. Such a practice 
is likely to be misleading, in that others may 
assume statistical justification for extrapolation 
on inappropriate grounds, resulting in overgener- 
alization of findings from less powerful analyses. 

The “scientific value” of a study can only be 
determined by multiple criteria. Scientifically 
meaningful conclusions from a particular study 
are dependent on the internal validity of the 
experimental operations—the degree to which 
plausible rival hypotheses have been ruled out 
of cause-effect relationships in that study. The 
generality or external validity of findings in 
psychotherapy research, in practice, can seldom 
be more than a rational-empirical undertaking, 
rather than a statistical one. Hypotheses con- 
cerning the generalizability of findings are, thus, 
Strengthened to the extent that a study provides 
strong internal validity and thorough measure- 
ment or description of the relevant domains and 
classes of variables over which generalization 
might be expected on the basis of knowledge 
obtained from other sources. Conventional in- 
terpretations in the majority of texts on statis- 
tical design point out that generalization of find- 
ings from “fixed factors” should only be within 
the levels actually included in a study. With re- 
gard to therapists as a fixed factor, this would 
tefer to therapists with similar characteristics— 
not just the particular therapists involved any 
More than conclusions would be restricted to 
the particular settings, instruments, times, and 
so forth, involved in a specific study, However, 
Such generalizations remain hypotheses until ex- 
tensions and limitations on the conditions in 
which findings hold are empirically tested—a 
Problem in experimental design and tactics, not 
Statistical models (see Paul, 1969). 


Absolutes That Inappropriately Assume 
Uniformity of Design and Analyses 


Several of Martindale’s (1978) assumptions 
and assertions could be appropriate for a specific 
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experimental question regarding psychotherapy, 
but they appear patently inappropriate for other 
questions—in fact, for>most questions that in- 
vestigators might rationally approach. Abstract- 
ing from Martindale’s comment, these include 
assertions that both therapists and clients must 
be randomly selected for experimental inclusion; 
treating therapists as a fixed effect in analyses 
of variance drastically inflates the possibility of 
Type I errors; the number of therapists rather 
than the number of clients determines the de- 
grees of freedom and power of statistical tests; 
and the only way to increase the power of hy- 
pothesis testing is to increase the number of 
therapists included. 

Not only do the above assertions reflect the 
myth of uniformity of therapists and clients in 
general, but they also extend the myth to ex- 
perimental design and analyses requirements as 
well. As with any other area of scientific re- 
search, investigations of psychotherapy must fit 
the design to the questions, or aspects, of the 
ultimate clinical research question asked. As- 
serting requirements as absolutes, irrespective of 
the purpose of an investigation or of the purpose 
of a particular set of analyses can only per- 
petuate a “cookbook” approach that fails to 
Tecognize that statistics are only tools to aid in 
decision making—They cannot replace thought 
or careful experimental design and controls. 

Martindale (1978) provides a prime example 
of the outcome of such a cookbook approach 
in his selection of data for reanalysis from Paul 
(1966). He asks, “What would happen to the Fs 
if we were willing to believe that therapists 
were selected at random?” (p. 1528) (a totally 
unwarranted assumption) and recomputes Fs 
treating the therapist factor as a “random ef- 
fect.” In fact, such an analysis would have been 
inappropriate for the data selected even if the 
therapists had really been “randomly sampled,” 
since the major purpose of those particular data 
analyses was to investigate the therapist ratings 
to identify possible limiting conditions on the 
validity of treatment effects previously evaluated 
by objective, external means. Maximum power 
within reasonable assumptions would logically 
call for fixed effects handling of that data, no 
matter how therapists came to participate in the 
Study, i 

Martindale (1978) also implies that reliability 
of therapist contributions cannot be assessed if 
treated as fixed effects, and that $ values asso- 
ciated with F ratios reflect the probability of 
Teplication, Elementary considerations of experi- 
mental design and statistical inference aTe, in 
fact, discussed in most texts on the subject. 
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However, it seems worthwhile to note that p 
values are dependent on N and have no reflec- 
tion on practical or scientific “significance”— 
They merely indicate the believeability that a 
difference in a particular direction was obtained. 
They tell nothing about the probability of rep- 
licating a difference of that size again—with the 
same or different subjects, whether in a fixed 
or a random model. The reliability and strength 
of effects for either fixed or random factors in 
analyses of variance can be examined—not by 
ps but by correlational analyses and such statis- 
tics as coefficient alpha (Winer, 1971) and epsi- 
lon-squared (Peters & VanVoorhis, 1940). 


Conclusion 


The need to abandon uniformity assumption 
myths is no less now than when Kiesler (1966) 
originally summarized them. Rather than ther- 
apists and clients in general, adequate descrip- 
tive, measurement, and control operations fo- 
cused on the specific domains and classes of 
variables detailed by Paul (1969) appear neces- 
sary for establishing the internal validity of a 
given study and for aiding in rational general- 
ization to practice and further investigations. 
The failure to replicate findings in psychother- 
apy research is more likely a function of in- 
appropriate attempts to generalize without rec- 
ognition of the myths of uniformity than of the 
error terms selected in a particular analysis. The 
fallacy of statistical absolutes is equally worthy 
of abandonment if we are to progress systemati- 
cally in accumulation of knowledge. Neither a 
‘box score” approach to determining treatment 
effects over studies without adequate representa- 
tion of their internal validity or the internal 
purpose of measurement (€.8., Smith & Glass, 
1977), nor absolutes in requirements of design 
and analyses (€.8., Martindale, 1978) add clari- 
fication—no matter how sophisticated the mathe- 
matics involved. We find ourselves in strong 
agreement with Myers (1972) that “in practice 
we expect the experimenter to use his brains as 
well as his F ratios to draw inferences” (p. 169). 
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A Question as to the Validi 


as a WAIS 


of the Verbal Scale 1Q 
Short Form 


William M. Reynolds 
State University of New York at Albany 


Wildman and Wildman’s contention as 


to the validity of the Verbal Scale 1Q 


as a Wechsler Adult Intelligence Scale short form is shown to be based on an 
inappropriate analysis of their data. Upon reanalysis, the claim for validity is 
unsupported on the basis of original criteria. Results are presented that also 
fail to support the validity of the Verbal IQ as a short form measure. 


j The development of short forms of standard- 
ized intelligence tests has been of concern to 
both researchers and clinicians. Due to their 
differential component structure, the Wechsler 
tests have been the subject of many dissections 
into the short-form status (Finch, Childress & 
| Ollendick, 1974; Levy, 1968; Luszki, Schultz, 
Laywell, & Dawes, 1970; Silverstein, 1970). Re- 
cently Wildman and Wildman (1977) reported 
results that they contend support the validity of 
using the Verbal IQ as a short form of the 
Wechsler Adult Intelligence Scale (WAIS; 
Wechsler, 1955). They cite three criteria, origi- 
nally proposed by Resnick and Entin (1971), 
as necessary for the validation of a short-form 
test: (a) There should be a significant positive 
correlation between the short form and the stan- 
dard test form; (b) the obtained score means 
between the two forms should not be statisti- 
cally different; and (c) there should be a high 
degree of concordance between forms on the 
IQ classification level assigned to examinees. 
Wildman and Wildman (1977) found (a) a 
product-moment correlation of .97 between Ver- 
bal Scale IQ and Full Scale IQ (n= 100), (b) 
4 t-test value of .95 (p > .05) between test form 
Means, and (c) a change in diagnosis of mental 
Tetardation in 13% of the cases. They concluded 
from their data that the Verbal Scale IQ meets 
e above stated criteria and therefore is a valid 
short form for the WAIS Full Score 1Q. A re- 
tamination of their data, however, disputes 
$ conclusion, Besides the fact that as Mc- 
cca (1949) pointed out, a spurious correla- 
ion will result when a subscore that is part of 
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the total score is correlated with the total score, 
Wildman and Wildman tested the significance 
between the means of the Verbal Scale IQ and 
the Full Scale IQ via a £ test for independent 
samples. The inappropriate use of independent 
samples ¢ tests with high positively correlated 
dependent samples will, as Glass and Stanley 
(1970) pointed out, result in an overestimation 
of the standard error of the differences between 
the means, subsequently resulting in nonsignifi- 
cant differences where significant differences be- 
tween the two means exist. If for related 
measures, 


D 


t= —— 
[S2 + S?— 2rr9S1S2 
N 


where D is the mean difference, $° is the vari- 
ance, r is the correlation coefficient, and N is the 
sample size, is used to reanalyze Wildman and 
Wildman’s results, a ¢ of 5.46 (p<.001) is 
obtained, which is highly disparate from the ¢ 
value originally determined. One would there- 
fore reject the validity of the Verbal Scale short 


form. 

To check the possibility that the above results 
could in part be due to heterogeneity of Wild- 
man and Wildman’s (1977) sample (psychotics, 
brain damaged, and normals), WAIS results 
from 42 normal adults, ranging in age from 20 
to 65 years with a mean age of 30.46 years (SD 
= 12,70), were analyzed. The mean Verbal Scale 
IQ of this group was 120.00 (SD = 11.37), and 
the mean Full Scale IQ was 117.71 (SD= 
10.61). The IQ ranges were 87-141 for the Ver- 
bal Scale and 89-134 for the Full Scale. The 
obtained product-moment correlation (uncor- 
rected) between the Verbal and Full Scale IQs 
was 92 (p< .01). A t test for correlated mea- 


Association, Inc. 0022-006X/78/4606-153. 5$00.75 


1535 


1536 


sures between the means of the Verbal Scale 
and Full Scale IQs produced a ¢ of 3.33 (p< 
01). 

The results of this study also fail to support 
the contention of the Verbal Scale IQ as a valid 
WAIS short form when nonsignificant differences 
between test form means are used as a criterion. 
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Validity of the Verbal Scale IQ as a WAIS Short Form: 
A Reply to Reynolds 


Robert W. Wildman and Robert W. Wildman II 
Central State Hospital, Milledgeville, Georgia 


We were criticized because of our use of improper statistics in s i 
validity of the Wechsler Adult Intelligence Seale Verbal IQasa es ae 
the Full Scale IQ. We agree with Reynolds that we erred in the choice of one 
statistical test. However, we contend that from a practical standpoint, the use 
of the Verbal IQ as a short form had considerable value as a screening device 
and should not be discarded because of this one criticism. 


We (Wildman & Wildman, 1977) presented 
| data interpreted as being supportive of the 
| validity of the Wechsler Adult Intelligence Scale 
| (WAIS) Verbal IQ as a short form of the Full 
Scale IQ. Resnick and Entin’s (1971) three cri- 
\ teria for evaluating the validity of the short 
form were used in the 1977 study: (a) There 
had to be a significant positive correlation be- 
tween the short form and the standard test 
| form; (b) the obtained score means between 
the two forms should not be statistically differ- 
ent; and (c) there should be a high degree of 
concordance between forms on the IQ classifi- 
| cation level assigned to examinees. 

It was decided to use the ¢ test for unmatched 
samples because one could make either of two 
assumptions—that the samples were similar or 


We felt that we did not want to make the a 
Priori assumption that our short form using the 
Verbal Scale would be a valid short form and 
Predict the Full Scale IQ. We thought that this 
Was a biased alternative to choose and neglected 
4 think through the fact that we were using 
i same sample of subjects and the difference 

5 would make statistically in testing the sec- 
os criterion mentioned above. We agree with 
eynolds (1978) that a matched-sample £ test 
Would have certainly been the logical one to use. 
i Oe Verbal Scale of the WAIS has a very 
$ ifcant positive correlation with the Full 
ale IQ, and there is a high degree of agree- 
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dissimilar, At the time the decision was made, 
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ment regarding the IQ classification levels as- 
signed to the subjects. We might say that “two 
out of three ain’t bad.” Even though the Verbal 
Scale short form did not meet the second cri- 
terion of Resnick and Entin (1971), from a 
practical standpoint the main point of a short 
form is to save time and to classify individuals 
accurately, which then makes it of value as a 
screening device in many clinical situations. Rey- 
nolds’ data with 42 normal adults indicate a 
high correlation between the Verbal and Full 
Scale IQ. However, Reynolds fails to mention 
how many subjects would have been misclassi- 
fied by placing them in Wechsler 1Q categories. 

In short, Reynolds’ criticsm was correct, but 
it seems that the Verbal Scale should be ac- 
cepted as a good short form because of its dem- 
onstrated clinical value rather than discarded 
because it failed on one of the three criteria pro- 


posed in one article. 
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Brief Reports 


Effects of Type of Social Reinforcement on the 
Intelligence Test Performance of Lower-Class Black Children 


Francis Terrell, Jerome Taylor, and Sandra L. Terrell 
University of Pittsburgh 


This study examined the effects of different types of reinforcement on the perform- 
ance of black males on the Wechsler Intelligence Scale for Children-Revised. After 
each correct response, participants were given either no reinforcement, a candy re- 
ward, traditional social reinforcement, or culturally relevant social reinforcement. 
Children given candy or culturally relevant social reinforcement obtained significantly 
higher scores than children given either no reinforcement or traditional social re- 


inforcement. 


A somewhat consistent finding is that lower- 
class black children perform better on cognitive 
tasks when given tangible rewards as opposed 
to social reinforcers (Schultz & Sherman, 1976). 
Typically, social reinforcers have consisted of 
such verbalizations as “good” and “fine,” A 
problem with these studies is that little atten- 
tion has been given to the type of social rein- 
forcers used. In view of the different value sys- 
tem among blacks (Terrell & Taylor, Note 1), 
it is possible that appropriate social reinforcers 
have not been used in Previous studies, This 
study explored the effects of culturally relevant 
reinforcers on lower-class black children’s per- 
formance on the Wechsler Intelligence Scale for 
Children-Revised (WISC-R) 


logist, 
Orcement condition, the children 
no reward; in the candy reward condi 
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forcement group (p<.01). 


dren were given an M&M after each correct 
response; in the social reward condition, chil- 
dren were given verbal praise such as good and 
fine after each correct response; and in the 
culturally relevant condition, children were given 
verbal praise such as “good job, blood” and 
“nice job, little brother” after each correct re- 
sponse. Scoring was done by an experienced 
master’s level clinician who was unfamiliar with 
the participants and purposes of this study. 

Children in the control group obtained a mean | 
IQ score of 81.55 (SD =9.88); the social re- 
ward group obtained a mean IQ score of 84.85 
(SD = 12.17); the tangible reinforcer group ob- 
tained a mean IQ score of 92.85 (SD = 11.39); 
and the culturally relevant group obtained 4 
mean score of 99.15 (SD = 10.49). Significant 
differences were found among the groups, F(3, 
76) = 10.37, p < .01. Using Scheffé’s method of 
Post hoc comparisons, no significant differences | 
were found between the control and social rein- | 
forcement groups. Also, no significant differences 
were found between the tangible and social re- 
inforcement groups. However, children given 
tangible reinforcement had significantly higher | 
mean IQ scores than children in the control 
group ($< 05). Further, children given cul- 
turally relevant rewards obtained significantly 
higher IQ scores than children in the control 
group (p< .01) and children in the social rein- | 


Thus, it is suggested that the type of social 
reinforcer has an important effect on black chil- 
dren’s performance on cognitive tests. Indeed, 
the results suggest that the use of appropriate 
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The Child With Cancer: Patterns of Communication and Denial 


John J. Spinetta and Lorrie J. Maloney 
San Diego State University 


What role do communication and denial play in the coping efforts of children with 
cancer? Although no conclusions are drawn regarding cause-effect relationships, re- 


sults indicate that the level of family communication about the illness, 
in the mother’s judgment of communication, 
hypothesized responses in the child. The study 


as expressed 
is correlated with three of the four 
demonstrates the usefulness of three 


instruments as effective tools in measuring the child’s reactions to the illness, Several 
questions are raised as possible avenues for further cross-sectional and longitudinal 
research on the communication hypothesis with the use of the instruments. 


If it is true that a child with cancer as young 
as 6 years of age is aware of the serious nature 
of the illness (Bluebond-Langner, 1977; Spin- 
etta, 1974), what does the child do with this 
knowledge and awareness? Does the child com- 
municate with the parents about the prognosis 
in an effort to seek emotional support or does 
the child live in silence with the knowledge? 
The present study is a pilot effort to test in- 
struments that might help clarify the issue, 

To delimit the term coping and make it oper- 
ational for the study, the following behaviors in 
the 6- to 10-year-old child were defined as suc- 
cessful attempts on the part of the child to 
master troublesome situations relative to the 
illness: a nondefensive 
ness to parental figures, 
and the freedom to 


painful, is both a healthier state of 


sil in- 
etta, 1977). It was ypothesized acs 
whose family allows discussion of the illness 
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bush, 1960); (b) express closeness to family 
members, as measured by an interpersonal dis- 
tance scale (Spinetta, Rigler, & Karon, 1974); 
(c) express happiness with self, as measured by 
Messages to self in the Family Relations Test 
(Anthony & Bene, 1957); and (d) feel free to 
express negative feelings within the family, as 
measured by the Family Relations Test. 

Levels of openness within the family regard- 
ing communication about the illness were mea- 
sured by a questionnaire filled out by the mother. 
Five items, each scaled 1-4, questioned the 
mother’s view of (a) how much the patient- 
child knows about the illness, (b) what kinds of 
questions the child asks, (c) how the parent 
responds to the questions, (d) what kinds of 
questions the siblings ask, and (e) how the 
parent responds to the siblings’ questions. The 
total score of the combined categories repre- 
sents the level of communication within the 
family, with the highest score indicating the 
fullest levels of communication, The study was 
envisioned as an effort to see whether there is 
a correlation among the variables; it was not 
intended to demonstrate a cause-effect relation- 
ship. The subjects of the study were 16 chil- 
dren aged 6-10 years with a diagnosis of leu- 
kemia who were being treated in outpatient 
clinics in three local children’s facilities (Spin- 
etta & Maloney, 1975). 

Results are as follows: Using the level of 
communication within the family as a criterion 
measure regarding the illness, its prognosis, and 
treatment, five predictors were subjected to a 
multiple regression analysis, as summarized in 
Table 1. Three of the predictors contributed 
significantly to the amount of variance explained, 
Yielding a multiple correlation of .71. Defense 
was the strongest predictor, followed by per- 
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sonal space difference—father and negative self. 
These three predictors supported three of the 
four hypotheses. 

The data were then analyzed by a multivari- 

ate analysis of variance using the same variables, 
with families divided into communcative (above 
the median, »=7) or quasi-communcative (at 
or below the median, »=9), Results are sum- 
marized in Table 2. The combination of vari- 
ables significantly differentiated the communi- 
cative from the quasi-communicative families 
(p<.038). The multivariate analysis of vari- 
ance supports the first three hypotheses in the 
predicted direction. 
f Although no conclusions can be drawn regard- 
ing cause-effect relationships, results indicate 
that the level of family communication about 
the illness, as expressed in the mother’s judg- 
ment of communication, is related to coping 
strategies in the child. Families in which levels 
of communication about the illness are high are 
those families in which the children (a) exhibit a 
nondefensive personal posture, (b) express a 
long-range close relationship with the parents, 
and (c) express a basic satisfaction with self. 
Freedom to express negative feelings openly 
within the family was not significantly corre- 
lated with level of communication. Further 
studies of a longitudinal and interventive na- 
ture with the use of the above instruments are 
needed to test whether in fact there is a cause- 
effect relationship, that is, whether openness in 
levels of family communication regarding the 
illness leads to closer family ties and to healthier 
coping responses in the child. 

Further studies from multiple sources are also 
needed to clarify whether in fact the mother’s 
et of openness is @ valid judgment of 
ig true situation; whether there is a difference 

communication patterns as the children go 
through subsequent relapses; whether parents 
become more or less communicative regarding 
implications of the illness as the child nears 
death; and, above all, whether families differ 


Table 1 


Multiple Regression: Family 
Ommunication Patterns 


Predictor R r afc Route 
petense 569. eso Ie O08 
Neon 709 342 213 4.22 

legative self 716 9593, 12, 4.12 


Note. PSD—F = Personal space difference—father. 
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Table 2 
Mi ultivariate Analysis of Variance for 
Communicative Versus Quasi-Communicative 


Quasi- 
Commu- commu- 
nicative nicative 


Predictor (n=17) (n=9) FP p 

3.85 .038 
Defense 10.86 14.33 2.07 174 
PSD—M> 7.94 5.53 2.39 .146 
PSD—F? 11.20 6.56 9.32 009 
Negative self 2.29 2,56 12 .732 
Negative total 24.29 26.89 38 548 
Note. PSD—M = Personal space difference— 


mother; PSD—F = personal space difference— 
father. 

sdf = 1, 13. 

b Higher score represents desire to have parents 


closer. 


in the level of openness that their support mech- 
anisms can tolerate. 

If one of the goals of work with children with 
cancer is to give them access to the intra- 
familial sources of support that they need most 
to help them in their struggle, then the present 
effort pointing to the relationship between levels 
of communication and the life-threatened child’s 
adaptive strategies is a step toward that goal. 
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Repression—Sensitization and Health Behavior 


William F. Gayton 
University of Maine at Portland-Gorham 


Joseph Tavormina 
University of Virginia 


Bell and Byrne (1978) have Suggested that 
repression-sensitization (R-S) may be related 
to health behavior. The theoretical basis for 
such a relationship stems from developments in 
Psychosomatic medicine that Suggest a strong 
relationship between illness behavior and coping 
style. To the extent that the R-S dimension 
reflects characteristic ways of coping with stress, 
one might expect repressors and sensitizers to 
differ in terms of health behavior, Byrne, Stein- 
berg, and Schwartz (1968) reported that sensi- 
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type behaviors in the presence of anxiety.. - 
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An alternative hypothesis would be to assume 
Physiological differences that lead sensitizers to 
be more susceptible to illness. If this were the 
case, the higher frequency of health visits on 
the part of sensitizers would be related to an 
increased incidence of actual disease. One way © 
of clarifying these two hypotheses would be to ` 
divide the total number of visits to a health 
facility into justified visits (i.e, when medical A 
attention was required) and unjustified visits — 
(ie, when no medical attention was deemed | 
necessary). If physiological differences existed, 
We would expect sensitizers to have significantly 
more medically justified visits than repressors. If 
the differences were strictly at the level of per- 
ceived vulnerability, we would expect sensitizers 
to have significantly more unjustified visits with 


no differences in terms of medically justified 
visits, 


Method 


The records of all inmates who had been 
given the Minnesota Multiphasic Personality 
Inventory (MMPI) during the years 1973-1974 
Were taken from the files of the psychodiagnostic 
center of a Prison for adult males in western 
Tennessee. The selection criteria yielded 392 
Protocols that were scored with templates made 
yom the Byrne Revised Repression-Sensitiza- 
tion Scale (Byrne, Barry, & Nelson, 1963). 

© groups of 30 subjects each—repressors, 
intermediates, and sensitizers—were selected 
from the distribution an the basis of their R-S 
Scores, The groups did not differ in terms of 
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Table 1 
Means and Standard Deviations of R-S* Groups 


1543 


on Health Behavior Measures 


Health behavior measure 


Total dispensary 


R-S scores visits Justified visits Unjustified visits 
Group M SD M SD M SD M SD 
Repressors 19.13 4.49 9.10 7.83 5.03 3.70 
r ; . c i ; 4.07 5.75 
Intermediates 50.23 1.92 25.93 17.10 10.43 6.93 15.50 12.54 
Sensitizers 78.50 8.50 48.13 26.56 17.90 11.68 30.23 21.53 


a R-S = repression-sensitization. 


age (F <1) or education (F <1). A measure 
of health behavior was obtained by recording 
from the prison medical records the number 
of sick-call visits made by each inmate in the 
study during a 1-year period. The number of 
opportunities to report voluntarily for sick call 
was equal for all inmates in the study. A record 
Was also made concerning whether the prison 
medical officer found the inmate in need of medi- 
cation and/or treatment during the dispensary 
visit. When this was the case, the visit was 
counted as medically justified. When there was 
no indication of medication and/or treatment, 
the visit was counted as medically unjustified. 


Results and Discussion 


Table 1 presents the means and standard de- 
Viations for the three groups on the R-S scale 
and the three dependent measures: (a) total 
pot of sick-call visits; (b) number of justi- 
fied sick-call visits; and (c) number of un- 
Justified sick-call visits. A one-way analysis of 
ote was used to examine the relationship 
etween each dependent variable and the R-S 
mension, 
wer of the three analyses indicated sig- 
A cant F ratios on each ofthe dependent mea- 
Hey (a) number of sick-call visits, F(2, 87) = 
fied » &<.001; (b) number of medically justi- 
mime F(2, 87) = 18.96, p<.001; and (c) 
ae of medically unjustified visits, F(2, 

= 23.70, p < 001. 
mettle post hoc comparisons of the group 
ae were made using the Scheffé (1953) pro- 

lure. Examination of the total number of sick- 
nif Visits revealed that sensitizers made sig- 
en: antly (p < .01) more trips to the prison dis- 
aA ae than either intermediates or repressors. 
on) a intermediates made significantly (< 
the a ore trips than repressors. Examination of 
tequency of justified visits revealed that 


sensitizers made significantly (p < .01) more of 
these visits than did either intermediates or re- 
pressors, The latter two groups did not differ 
significantly from one another on this measure. 
Examination of the frequency of unjustified 
visits indicated that sensitizers made signifi- 
cantly (p<.01) more of these visits than did 
either intermediates or repressors. The latter 
two groups did not differ significantly from one 
another on this dimension. 

The present study supports the Byrne et al. 
(1968) finding that male sensitizers seek medi- 
cal attention significantly more frequently than 
do male repressors. Our previous analyses did 
not allow us to determine whether differences 
between repressors and sensitizers were due to 
differences in susceptibility to illness, perceived 
vulnerability to illness, or both. In order to help 
choose between the two alternatives, a post hoc 
analysis of variance was performed on the pro- 
portion of justified/unjustified visits. The mean 
proportion for repressors was 1.79; intermedi- 
ates, 1.05; and sensitizers, 1.61. Results of this 
analysis, F(2, 87) = .94, ms, indicate that sensi- 
tizers and repressors do not differ in the pro- 
portion of justified/unjustified visits. This indi- 
cates that the differences found between repres- 
sors and sensitizers in total visits were most 
likely due to differences in actual illness. 
Other possible explanations remain. Repressors 
may ignore or deny physical symptoms and 
avoid visiting a health facility even when they 
are actually ill, If this possibility were correct, 
the results of this study would be interpreted 
quite differently. Further research on the rela- 
tionship between repression-sensitization and 
health behavior would seem warranted, 
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Relationship of Two Process Measurement Systems for Group Therapy 


John E. Roe and Keith J. Edwards 
Rosemead Graduate School of Professional Psychology 
Biola College, LaMirada, California 


Ten-minute audiosegments from 42 group therapy sessions were rai i i 

interaction matrix variables and the Tenax Caskhuff variables pps EC 
diacy, self-exploration, and confrontation. Canonical correlation analysis suggested 
that the two systems. converged along a dimension labeled initiating skills. A factor 
analysis suggested three underlying factors of group process labeled Initiating Skills 
Responding Skills, and Discussion Skills. Factors 1 and 3 suggest a multidimensional 
structure of immediacy clarified by the Hill variables. Factor 2, with high loadings 
on empathy and self-exploration, identified a qualitative dimension of group ‘process 


not tapped by the Hill matrix. 


Research on the psychotherapeutic process has 
produced a proliferation of process measure- 
ment systems from many diverse perspectives. 
Kiesler (1973) reported more than 27 direct 
process measures based on observations of ther- 
apy sessions and an additional 25 indirect mea- 
sures. The use of process measures has been 
productive in identifying factors in therapy that 
telate to improvement (Carkhuff & Berenson, 
1967). However, theoretical integration of the 
results from process research has been difficult, 
since the studies range in disorganized fashion 
over the entire field of psychotherapy (Meltzoff 
& Kornreich, 1970). Research is needed that 
investigates the potential convergent validity of 
Process measurement systems that have empiri- 
cally demonstrated utility. Understanding the 
communality among systems of measurement is 
essential to integrating the theories on which 
they are based. The purpose of the present study 
Was to investigate the convergent validity of 
two systems for assessing group process, the 

interaction matrix (Hill, 1971) and the 
Truax-Carkhuff (1966) dimensions of facilita- 
tive functioning. 

The group process measurement system devel- 
Ti by Truax and Carkhuff contains the vari- 

les of empathy, immediacy, self-exploration, 
and confrontation. These dimensions are well- 
ee Detailed definitions can be found in 

Nes and Carkhuff (1966). The second system 
of process measurement investigated was Hill's 
‘teraction matrix (Hill, 1967). As used in the 
Present study, the Hill matrix involved a 2 X 2 
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fourfold classification modeled after the work of 
Lewis and Mider (1973). The first dimension 
defines the content style of the group interac- 
tion, with the two categories being topic- 
centered and member-centered styles. Interac- 
tions are classified on this dimension by asking 
the question, “What are the group members talk- 
ing about—topics (there and then) or one an- 
other? (here and now).” The second dimension 
is a work-style dimension. The term work refers 
to the participation of members in the roles of 
patient and therapist with the goal of self-under- 
standing. The two categories on the work-style 
dimension are simply prework and work. 

Group process interaction segments were taken 
from 11 therapy groups of ministers and their 
spouses involved in a month-long training semi- 
nar in pastoral counseling. Groups were led by 
experienced counselors with graduate students 
as coleaders. Each group consisted of approxi- 
mately 10-12 persons. Empathy, immediacy, self- 
exploration, and confrontation were rated using 
}-point discriminations on S-point scales. A 
modification of the Hill Interaction matrix 
(Form G) was used to assess the group inter- 
actions. The Hill matrix variables were evalu- 
ated by means of a 32-item. questionnaire that 
required the raters to estimate the percentage 
of time spent on interactions represented by the 
four variables (a) topic-centered prework, (b) 
topic-centered work, (c) member-centered pre- 
work, and (d) member-centered work. 

The unit of measurement was a 10-minute 


audiosegment taken from the midpoint of each 
90-minute session. Each group was taped during 
the second, fourth, sixth, and eighth sessions. 


A total of 42 10-minute segments were placed 
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Table 1 f 
Canonical Variable Loadings 
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ee aaaaaaaaaaaaaamaÃħno 


HIM-G 


Truax-Carkhuff 


994 056 430 


eee 


Note. HIM-G = Hill interaction matrix, Form G; E 


and C = confrontation. 


in random order on a master tape. The tapes 
were evaluated by four raters who were experts 
in the ratings of the systems used. Two raters 
evaluated the tapes using the Truax-Carkhuff 
variables. Correlations between the ratings given 
by the two raters were .79 (empathy), .93 (im- 
mediacy), .72 (self-exploration), and .87 (con- 
frontation). Two other raters rated the 42 
Segments using the Hill matrix, The raters’ 
quadrant scores correlated .91 (topic-centered 
prework), .97 (topic-centered work), .89 (mem- 
ber-centered prework), .94 (member-centered 
work). The use of independent sets of expert 
raters to evaluate the group interactions is con- 
sidered a major strength of the present study. 
Communality between the two systems of mea- 
surement was examined using canonical correla- 
tion analysis. The multivariate structure of the 
eight process measures was investigated using 
factor analysis, 

Only the first canonical correlation (r, = .58) 
was statistically significant, (16) = 28.5, 16 
P= .03. There was .339 sha i i 
the two sets of variable. The first canonical vari- 
ate, given in Table 1, shows the highest loadings 


for topic-centered work, -centered work 
immediacy, i 


The results of the fac 


h Canonical variate 
on member-centered work (.86), 

» and confrontation (.72). The 
‘Onsisted of empathy (.91) and 


self-exploration third factor was 


group. 
tor. The oo) 


= empathy, I = immediacy; SE = self-exploration, 


factor encompasses the facilitative responding 
skills of Truax and Carkhuff. The third factor 
appears to be related to discussion skills. The 
multivariate nature of the relationships among 
these process measures supports Carkhuff’s view 
that effective interpersonal process involves both 
responding and initiating. It was expected that 
empathy and self-exploration would be more 
highly correlated with member-centered work 
than they were, based on Hill’s position that 
Quadrant 4 has the highest therapeutic value. 
However, the results show an independence of 
these variables. The important concepts of em- 
pathy and self-exploration are not accounted for 
by Hill’s variables. It appears that while Hill’s 
Quadrant 4 identifies the conceptually important 
here-and-now character of group process, as- 
Sessment of whether the interaction is facilitat- 
ing or detracting is not possible. 

When considered together, the two process 
Systems complement and enhance each other. 
Hill’s system shows that immediacy as concept- 
ualized by Truax and Carkhuff is multidimen- 
sional, involving both topic-centered and mem- 

-centered interactions. The empathy and self- 
exploration variables of Truax and Carkhuff 
suggest a qualitative dimension not explicit in 
the Hill schema. Further research on the devel- 
opment of a system that incorporates both as- 
pects holds promise for enhancing our under- 
standing of group therapy. 
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F Recently there has been a growing interest in 
including persons who have difficulties in hetero- 
social situations as subjects in analogue treat- 
ment studies. Since the difficulties are thought 
to be characteristic of “minimal daters,” college 
students who report infrequent dating have often 
served as subjects. Studies on minimal daters 
have shown their problems to be frequently 
occurring (Martinson & Zerface, 1970), clini- 
cally relevant (Curran, 1975; Twentyman & 
McFall, 1975), and not ameliorated by demands 
or suggestions for improvement (Borkovec, 
Stone, O'Brien, & Kaloupek, 1974). However, 
most of the studies in this area have been pri- 
marily concerned with treatment techniques and 
have neglected subject selection and assessment 
Procedures. In addition, most of these studies 
have focused exclusively on male undergradu- 
ates, thus neglecting the female population. 
The present study developed and initially vali- 
ped an instrument for females based on Twen- 
tyman and McFall’s (1975) Survey of Hetero- 
te Interactions (SHI). The Survey of 
lterosexual Interactions for Females (SHI-F) 
ie made as similar as possible to the SHI. 
ch survey contains four questions on dating 


frequency and 20 heterosocial situations in 
5, ch the subjects are requested to rate on a 
aot scale their ability to initiate or carry 
te a conversation in that situation. An item 
questing the subject to rate her physical at- 


he = | 
of ests for reprints and for an extended report 
is study should be sent to Carolyn L. Williams, 
Partment of Psychology, University of Georgia, 
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Development and Validation of a 
Heterosocial Skills Inventory: 
The Survey of Heterosexual Interactions for Females 


Carolyn L. Williams and Anthony R. Ciminero 
University of Georgia 


A self-report heterosocial skills inventory for females (SHI-F) was developed and 
was found to have satisfactory test-retest reliability, excellent internal consistency, 
and significant correlations with self-reported assertiveness and anxiety measures. An 
: initial validity study compared high and low SHI-F scorers in analogue situations in 
which self-report, behavioral, and heart rate measures were taken, Although heart 

rate did not differ between groups, some behavioral and all self-report differences 
were significant. Additional data suggest that the SHI-F is more a measure of social 
skills and general negative self-evaluations and less a measure of interpersonal anxiety. 


tractiveness was added to the SHI-F. A low 
score on the SHI-F indicates a less heterosocially 
skilled individual. 

The first part of this study collected norma- 
tive and reliability data and described the char- 
acteristics of subjects selected by the SHI-F. 
The SHI-F was administered to 256 under- 
graduate females in introductory psychology 
classes at the University of Georgia. The mean 
score on the SHI-F was 68.28, with scores rang- 
ing from 32 to 98. The standard deviation was 
11.78. The internal consistency of the SHI-F 
was substantially high as measured by the co- 
efficient alpha (a =.89), and its test-retest re- 
liability was acceptable, (38) =.62, p<.001. 
High and low scorers (at least 1 standard devia- 
tion above or below the mean) on the SHI-F 
did not differ on questions requesting them to 
estimate their average number of dates, How- 
ever, high scorers reported dating a significantly 

ter number of different males per year, 
2(73) = 2.75, p < 01, rated themselves as par- 
ticipating in a greater amount of heterosocial 
behavior, ¢(72) = 3.35, $ <.01, and rated them- 
selves as significantly more attractive, ¢(71) = 
3.62, p < .001, than the low scorers. In addition, 
the SHI-F was significantly correlated with the 
Rathus Assertiveness Scale, r(119) = .558, 
001, and the Trait portion of the State-Trait 
Anxiety Inventory, r(117) = .—404, p<.001. 

From the initial 256 subjects, a group of 15 
high scoring and a group of 15 low scoring 
subjects were compared in six heterosocial situ- 
ations in which physiological, behavioral, and 
self-report measures were collected. 

The heterosocial situations were adapted from 
Rehm and Marston (1968) and required each 
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subject to interact over an intercom with a male 
confederate who was in an adjacent room, An 
example of the situations is as follows: “You 
run into a guy you dated a few times in high 
school in a drugstore near campus. He says ‘Hi, 
I didn’t know you were going to school here.’ ” 

Subjects were asked to respond to the male 
as if they were actually in the situations de- 
scribed. During each situation the subject’s 
heart rate was monitored via biotelemetry equip- 
ment. No significant differences between the two 
groups were found on heart rate. This contrasts 
with Twentyman and McFall’s (1975) study, in 
which some analyses of the heart rate data re- 
vealed significant differences between males scor- 
ing high and low on the SHI. For females, the 
physiological response channel did not differ- 
entiate between the two groups. 

Several behavioral measures were also taken 
during the six social behavior situations. Two 
observers rated the subject’s overall anxiety 
and social skill plus. three specific anxiety indi- 
cators and three specific social skill indicators 
during each scene. They also rated the subject’s 
physical attractiveness after the last situation. 
Two time measures, duration and latency of 
response, were also recorded for each scene. 
(Interrater reliabilities for the behavioral and 
physiological measures are presented in the ex- 
tended report.) 

Significant differences were found on observer- 
rated social skill, with the high group being 
rated as more socially skilled, #(28) = 2.17, p < 
.05. Two of the three specific social skill indi- 
cators, interest and initiation, also differentiated 
between the two groups, with subjects in the 
high group rated by observers as showing more 
interest, ¢(28) = 2.29, p<.05, and more ini- 
tiation, #(28) = 1.95, p < .05. The groups were 
not rated as different on the overall anxiety 
measure, on any of the specific anxiety indi- 
cators, on physical attractiveness, or on either 
of the time measures. 

The subjects also rated their own overall 
anxiety and social skill in each scene, In con- 
trast to the behavioral measures, the two groups 
differed on both social skill, £(28) = 4.36, p< 
-001, and anxiety, (28) = 4.89, P<.001, in 
the situations. The low group subjects also rated 
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themselves as less attractive than the high group 
subjects, #(28) = 3.03, p< .01. Thus, it appears 
that low scoring subjects on the SHI-F per- 
ceive themselves and their behavior as less ade- 
quate than do independent observers. 

These results point to the necessity of a 
multiple channel approach when studying a new 
self-report device and assessing the differences 
between groups. If only the self-report channel 
had been used in the present study, it would 
have appeared that the SHI-F measured both 
heterosocial skill and anxiety. However, the addi- 
tional information from the behavioral channel 
suggested that the SHI-F was a heterosocial 
skills inventory. 

The results of this study demonstrated the 
SHI-F to be a reliable and potentially useful 
instrument for selection of female subjects for 
analogue research in the heterosocial skills area. 
Use of the SHI-F in this manner would ensure 
more precise specification of subjects and more 
standardization across studies with females. 
Further research is needed to examine addi- 
tional psychometric properties of the SHI-F and 
its use’ as an outcome measure in treatment 
studies. 
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Interpersonal Liking and Self-Disclosure 


Richard Gelman and Hugh McGinley 


University of Wyoming 


Sixty-six female subjects viewed a videorecording of a female stranger who di 

her opinions about 10 social issues, After this, the subjects reali aaa ies 
Interpersonal Judgment Scale and indicated on the Jourard Self-Disclosure Question- 
naire what topics they would be willing to discuss with the stranger. What the sub- 
jects would discuss was found to be positively related to their characteristic level of 
disclosure and their attraction toward the stranger. The implications of the results 
for therapist-client interactions are discussed. 


Self-disclosure can be viewed as a process in 
which an individual purposely communicates 
information about himself or herself. It is a 
topic that in the last several years, has become 
the focus of considerable research activity. Par- 
lilly, this activity can be related to the as- 
sumed importance of self-disclosure to mental 
health, Some therapists and counselors believe 
that it is vital to have knowledge of significant 
ae of their clients’ lives if they are to work 
Selb with them. Inasmuch, a clear under- 
7 ang of the effect of their own self-disclosure 
e the clients’ self-disclosure would be useful 
lo them. 
koe the strongest finding in the clinical 
UPN study of self-disclosure is that individ- 
Ee willing to disclose more about them- 
ae to others whom they like than to others 
fin they dislike or regard with indifference 
ea 1963; Halvorsen & Shore, 1969; 
i, & Lasakow, 1958; Worthy, Gary, & 
à i 1969). Based on the relationship between 
be Sclosure and liking, it seems reasonable 
by ER variables that affect interpersonal lik- 
i a also influence self-disclosure. A variable 
a as been unequivocally shown to influence 

eae liking is attitude similarity. 
4 he assumption is that attitude similarity is 

jestive reinforcement (Byrne, 1971) and 
ree are interpersonally attracted to 
nal who are associated with the positive emo- 
simi}, tS that are elicited by reward, the 

p ttitude (Lott & Lott, 1968). Knecht, 
te and Swap (1973) have used attitude 

rity as a manipulation of interpersonal 
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liking in a self-disclosure study. In this study, 
subjects perused an attitude questionnaire that 
had been purportedly filled out by a stranger 
who was either similar or dissimilar to the sub- 
jects in attitude. After their perusal of the 
questionnaire, the subjects rated the stranger on 
the Interpersonal Judgment Scale (IJS) and 
then indicated what they would be willing to 
discuss with the stranger. Knecht et al. found 
that subjects were willing to discuss more items 
with a stranger who held similar attitudes as 
compared to dissimilar attitudes. However, based 
on trends in their data, Knecht et al. suggested 
that it was attraction (liking) toward the 
stranger that determined the subjects’ willing- 
ness to disclose, and not attitude similarity 
per se. 

The purpose of the present study was to 
further investigate the relationship between 
liking, attitude similarity, and disclosure to a 


stranger. 


Method 


The subjects were 66 females from introduc- 
tory psychology classes who had, 6-8 weeks 
earlier, completed a 36-item attitude survey. 
The materials of the study were four videotapes 
of a female stranger expressing her views on 10 
of the items from the 36-item questionnaire, & 
Sony Videocorder, a 21-inch television monitor, 
the 40-item Jourard Self Disclosure Question- 
naire (JSDQ; Jourard & Resnick, 1970), and the 


IJS (Byme, 1971). 
The subjects met in coacting groups of 2-8 
mpleted the JSDQ by 
ich items they would be willing to 
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they rated the stranger on the IJS. Finally, the 
subjects indicated on the JSDQ what they would 
be willing to discuss about themselves with the 
stranger. The stranger expressed different view- 
points on each of the four videotapes. This was 
done to ensure a wide range of attitude simi- 
larity—dissimilarity between the stranger and the 
subjects. This procedure is a modification of 
Byrne’s (1971) standard stranger technique. 
The score compiled for the JSDQ was the sum 
of the items the subjects indicated that they 
would be willing to discuss. These scores could 
range from O to 40. The IJS score was the sum 
of two items, liking and desirability as a work 
partner. This score could range from 2 to 14, 
with the lower score being positive. The attitude 
similarity—dissimilarity score was based on the 
differences between the stranger’s expressed atti- 
tudes about the 10 topics that she discussed and 
the subjects’ responses to these same 10 items 
on the attitude questionnaire. Each item offered 
eight possible answers ranging from “absolutely 
in favor” to “absolutely opposed.” The sub- 
ject’s response to a given item was scored as 
zero if it was within one scale unit of the 
stranger’s response, and it received a difference 
score of one for each unit of discrepancy there- 
after, regardless of direction. The theoretical 
range of the attitude similarity score was 0-60, 
with a low score indicating attitude similarity. 


Results and Discussion 


The purpose of the first analysis was to dis- 
cern which of the three variables, Disclosure 1 
(the subject’s first completion of the JSDQ), 
attitude similarity, or interpersonal liking (the 
IJS score), would best predict the subject’s 
level of disclosure to the stranger (Disclosure 
2). The regression equation for these data was 
Disclosure 2=9.75+.77 Disclosure 1 —.94 
Liking — .08 Attitude Similarity, ‘The multiple 
correlation coefficient was .77, F(3, 62) = 30.06, 
< .001. The next analysis described the in- 
dividual relationships between Disclosure 2 and 
the three predictor variables, 

The standard regression method of decom- 
position was used (Nie, Hull, Jenkins, Stein- 
brenner, & Bent, 1975). In this method, each 
predictor variable is treated as if it had been 
added to the regression equation in a separate 
step after the other two predictors had been in- 
cluded. The amount of Self-Disclosure 2 vari- 
ance explained by Disclosure 1 was Significant, 

F(1, 62) = 82.86, p< .001, as was that ex- 
plained by interpersonal liking, F(1, 62) = 6.73, 
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p < .025, whereas attitude similarity was not a 
significant predictor (F < 1). | 
As expected, the best predictor of the sub. 
ject’s disclosure to the stranger was her scoré 
on the first administration of the JSDQ. Beyond 
this, disclosure to the stranger was best pre- 
dicted by her rating of the stranger on the IJS. 
This suggests that although self-disclosure may 
be an enduring trait, it is influenced by inter- 
personal liking, which is determined, in this 
study at least, by a variable or variables other 
than attitude similarity. This result raises a ques- 
tion. According to Byrne (1971), interpersonal 
attraction is directly related to the degree of 
attitude similarity between two people. Indeed, 
in the present study, the correlation between 


these two variables was .73 (df = 64, p < .001), 


In the decomposition of the multiple regres- 
sion, the liking score was significantly related 
to the subject’s disclosure to the stranger, but 
attitude similarity was not. This suggests that 
although attitude similarity was a contributor 
to the liking score, its contribution was not 
significantly related to the subject’s disclosure 
to the stranger. There was at least one other 
systematic contributor to the IJS score. ie 

It is possible that there was an implicit 
attitude similarity that influenced the subject's 
attraction toward the stranger. The stranger's 
behavior of disclosing her attitudes about such 
topics as premarital sex and God implied both 
that she was willing to disclose and that she 
had a positive attitude toward disclosure, In 
the same vein, it can be said that a subject 
with a higher disclosure score would also have 
a positive attitude toward disclosure. Thus, 4 
high disclosing subject shared a common and 
positive attitude with the stranger, whereas a 
low disclosing subject did not. Following from 
this, we would expect a high discloser, regard- 
less of attitude similarity on other topics, to be 
more attracted to the stranger than a low dis- 
closer would be. ? 

To test this possibility, the 66 subjects were | 
trichotomized on the basis of their initial JSDQ 
scores, and the IJS scores made by the two y 
treme groups were compared. The high aioe 
were significantly more positive tomat pi 
stranger than were the low disclosers, Ka f 
3.15, p < .01. Thus, similarity in self-disc i 
was an important factor in determining the oi : 
ject’s attraction toward the stranger, and ne ie 
apparently, this source of interpersonal is 
that influenced the subject’s disclosure to 
stranger. 

The. results indicate that people are phe 
attracted to others whose self-disclosure 18 si | 


BRIEF REPORTS 


Jar to their own level of disclosure and that this 


| attraction has an effect on the other’s disclosure. 


This finding implies that the congruity of dis- 
dosure level between therapist and client may 
not be as crucial for the client whose character- 
istic level of disclosure is high, since the client 
is likely to continue her or his personal dis- 
closure in the presence of the therapist. How- 
ever, if a client who has a characteristically low 
level of disclosure perceives her or his therapist 
to be overly disclosing, then there is the possi- 
bility that the client might develop negative 
feelings toward the therapist that could preclude 
or interfere with successful counseling. 
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Reported Stressful Events During Developmental Periods 
‘and Their Relation to Locus of Control Orientation 
in College Students 


Stephen Nowicki, Jr. 
Emory University 


To ascertain whether high levels of stress at different periods of development may 
be related to an external locus of control, 30 externals (15 males and 15 females) and 
30 internals (15 males and 15 females) completed the Life Events Scale. Analysis of 
the data indicated that for females, stress in the preschool and pubescent, and for 
males in the elementary and pubescent, years was related to externality. It is sug- 
gested that there may be different critical developmental periods for males and females 
during which high levels of stress may be related to an external locus of control. 


Most authors studying antecedents of locus 
of control assumed that children develop a 
locus of control orientation by reacting to con- 
tinuous parental or environmental stimuli (see 
MacDonald, 1971). However, Bryant and Troc- 
kel (1976) have emphasized the possibility that 
noncontinuous events may determine whether a 
child becomes internal or external in orientation. 
Using Adler’s assumption that “individuals par- 
ticularly try to make sense of their stressful or 
perceived unusual life experiences” (cited in 
Bryant & Trockel, 1976, p. 266), Bryant and 
Trockel hypothesized that certain discrete stress- 
ful early life experiences may predispose the 
individual to an external orientation. To test 
such a proposition, these investigators admin- 
istered a life stress questionnaire (Coddington, 
1972) to females that covered four time periods: 
preschool, elementary, junior high school, and 
high school. As predicted, they found that fe- 
males who expressed an external locus of con- 
trol on the Adult Nowicki-Strickland Locus of 
Control Scale reported more stressful experiences 
during their preschool years than did the in- 
ternal females, However, they neither replicated 
their results for females nor generalized them 
to males. If their findings could be confirmed 
for females and males, they would suggest 
strongly that the significant antecedents of locus 
of control orientation may occur before a child 
begins school and as a result of specific events 
as well as continuous interactions. 


Requests for reprints and for an extended report 
of this study should be sent to Stephen Nowicki, 
Jr., Department of Psychology, Emory University, 
Atlanta, Georgia 30322. 


The subjects for the present study were the 
top and bottom 30% of a group of subjects (N 
= 103) whose median Nowicki-Strickland Locus 
of Control score of 9 was comparable to that 
of Bryant and Trockel (1976). These subjects _ 
(15 female internals, 15 male internals, 15 fe- 
male externals, and 15 male externals) com- 
pleted Coddington’s (1972) Life Events Scale. 
In Coddington’s scale, the birth of a sibling was 
arbitrarily chosen as a midpoint and given 4 
score of 50. Other events receive more or less 
than this arbitrary midpoint value. 

Analysis of the data via a 2 (male vs. female) 
X 2 (internal vs. external) X 4 (preschool, ele- 
mentary, junior high school, and senior high 
school) analysis of variance with repeated mea- 
sures on the last factor revealed a significant 
three-way interaction, F(3, 168) = 3.04, p <.05. 
Subsequent post hoc testing via Newman-Keuls 
procedures suggested that as predicted, external 
females did report more stress during their 
preschool years than did internal females. How- 
ever, further analysis also revealed that external 


females reported more stress during their junior 


high school years than did internal females, and 
external males reported more stress than internal 
males during their elementary and junior high 
school years. 

It appears from these results that kemale 
and males do mot share similar histories ° 
stressful experiences. For females, it seems that 
the preschool, and for males the elementary, 
years may be differentially significant for 
development of an external orientation. 


the - 


However, it appears that events occurring at 


the tumultuous time of pubescence may 5 
portant for how much control males and fema! 5 
perceive that they have over their life. Mor 
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Hf 
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work with children at each of the three levels 
‘ould substantiate whether stress or the locus of 
control orientation comes first. Males and fe- 
males seem to have different vulnerabilities to 
stress and to the ability of that stress to affect 
their locus of control orientation. Further work 
needs to delineate the source of those sex differ- 
ences as a function of time periods. 
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An Index of Premorbid Intelligence 


Robert S. Wilson, Gerald Rosenbaum, Gregory Brown, 
Daniel Rourke, and Douglas Whitman 
Wayne State University 


James Grisell 
Lafayette Clinic, Detroit, Michigan 


The aim of this study was to develop a way of estimating Wechsler IQs from demo- 
graphic measures. The three summary IQs of the 1955 WAIS standardization sample 
were regressed on age, sex, race, education, and occupation. The resulting Rs were 
53, 42, and .54 for the Verbal IQ, Performance IQ, and Full Scale IQ, respectively. 
The three regression equations provide actuarial indices of IQ that can be used as 
estimates of premorbid ability in neuropsychological assessment and research. 


In neuropsychological assessment and research, 
there are a number of situations in which knowl- 
edge of subjects’ premorbid IQ is desirable. 
Test data from the period preceding disease on- 
set are, however, rarely available. Thus, clini- 
cians are forced to estimate. 

These estimates have relied on one of two 
data bases: (a) present ability measures thought 
to be relatively insensitive to neurologic dis- 
ease (e.g, vocabulary) or (b) demographic in- 
formation known to be related to IQ (eg., edu- 
cation). 

Although present ability measures such as 
vocabulary are highly correlated with IQ, their 
insensitivity to central nervous system disease 
is questionable. When compared with age- 
matched controls, neurologic patients in fact 
show a significant decline on all Wechsler Adult 
Intelligence Scale (WAIS) subtests (Russell, 
1972). Impairment indices based on this model 
of estimating premorbid IQ have been ineffec- 
tive in identifying cases of deterioration (e.g., 
Wechsler, 1958; Yates, 1956). 

There have been two attempts to estimate 
premorbid IQ from demographic measures (Fo- 
gel, 1964; Ladd, 1964). The aim of both studies 
was to differentiate neurologic patients from 
controls. Subjects were assigned to one of four 


This article is based on a dissertation submitted 
by the first author under the direction of the 
second author. 

Requests for reprints should be sent to Robert 
S. Wilson, who is now at the Department of Psy- 
chology and Social Sciences, Rush Presbyterian- 
St. Luke’s Medical Center, 1753 West Congress 
Parkway, Chicago, Illinois 60612. 
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educational categories. Control group means on 
several WAIS variables were then used as pre 
morbid IQ estimates for patients with equivalent 
educational backgrounds. Use of these estima 
improved Ladd’s classification accuracy by 4 o 
Fogel did not report on the utility of the esti- 
mates. 

These results are not dramatic, but they do 
suggest that the demographic approach may have 
merit, particularly given the reliance in these 
studies on a single demographic variable, edu- 
cation. Since adult onset disease should have 
little effect on demographic status, the accuracy 
of such estimates should be limited only by the 
correlation between IQ and the demographics: 
We assumed that this correlation could be sub- 
stantially increased by the addition of other 
demographics known to be related to IQ (egu 
race, occupation). Our aim was, therefore, to 
develop multiple regression equations, based On 
a large, representative sample, that would per 
mit estimation of premorbid IQ from demo- 
graphic status. A 

Subjects (W = 1,700) consisted of the 1955 
WAIS standardization sample minus the Kansas | 
City elderly subjects (Wechsler, 1955, 1958). 
The three WAIS summary IQs (Verbal, Per- 
formance, and Full Scale) were regressed in a 
stepwise fashion on five demographic variables 
(age, sex, race, education, and occupation). Age 
and education were treated as continuous bese 
ables, whereas sex, race, and the 13 v ay 
census occupation categories Se by Wechsle 
were dummy coded (Cohen, : i 

To sinalify the equations, the unstandardized 
regression coefficients for the 12 dummy ae 
tion categories were averaged across the ea 
regression runs to yield a composite raw s 
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weight for each occupation category. The three 

IQs were regressed a second time on the demo- 
graphic variables using these composite weights 
as occupation scores. 

Education and race were the most powerful 
“predictors in each equation, The r?s between 
education and Verbal IQ, Performance IQ, and 
Full Scale IQ were .44, .31, and .43, respectively. 
‘The R’s between all five demographics and 
‘Verbal IQ, Performance IQ, and Full Scale IQ 
were 53, .42, and .54, respectively. Thus, the 
four additional variables increased the amount 
of explained IQ variance by about 10%. With 
the exception of the negligible relation between 
sex and performance IQ, addition of each demo- 
Re variable significantly (p< .01) increased 

2 


The regression equations that can be used to 
éstimate premorbid IQ are as follows: 


Estimated Verbal IQ= 
(18) Age — (2.02) Sex — (8.99) 
Race + (3,09) Education + 
(.97) Occupation + 70.80. 


Estimated Performance 19’ 
(14) Age — (.66) Sex — (12.91) 
Race + (2.44) Education + (.91) 
Occupation + 81.55. 


Estimated Full Scale IQ = 
(.17) Age — (1.53) Sex — 
(11.33) Race + (2.97) 
Education + (1.01) Occupation + 74.05. 


h these equations, male = 1, female = 2, white 
i ip nonwhite = 2, and the scores for Wechs- 
“S (1955, p, 7) 13 occupation categories are 
ts 7, 7, 6, 3, 3, 5, 0, 1, 4, 10, and 0, respec- 
ely. The standard errors of estimate for the 
ee equations are 10.2, 11.4, and 10.2, re- 
Ctively, 
og equations are based on 1955 data. The 
iti ent applicability of the equations is an em- 
“al question. If the formulae do permit 
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meaningful estimates of premorbid 1Q, their 
use should aid in the identification of persons 
who have deteriorated intellectually, In a sub- 
sequent article we will show that these formulas 
can increase the accuracy with which the WAIS 
classifies persons as deteriorated versus normal 
by more than 10%. 

Given the increase in educational attainment 
over the past 20 years, from a mean of 10.1 in 
the WAIS sample to a median of 12.3 in 1975 
(U.S. Bureau of the Census, 1976), these equa- 
tions can be expected to overestimate premorbid 
1Q. A partial solution to this problem consists 
of adjusting (deflating) the education weights 
in the equations to their 1955 level by multiply- 
ing the weights by .82 (10,1/12.3). 
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Relationships Among Marital Assessment Procedures: 
A Correlational Study 


Gayla Margolin 
University of California, Santa Barbara 


This study examined the relationships among three methods for assessing marital 
adjustment: self-reports of marital satisfaction, spouse reports of pleasing and dis- 
pleasing behaviors, and trained observers’ coding of positive and negative commu- 
nication behaviors. Frequency of pleasing behaviors was the only measure that cor- 
related with global marital satisfaction. Inverse relationships were found between 
positive and negative scores for two of the three methods investigated. 


Recent investigations into the quality of mari- 
‘tal relationships have resulted in multiple pro- 
cedures to measure relationship functioning. 
These procedures span a variety of observers 
(eg., self, spouse, and trained others), settings 
(e.g., home and laboratory), targets (e.g., com- 
munication, companionship, sex, and affection), 
and methods (e.g., global ratings, daily observa- 
tions, multibehavioral coding systems) (Weiss 
& Margolin, 1977). The purpose of this study 
was to describe the relationships among several 
of the most commonly used assessment pro- 
cedures. 

A social-learning formulation of marital ad- 
justment suggests that marital dysfunction is 
related to (a) a disproportionately high ex- 
change of displeasing, as compared to pleasing, 
events between partners and (b) reliance on 
coercive, rather than constructive, methods to 
bring about change in problem areas. To ex- 
amine these facets of relationship functioning, 
this study used assessment methodologies that 
Measure couples’ (a) daily exchanges of pleas- 
ing and displeasing behaviors, (b) Positive and 
negative communication behaviors, and (c) glo- 
bal impressions of marital adjustment. Each 
assessment methodology contained a positive and 
a negative dimension. On the assumption that 
spouse behaviors are related to marital ad- 
justment, high correlations were expected be- 
tween each behavioral measure and self-reported 
marital satisfaction. 

The 27 couples who participated in this study 
were self-referred for marital counseling. Their 
average length of marriage was 10,8 years; their 


Requests for reprints should be sent to Gayla 
Margolin, who is now at the Department of Psy- 
chology, University of Southern California, Los 
Angeles, California 90007, 


. assessment procedures, 


mean spouse score on the Locke-Wallace in- 
ventory was 71.8. Couples were in the pre 
treatment assessment phase of their therapy at 
the time of this study. As part of this assess: 
ment, couples (a) completed a packet of self- 
report inventories, (b) came into the laboratory 
to videotape two 10-minute negotiation discus- 
sions, and (c) kept daily records of the pleasing 
and displeasing behaviors received from the part- 
ner during a 1-week period. 

Two self-report measures of relationship sat- 
isfaction contained in the initial assessment pack- 
ets were the Locke-Wallace Marital Adjustment 
Inventory, a traditional measure of marital ad 
justment, and the Areas of Change Question- | 
naire, which assesses the changes that each 
spouse desires for self and spouse in 34 com- 
mon problem areas (Birchler & Webb, 1977): 
Daily frequencies of pleasing and displeasing 
behaviors came from the Spouse Observation 
Checklist (SOC), a 400-item listing of relation- 
ship behaviors that spans 12 categories of dyadic 
functioning. Each evening, spouses read through 
the checklist and indicated which pleasing 0 | 
displeasing events had occurred during the pre- 
vious 24-hour period. A pair of trained ob- 
servers used the Marital Interaction Coding 
System (MICS) to assess communication skill- 
fulness from the videotaped negotiation ses- 
sions. The 29 MICS codes include verbal an 
nonverbal behaviors that have been a priori as 
signed to positive (e.g., approval, agree), nega- 
tive (eg. criticize, no response), or neutri 
(eg., question) summary scores. All of these 
with the exception ° 
the Locke-Wallace, were developed at the ve 
versity of Oregon and Oregon Research Insti 
tute and are described elsewhere (c.f, Weiss 


Margolin, 1977). 
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Correlations Among Measurements for Assessing Positive and Negative 


Dimensions of Marital Relationships 


a 


d 1 


Measure 1 2 3 4 5 
6 

Self-report data ; 

1. Locke-Wallace adjustment scores (.90)** 

2. Areas of Conflict scores —.43* (.83)** 
Spouse observer SOC data 

3, Pleasing behaviors .40* —.04 hie 

4. Displeasing behaviors —.23 25 ey (.78)** 
Trained observer MICS* 

5. Positive communication behaviors 02 31 

y catio : 3 21 04 .70)** 
6. Negative communication behaviors 01 —11 —.13 —.06 ioe (.83)** 


“Interobserver agreement = 83.8% 
A < .05, two-tailed. $ 
$ < .001, two-tailed. 


The correlational matrix in Table 1 contains 
split-half reliability coefficients, within-method 
la rtions,. and across-method correlations. 
plit-half reliability coefficients were consistently 
Significant, Within-method correlations revealed 
Significant correlations for two of the three 
pos investigated: Adjustment scores were 
J perel. related to conflict scores, and positive 
p Poet behaviors were inversely related 
the ee communication behaviors. However, 
as SH significant correlation across methods 
$ etween marital adjustment and mean daily 

any of pleasing behaviors. 
ys of this study illustrate the rela- 
Pit lependence among several commonly used 
TANA to assess marital relationships. The 
eeo that each of the two behavioral mea- 
ia nts would correlate with a more tradi- 
i oa ae of global marital satisfaction was 
U ed for SOC pleasing behaviors but not 
kdi mmunication behaviors. Although previous 
ae have shown that communication skills 
Benet inate between distressed and nondis- 
oe couples, data from this study on dis- 
oe couples do not indicate a relationship 
skills n observers’ coding of communication 
ine Ud Couples’ subjective satisfaction rat- 
of my oor communication may be the earmark 
Pit, arital distress, but it appears from these 
ie the intensity of the communication 
does not correspond systematically with 


le 
level of overall distress. 


the Within method correlations are in boldface; split-half reliability values corrected by the Spearman- 

town formula are in parentheses; and the remaining values are across-method correlations, Mean fre- 

ad for odd and even days were correlated for Spouse Observation Checklist (SOC) reliability values. 

tics requencies for two 10-minute segments were correlated for Marital Interaction Coding System 
) reliability values. n = 27; scores are based on husband-wife averages. 


The lack of correspondence between SOC 
pleasing and displeasing behaviors lends sup- 
port to the repeated finding that these two di- 
mensions vary independently. However, the low 
correlation between behavioral frequencies on 
the SOC and on the MICS was contrary to an 
earlier finding by Patterson, Hops, and Weiss 
(1975) on a sample of 10 distressed couples. 
Patterson et al.’s rank-order correlations be- 
tween positive behaviors in the laboratory setting 
and daily reports of reinforcing exchanges at 
home were .85 for wives and .61 for husbands. 

The overall lack of correspondence among 
measures used in this study is not an indica- 
tion of invalid measurement procedures but 
rather is an indication that marital adjustment 
is not a unitary dimension. If marital adjust- 
ment does imply a set of independent dimen- 
sions, a thorough assessment must take account 
of these different dimensions and provide a 
profile analysis of each couple’s strengths and 
weaknesses. From the results of this study, it 
appears necessary to use different data collec- 
tion procedures to assess the various aspects of 
marital functioning. In light of these findings 
and the limited information about the psycho- 
metric properties of particular instruments, fur- 
ther investigation is needed to choose wisely 
from the wide range of available marital assess- 


ment options. 
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Effects of Uncertainty About the Behavior of a Phobic Stimulus 


on Subjects’ Fear Reactions 


John Lick 
State University of New York at Buffalo 


Mark Condiotte 
University of Oregon 


Thomas Unger 
State University of New York at Buffalo 


Two groups of rat-phobic subjects were repeatedly exposed to a live rat on a nearby 
platform under instructional conditions designed to induce different expectancies about 
the probability of the rat remaining on the platform. Although the rat’s behavior 
was the same in both conditions, subjects who were led to believe that the rat might 


leave the platform evidenced significantly 


greater cognitive and physiological arousal 


to repeated rat exposures than did subjects who were informed that the rat would 


definitely not leave the platform. 


An important methodological issue in labora- 
tory research dealing with “small animal phobia” 
concerns stimulus differences between the lab- 
oratory and natural environment that may affect 
subjects’ uncertainty about how a phobic stimu- 
lus will behave. For example, in the laboratory, 
subjects are usually exposed to caged animals in 
‘context designed to foster the expectation that 
events are “under control” and nothing unpre- 
dictable will happen. In contrast, outside the 
laboratory, subjects are likely to encounter un- 
oe animals moving in an unpredictable 
et The study reported here is designed to 
eo the reactions of phobic subjects 
a a fear-eliciting animal as a function of 
Š Tuctions designed to affect uncertainty about 

e behavior of that animal. 

Subjects were 19 females from introductory 
Fey classes who received research credit 
i their participation. They were selected from 

, el of 600 females who completed a ques- 
j “onnaire designed to assess fear of rodents. 
subjects arrived at the laboratory, they 
Sir told that the experiment involved investi- 
* g their physiological reactions to various 
muli. Subjects were then connected to a 


Polygraph, 
t she would be exposed to two different stim- 
t of Psychology, State University of 


After a baseline period of 10 min, the experi- 
af euests for reprints and for an extended report 
is study should be sent to John Lick, Depart- 
York Buffalo, New 
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uli. The first would be a roll of masking tape, 
and the second a live, harmless rat. Subjects 
were informed that the experimenter would place 
each stimulus on the platform, leave the room 
for 25 sec, and then return to remove the stim- 
ulus. At this point subjects were given an oppor- 
tunity to withdraw from the experiment, and, if 
they decided to continue, they were asked to sign 
a consent form. Three minutes after signing this 
form, the experimenter exposed subjects to the 
masking tape and the live rat, with an inter- 
stimulus interval of 2 min between presentations. 
(The interstimulus interval for all subsequent 
rat exposures was 25 sec, and subjects were asked 
to rate their cognitive anxiety on a scale from 1 
to 10 between trials.) Subjects who showed 
greater physiological reactivity to the rat than 
to the masking tape were retained in the study 
and were randomly assigned to one of two ex- 
perimental conditions. 

In the certainty condition (n= 10), subjects 
were told that the rat they would be repeatedly 
exposed to for the rest of the experiment had 
stayed on the platform for hundreds of hours 
and could not leave it. To enhance the credibility 
of this message, subjects were shown an electric 
grid on the platform and were told that this grid 
had been used to successfully train the animal to 


stay there. : 

In the uncertainty condition (n = 9), subjects 
were informed that the rat would “probably” not 
leave the platform; however, in the highly un- 
likely event that this did happen, subjects were 
told that the experimenter would intervene im- 
mediately to capture the animal. 

During each of the last 10 rat exposures, the 
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experimenter rated the rat’s activity level on a 
5-point scale through a small window. After the 
last exposure, subjects were disconnected from 
the polygraph and were given a behavioral 
avoidance test. Following this, subjects completed 
the rodent questionnaire again and an additional 
question designed to assess how confident they 
were during the experiment that the rat would 
remain on the platform. 

Preliminary analyses indicated that (a) there 
were no significant between-groups differences on 
any measures before the introduction of the 
instructional manipulation defining the inde- 
pendent variable, (b) the instructional manipu- 
lation was highly effective in inducing appropri- 
ate cognitive expectancies vis 4 vis the rat’s 
probable behavior, and (c) the animal’s activity 
level was not confounded with either trial or 
experimental conditions. 

Subjects’ cognitive and physiological responses 
to the rat following the induction of differential 
expectancies were analyzed by repeated measures 
analyses of variance. The results of this analysis 
for the cognitive data indicated significant main 
effects for condition, F(1, 17) = 21.1, p < .001; 
trials, F(9, 153) = 14.3, p<.001; and the Con- 
dition X Trials interaction, F(9, 153) = 3.61, p 
<.001. The results for the electrodermal re- 
sponse data show significant main effects for 
condition, F(1,17)= 17.1, p<.001, and trials, 
F(9, 153) = 7.43, p< .001, but no significant 
Trials X Condition interaction, F(9, 153) = 1.19, 
p> .30. Similarly, the heart rate data showed 
significant main effects for condition, F(1,17) = 

7.85, p <.02, and trials, F(9, 153) = 6.82, p< 
.001, and no significant Trials X Condition inter- 
action, F(9, 153) = 1.02, p>.50. Overall, these 
results show that subjects in the uncertainty 
condition manifested significantly more cognitive 
and physiological fear during the 10 rat expo- 
sures than subjects in the certainty condition. In 
addition, subjects in the certainty condition 
showed a significantly faster rate of cognitive 
fear habituation than uncertainty condition sub- 
jects. 

Finally, results from the behavioral avoidance 
test indicated that subjects in the uncertainty 
condition completed an average of 9.9 steps 
(higher numbers indicating more approach), 

whereas certainty subjects completed a mean of 
5.9 steps, #(17) =3.55, p < .002. In contrast, 
pre-post differences on the rodent questionnaire 
failed to discriminate between the two experi- 
mental conditions, ¢(17) = 1.45, p > .15. 

The results of this study appear to have sev- 
eral implications for laboratory research with 

subjects who are afraid of small animals. First, 
the results call into question the external valid- 


BRIEF REPORTS 


ity of the laboratory-based behavioral avoidance 
test. Since this test assesses subjects’ fear reac- 
tions to a caged phobic stimulus in an experi- 
mental context associated with safety and pre- 
dictability, it is possible that subjects might show 
fear reductions on this measure without mani- 
festing concomitant fear reductions to uncaged 
animals moving in unpredictable ways outside the 
laboratory. Although this issue requires addi- 
tional study, the available information suggests 
that this lack of generalization may not be un 
common (e.g., Lick & Unger, 1975). 

This study also has implications for research 
investigating habituation phenomena (e.g. Wat- 
son, Gaind, & Marks, 1972) and paradoxical fear 
enhancement (e.g., Stone & Borkovec, 1975). 
Specifically, the significant Trials X Condition 
interaction for the cognitive fear measure sug- 
gests that the slope of the curve describing sub- 
jects’ habituation to repeated presentations of a 
phobic stimulus can be influenced by situational 
variables affecting the threat value of that stim: 
ulus. In this respect, cues that affect subjects 
cognitive appraisal of the probability of a phobic 
stimulus suddenly approaching them may 
particularly important in determining level of 
threat. This is important, since studies investi- 
gating habituation and fear-enhancement phe- 
nomena have used phobic stimuli consisting © 
either caged animals or slides, and subjects 
may habituate to these stimuli substantially 
faster than they would to stimuli that are much 
more threatening (e.g, encountering a moving 
animal in close proximity). 7 

Finally, the significant effect of the in 
tional manipulation on subjects’ behavio al 
avoidance test performance justifies additio: 
attention because it suggests that the effective 
ness of fear-reduction techniques using 1m yi 
exposure might be enhanced by structuring ane 
of these exposures under conditions m 
closely approximating those characterizing ne 
ralistic encounters with phobic stimuli. 
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Some Useful Statistics for the Interpretation of the WISC-R 


Terry B. Gutkin 
University of Nebraska—Lincoln 


A verbal comprehension deviation quotient (VCDQ) and a perce; izati 

deviation quotient (PODQ) are presented as alternatives to ane SERSA 
ance IQs when attempting to measure the Wechsler Intelligence Scale for Children- 
Revised Verbal Comprehension and Perceptual Organization factors. The superiority 
of the VCDQ and PODQ rests on their equally high reliabilities, equal ease of com- 


putation, and improved factorial validity when compared with the Verbal and Per- 
formance IQs. To aid in scatter analysis research, techniques that permit a determi- 
nation of whether individual subtests deviate significantly from the subtest mean of 
the VCDQ and PODQ are discussed. Formulas are also presented for the computation 
of a freedom from distractibility deviation quotient (FDDQ) to measure the Freedom 


5 from Distractibility factor. Unlike the VCDQ and PODQ, 


an inconsistent pattern of 


factor loading mandates the use of age-specific formulae for the FDDQ. Reliability 
data indicate that the FDDQ should be used primarily for research purposes and only 
with considerable caution in applied settings. 


Kaufman’s (1975) factor-analytic investiga- 
tion of the Wechsler Intelligence Scale for 
Children-Revised (WISC-R; Wechsler, 1974) 
standardization data has provided psychologists 
with a well-grounded empirical framework within 
Which WISC-R_ scores can be analyzed with a 
maximum of objectivity. Among the study’s ma- 
jor findings were the discovery of Verbal Com- 
prehension, Perceptual Organization, and Free- 
dom from Distractibility factors, all of which 
appeared across the entire age range of the 
Pouca sample. Kaufman advocates that the 
ol IQ be taken as a rough equivalent for 
E Verbal Comprehension factor. Unfortunately, 
E inclusion of the Arithmetic subtest score 
a the computation of the Verbal IQ, in con- 
moe with the failure of the Arithmetic sub- 
ee to show any substantive loadings on the 
ae Comprehension factor, results in the 
iy IQ being an impure measure of the 
erbal Comprehension factor. 
tne better measure of this factor would be a 
Sin combination of only the Information (I), 
eens (S), Vocabulary (V); and Compre- 
‘a (C) scores, as all of these subtests 
ore Substantial loadings on the Verbal Com- 
tne ae factor. Working from procedures out- 
by Guilford (1954), it was determined 
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that such a linear combination would be as re- 
liable as Wechsler’s (1974) Verbal IQ. In fact, 
the reliabilities for the proposed linear combina- 
tion and the Verbal IQ never differ by more 
than .02 of a point at any age level between 64 
and 16}. A simple formula for the conversion 
of this factor score into a verbal comprehension 
deviation quotient (VCDQ), with a mean of 
100 and a standard deviation of 15, was de- 
veloped with the aid of procedures discussed by 
Tellegen and Briggs (1967). VCDQ = 1.47 (I+ 
S+V+C) + 41.2. 

In a similar vein, it is suggested that the 
Performance IQ is not the best measure of 
a subject’s perceptual organization skills. The 
Coding subtest, which is included in the com- 
putation of the Performance IQ, shows only 
trivial loadings on the Perceptual Organization 
factor at every age level, Using the Tellegen and 
Briggs (1967) procedures, the following formula 
was derived for computing @ perceptual or- 
ganization deviation quotient (PODQ) with a 
mean of 100 and a standard deviation of 15. 
PODQ = 1.60 (PC + PA+BD+0A) + 36.0. 
(PC = Picture Completion; PA = Picture Ar- 
rangement; BD = Block Design; OA = Object 
Assembly.) Reliability coefficients for the PODQ 
were computed for ages 64-164 and were never 
more than .01 of a point discrepant with those 
found by Wechsler (1974) for the Performance 


IQ. ‘ 
alent reliability, equal ease of com- 


With equiv. I e 0 
putation, and superior factorial validity the 
VCDQ and PODQ should prove to be more 
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appropriate measures of verbal and nonverbal 
intellectual skills than the more traditional Ver- 
bal IQ and Performance IQ scores. The impact 
of using the VCDQ and PODQ versus the Verbal 
IQ and Performance IQ was examined using the 
WISC-R scores of all children who were re- 
ferred for psychological examinations in a south- 
west urban school district over the period of a 
full school year (N = 275, average Full Scale 
IQ=76, SD = 13). The results showed that 
despite very high correlations between the 
VCDQ and Verbal IQ (r =.98) and between the 
PODQ and Performance IQ (r=.96), impor- 
tant score shifts were quite common. Wechsler 
(1974) stated that verbal-performance discrep- 
ancies of 15 points or larger are “important” 
and call for further investigation. When com- 
paring the magnitude of verbal-performance dis- 
crepancies using the VCDQ-PODQ scores in- 
stead of the Verbal IQ -— Performance IQ scores, 
8% of the students changed from either an im- 
portant to an unimportant discrepency or vice 
versa. An additional 16% of the group changed 
from either a nonsignificant to a significant dis- 
crepancy (a difference of 12 points or more) 
or vice versa. A further indication that the 
use of the VCDQ and PODQ scores in lieu of 
the Verbal IQ and Performance IQ scores may 
substantially affect test interpretation comes from 
the finding that 20% of the referred students 
had a score differential between either the 
VCDQ and Verbal IQ or the PODQ and Per- 
formance IQ of 7 points or more. 

Since many of the WISC-R subtests have ade- 
quate subtest specificity (Kaufman, 1975), the 
investigation of subtest scatter may be of in- 
terest to researchers. One common technique of 
scatter analysis has been to determine if any 
particular subtest score is significantly different 
from the average of other subtest scores that 
purport to measure similar abilities (Davis 
1959). i 

Tables 1 and 2 provide the difference scores 
needed to determine if any of the individual 


Table 1 
Minimum Deviations from the VCDQ 
Subtest Average Required for Statistical 


Significance 

———$—$ LL 
b I S Vv iC 
05 2.07 2.24 2.02 2.29 
01 2.72 2.94 2.66 3.02 


Note. VCDQ = verbal comprehension deviation 
quotient; I = Information; S = Similarities; V 
= Vocabulary; C = Comprehension. 
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Table 2 

Minimum Deviations from the PODQ 
Subtest Average Required for Statistical 
Significance 


— 


Pp PC PA BD OA 
MoS 262 218. 217 
Osa 345 2.87 3,65 


Note. PODQ = perceptual organization deviation 
quotient; PC = Picture Completion; PA = Picture 
Arrangement; BD = Block Design; OA = Object 
Assembly. 


subtest scores that comprise the VCDQ and 
PODQ are significantly different from the aver- 
age of all the subtest scores that comprise the 
VCDQ and PODQ. For example, if a subject's 
scores on the PODQ subtests are PC = 8, PA = 
14, BD =9, and OA = 9, the average for these 
subtests is equal to 10. The deviation from the 
PODQ average for each subtest is as follows: 
PC=—2, PA=4, BD =—1, OA = —1. Table 
2 shows that only the PA score differs signifi- 
cantly from the mean of the PODQ subtests 
(p< 01). The specific interpretation of any 
difference obtained in this manner will hinge 
on whether or not the subtest in question has 
adequate subtest-specific variance at the age 
level of the subject taking the test and on an 
analysis of what a particular subtest purportedly 
measures. 

Although Sattler (1974) has presented 4 
formula for the calculation of a freedom from 
distractibility deviation quotient (FDDQ), it ® 
of questionable validity for several of the ase 
levels that are encompassed by the WISC-R. 
Specifically, the Sattler formula is based on 4 
linear composite score for the Arithmetic (A), 
Digit Span (DS), and Coding (Co) subtests: 
Even though these subtests do show substan- 
tial factor loadings on the Freedom from Dis- 
tractibility factor at most age levels, the Co 
subtest has trivial loadings at the 64-, 72> 1445 
and 16}-year levels (Kaufman, 1975). Sate 
formula, FDDQ=2.2 (A+DS+Co) + aa 
is thus appropriate for all age levels other thal 
6}, 74, 144, and 164. The Tellegen and oe 
(1967) technique was used to compute an al on 
nate formula for use at these four specific E 
levels: FDDQ=2.94 (A+ DS) +412. I 
important to note that for all age groups, A 
reliability of the FDDQ falls below .90 a ; 
from a low of .83 at the 143-year level T 
high of .88 at the 124-year level). Gen% g 
.90 is accepted as the minimum reliability 


i ree 
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ficient for scores that are used to discriminate 
among individuals (Nunnally, 1967). The FDDQ 
should thus be used primarily for research pur- 
poses and only with considerable caution in ap- 
plied settings. The user must also keep in mind 
that unlike the VCDQ and the PODQ, the 
FDDQ measures a behavioral rather than a 
cognitive domain. 
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Symptom Contamination of the Schedule of Rec 


Robert E. Lehman 
University of Idaho 


Three hundred fifty-seven subjects completed the Schedule of Recent Events (SRE) 
and a widely used symptom scale. The correlation between unit weighted SRE scores 
and the symptom measure was significant and did not improve when the SRE scores 
were weighted by life change units. Four items from the SRE were identified as 
possibly representing symptoms in and of themselves. Correlations between each 
SRE item and the symptom measure were calculated, and it was found that these 
four items correlated more highly with the symptom scores than all but two of the 
remaining items from the SRE. It is suggested that because certain SRE items ap- 
parently represent symptoms, observed correlations between the SRE and various 
symptom measures may be artifactually high. 


A growing body of research has demonstrated 
that an accumulation of life events may impair 
psychological functioning and may contribute 
to the development of psychopathology (Dohr- 
enwend, 1973; Myers, Lindenthal, Pepper, & 
Ostrander, 1972; Paykel et al., 1969; Vinokur 
& Selzer, 1973). The Schedule of Recent Events 
(SRE), developed by Holmes and Rahe (1967), 
has been widely used to assess the accumula- 
tion of life events in such studies. The scale 
comprises 43 items that represent changes in 
personal behavior and life circumstances. Each 
item is assigned a weight termed the life change 
unit (LCU), based on the average of judges’ 
estimates of the adjustive demand required by 
each event. These range from a low of 11 to a 
high of 100. The total score is the sum of the 
weights of those items checked as having 
occurred within the last 12 months. Numerous 
studies have demonstrated significant correla- 
tions between SRE scores and the development 
of both psychological and physical dysfunctions. 
(See review by Rahe, 1972.) 

However, Rahe (1974) has noted that the 
LCU weightings, as compared to unit weight- 
ings, are not important in subject populations 
that report low to moderate magnitude events. 
One purpose of the present study was to further 
test this observation. 

In addition, examination of the individual 
items on the SRE suggests that some of the 
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items may, themselves, represent symptoms of 
disturbed functioning. For purposes of the 
present study, four items were identified as gen- 
erally symptomatic of a wide range of dis- 
orders.: These are “change in eating habits,” 
“change in sleeping habits,” “change in social 
activities,” and “revision of personal habits.” 
To the extent that these items on the SRE also 
represent symptoms, correlations between SRE 
scores and symptom measures will be arti- 
factually high. A second purpose of the present 
study was to further explore this possibility. 

Three hundred fifty-seven undergraduate stu: 
dents at the University of Idaho were admin- 
istered the SRE and a widely used symptom 
scale (Langner, 1962) that has been found to 
correlate with the SRE. The mean LCU of items 
checked by these subjects was 23.94 (SD= 
4.57), which is quite close to the mean o! 
approximately 25 LCUs for Rahe’s 
subjects. Scores on Langner’s sympt 
had a mean of 4.29 (SD = 3.00). Subjects were 
blind to the purpose of the experiment, an the 
order of administration of the scales was counter 
balanced. 

As anticipated for this low 
use of LCU weightings did 
correlations between the SR § for 
scale. The product-moment correlations 4 
weighted and unweighted scores were PER. 
exactly the same (r =.23, and r= pe 
tively). Both were statistically significant il 
.001, two-tailed). These results further co 
previously observed relationships between ue, 
scores and disturbed functioning. ee 
support Rahe’s (1974) contenia t empleo 
using the LCU magnitude scale with a 
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subjects who report primarily low to moderate 
LCU events . . . one can dispense with the LCU 
scale” (p. 82). 

A likely explanation for this phenomenon is 
that the LCU weights of the items most fre- 
quently affirmed by low-LCU-magnitude sub- 
jects do not differ a great deal. For example, 
the lower half of the scale (ie. that half of 
the scale with the lowest LCU magnitude 
weights) represents less than one third of the 
range of the entire scale. Thus, because the 
LCU weights of these items are relatively simi- 
lar, use of the weights may provide little ad- 
vantage. 

To assess the possibility that certain items 
on the SRE represent symptoms and may thus 
inflate correlations between SRE scores and 
measures of maladjustment, the correlations 
between each item from the SRE and the 
symptom scale were calculated. The resulting 
point-biserial correlations ranged from —.063 to 
202, The four items mentioned above as 
probable symptoms were isolated and were 
found to have correlations ranging from .133 to 
202, with a mean of .167. All were significant 
at the .02 ( two-tailed) level. The mean correla- 
tion of the remaining items, in contrast, was 
only .048. Furthermore, the four “symptom” 
item correlations were larger than all but two 
of the remaining items. This was in spite of the 
fact that the symptom items had a considerably 
lower mean LCU score (M = 18.25) than the 
remaining items (M = 36.34). The fact that 
these four symptom items generally exhibit 
much higher correlations with the symptom 
scale than do the other items suggests that they 

_ Strongly contribute to observed relationships 
between SRE scores and symptom measures. 

To the extent that the SRE contains items 
that are symptomatic of disturbed functioning, 
correlations between the SRE and measures of 
Symptomotology will be artifactually high. In- 
deed, Hudgens (1974) has noted that as many 
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as 29 of the 43 SRE items could represent 
symptoms or consequences of illness. It is not 
suggested that the observed detrimental effects 
of stressful life events on physical and psycho- 
logical health are entirely artificial by-products 
of the existence of symptom items on the SRE. 
However, it is probable that they do contami- 
nate the SRE. This possibility warrants atten- 
tion in future research. 
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The Polydrug Assessment Scale: A Psychometric Technique 
for the Indirect Measurement of Drug Use 


Khalil A. Khavari and Frazier M. Douglass IV 


Midwest Institute on Drug Use, University of Wisconsin—Milwaukee 


Presently, there are no psychometric instruments for the indirect measurement of 
polydrug use. To ameliorate this situation, self-report data from 335 college students 
were examined to identify a set of personality-attitudinal items that correlated best 
with polydrug use. Twenty items emerged as reasonably stable correlates of drug use. 
Additional analysis showed that the 20 items, when treated as a scale, were reliable, 
diagnostically accurate, and applicable to either sex. Thus, this work represents an 
initial step in the development of a scale for indirect measurement of an individual’s 
extent of involvement with use of psychotropics. 


Psychometric instruments are valuable tools 
in the area of substance abuse. Several instru- 
ments and procedures have been developed to 
assess alcoholism (Jacobson, 1976) and narcotic 
addiction (Siegel, 1976), However, no indirect 
instrument has been available for measurement 
of polydrug use. Thus, the purpose of the 
present study was to develop a psychometric 
instrument to measure indirectly the extent of 
each person’s involvement with a variety of 
psychoactive substances. 

Data for this purpose were obtained from 335 
college undergraduates, who voluntarily com- 
pleted a brief questionnaire and returned it 
anonymously. The questionnaire contained 132 
statements (similar to Minnesota Multiphasic 
Personality Inventory items) presented with a 
Likert response format and 19 questions about 
current usage of various licit and illicit drugs. 
Responses from the drug use questionnaire were 
transformed to T scores and were averaged to 
yield a criterion measure of polydrug usage. 
(See Douglass & Khavari, in press. ) 

To select items for the Polydrug Assessment 
Scale (PAS), responses to the 132 items were 
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to this work. 
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first correlated with the criterion. Then, all items 
with correlations less than .20 were excluded 
from further consideration. The remaining 50 
items were then subjected to a series of multi- 
ple regression and cross-validation analyses 
until an accurate and stable set of 20 items was 
identified. Finally, these items were further €x- 
amined to determine their reliability and their 
applicability to selected subgroups of respond- 
ents. 

The resulting set of items are listed in Table 1. 
Their relationship with the criterion is reflected 
by bivariate correlations (absolute value) ranging 
from .21 to .51. When combined and regresse 
on the criterion in a random sample (N= 
166), these items accounted for 61% of ue 
variance associated with polydrug usage (R= 
.78, p < .001). Resulting regression coefficients 
were then used to cross-validate items using the 
remaining respondents. The sum of these — 
weighted items (PAS score) accounted for 41% 
of the polydrug variance (r=.69, p <.001). 
Since results from these two analyses bee 
comparable, all subjects were pooled for aa 
mary computations. The mean and stan 
deviation for the pooled group were 49.64 pi 
4.29, respectively. Multiple regression cog 
cients are presented in Table 1. A 

Discriminative efficiency of the PAS ba 
sessed by first dividing respondents into @ Ph 
average and below average polydrug ee 
then determining the correspondence be a 
PAS scores and actual usage. A PAS scor 

AA ; Ited in 81.5% 
52, used as a dividing point, resu rrectly 
correct classification; 24.18% were CO en, 
identifed as above average polydrug 
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1% were correctly identified as below 
erage users, 6.57% were false positives, and 


fications. 
To examine the clinical utility of the PAS, 


espondents were divided into three groups 
(greater than +1 SD, between +1 and —1 SD, 
and less: than —1 SD). Univariate analyses of 
yariance showed that these three groups were 
Significantly different in terms of all drugs except 
nti-infectious agents. People in the highest 
PAS group used every type of drug (except anti- 
infectious) more frequently than people with 
‘average PAS scores, and people with average 
cores used drugs more frequently than people 
ith low PAS scores. 
Examination of sex differences showed that 
Jes and females were not significantly dif- 
t in terms of overall polydrug criterion 
res or in terms of PAS scores. Thus, al- 
ough males were more likely to use alcohol, 
Marijuana, hashish, opiates, and cocaine, and 
females were more likely to use anti-infectious 
‘drugs, diet pills, tobacco, tranquilizers, sedatives, 
‘and relaxants, these differences are adequately 
ntrolled so that the PAS is applicable with 
‘either sex. 
| Since five of the PAS items specifically refer 
‘to drugs, some critics might contend that these 
‘items should not be included in the scale. There- 
fore, the multiple regression — cross-validation 
analyses were repeated after omitting the five 
items, The resulting correlation coefficients (.75 
id .63, respectively) were not substantially 
different from values resulting from the full 
Scale, 
Reliability of the PAS was determined by 
lying the Spearman-Brown prophecy formula 
by conducting a test-retest study with an 
dditional group (n= 17). The Spearman-Brown 
lability that was determined in the original 
oup was .80, and the test-retest correlations 
‘Were above .84 for most items (see Table 1). 
“These results show that the 20 items having 
individual correlations with polydrug use 
be combined to produce a stable instrument 
or indirectly assessing the degree of polydrug 


is valuable for polydrug diagnostic pur- 
es. It could be further developed by deter- 
g its use as a predictive device and i 
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Table 1 

Polydrug Assessment Scale Items and Their 

Bivariate Correlation With Polydrug Use (P), 

Multiple Regression Weights (RC), and 

Test-Retest Coefficients (TR) 

es 
Item Pi RC TR 


1. I have participted in a 

political demonstration. 

. I enjoy going to church. 

. I have stolen (or shoplifted) 

something. 

My friends have been in 

trouble with the police. Boe 722 IS 

I have thought about 

suicide. 26 19 

6. I deserve severe punishment 
for my sins. 

7. I have a cough most of the 
time. 

8. I have few or no pains, 

9. Christ performed miracles 
such as changing water 
to wine. 

10. I frequently notice my hand 
shakes when I try to do 
something. 

11, I have been quite inde- 
pendent and free from 
family rule. 

12. I have never been in trouble 
with the law. 

13. My parents have often ob- 
jected to the kind of 
people I went around 
with. 

14. I played hooky from school 
quite often as a youngster. 

15, Religion is the most im- 
portant system of human 
values. 

16. Most people who use 
marijuana lead a normal 
life. 

17. People should only use 
drugs for medical reasons, 

18. The facts on crime and drug 
abuse show we will have 
to crack down harder on 
criminals and drug 
addicts. 

19. People have the right to the 
pursuit of happiness an 
well being. If to achieve 
this requires taking illicit 
drugs, that’s their F 
business. 

20. Drugs are there to be used 
and enjoyed by people. 

Note. Decimals have been omitted. Multiple regres- 

sion constant = 50.12 


37 104 85 
-34 -27 65 


wr 


38 3B 


“oe 


-21 


27 87 85 
—22 


-35 -56 85 


-55 86 


46 59 4 
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fying new items that substantially npr its 
assessment capability. 
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Comparison of Symbolic and Overt Aversion in the 
Self-Control of Smoking 


Oscar A. Barbarin 
University of Maryland 


This research compares rapid smoking (overt aversion), covert sensitization (sym- 
bolic aversion), and a combination of the two in a self-punishment procedure for 
reducing and eliminating cigarette smoking. Sixty adults (M age=40) were ran- 
domly assigned to one of three experimental groups or to the control group. Ten 
training sessions were spaced over a 1-month period. The experimental groups smoked 
significantly less than the control group at each follow-up point. The overt group 
achieved significantly greater reduction than the symbolic group, and the combined 
group did not differ significantly from either of the single treatment groups. One year 
after treatment, 6 of 15 persons in the overt aversion group were completely abstinent, 


as opposed to 1 in each of the other experimental groups. 


The present study examines the relative effi- 
cacy of symbolic and overt aversion in self-con- 
trol procedures designed to eliminate smoking. 
Sixty adults (M age = 40) who smoked at least 
one pack of cigarettes a day were medically 
screened and assigned to one of three experi- 
mental groups (rapid smoking, symbolic aver- 
sion, combined treatment) or a control group. 
A week of self-monitoring prior to treatment 
provided the baseline against which treatment 
effects were to be measured. Training in each 
self-control strategy was conducted in 10 1-hour 
Sessions spaced over a 1-month period. Treat- 
ment was carried out in groups of 3-7 persons. 
At each session participants were asked to report 
the number of cigarettes smoked since the 
Previous session and whether they had used the 
self-control procedure to avoid normal smoking. 

After 10 minutes of relaxation exercises, the 
overt aversion group began rapid smoking, which 
called for normal inhalations every 6 sec. Par- 
ticipants were paced by a prerecorded tone and 
4 voice signal that simultaneously counted out 
consecutive numbers every 6 sec. Trial length 
and the latencies between trials were determined 


| Eee 
_ This study is based in p 
tion completed at Rutgers University, under the 
ditection of Peter Nathan. Data analysis was made 
Possible through a grant from the Computer Sci- 
ence Center ‘at the University of Maryland. 
Requests for reprints and for an extended report 
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by the participants. They were instructed to 
smoke as long as they could tolerate smoking. 
Group members recorded the number that they 
heard at the beginning and end of each trial, 
making it possible to calculate the number of 
inhalations per trial and the latencies between 
trials. At the end of each trial, participants also 
rated the aversiveness of their thoughts and 
physical reactions. 

The symbolic aversion group was first asked to 
imagine as clearly as possible several scenes in 
which unpleasant consequences occurred as a 
result of smoking. Most scenes were drawn from 
situations listed by participants in their smoking 
logs compiled prior to treatment, Participants 
called to mind situations in familiar places such 
as homes, work, or with friends in which they 
frequently experienced the temptation to smoke, 
Then they were asked to imagine the occur- 
rence of a variety of aversive consequences. In 
many instances these imagined consequences 
were isomorphic with the real consequences for 
the overt group, namely, nausea, coughing, dizzi- 
ness, and discomfort in the chest. Other graphic 
images such as vomiting on self and uncontrol- 
lable coughing were also used to enhance the 
aversive nature of the imagined scenes. After 
the experimenter described an aversive scene, 
the participants imagined the scene. Following 
the aversive scenes, participants imagined an 
aversion relief scene that included the refusal to 
smoke. In these scenes participants imagined 
feeling calm, , and generally good about 
themselves. After each trial, members of the 
symbolic aversion group rated the vividness and 


/4606-1569$00.75 


1569 


1570 


Table 1 
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Mean Percentage of Baseline Smoking for Experimental Groups During Treatment 


and Follow-Up 


_  —  — — — ———  — —h— 


Group 
Symbolic Overt Combined Control 
Time —— — — 
period M SD M SD M SD M SD F 
Treatment 
1 week 22 11 9 9 11 16 91 12 58.41** 
2 weeks 13 13 4 7 9 .09 88 20 86.36** 
3 weeks 16 21 1 2 11 20 87 29 59.12%% 
Follow-up 
1 week 29 29 1 7 20 28 87 24 35.40% 
2 weeks 42 36 8 13 36 48 90 22 19.27% 
4 weeks 47 39 18 24 36 46 85 29 11.17% 
8 weeks 55 39 21 17 50 48 90 23 9.37%" 
12 weeks 59 33 28 31 53 37 92 26 10,327 
1 year 84 28 56 40 75 39 100 0 4.38 
an = 14, 
bn = 15. 
*p <01. 
** p < 001. 


aversiveness of the scenes. Participants were 
asked to use the symbolic aversion procedure 
and the relaxation exercises when they felt an 
urge to smoke. In addition, they were asked to 
practice for 10 minutes twice a day. 

The combined aversion group used both sym- 
bolic and overt aversion, as described above. 
Participants used relaxation exercises, imagined 
unpleasant consequences for smoking, and en- 
gaged in rapid smoking while concentrating on 
these unpleasant images. Members of the con- 
trol group were told that they could not be 
accommodated in the training groups but that 
they would be sent descriptions of self-control 
procedures used in the program and that their 


Table 2 

Number of Abstinent Participants During 

Follow- Up 

usss 
Weeks after treatment 


Group 1 2 4 S20 152 
Symbolic* 4 4 3 3 is 
Overt? 14 12 12 8 6 6 
Combined* 7 6 6 5 3 1 
Control’ 1 1 1 1 1 A) 

an = 14. 
bn = 15. 


Progress would be monitored through telephone 
contacts. Subsequently, they were sent publica- 
tions of the American Cancer Society describing 
a self-directed plan for monitoring and gradual 
reduction of smoking. The control group was 
contacted weekly for 1 month and at each of the 
follow-up points. During the 4 weeks of treat- 
ment, they were asked about their progress 
and were encouraged to maintain their efforts to 
abstain from smoking. All participants were con- 
tacted 1 week, 2 weeks, 1 month, 2 months, 3 
months, and 1 year following treatment. | i 
One participant each from the symbolic an 
combined groups dropped out before foon 
could be completed. The resulting groups did nor 
differ significantly with respect to sex, age, E 
arette consumption, or previous attempts to es 
smoking. A one-way analysis of variance ale 
performed on the percentage of baseline smo 85 
reported at each follow-up point. Table 1 p 7 
sents the means, standard deviations, an 


he 

values for smoking rates from eatae 
Fs ni 

l-year follow-up. Significant differe mental 


found at each point between the emen aa 
groups and the control groups. In aedon a 
aversion produced greater reductions 1n e 
than symbolic aversion. However, ot bedi 
igni i e C 

significant differences between a A 


group and either the symbolic or ove! 


\ 


| 
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groups. Interestingly, this pattern held up even 
at the 1-year follow-up. 

A similar pattern was observed with respect to 
the number of participants maintaining absti- 
nence (see Table 2). 

The overt aversion group appeared decidedly 
superior to all other groups with respect to 
long-term maintenance of abstinence. 

On a 9-point scale, the symbolic aversion 
group rated their images as significantly clearer 
than did the overt aversion group (M = 7.72 vs. 
6.00). However, the symbolic group did not dif- 
fer from the combined group (M = 6.67). In 
tating the aversiveness of physical reactions dur- 
ing aversion training, the symbolic aversion 
group (M = 6.8) was significantly lower than 
either the overt (M=8.2) or the combined 
groups (M = 8.2). The combined groups had 
less training with the rapid-smoking technique 
than the overt group with respect to the number 
of trials (25.4 vs. 31.4), the number of cig- 
arettes per session (5.1 vs. 7.1) and the total 
time in rapid-smoking training (167 vs. 200 min- 
utes). At the end of treatment, participants 
rated the usefulness of various treatment com- 
ponents. The overt and combined groups rated 
tapid smoking as most useful, with group support 
second. The symbolic group gave highest ratings 
to relaxation training and group support. 

The results of the study point to the su- 
Periority of overt aversion over symbolic aver- 
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sion in the self-control of cigarette consump- 
tion. This is true both with respect to the abso- 
lute level of smoking reduction and the number 
of individuals who were able to maintain ab- 
stinence. However, each of the experimental 
treatment groups smoked significantly less than 
the control group, These findings suggest that 
symbolic aversion shows promise as a self-modi- 
fication procedure for gaining control over smok- 
ing behavior. The failure of the combined treat- 
ment group to achieve a level of smoking reduc- 
tion superior to that of the symbolic treatment 
alone is perplexing. However, similar findings 
are reported by Danaher (1977), who combined 
several treatment components. It is likely that 
participants attempting to apply several tech- 
niques may not be able to master each of them 
fully. In this case, differences in the amount of 
training in the overt condition may also con- 
tribute to the diminished effect of the combined 
treatment group. Consequently, additional atten- 
tion must be given to the effects of combining 


treatment. 
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Moderators of Racial Differences on the MMPI 


Arthur I. Rosenblatt and David A. Pritchard 
University of Mississippi 


Multiple discriminant analysis of Minnesota Multiphasic Personality 
(MMPI) scores between high-IQ white, high-IQ black, low-IQ white, and low-IQ 
black subjects yielded two significant canonical variates. The results suggest that 
racial differences on the MMPI do not occur in all racial comparisons but instead 


are restricted to low-IQ groups. 


In his review of racial differences on the Min- 
nesota Multiphasic Personality Inventory 
(MMPI), Gynther (1972) concluded that 


the degree of MMPI differences between blacks and 
whites appears to be affected by such variables as 
education, residence and cultural separation, (p. 
390) 

The largest racial differences reportedly occur 
among poorly educated, rural, and isolated sub- 
jects, whereas such differences are attenuated or 
even eliminated among better educated, urban, 
and integrated subjects. The empirical basis for 
this conclusion is unclear, however, since only 
one of the studies reviewed by Gynther (Erd- 
berg, 1970) contained a factorial design to test 
the moderating effect of one of the hypothesized 
variables (i.e., residence: urban vs. rural). The 
remaining studies generally used fixed values for 
education (e.g., ninth graders), for residence 
(e.g., urban), and for cultural separation (e.g., 
segregated school), and thus provided no test of 
the moderating effect of these variables. 

Since Gynther’s (1972) review, only one 
study has specifically investigated the effect of 
one of Gynther’s hypothesized moderator varia- 
bles. Davis and Jones (1974) performed analy- 
Ses of variance on each of the MMPI clinical 
scales using race, diagnosis (schizophrenic vs. 
“other disorders”), and education (12+ grades 
vs. 11 or fewer grades completed) as factors. 
They reported no significant main effects for 
race on the clinical scales, but they did find a 
significant Race X Education interaction on the 


This article is based on a master’s thesis com- 
pleted at the University of Mississippi by the first 
author. 
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Inventory 


Pa and Sc scales, with the less educated blacks 
scoring higher than the other groups. 

In a related analysis of the same data, Cow 
Watkins, and Davis (1975) reported that o 
the poorly educated black nonschizophrenics w 
significantly misclassified as schizophrenic wl 
an empirical rule was used to classify MMPI 
profiles into schizophrenic and nonschizophreni¢ 
groups. Both of these analyses support the 
pothesis that racial differences on the MM 
are moderated by educational status. The pre 


a new sample, using a more appropriate methot 
of statistical analysis. p 
The subjects were 104 black and 191 w 
male inmates at the Mississippi State Penitem 
tiary, who were each administered the Wechsle 
Adult Intelligence Scale (WAIS) and the MMPI 
as part of an evaluation for vocational rehabi 
tation services. Subjects ranged in age from 1 
to 59 (white: M = 25.8, SD = 7.54; black: 
= 24.5, SD = 5.83), and in education from 
completed grades to 16 completed grades (whii 
M =10.7, SD =2.32; black: M= 11.2, SD 
2.02). All subjects scored above the standard 
score of 70 on the reading portion of the Wi 
Range Achievement Test (WRAT) (white: 
= 101.2, SD=15.81; black: M=91.1, SD- 
13.0). Over 90% of the subjects were new af 
rivals at the prison at the time of test 
whereas the remaining subjects had reache® 
their time of parole eligibility. k 

Highest grade completed was rejected MAE 
potential moderator variable, since @ majority i 
the subjects had been in school prior to inte 
tion in Mississippi, and thus highest grade coma 
pleted might not indicate an equal ena 
experience for the two racial groups. Ins E 
the Full Scale WAIS IQ was adopra r 2 
approximate indicator of the kinds of differ 
that might moderate racial differences on 


a 
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MMPI. The product-moment correlations be- 
tween WAIS Full Scale IQ and WRAT scores 
were .64 (Reading, p < .001), .54 (Spelling, $ 
<.001), and .70 (Arithmetic, p<.001), 
whereas the correlations between highest grade 
completed and WRAT scores were .32 (Read- 
ing, p < .001), .40 (Spelling, p< .001), and .35 
(Arithmetic, p< .001). Thus, Full Scale IQ 
appeared to be a better indicator of educational 
achievement than highest grade completed. 
The sample was divided into four subgroups 
defined jointly by race and by IQ score (di- 
chotomized at the sample median of 93.0). The 
four groups were labeled high-IQ white (M = 
106.0, SD = 8.81, n=121), high-IQ black (M 
= 100.8, SD = 6.68, n = 23), low-IQ white (M 
=85.7, SD = 5.69, n = 70), and low-IQ black 
(M=81.2, SD=647, n=.81). A multiple 
discriminant analysis (Klecka, 1975) was per- 
ormed on these four groups using K-corrected 
raw scores on the MMPI’s 10 clinical scales as 
variables. If IQ functions as a moderator of 
racial differences on the MMPI, then the re- 
sultant canonical variates should separate the 
low-IQ white and low-IQ black groups but not 
the high-1Q white and high-IQ black groups. On 
the other hand, if IQ does not moderate racial 
differences on the MMPI, then canonical vari- 
ates should separate the racial groups within 
oth the high and low IQ categories. 
The first variate produced a Wilk’s lambda 
of 6958, which is approximately equal to a 
chi-square of 104.1 (df = 30, p <.001; Overall 
& Klett, 1972). The second variate produced a 
Wilk’s lambda of .8508 (df = 18, p < .001), and 
the third variate yielded a nonsignificant Wilk’s 
lambda of .9766 (p<.560). The first variate 
accounted for 18.23% of the MMPI variance 
and separated the high and low IQ groups; the 
second variate accounted for 12.89% of the 
variance and separated the low-IQ white group 
from the low-IQ black group. Accordingly, the 
first variate can be interpreted as an 1Q effect 
on MMPI scores, and the second can be in- 
terpreted as a (moderated) racial effect on 
MMPI scores. In support of this interpretation, 
it should be noted that the greatest contribu- 
tions to the second (moderated racial) variate 
were made by Scales 1, 3, 8, and 9, as indicated 
y the relative magnitudes of the standardized 
discriminant weights for these scales. Blacks 
Scored higher than whites on Variate 2 for 
Scales 1, 8 and 9, whereas whites scored higher 
than blacks on Scale 3. This is precisely the 
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pattern of racial differences on the MMPI that 
previous studies have reported most often 
(Gynther, 1972; Rosenblatt, 1976). It can thus 
be concluded that IQ moderated the effect of 
race on MMPI scores and that such racial dif- 
ferences were almost exclusively limited to 
lower IQ subjects. 

Since the above results may have been influ- 
enced by the inclusion of “invalid” profiles 
(Costello, Tiffany, & Grier, 1972), the analysis 
was repeated five times on successive samples 
of the 295 profiles. The samples differed from 
each other only in the maximum F score allow- 
able for inclusion, and thus represented suc- 
cessively more stringent criteria for profile va- 
lidity. Results from these five analyses confirmed 
that the results obtained in the analysis of all 
profiles were not due to the inclusion of invalid 
profiles. In fact, the exclusion of profiles with 
high F scores increased the size of the IQ effect 
and the moderated racial effect on MMPI clini- 


cal scales. 
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Problems Associated With the 
Typological Measurement of Sex Roles and Androgyny 


Wyndol Furman 


University of Denver 


Jeffrey A. Kelly 


Veronica Young 
Belhaven College 


Four new measures of sex role style (the Bem Sex-Role Inventory (BSRI) the Per- 
sonal Attributes Questionnaire (PAQ) the Personality Research Form (PRF) ANDRO 
scale, and the Masculinity and Femininity scales of the Adjective Check List; ACL) 
each define sex role style and androgyny in similar conceptual and psychometric 
terms. Although these scales have often been used interchangeably, the current study 
is the first to examine interscale comparability among these inventories. Although 
correlations among the respective Masculinity and Femininity raw (continuous) scale 
scores on the BSRI, PAQ, PRF ANDRO scale, and the ACL were moderately high, 
a large proportion of subjects were classified into different sex role categories (mas- 
culine typed, feminine typed, androgynous, or undifferentiated), with the category 
depending on the inventory used. In fact, when corrected for chance agreements, the 
majority of subjects (61%) were actually categorized discrepantly by any pair of 
inventories. This suggests limited comparability of sex role research findings based on 
different inventories, and when sex role styles are dichotomized into broad typological 
quadrants, as is the current practice in sex role research, substantial predictive utility 


is lost. 


Just as bipolar masculinity-femininity scales 
had once, flourished, new sex role inventories 
based on a conceptual model using independent 
measurement of masculinity-femininity have 
proliferated recently. The Bem Sex-Role In- 
ventory (BSRI; Bem, 1974), the Personal At- 
tributes Questionnaire (PAQ; Spense, Helm- 
reich, & Stapp, 1975), the PRF ANDRO scale 
(Berzins, Welling, & Wetter, 1978), and the 
Masculinity-Femininity scales of the Adjective 
Check List (ACL; Heilbrun, 1976) each define 
sex roles in similar terms, and each has been 
used to designate sex-typed androgynous, and 
undifferentiated roles. Because these four scales 
rely on highly similar definitions of sex roles 
are all capable of designating androgyny, and 
all use similar scoring procedures to yield four 
sex role categories, the inventories have been 
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used almost interchangeably in recent sex role 
research. 

However, Kelly and Worell (1977) have noted 
that while purporting to measure the same se 
role constructs, each of these scales samples 
somewhat different content domains, were de- 
veloped using different psychometric and item 
selection procedures, and were subjected to dif- 
ferent criteria of validity and reliability. 1 
ther, although the four scales have been use 
interchangeably, no studies have directly com- 
pared these inventories. An important question 
is whether individuals categorized as sex ea 
androgynous, or undifferentiated on one s 
receive the same designation on another. 
this were not the case, it would impose sae 
restrictions on the comparability of research 
findings based on different scales and would a 
quire further examination of configural sex ee 
scoring procedures. The current study cpa d 
the questions of interscale comparability et 
strategies of sex role categorization. 

One hundred thirty (65 male and aa 
undergraduate college students served E 
jects in the study. All inventories were E 
istered in fully counterbalanced order. Dire ee 
for completing each of the scales were 


65 female) 
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used by each respective inventory’s author. 
However, because time limitations precluded 
the administration of the entire ACL, only those 
ACL adjectives actually comprising the Mascu- 
linity-Femininity scales were administered. These 
adjectives were listed in random order on a 
checklist form. 

The scales were then scored to yield separate 
Masculinity and Femininity scores for each stu- 
dent on the BSRI, PAQ, ANDRO scale, and 
ACL. Following the procedure advocated by 
each scale’s author, subjects were considered 
masculine typed if they scored above the Mas- 
culinity scale median and below the Femininity 
scale median on an inventory. Those subjects 
whosé Femininity scores were above the median 
and whose Masculinity scores were below the 
median were categorized as feminine typed. Per- 
sons were considered androgynous if they ex- 
ceeded an inventory’s masculinity and feminin- 
ity medians, and they were considered undiffer- 
entiated if they fell below both medians on an 
Inventory. 

Pearson product-moment correlations were 
first calculated among the raw masculinity and 
femininity scores of all inventories. For sexes 
combined, interscale masculinity score correla- 
tions were BSRI/PAQ =.85; BSRI/ANDRO = 
70; BSRI/ACL=.75; PAQ/ANDRO = .66; 
PAQ/ACL =.70; and ANDRO/ACL = 61. In- 
terscale femininity score correlations were BSRI/ 
PAQ=.73; BSRI/ANDRO=.62; BSRI/ACL 
=.68; PAQ/ANDRO=.59; PAQ/ACL=.51; 
and ANDRO/ACL = .57. The mean correlation 
between the four inventories’ raw Masculinity 
Scales was .71, and the mean correlation between 
Femininity scales was .62. 

Since current approaches to sex role measure- 
Ment assign subjects to one of four typologies, 
interscale comparability was further assessed by 
determining the percentage of subjects who 
were classified into the same sex category by 
each pair of scales. A crucial question is whether 
the same individuals are assigned to the same 
Sex role categories across inventories. Table 1 
Teveals that the percentage of agreement in in- 
dividual subject classification was quite low, 
averaging only 56% of the subjects between 
any two scales’ categories (range = 52% agree- 
Ment to 61% agreement for sexes combined and 
49% to 63% for sexes separately). Thus, an 
average of 46% of all subjects were assigned to 
discrepant sex role categories when the classifi- 
cation outcomes of two inventories were com- 
pared, 

To correct these results for interscale classi* 
fication agreements due to chance, Kappa co- 


1575 


Table 1 
Interscale Classification Agreement Rates 
% 
é assigned G 
Inventories tosame corrected 
compared category for chance 
gee 
BSRI/PAQ 
Sexes combined 60.8 37.9 
Males 60.0 43.1 
Females 61,5 45.9 
BSRI/ANDRO 
Sexes combined 55.4 40.0 
Males 52.3 28.1 
Females 58.5 39,1 
BSRI/ACL 
Sexes combined 56.2 42.0 
Males 56.9 39.6 
Females 55.4 35.7 
PAQ/ANDRO 
Sexes combined 51.5 35,9 
Males 49.2 28.9 
Females 53.8 35.6 
PAQ/ACL 
Sexes combined 55.4 40.0 
Males 50.8 29.7 
Females 60.0 44.0 
ANDRO/ACL 
Sexes combined 56.2 42.2 
Males 49.2 28.6 
Females 63.1 47.4 


Note. BSRI = Bem Sex-Role Inventory; PAQ 
= Personal Attributes Questionnaire; ANDRO 
= Personality Research Form ANDRO scale; and 


ACL = Adjective Check List. 


efficients (Cohen, 1960) were determined. When 
the percentage of agreement between the classi- 
fication outcomes of inventories have been ad- 
justed for chance, the mean Kappa percentage 
agreement drops to 39. Thus, the majority of 
subjects are actually classified discrepantly when 
a second sex role inventory is used, When the 
percentage of agreement in subject classifica- 
tion across all four sex role inventories was cal- 
culated, only 30% of the subjects were found 
to be categorized the same on all four inven- 


tories. 


It is apparent that persons classified as an- 


drogynous, sex typed, or undifferentiated using 
one sex role scale may well be a very different 
subsample than persons with the same desig- 
nation based on another scale. This is the case 
even when the scales purport to assess similar 
characteristics, use the same scoring procedures, 
and assign subjects with identical labels based 
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on their scores. These findings raise serious limi- 
tations concerning the comparability of research 
results when differing inventories are used. 

In the current study, relatively high correla- 
tions were obtained between the Masculinity 
and, to a lesser degree, the Femininity scales of 
the four inventories when subjects’ scores were 
treated as continuous variables. However, when 
subjects’ scores were dichotomized by median 
split categorizations, many, and in some cases, 
most’ subjects are discrepantly classified. Lost 
variance and reduced predictive utility are ex- 
acerbated by the prevalent scoring procedure in 
which subjects are classified into four sex role 
categories on both the Masculinity and Femin- 
inity scales of a given inventory. Although mas- 
culinity and femininity as defined by these scales 
appear to be unrelated and should be measured 
separately, the current data indicate that there 
is little empirical basis for configurally com- 
bining these scores into broad, typological cate- 
gories based on median splits. Multiple linear 
measurement or multiple regression analyses may 
enable future research to more precisely examine 
the adjustive and social-behavioral implications 
of sex role orientations. 
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Male and Female Treatment Differences: 
Can They Be Generalized? 


Andrew C. Del Gaudio, Paul J. Carpenter, and Gary R. Morrow 
_ Department of Psychiatry 
University of Rochester Medical Center 


The present study evaluated the generalizability of previous findings that male and 
female patients may receive differential psychiatric treatment. A comparison of 156 
female and 66 male outpatients on demographic, clinical, and self-report measures of 
mood, symptoms, and interpersonal concerns revealed no sex differences. The treat- 
ment variables of length of therapy and the prescription of medication were the 
dependent measures. The results indicated that with a broad sample of male and 
female patients, there were no significant sex-related differences on the dependent 
variables. Thus, it is suggested that the generalization of sex-specific treatment dif- 


ferences be qualified, 


A recent study by Stein, Del Gaudio, and 
Ansley (1976) found that a sample of neuroti- 
cally depressed female outpatients had signifi- 
cantly more therapy sessions and were signifi- 
cantly more likely to be prescribed psycho- 
tropic medications, especially the more potent 
varieties, than were a sample of male neurotic 
depressives. These findings were obtained de- 
spite the fact that the male and female patients 
were indistinguishable on demographic, clinical, 
and self-report measures of symptoms, mood, 
and interpersonal concerns. Their data supported 
the findings of others (Abramowitz, Abramowitz, 
Roback, Corney, & McKee, 1976) that differ- 
ential attitudes of clinicians, rather than pa- 
tient factors, are a salient element in the psy- 
chotherapeutic process. 

The present study examined the generaliza- 
bility of the earlier study (Stein et al., 1976) 
by attempting to replicate the sex-specific find- 
ings with a different sample of patients repre- 
senting a broader range of clinical diagnoses 
father than a single category. 

Subjects were 222 patients who were admitted 
to the same outpatient clinic used in the Stein 
et al. (1976) study. Demographic and clinical 
lata were gathered on each patient, as well as 
their responses to the following measures of 
Mood, symptoms, and interpersonal concerns: 

‘erminator-Remainer Scale (TR), Hopkins 


Requests for reprints and for an extended report 
of this study should be sent to Paul J. Carpenter, 
Who is now at the University of Southern Missis- 
SPPi, P.O. Box 8482, Southern Station, Hattiesburg, 

ississippi 39401. 
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Symptom Rating Scale (HSRS, five subscales), 
Profile of Mood States (POMS, six subscales), 
FIRO-B (six subscales), and the Marlowe- 
Crowne Social Desirability Scale. The majority 
were white (84.6%), married (41.6%), middle 
class (68.3%), and female (70.3%). 

A broad spectrum of psychiatric diagnoses 
was present in the sample: neurosis (37.9%); 
transient situational disorder (29.4%); person- 
ality disorder (15.8%); psychosis (7.7%); ad- 
dictive states (2.3%); and deferred, unknown, 
and other diagnoses (6.9%; the other diagnoses 
chiefly included psychophysiologic disorders, or- 
ganic brain syndrome, and mental retardation). 

Chi-square comparisons of the male and fe- 
male patients on the demographic variables of 
age, race, marital status, and education yielded 
no differences. Male-female differences in self- 
ratings of mood, symptoms, and interpersonal 
concerns examined by ¢ tests revealed that (a) 
Female patients had significantly higher scores 
than males on the Depression subscale of the 
HSRS (p <.05); (b) males, on the other hand, 
had significantly higher depression scores than 
females on the POMS ($ mnt S males a 
significantly higher expressed control scores 
te than did females on the FIRO-B. 

Significantly elevated Depression scores for 
both male and female patients on different mea- 
sures make it safe to assume that there are es- 
sentially no sex-specific differences on depres- 
sion. The findings on the FIRO-B can be attrib- 
uted to chance, since 1 significant result out of 
a possible 19 is within the statistical expectation 
of chance, Therefore, this study yielded no evi- 
dence that male and female psychiatric patients 
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entered treatment with significantly different 
levels or types of subjective distress and inter- 
personal needs and concerns, a result in agree- 
ment with the Stein et al. (1976) study. 

The present investigation did not, however, 
reveal any sex differences in the average number 
of therapy sessions or the extent to which medi- 
cations were prescribed. Thus, when the patient 
sample was expanded beyond a single diagnostic 
category (neurotic depression), no treatment 
differences emerged between male and female 
patients drawn from the same clinic population, 
as they did in Stein et al. (1976). This strongly 
suggests that limits be placed on the generaliza- 


BRIEF REPORTS 


tion that differential psychiatric treatment is 
afforded to men and women. 
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of 14 years. The patients, analyst, 


creased by LSD. 


The rationale for the clinical use of psychoto- 
metics (LSD, anticholinergics) is that they 

ent traditional psychotherapy by intensify- 
ing and accelerating transference, recall, loosen- 
‘ing of defenses, and so forth (Abramson, 1967). 
there any empirical verification for these 
r tions or do they remain presuppositions? Fink 
(1974) has presented evidence suggesting that 
Psychotomimetics induce a decrease in syntactic 

fensive language behaviors” (use of third- 
n reference and qualifiers, use of past tense 
tead of present tense). LSD-induced language 
Changes are hypothesized to resemble the speech 
‘Of acute schizophrenics. However, Gottshalk and 
Gleser (1969) presented contrary evidence to 
effect that psychotomimetics do not promote 
ophrenic” speech, but they do promote 
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uld be sent to Michael 
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Y, University of Pennsylvania, 3813 Walnut 
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Effect of Psychotomimetics (LSD and Dextroamphetamine ) 
on the Use of Figurative Language During Psychoanalysis 


Michael Kowitt 
Philadelphia Psychiatric Center 
Philadelphia, Pennsylvania 


Charles C. Dahlberg and Joseph Jaffe 
Department of Communication Sciences 

P New York State Psychiatric Institute 

n New York, New York 


® This study examined the effect of LSD, dextroamphetamine, and a placebo on pa- 
tients’ use of figurative language in psycho: 
performed on three male psychoanalytic patients who volunteered to 
drug (LSD-50-100 u, dextroamphetamine-15 mg, placebo) seven times over a period 
and raters were all blind concerning the drug 
condition. Forty-minute verbatim transcripts were scored for figurative language by 
two trained raters, Results showed that for two of the three patients, 
nonliteral language, in particular, “novel” figurative phrases, was significantly in- 


analytic sessions. A longitudinal design was 


receive each 


the use of 


language that reflects a cognitive impairment 
(organicity). 

In light of the above-cited contradictory find- 
ings, it is fair to state that the effects of LSD 
on the speech of normals is undetermined at 
present. The goal of this study was to clarify the 
effect of psychotomimetics on speech in (a) the 
examination of the therapeutic language effects 
of LSD, dextroamphetamine, and a placebo on 
patients’ speech during actual psychotherapy 
sessions and (b) the use of a language measure 
that is significantly related to “Gnsight” on the 
part of the patient. We (the experimenters) 
chose “figurative language” as a speech index of 
clinical change because investigators have be- 
come increasingly aware of the role of figurative 
language in psychotherapy; specifically, novel 
nonliteral language is associated with insight. 
(See Billow, 1977, for a review.) To sum up, 
the hypothesis of the present study can be stated 
as follows: Ingestion of LSD and dextroampheta- 
mine is expected to cause an increased usage of 
novel figurative phrases by patients during psy- 
choanalytic sessions.* 

Verbatim transcripts 
analytic patients were us 


of three male psycho- 
ed in the present study; 


1 The interested reader is referred to Mechaneck, 


Feldstein, Dahlberg, and Jaffe (1968) for a de- 
tailed description of the research project, What fol- 
lows is a brief outline of the methodology. 
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the patients were white, college educated, and 
suffered from neurotic depression. All subjects 
provided informed consent to participate in the 
research project. 

Each patient received each of the three drugs 
(LSD = 50-100 u; dextroamphetamine = 15 mg; 
placebo) seven times on a randomized schedule 
for a total of 21 experimental psychotherapy ses- 
sions over a period of 14 years. Patients, thera- 
pist (CCD), and raters (MN and MK) were 
blind concerning the drug condition. 

Each transcript was scored via Barlow, Ker- 
lin, and Pollio’s (1970) programmed scoring 
technique for the patients’ use of figurative 
language by two raters. Fourteen categories of 
figurative language were scored, with each in- 
stance being classified as either “frozen” or 
novel. Frozen was defined as a figure of speech 
that is considered a part of common vocabulary. 
Novel was defined as a figure of speech that 
the rater judged to be unique in the context. The 
following dependent variables were obtained: 
total figurative phrases (FP), FP/1,000 words, 
novel figurative phrases (N), N/1,000, and N/ 
FP%. 

The two judges reached a high level of agree- 
ment on their scoring of FP (96%); an 88.9% 
agreement was obtained for classification of novel 
versus frozen. Only phrases that were mutually 
scored were used in the analysis. A statistical 
analysis was performed on each subject; LSD 
and dextroamphetamine sessions were individu- 

ally compared to the placebo condition (n= 7 
for each subject). Dunn’s multiple comparison 
procedure was used to make the a priori com- 
parison. 

The results indicated that Patient 1 signifi- 
cantly increased (p < .01) his use of FP, FP/ 
1,000 words, N, N/FP%, and N/1,000 words 
when under the influence of LSD. Patient 2 dis- 
played LSD-induced increases in nonliteral lan- 
guage similar to the above-described findings for 
Patient 1, with the exception of FP and N/FP% 
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(p<.05). In addition, FP/1,000 words was 
significantly increased (p <.05) for Patient 2 
when under the influence of dextroamphetamine, 
Concerning Patient 3, the use of nonliteral lan- 
guage during psychoanalysis was not affected by 
either LSD or dextroamphetamine. 

To sum up, the hypothesis that LSD increases 
the patient’s use of novel figurative language in 
the course of psychoanalysis has been supported 
in two out of three case studies. However, it 
should be noted that the overall therapeutic 
value of psychotomimetics has not been demon- 
strated by this study. Although LSD does ap- 
parently increase the use of novel metaphores, 
and there is evidence that figurative language is 
associated with an important therapeutic process 
(insight), there is no evidence that drug-induced 
imagery will have the same beneficial effects as 
images that occur naturally. 
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Psychometric Correlates of the 
Mosher Forced Choice Guilt Inventory 


Kevin E. O’Grady 


University of Connecticut 


Louis H. Janda 
Old Dominion University 


One-hundred one male and 135 female undergraduate psychology students were ad- 
ministered the Mosher Forced Choice Guilt Inventory (MFCGI), which consists of 
subscales measuring Sex Guilt, Hostility Guilt, and Morality-Conscience Guilt; the 
Repression-Sensitization Scale; the State-Trait Anxiety Inventory; the California F 
Scale; the Marlowe-Crowne Social Desirability Scale; and the Adult Nowicki- 
Strickland Locus of Control Scale, Inspection of correlations and principal compo- 
nent solutions for each sex indicated that guilt emerges as a construct separate from 
anxiety or authoritarianism. The applicability of the male form of the MFCGI for 


use with females is discussed. 


The Mosher Forced Choice Guilt Inventory 
(MFCGI; Mosher, 1966) consists of 79 forced- 
choice items intended to measure three theo- 
tetically independent aspects of guilt: sex guilt 
(SG), hostility guilt (HG), and morality-con- 
science guilt (MCG). The purpose of the present 
tesearch plan was twofold. First, an examination 
of the literature indicated the importance of 
assessing the relationship between the subscales 
of the MFCGI and several other personality in- 
Yentories, Second, because the 1966 MFCGI, 
originally intended for use with males, has been 
Used to measure guilt in females, an assessment 
of the factor structure in males and females of 
the 1966 MFCGI in relation to these other 
Scales also seemed called for. 

The MFCGI, together with the Repression- 
Sensitization (R-S) Scale; the California F 
Scale (F); the Adult Nowicki-Strickland Locus 
of Control (I-E) Scale; the State-Trait Anxiety 
Inventory (STAI), A-Trait form; and the Mar- 
lowe-Crowne Social Desirability Scale (M-C 
DS) were completed by 236 undergraduate 
Psychology students (101 males, 135 females). 
Separate principal components solutions for each 
Sex were then obtained from the matrix of inter- 
Correlations of the test scores. In each case three 
Principal components with eigenvalues greater 
than 1.0 were retained and were rotated to 
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orthogonal simple structure using the normalized 
varimax method (see Table 1). 

The results of the two principal components 
analyses were then compared using a technique 
developed by Schönemann and Carroll (1970). 
This procedure, a generalization of the orthog- 
onal Procrustes problem, uses a least squares 
approach to fit one matrix to another. The over- 
all degree of correspondence between the two 
matrices is provided by a measure of normali 
symmetric error (e). Fitting the male rotated 
component loadings to the female rotated com- 
ponent loadings indicated that the inventories 
were assessing the same underlying dispositions 
in the two groups (e= .013). 

The first component, accounting for 31.2% of 
the total variance for the male sample, and 
31.9% in the female sample, was labeled admis- 
sion of anxiety. For males, the primary loadings 
(èļ+.5)) consisted of the R-S (.892), F (607), 
STAI (.852), and I-E (.674) scales. Primary 
loadings for females consisted of the R-S (.929), 
STAI (.892), I-E (.715), and M-C SDS 
(—.548) scales. 


The second component, 
of the total variance for the male sample and 


25.7% in the female sample, was almost en- 
tirely marked by the three guilt subscales of the 
MECGI and clearly represents a general guilt 
factor. Male loadings were 704 for SG, .850 for 
HG, and .846 for MCG. Female loadings were 
790 for HG, and .857 for MCG, and .425 for 
SG. 

The third component, accounting for 13.5% of 
the total variance for the male sample and 
15.0% in the female sample, seemed to be 
largely determined by authoritarianism in both 


accounting for 26.9% 
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Table 1 i 
Means, Standard Deviations, and Intercorrelations of the Eight Scales for the Male and 


Female Samples 


Variable f 2 3 4 5 6 7 8 
1. R-S .39 -16 46 —.35 —.10 .09 -00 
2: F .24 37 -32 07 2i 16 —.09 
3. STAI .82 ll 39 —.24 —.09 -20 .07 
4, I-E -56 27 iSt —.15 01 -03 -04 
5. M-C SDS —.41 -06 —.35 =AT .27 123 .09 
6. SG —.06 39 —.01 04 25 61 38 
7. MCG 06 16 -06 —.01 31 OL 52 
8. HG 11 —.09 -13 -03 -16 19 45 

Mal 
3 “u 39.60 11.04 34.69 9.38 15.39 10.53 11.86 17.54 

SD 17.80 4.91 8.47 3.87 5.54 7.08 4.58 5.86 
Females 

M 43.76 10.82 37.50 10.20 15.93 12.81 13.44 19.52 

SD 17.57 5.21 8.24 4.95 5.49 6.61 3.94 4.52 


Note. R-S = Repression-Sensitization Scale; F = Authoritarianism Scale; STAI = State-Trait Anxiety 
Inventory, A-Trait Form; I-E = Internal-External Locus of Control Scale; M-C SDS = Marlowe-Crown 
Social Desirability Scale; SG = Sex Guilt subscale, MCG = Morality-Conscience Guilt subscale, HG = 
Hostility Guilt subscale of the Mosher Forced Choice Guilt Inventory. The correlation for males are abo 
the diagonal, and the correlations for females are below it. For males, r(99) > .19, p < .05; r(99) > .2 


p < .01. For females, r(133) > .17, p < .05; r(133) > .23, p < .01. 


groups. Male loadings were composed of the F 
(.644) and M-C SDS (.707) scales, along with a 
smaller loading for SG (.453). Female loadings 
primarily involved the F (.857) and SG (.717) 
scales. 

Examination of the separate factor solutions, 
and the residual matrix (the matrix of differ- 
ences between the matrix of best fit and the 
initial target matrix) provided through the 
Procrustes procedure, revealed that the lack of 
correspondence in the solutions was mainly due 
to the fact that the SG subscale loaded pri- 
marily on the second factor for males, with a 
secondary loading on the third factor, whereas 
this pattern was reversed for the female sub- 
jects. 

These results generally support the notion that 
guilt is a construct identifiably different from 
anxiety or authoritarianism. Besides a nominal 
amount of shared variance with SD for SG "S 
.25 for males, r =.27 for females) and MCG 
(r =.23 for males, r = .31 for females, all ps < 
.01), only three other significant relationships 
emerged that involve the subscales of the 
MFCGI: SG with F for males CHi p< 
.05) and females (r= .39, p < .01), and MCG 

with STAI for males (r = .20, p < .05). In con- 
trast, the three subscales of the MFCGI share a 
moderate amount of variance among themselves 
(ranging from 4% to 38% across the two 
groups). Together with the results of the multi- 
trait-multimethod analysis of Mosher (1966), 


which indicated that the SG, HG, and MCG sub- 
scales of the MFCGI measure discriminately 
different aspects of guilt, this study, which indi 
cates that guilt is a discriminately different cont 
struct from anxiety or authoritarianism, offers 
strong support for the construct validity of the | 
MFCGI. 

Conclusions regarding ‘the use of the male form! 
of the MFCGI with females remain tenuous: 
Although the intercorrelations of the three Sì 
scales of the MFCGI exhibited the same overal 
pattern in males and females, the female corte 
lations were somewhat lower than the ca T 
sponding male correlations. Further, the dit 
ference in factor structure for SG in the me 
and female factor solutions suggests that itl 
SG subscale may be measuring apa dg 
tions in the two groups. A compari 7 
item factor structure of the MFCGI between 
males and females would help to clarify 
issue. 
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Clinical Application of a 


Broad-Spectrum Behavio 


Approach to Chronic Smokers 


Harry A. Lando and John A. McCullough 
Iowa State University 


The Present investigation assessed the clinical applicability of a broad-spectrum be- 
havioral treatment (aversion, contractual management, booster sessions, group con- 
tact and support) previously reported by Lando. The feasibility of reorienting non- 
abstinent subjects to maintained reduction was also assessed. Twelve of 16 subjects 
who completed treatment were abstinent at a 6-month follow-up (75%), and 2 
others were smoking substantially less than at baseline. Results suggest that the 
previous findings are replicable and that this program can be applied on a self- 
supporting clinical basis, Additional work is needed to validate the maintained re- 


duction procedure. 


Lando (1977) has called for the application of 
promising laboratory approaches to smoking ces- 
sation on a clinical basis. In that article he de- 
scribed a broad-spectrum treatment consisting of 
aversion, contractual management, booster ses- 
sions, faded group contact, and support, which 
led to a 76% abstinence rate at a 6-month fol- 
low-up. The major purpose of the present study 
Was to assess the applicability of this treatment 
in a cost-effective clinical program. The present 
Study was also intended to provide a replica- 
tion of the Lando (1977) findings, as well as to 
allow a preliminary assessment of an additional 
treatment component in which mnonabstinent 
Bonen are reoriented toward maintained reduc- 
ion, 

Treatment was conducted by the second au- 
thor, who is an advanced psychology graduate 
Student at Iowa State University. He followed 
the treatment procedures for aversion and main- 
tenance outlined in a treatment manual pub- 
lished by Lando (1976). Participants were re- 
quired to deposit $20, of which $10 was refund- 
able, The remaining $10 was sufficient to cover 
all costs of the program. 

Subjects included 17 smokers. (M age was 35.7 
Years, M years as a smoker was 17.3, and M 
baseline smoking was 28.7 cigarettes per day.) 
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Procedures for record keeping were as described 
in previous studies, 

Aversion included six treatment sessions over 
a 1-week period. In the current study all 17 sub- 
jects were seen in a single group. Sessions 
averaged approximately 45 minutes, including a 
prescribed 25-minute period of continuous smok- 
ing. Subjects were also instructed to smoke as 
much as possible on their own between sessions, 
with a daily minimum of twice their usual con- 
sumption. 

Subjects were instructed to abstain following 
aversion, and they attended seven maintenance 
sessions at the termination of this phase of the 
program. Formal support was faded over time, 
with the initial session occurring within 48 hours 
and the second session taking place 5 days later. 
Three additional maintenance sessions were then 
held at weekly intervals, followed by two final 
sessions that took place 2 weeks apart. These 
sessions consisted primarily of group discussion 
and the signing of contracts. Subjects contracted 
to forfeit money for every cigarette smoked, to 
reward themselves for increasingly longer periods 
of abstinence, and to undergo 48 hours of 
booster aversion (rapid smoking) in the event 
of relapse. (A more detailed description of treat- 
ment is available elsewhere; Lando, 1976, 1977.) 

In a departure from previous work, subjects 
who failed to remain abstinent following two 
booster treatments were reoriented toward main- 
tained reduction. There appear to be a certain 
number of smokers for whom abstinence is vir- 
tually impossible. In the current study main- 


tained reduction was explored as a possible 


alternative for these smokers. The same essential 
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contractual techniques that had been directed 
toward abstinence were adapted to a program of 
maintained reduction. Nonabstinent subjects con- 
tracted to limit their smoking to a set daily 
allotment (between 10 and 20 cigarettes). Sub- 
jects were maintained at this level rather than 
gradually reduced to minimize the likelihood that 
each remaining cigarette would become more 
reinforcing. Efforts were also made to minimize 
the reinforcing value of smoking by prescribing 
a stimulus control procedure in which smoking 
was confined to specified locations and was 
divorced from other rewarding activities. Sub- 
jects now forfeited money for every cigarette 
smoked over the daily allotment and rewarded 
themselves over increasingly longer intervals for 
adhering to targeted consumption. 

Sixteen subjects completed treatment, with 1 
other dropping out at the termination of aver- 
sion. Of the subjects who completed the program, 
100% remained abstinent at Week 1. Two sub- 
jects relapsed during the ist month following 
aversion, and 2 others reported relapse in the 
4th month. This yielded a 2-month abstinence 
rate of 88% (14 of 16 subjects), and 4- and 6- 
month abstinence rates of 75% (12 of 16 sub- 
jects). If the person who dropped out of the 
program is counted as a treatment failure, the 
6-month abstinence rate is 71%. 

The two subjects who relapsed during the 
first month failed to respond to booster treat- 
ment and were placed on maintained reduction 
(12 subjects maintained abstinence for the en- 
tire 6 months and did not undergo booster aver- 
sion). The 2 subjects who relapsed during the 
4th month had reverted to their original levels 
of smoking by the 6-month follow-up. (No pres- 
sure was placed on subjects to participate in 
booster sessions or maintained reduction follow- 
ing the 2-month maintenance program.) Both 
subjects who were assigned to maintained reduc- 
tion contracted to limit their smoking to 20 
cigarettes a day. (Both had previously been two- 
pack-a-day smokers.) One of these “subjects 
averaged 13 cigarettes per day at the 4-month 
follow-up and 17 at the 6-month follow-up (more 
than 5 months after the maintained reduction 
program had been initiated). The other subject 
averaged 21 and 25 cigarettes per day at the 4- 
and 6-month follow-ups, respectively. Unfortu- 
nately, the appropriate contingencies were not 
fully implemented, with contractual responsibil- 
ity left to subjects and apparently only loosely 

adhered to. 
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Overall, the present findings are extremely 
encouraging, primarily because they suggest that 
the successful program reported by Lando (1977) 
is both replicable and generalizable to a clinical 
context. Treatment was offered on a service 
basis using a programmed treatment manual, 
This was not a controlled laboratory study, nor 
was it intended to be. Rather, it was an explora- 
tory assessment of the applicability of a cost- 
effective laboratory-based program as a self- 
supporting clinical treatment. By this criterion, 
the program was clearly successful: The results 
are comparable to those obtained by Lando 
(1977) and are among the most encouraging 
findings reported in the literature to date. 

Caution is still indicated in interpreting these 
findings, however. Both the current study and 
that of Lando (1977) were based on relatively 
small numbers of subjects. Both studies were 
conducted in the same community, with a simi- 
lar population of smokers. There is considerable 
need for replication of this program by other 
investigators in different settings. The fact that 
all subjects were seen in a single treatment 
group also limits the generalizability of the 
current findings. F 

Considerable group cohesiveness was again 
evident in the present program and may be due 
at least in part to the specific treatment tech- 
niques used. Subjects were repeatedly informed 
of the importance of their own efforts in aiding 
themselves and in supporting each other. Par- 
ticipation in the program itself represented a 
significant commitment both in undergoing the 
aversion and maintenance sessions and in Te 
mitting the $10 fee. The commitment to the 
program (and to each other) seemed to bere 
inforced by the extremely encouraging initi 
outcome in which 100% of the participants te 
mained abstinent throughout Week 1. F 

Another purpose of the present study was 
assess the effectiveness of a maintained be 
tion program for nonabstinent subjects. pete? 
such a program appears very promising, n 
ticularly in providing nonabstinent subjects a 
an alternative, it cannot be meaningfully om 
ated on the basis of the current evidence. A 
two subjects underwent maintained pare f 
and although the program appeared to t 
least moderately successful in both cases, E 
contractual and stimulus control procedures We 
not fully implemented. The possibility me e 
subjects might eventually resume their $ In 
levels of smoking must also be considere a 
future work maintained reduction nee 
should be more carefully implemented, an 
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must be taken to insure that they do not pro- 
yide abstinent subjects with an excuse for re- 
smed consumption. 


References 


Lando, H. A manual for a broad-spectrum be- 
havioral approach to cigarette smoking. JSAS 


* 


Catalog of Selected Documents in Psychology, 
1976, 6, 113 (Ms. No, 1371). 

Lando, H. Successful treatment of smokers with a 
broad-spectrum behavioral approach. Journal of 
Consulting and Clinical Psychology, 1977, 45, 
361-366. 


Received December 16, 1977 = 


SULTING AND CLINICAL PSYCHOLOGY 


ee a a 
1200 17th St-, N.U., Washington, D.C. 20096 

scrim re tansa mant o tng nS A SP 
1200 17th St., N.M., Washington, D.C. 
en ie EE DY PARDEE IOTOR SOMERS ESTOR 
arican Peyehotogical Assoctation, 1200 17th St., NM., Washington, D.C. 20036 


‘Srendan Maher, Harvard University. Cambridge, MA 02138 


anew IROLE OAA DORTORGTIR ANG OTHER SNCUNTY ROLDINE ORWING.OR MOLDING /FERELNT OR WOME OF 
Scent ADDUNT or PONOR wonTnGts on OTEP MECUA ITEN A7 ry pe nt 2 AAt 


oe =e 


poor E laa] 


ave emanes ouam stent pone mam mana pen et 


e | SES [ORS sues, 


eneses O 


antent ano HATUME OF EIREUAATION 


pancreas nare <0. a0: 
Tan eenen one 8,987 9,550 
E vera pave cimeucAarvom comet 10a nt 1880 8,987 9,580 
Pearman” | ns 200 
E vera provmmurion tem af Cond D) 9,202 9,782 
P EAEE Gia mramorne oe 140 2.388 

pais T 


E pervane rmon newt neante 
E Taras om oR TT oo el a m mr EA 


~; 


Heer at he snes mde by me 
tire we correct sd comple 


Aa aw tone 


yng there niet aa area er 
i res ha me ma e e led 
norma shares 


manne net ee m 


{er insaroctsoas on ret) 


Journal of Consulting and Clinical Psychology 
1978, Vol. 46, No. 6, 1586-1587 


Relationships Between Client Self-Perceptions of 
Self-Consciousness Levels and 
Therapist Awareness of These Perceptions 


Robert G. Turner and Mae Keyson 
Pepperdine University 


The present study investigated the congruity between clients’ self-perceptions related 
to self-consciousness and their therapists’ awareness of these self-perceptions. Neurotic 
(n= 47) and psychotic (n= 51) clients completed the Self-Consciousness Scale (SCS) 


under standard self-descriptive instructions. Therapists completed the SCS as they 
thought their client would respond. In both samples of clients, therapist’s reports 
were significantly correlated with client self-reports on subscales of the SCS indicat- 
ing self-reflective habits (Private Self-Consciousness) and discomfort in the presence 
of others (Social Anxiety) but not on a measure of awareness of oneself as a social 


object (Public Self-Consciousness) . 


The Self-Consciousness Scale (SCS; Fenig- 
stein, Scheier, & Buss, 1975) is a measure of 
individual differences in habitual self-focused 
attention. Factor analysis of the items in the 
scale revealed two components of self-conscious- 
ness. The factor labeled Private Self-Conscious- 
mess concerns habitual attendance to one’s 
thoughts, motives, and feelings and thus mea- 
sures self-reflective or introspective tendencies. 
High compared to low private self-consciousness 
subjects generate self-reports that are both 
more detailed (Turner, in press) and more pre- 
dictively valid (Turner, 1978). The second fac- 
tor, Public Self-Consciousness, was defined by 
a general awareness of the self as a social ob- 
ject and reflects a concern for social appear- 
ance. A third factor, Social Anxiety, also 
emerged from the analyses of the scale and 
reflects a measure of discomfort in the presence 
of others. Social anxiety presumably occurs as 
a possible reaction to public self-consciousness. 
(See Turner, Scheier, Carver, & Ickes, 1978, for 
a review of research involving these subscales.) 

Fenigstein et al. (1975) argued for the im- 
portance of individual differences in client self- 
consciousness levels in determining both the 
choice and goals of therapy. The goals of ther- 
apy might be directed toward the modification 
of a component of self-consciousness, for ex- 
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ample, decreased social anxiety or increased — 
insight and self-reflection. p 
A component of many therapeutic approaches 
consists of the therapist understanding the phe- 
nomenological world of the client. Since self- 
consciousness would seem to have broad clinical 
and behavioral implications (cf. Turner et al, 
1978), it would be useful to know the extent to 
which therapists are aware of clients’ self-per 
ceptions as related to self-consciousness levels: 
For example, if the goal of a therapeutic pro- 
gram is to increase client insight, are therapists 
aware of the extent to which their clients con 
sider themselves to be self-reflective or insight- 
ful? Therefore, the purpose of the present in- 
vestigation was to determine the congruity be- 
tween clients’ self-perceptions related to self- 
consciousness and their therapist’s awareness 0 
these self-perceptions. ft 
The sample of clients consisted of 47 oni 
patients. of a private psychiatric center i 0 
were clinically diagnosed as psychoneurotic S- 
order and 51 resident patients of a private, i 
profit psychiatric hospital who were clinica % 
diagnosed as psychotic disorder (schizophrenit 
reaction). Each client was paired with the a 
pist directly in charge of administering the ae 
peutic program of the client. Part of the ma 
retical orientation of each therapist gaiii 
understanding the client’s phenomenolog! a 
world. There were four therapists directing isi 
grams for the neurotic sample (each having te 
clients) and seven directing programs i 
psychotic sample (each having 3-13 clients). dan 
Each client completed the SCS under are. 
self-descriptive instructions. Independently, 
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Correlations Between Self-Reports and Therapists’ Reports of Clients’ Scores on the 


Self-Consciousness Subscales 


a IIaaaaaaaamħįă 


Self-reports 
Neurotic sample Ps; i 
ychotic sampl 
(n = 41) (n = 51) P5 
Therapists’ reports 1 2 3 1 2 3 
1, Private Self-Consciousness 34* 
a i 2 = 
2 Public Self-Consciousness 05 F ao oe ae of 
3, Social Anxiety: 06.24 24 '33¢ ase 
*p < 05. 
Po <.0: 
"s$ < 001. 


apists were instructed to “answer the items as 
you think the client would respond.” 

Table 1 presents correlations between self- 
hes and therapists reports of the scores of 
ad on the SCS subscales. In both samples 
k correlations of primary interest lie along 
in AN diagonal. Completing the SCS as they 
Ane t their clients would, therapists generated 
ee scores significantly correlated with the 
ae of their | clients for private self- 
i pene and social anxiety in both samples 
i ients. Thus therapists were significantly 
nett Bh, the extent to which clients would re- 
lak emselves to be self-reflective and uncom- 

able as the object of public attention. 
et both samples of clients, therapists’ re- 
oe and self-reports on the Public Self-Con- 
Aiea subscale were not significantly cor- 
Rie To the therapists of the neurotic sample, 
a self-consciousness was related to client 
nig anxiety levels. In the psychotic sample, 
the perceptions tended to be related to thera- 
+ perceptions of the clients as self-reflective 
re Roly anxious. Thus. the therapists’ reports 
ae public self-consciousness were neither 
aie cantly nor distinctively related to clients’ 

-reports of public self-consciousness. 
hae level of the off-diagonal correlations of 

blic self-consciousness with private self-con- 


sci } 3 s ; 
sciousness and social anxiety are consistent with 
ales in single- 


&mple data (Turner et al, 1978). It would 


oy that public self-consciousness is neither 
eet nor as relevant to insight-based thera- 
and lC processes as are private self-consciousness 
Boal social anxiety. Self-reflection is a pnma'y 

l of such approaches, and Jevels of social 


ety are immediately salient in group therapy 


procedures that often are a part of insight thera- 
peutic programs. Concern for how one appears 
to others (public self-consciousness) would not 
be relevant unless it resulted in discomfort in 
the presence of others or social anxiety. Thus 
normal levels of public self-consciousness would 
not appear to be particularly relevant in many 
insight therapeutic programs. 

If the therapist is to facilitate the growth of 
self-insight in the client, awareness of self-per- 
ceptions that are relevant to the direction of 
therapy is imperative. A possibly fruitful area 
for future research would be to investigate the 
relationships between client self-perceptions and 
therapists’ awareness of these perceptions in 
other important areas of client self-concept. In- 
dications of common accurate and inaccurate 
perceptions might assist therapists in identifying 
areas requiring diligent attention if accurate 
client self-perceptions are to be achieved. 
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Situational Management, Standard Setting, and Self-Reward 
in a Behavior Modification Weight Loss Program 


Stanley L. Chapman 
Department of Rehabilitation Medicine 
Emory University 


D. Balfour Jeffrey 
University of Montana 


In a comprehensive weight loss program, overweight women exposed to instruction 
in self-standard setting as well as to situational management techniques lost more 
weight than those instructed only in situational management techniques. Addition of 
instruction in self-reinforcement to standard setting and situational management failed 
to produce additional weight loss, Findings illustrate the facilitative effect of teaching 
individuals to set specific, objective, and realistic goals for eating behaviors and weight 
and the difficulties of incorporating self-reward procedures in a comprehensive program. 


Short-term behavioral programs have yielded 
promising results, but there has been insufficient 
research on the variables involved (Abramson, 
1977). Although several investigators have found 
greater weight losses in individuals taught to 
reward themselves for changing eating habits 
or weight than in controls instructed to self- 
monitor eating and weight (e.g., Bellack, 1976; 
Mahoney, 1974), the role of standard setting 
as a component of self-reinforcement has not 
been adequately studied. 

In the present study, 57 women, who ranged 
from 17.0% to 84.8% above ideal body weight 
(M = 45.4) on insurance tables were recruited 
from the community through newspaper ads. 
They ranged in age from 17 to 65 years (M = 
37.8), and most came from a middle or upper- 
middle socioeconomic level. They deposited $40, 
which was returned contingent on attendance at 
group meetings, and were divided into triplets 
of approximately equal body weights and per- 
centages of overweight. Members of each trip- 
let were assigned randomly to the three treat- 
ment groups, each of which met for 90 minutes 
for 8 consecutive weeks and for a follow-up 
meeting 8 weeks after treatment. Group meet- 
ings consisted of a private weigh-in and pre- 
sentation and discussion of specific behavioral 
techniques to be used during the following week. 
Three therapists, all of whom were familiar 
with behavior modification but who were in- 
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experienced in behavioral treatment for weight, 

rotated as leaders for the three groups. They 
were unaware of the hypotheses of the study 

and did not set standards or reward group mem- 

bers verbally for any eating behaviors or weight” 
losses. 

Treatment groups included a situational man- 
agement group (SM), which focused mainly on 
the control of environmental stimuli that lead to 
overeating, as outlined by Ferguson (1975). In 
structions in self-monitoring, stimulus control 
techniques, chaining, exercise, and diet were im- 
cluded. A second group, the self-standard setting 
group (SS), received the same instructions a 
Group SM, but they also were taught to write 
down standards each week for changes in weight 
and eating behaviors. SS participants were i | 
couraged to set goals that they had an excellen 
chance to achieve and to make them specific, 
such as “I will eat a moderate breakfast (belon 
500 calories) each morning,” rather than “I 
never eat candy again.” (See Jeffrey & Kaa 
1977, for more information on realistic standard 
setting and other behavioral techniques.) Ther 
apists stressed the importance of standards ani 
examined each participant’s standards weekly. i 

The self-reward (SR) group received the a 
training as Groups SM and SS, but mem i. 
also were instructed weekly to make contra 
with themselves and others to receive verbal ani 


losses and small daily achievements in ¢ aed 
eating behaviors. Suggested examples inc i 
telling oneself one performed well, a, 
small treats such as a leisurely bath or a “al 
to the beauty parlor, and arranging a spe 
outing with one’s spouse. ; 

In addition, members of this gr 
an extra $17.50, which was used as mo) 


ited 
oup deposite 
n netary 
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rewards for improvement in the form of a 
check for $1.25 for eating habit changes and 
a check for $1.25 for weight loss each week for 
7 weeks of treatment. Participants were in- 
structed to reward themselves weekly by taking 
their respective checks only if they reached 
their respective weekly goals for eating and 
weight loss. If they did not reach their goals, 
they were instructed mot to reward themselves, 
but rather to deposit the checks through a slot 
ina locked box outside the meeting room. 

One member of each group dropped out of 
treatment, and one SR group member was ex- 
cluded from follow-up analyses after developing 
thyroid difficulties. Mean weight changes were 
compiled for each group for a 7-week baseline 
period, for 7 weeks of treatment, and for 1 
follow-up 8 weeks after treatment. These re- 
sults were as follows: Group SM (n= 19) 
gained 1.84 pounds (.83 kg) during baseline, 
lost 4.03 pounds (1.82 kg) during treatment, 
and lost an additional 1.13 pounds (.51 kg) after 
treatment; Group SS (m=18) gained .16 
Pounds (,07 kg) during baseline, lost 7.47 
pounds (3.39 kg) during treatment, and lost 1.97 
pounds (.89 kg) after treatment; Group SR 
(n= 17) gained 1.57 pounds (.71 kg) during 
baseline, lost 4.71 pounds (2.14 kg) during treat- 
Ment, and lost 1.46 pounds (.66 kg) after treat- 
ment. One-way analyses of variance with re- 
peated measures indicated that all three groups 
lost significant amounts of weight during the 
tteatment and from pretreatment to follow-up 
(all _ps<.01), Welch’s ¢ test revealed that 
Group SS lost significantly more weight than 
Group SM during treatment and from the be- 
linning of treatment to the end of follow-up 
(/<.05), but all other group comparisons were 
nonsignificant, 

At posttreatment and at follow-up, members 
of each group rated the helpfulness of 19 be- 
Rot change techniques. “Setting standards 
‘ot behavior” received the highest mean rating 
I the SS group and also was rated highly in 
Croup SR, whereas Group SR rated “rewarding 
Sf” and “getting others to reward me” as the 


two least helpful techniques that they learned. 


At Correlations calculated between weight change 


turing and after treatment and self-reports of 


i ndard setting for eating habits reached the 


level of significance for the SS and SR 
‘Cups combined. These coefficients, which 
"nged from —.343 to —.609, were calculated 
om self-reports on the closeness of attention 
oa to eating standards, the number of be- 
Viors for which standards were set, the fre- 
Nency of setting standards for eating, and the 


N R : 
tees in reaching them. Closeness of attention 


d to weight standards correlated significantly 
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with weight loss after treatment but not during 
treatment. Questionnaire data also showed that 
SR participants gave themselves a mean of 
only two verbal rewards per week at home and 
one material reward every 3-4 weeks. 

The significant weight loss of the SM group 
adds further evidence that weight loss can be 
produced without programming external or self- 
reinforcement. The data also support the su- 
periority of teaching individuals to set standards 
for eating and for losing weight in addition to 
teaching situational management over instruc- 
tion in situational management alone. On the 
other hand, comprehensive instruction in self- 
reward in addition to standard setting and situ- 
ational management did not facilitate further 
weight loss as predicted. This instruction may 
have been incompatible with instruction in the 
standard setting or it may have caused par- 
ticipants to be overloaded with information, 
However, these possibilities were not consistent 
with questionnaire and anecdotal data, which 
indicated that the self-reward techniques were 
seen as being of little benefit, were used seldom, 
and made participants feel anxious and overly 
pressured to achieve standards. Many of them 
reported that losing weight was so important to . 
them anyway that instruction in self-reward 
was unnecessary. 

It is possible that implicit training in setting 
standards rather than training in self-reinforce- 
ment was the more important factor in previous 
studies in which members of self-reward groups 
have shown greater weight losses than controls, 
External or self-standard setting was embedded 
in the self-reward condition but not in the con- 
trol conditions of these studies. The findings of 
this study do not imply that self-reinforcement 
is unimportant, but they illustrate the difficulties 
of incorporating self-reward procedures in a 
comprehensive clinical program. 
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Psychotherapeutic Styles: Self-Perceptions of Therapists 
Varying on A-B Type and Experience 


William B. Goodwin 
Yale University 


The possibility that psychotherapeutic styles might differ as a function of A-B type 
or experience was investigated by interviewing 40 A and B experienced and inexperi- 
enced psychotherapists on a wide range of issues relating to therapeutic approach and 
style. The importance of the experience level of the therapist was confirmed by ther- 
apists’ self-descriptions: Experienced therapists seemed to conform less to conventional 
stereotypes of the traditional psychotherapist than did inexperienced therapists, and 
experienced therapists appeared to emphasize immediacy and encounter while de- 
emphasizing detachment, diagnoses, and psychodynamic interpretations, At the same 
time, self-perceived differences between A and B therapists were not apparent. 


One approach to the study of psychotherapy 
that has seldom been tried is to interview the 
therapists themselves. Schiffman, Carson, and 
Falkenberg (Note 1) A-B scale scores were 
obtained by mail from a large sample of trainee 
therapists at the Yale University Department 
of Psychiatry. A-B scores were also available 
for a number of experienced therapists who had 
participated in a previous study (Goodwin, 
1975). Only male clinicians with at least 1 year 
of therapeutic experience and who were clearly 
Type A or Type B as defined by the approxi- 
mate outer quartiles of the distribution (14 or 
above =A; 8 or below=B) were invited to 
participate. Forty of the 41 therapists who were 
contacted consented to be interviewed about 
their own approach to psychotherapy, in ex- 
change for a précis of the results upon com- 
pletion of the study. Each clinician was then 
scheduled for a 1-hour standardized tape-re- 
corded interview in his own office, followed by 
a brief questionnaire to elicit extent of use of 
particular techniques, 

All scales and coding systems for interview 
responses were constructed after preliminary in- 
spection of the range of responses. The inter- 
viewer (myself) and corater blindly rated the 
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This article is based on a doctoral dissertation 
submitted to the Department of Psychology at Yale 
University under the supervision of Donald M. 
Quinlan and Jesse D. Geller. 

Requests for reprints, an extended report in- 
cluding. the factor loadings matrix, and copies of 
the interview schedule should be sent to William 
B. Goodwin, 599 Whitney Avenue, Apartment 1A, 
New Haven, Connecticut 06511. 


material. I was blind to therapists’ A-B type 
until the coding was completed. 


Results 


Interscorer reliabilities for all response scales 
and category systems were quite high, with 
many in the .90s. Mean A-B scores were 16,75 
for As (n=20, SD=1.92) and 6.05 for Bs 
(m= 20, SD = 1.40). Therapists still in training 


averaged 2.10 years of experience; experienced 


therapists averaged 9.95 years of experience. 


Outcome Index 


For responses to four questions regarding 
retrospective assessment of psychotherapy out- 
come, an outcome index was devised based on 
a simplified version of the A-B interaction by- 
pothesis: Positive outcome with schizoids or 
negative outcome with neurotics was scored 3; 
positive outcome with neurotics or negative out- 
come with schizoids was assigned 1; and oge 
responses were assigned 2. The questions 5 
dressed the individual patient m whom the 
therapist had enjoyed the greates 
one tt is be had encountered the bie 
success, and the type of patient with whom s 
had experienced the greatest and least acea 
Scores for these questions were summed, 2 
the resulting linear combination 
to a 2 X 2 analysis of variance. 
for A-B type was found nor were 0 


significant. 
Factor Analysis 


The bulk of the jepon > 
a principal components factor ana) : 
rotations were then computed for a four 
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success, the | 


was subjected _ 


BRIEF REPORTS 


) gution, Factor labels, and proportions of vari- 
p ance accounted for, were 1: Human Potential 
Movement (184%); 2: Rogerian Style 
(14.1%); 3: Modified Psychoanalytic Style 
(13.1%); and 4: Agitated Involvement Versus 
| Calm Neutrality (11.4%). A 2X2 multivariate 
aalysis of variance was computed on the four 
factor scores for each therapist, treating A-B 
type and experience level as the predictors. 
Neither the overall A-B Type X Experience 
Level interaction nor the A-B main effect were 
| significant. Thus, a pervasive difference in style 
pe A and B psychotherapists was not 
lent. 


Experience Level 


The multivariate main effect for experience 
level was significant (p<.017), however, and 
thiefly reflected experienced therapists’ low 
scores on the Modified Psychoanalytic Style 
factor (weighted —.905 in the discriminant 
function) and high scores on the Human Po- 
tential Movement factor (weighted .514 in the 
discriminant function). Based on the most sali- 
tnt components of these two factors—and noting 
Univariate Fs where significant—inexperienced 
‘therapists seemed to perceive themselves as 
“ately using an advice-giving educational ap- 
2, F(1, 16) =4.73, p< .036; as not per- 
sonally involved with, F(1, 16) = 6.15, p< .018, 
Or self-disclosing toward their patients; as mak- 
ing extensive use of psychodynamic interpreta- 
tions, F(1, 16) = 6.31, p<.017; as seeing di- 
noses as very useful, F(1, 16) = 7.03, P< 
pe as not expressing negative reactions readily 
Patients, F(1, 16) = 4-82, p < 035; and as 
deing inclined to reject the “sociopathology of 
Mental illness” viewpoint of Thomas Szasz. 
4 Experienced therapists perceived themselves as 
ot using depth interpretations frequently; as 
ending to use an advice-giving educational ap- 
at as being more personally involved with 
ind self-disclosing toward their patients; as 
Peressing negative reactions readily; as using 
ntering to a moderate extent; as finding di- 
noses comparatively useless; and as tending to 
‘8tee with the viewpoint of Thomas Szasz. 


Possible Covariates 


BN significant differences between As and Bs 
ia found with respect to type of practice, 
Place of training, avowed theoretical orientation, 
| ree ofession, but inexperienced therapists more 
haty had institutional practices (100% vs. 
60%) and were more frequently trained at 
oa University than their experienced col- 
gues (100% vs. 30%). In addition, a higher 
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Proportion of the inexperienced (85%) than of 
the experienced therapists (40%) were psychia- 
trists. The possibility that experience-level dif- 
ferences were actually mediated by profession 
differences was indirectly evaluated by means of 
a profession by experience level 2X2 multi- 
variate analysis of variance of the factor scores: 
No main or interaction effects for profession 
were found, however. 
Discussion 

Perhaps broad differences in clinical style 
between A and B therapists do not exist, al- 
though it remains possible that such differences 
may not be reflected in therapist self-descrip- 
tion or may not be aligned along dimensions 
studied here. 

Perhaps the most striking thing about the 
self-descriptive portraits for the two experience 
levels is that popular stereotypes coincide more 
closely with the self-characterizations of the in- 
experienced therapists. It could be that one of 
the major effects of accruing therapeutic ex- 
perience is a moving away from traditional 
role expectations toward higher degrees of per- 
sonal involvement and encounter with patients. 
Longitudinal studies of professional psycho- 
therapists are. obviously needed to resolve this 
question. 

Finally, if such differences between experi- 
enced and trainee therapists can be replicated 
in other populations using other methods, what 
is it about therapist experience level that cre- 
ates these differences? Maturation, aging, in- 
creased life experiences, change in socioeconomic 
status, and possible additional training are a 
few alternatives to the obvious increase in psy- 
chotherapy experience. By partitioning experi- 
ence level into more subgroupings, future re- 
search might also attempt to ascertain whether 
any such differences emerge gradually and con- 
tinuously or abruptly and discontinuously. 


Reference Note 

R. C, & Falkenberg, P. 
of the Whitehorn-Betz 
cript, Duke Uni- 


1. Schiffman, H., Carson, 
A psychometric analysis 
A-B scale. Unpublished manus 


versity, 1967. 


Reference 


The A-B variable and styl- 


psychotherapy. (Doctoral dis- 
i Dissertation 


Goodwin, William B. 
istic differences 1n 


Abstracts Int 


(University Microfilms No. 75-24,537). 


Received August 1, 1977 = 


Journal of Consulting 
and Clinical Psychology 


Volume 46, 1978 


Brendan A. Maher, Editor 


Harvard University 


Judith P. Worell, Associate Editor 
University of Kentucky 


Published bimonthly by the 
American Psychological Association, Inc. 
1200 Seventeenth Street, N.W., Washington, D.C. 20036 
Copyright © 1978 by the American Psychological Association, Inc. 


Consulting Editors 


Thomas M. Achenbach, National Institute of Mental 
Health, Bethesda, Maryland 

Kent G. Bailey, Virginia Commonwealth University 

Barbara B. Brown, Beverly Hills, California 

James N. Butcher, University of Minnesota 

Jean Chapman, University of Wisconsin—Madison 

George M. Chartier, Arizona State University 

Andrew L. Comrey, University of California, 
Los Angeles 

W. Grant Dahlstrom, University of North Carolina, 
Chapel Hill 

Anthony Davids, Brown University 

Gerald C. Davison, State University of New York at 
Stony Brook 

Alan S. DeWolfe, Loyola University of Chicago 

Barbara Dohrenwend, Graduate School and Univer- 
sity Center of the City University of New York 

Richard M. Eisler, Virginia Polytechnic Institute and 
State University, Blacksburg, Virginia 

Norman S. Endler, York University, Downsview, 
Canada 

Sol Garfield, Washington University in St. Louis 

Donna M. Gelfand, University of Utah 

Goldine C. Gleser, University of Cincinnati Medical 
Center 

Harrison G. Gough, University of California, 

Jesse G. Harris, Jr., University of Kentucky 

Mary B. Harris, University of New Mexico 

Kenneth Heller, Indiana University 

Frances A. Hill, University of Montana 

Alexandra G. Kaplan, University of Massachusetts— 
Amherst 

Alan S. Kaufman, 

Alan E. Kazdin, Pei 


Berkeley 


Oe of Georgia 

mnsylvania State Universit 

Jeney A. Kelly, University of Mississippi Medical 
enter 


Barbara K. Keogh, University of California 
i Los ee ` 
ames C, Kincannon, University of Minne: 
Richard I. Lanyon, Arizona State each 
Herbert M. Lefcourt, University of Waterloo 
Waterloo, Canada 
Peter M. Lewinsohn, University of Oregon 


Michael J. Mahoney, Pennsylvania State Univ 

G. Alan Marlatt, University of Washington 

Albert Marston, University of Southern California 

Eric J. Mash, University of Calgary, Calgary, 
Canada x 

Richard McFall, University of Wisconsin—Madison 

Martha T. S. Mednick, University of Connecticut 

Barbara G. Melamed, University of Florida 

David I. Mostofsky, Boston University A 

Benjamin J. Murawski, Peter Bent Brigham Hos- 
pital, Boston, Massachusetts 

Peter E. Nathan, Rutgers—The State University 

John M. Neale, State University of New York at 
Stony Brook 

K. Daniel O'Leary, State University of New York ab 
Stony Brook 

Oscar A. Parsons, University of Oklahoma 3 

Nolan E. Penn, University of California, San Diego 

N. Dickon Reppucci, University of Virginia 

Jerome Resnick, Temple University 

Leonard G. Rorer, Miami University 

Gerald M. Rosen, Providence Family 
Seattle, Washington 

Robert Rosenthal, Harvard University i 

William Schofield, University Hospitals, Minneapolis 
Minnesota A 

Carolyn Simmons, University of Colorado at Denver 

Bonnie Strickland, University of Massachusetts— 
Amherst $ 

Hans H. Strupp, Vanderbilt University 

Richard M. Suinn, Colorado State University . 

Richard S. Surwit, Duke University Medical Center 

Patricia B. Sutker, Medical University of South 
Carolina 5 

Norman F. Watt, University of Massachusetts- 
Amherst 

Robert L. Weiss, University of Oregon { 

Joan Welkowitz, New York University PE, 

G. Terence Wilson, Rutgers—The State University 

Patricia Wisocki, University of Massachusetts — 
Amherst 

Melvin Zax, University of Rochester 


eriy 


w 


4 


k 
Í 


Medical Center, 4 


| 


Acknowledgment 


In addition to the regular panel of Consulting Editors, the following people reviewed “ct 
evaluated manuscripts for the Journal of Consulting and Clinical Psychology during the 
preparation of the 1978 volume. 


Allen Adinolfi 

James Alexander 
Norman H. Anderson 
Stephen Auerbach 


Bruce L. Baker 
Robert F. Bales 
Joseph Becker 
Irving Beiman 
Alan S. Bellack 
Herbert Benson 
Peter M. Bentler 
Allen Bergin 
Douglas A. Bernstein 
Ellen Berscheid 

J. Allen Best 
Anthony Biglan 
Henry Biller 
Robert C. Birney 
George E. Bigelow 
Gary R. Birchler 
Edward B. Blanchard 
Paul H. Blaney 
Sidney J. Blatt 
Jeanne H. Block 
Marcia M. Bok 
Philip Bornstein 
Stanley Brodsky 
Denis B. Bromley 
Inge K. Broverman 
Fred G. Brown 
Brenda Bryant 
Elaine P. Burgess 
Thomas G. Burish 
Arnold H. Buss 
Nelson Butters 
Billy Barios 


Dale Callner 
Bonnie W., Camp 
Mary M. Campbell 
Albert S. Carlin 
Carol Carson 
Edmund Chaney 
Richard Chapman 
James R. Clopton 
Gep Colletti 
Paul T. Costa 
Cynthia Cowgill 
aniel J. Cox 
Rue Cromwell 
Herb J. Cross 
James Patrick Curran 


Richard J. Davidson 
ouglas A. Davis 


John Lattimer Delk 
Florence L. Denmark 
Douglas R. Denney 
Zachary Dershowitz 
A. Devries 

Robert R. Dies 
Daniel Doleys 
Edward Dougherty 
Ralph M. Dreger 
Thomas J. D’Zurilla 


Craig Edelbrock 
Jeff Edelson 
Robert V. Erikson 
Peter J. Equi 
Stephen Etkind 
Hans J. Eysenck 


Norman Farberow 
Amerigo Farina 

A, David Feinstein 
Robert D. Felner 
C. B. Ferster 

A. J. Finch, Jr. 
Eva Fogelman 
Donald C. Fowles 
Susan Frank 
Alfred S. Friedman 
Ronald J. Friedman 


Lawrence Gaines 
John P. Galassi 
Merna Dee Galassi 
B. L. Gant 

James B. Garrett 
Robert J. Gatchel 
Margaret J. Gatz 
William Gayton 
Joanne Gersten 
Alan Gessner 
Lewis R. Goldberg 
Leo Goldberger 
Marvin R. Goldfried 
Gerald Goldstone 
Irving Gottesman 
John Gottman 
John R. Graham 
Wilson Guertin 
Ruben C. Gur 
Alan S. Gurman 
Malcolm Gynther 


Richard Haier 
Judith Hall 

Scott B. Hamilton 
Constance 


A. Hakstian 

Donald P. Hartmann 
Robert Hauser 
Alfred Heilbrun f 
David E. Herr 
Michel Hersen 
Dorothy Hochreich 
Martin L. Hoffman 
Robert Hogan 
Steven D. Holon 
Philip Holzman 
Hyman Hopps 

John J. Horan 
Kenneth Howard 


Allen C. Israel 
Robert J. Ivnik 


Neil Jacobson 
D. Balfour Jeffrey 
Victor C. Joe 
Warren Jones 


Gershen Kaufman 
Thomas D. Kennedy 
Kevin Kennelly 
Michael Keown 

Bill N. Kinder 
Irving Kirsch 
Benjamin Kleinmuntz 
Irving J. Knopf 
Martin Kohn 
Viadimir J. Konetni 
Gerald P. Koocher 
Elizabeth M. Koppitz 
William M. Kurtines 


Michael Lambert 
Nadine Lambert 
Harry A. Lando 
Elaine L. LaMonica 
Ellen Langer 
Richard Lazarus 
Herbert Lefcourt 


Theodore W. Lorei 
Raymond P. Lorion 
Maurice Lorr 
Bernard Lubin 
Lester Luborsky 


Alfred P. MacDonald, Jr. 


Marion L. MacDonald 
Stephen A. Maisto 
Martin Manosevitz 

K. Gerald Marsden 
Barclay Martin 
Raymond Martorano 
Suzanne Martorano 
Charles G. Matthews 
Janet R. Matthews 
Ralph Maurer 

Alfred McAlister 
Andrew McGhie 
William McIntosh 
Kate McMahon 
Quinn McNemar 
Paul McReynolds 
William T. McReynolds 
Hayden L. Mees 


Donald H. Meichenbaum 


Joseph Mendels 
Lovick Miller 

Peter M. Miller 
William R. Miller 
Taian Montgomery 
Richard Morris 
Don Mosher 
Thomas Mulholland 
Louise Musser 


Vincent J. Nerviano 
Charles S. Newmark 
Mike Nietzel 
Stephen Nowicki 
Vincent Nowlis 
Jum C. Nunnally 


Anita DeVivo 
Executive Editor 


Margaret O’Conner 
Robert O'Conner 
David E. Orlenski 
Jacob Lee Orlofsky 
Gennaro Ottomanelli 
John E. Overall 


Susan Pepper 
Thomas P. Petzel 
Roy W. Persons 
Douglas Powell 


Janet Rafferty 
Steven Reiss 
Alexander Rich 

C. Steven Richards 
Frank C. Richardson 
Manuel Riklan 
David C. Rimm 
Arnold Rincover 
Ross Rizley 

Sherry Rochester 
Joanna Rohrbaugh 
David Rosenthal 
Alan Rosenwald 
Timothy T. Ryan 


Ann Salomon 

Ina Samuels 

Irwin Sandler 
Victor D. Sanua 
Trwin G. Sarason 
Jerome Sattler 

Earl Schaefer 
Harold M. Schroder 
Harold E. Schroeder 
Lee Sechrest 

M. E. P. Seligman 
Lewis J. Sherman 
Marianne L. Simmel 
Arthur B. Silverstein 


APA Journal Staff 


Ann I. Mahoney 


Manager, Journal Production 


Robert J. Hayward 
Advertising Representative 


Jacob O. Sines 
Harvey A. Skinner 
Jonathan C. Smith 
C. R. Snyder 

Linda C. Sobell 

John F. Solin 

Paul A. Spiers 

John J. Spinetta 
Bonnie Spring 

Fred F. Stauss 
Abigail Stewart 
Mark Stewart 

Judith M. Stillion 
Milton Strauss 
Kenneth R. Suckerman 
Norman D. Sundberg 
Stanley Sue 


Shelley E. Taylor 
Francis Terrell 
Robert M. Thorndike 
John Todd 

Craig T. Twentyman 
Forrest Tyler 


John Vincent 


James A. Wakefield 
Barbara Wallston 
Charles G. Watson 
John G. Way 

David Wechsler 
Richard Weigel 
Carol K. Whalen 
Cathy Spatz Widom 
Jerry S. Wiggins 
Wallace W. Wilkins 


Antonette M. Zeiss 
Marvin Zuckerman 
Miron Zuckerman 


Barbara R. Richman 
Production Supervisor 


Anne Redman 
Subscription Manager 


ee Sy 


Author Index to Volume 46 


Key to Pagination 
Issue Month Pages Issue Month Pages 
1 February 1-206 4 August 595-838 
2 April 207-384 5 October 839-1169 
3 June 385-594 6 December 1171-1591 


ARTICLES 


Achenbach, Thomas M. The Child Behavior Profile: I. Boys aged 6-11 a 

Achenbach, Thomas M. Psychopathology of childhood: Research problems and issues. 

Achterberg, Gloria. See Stedman, James M. 

Acosta, Frank X., Yamamoto, Joe, and Wilcox, Stuart A.. Application of electromyographic 
biofeedback to the relaxation training of schizophrenic, neurotic, and tension headache a 


Alien, Deborah R. See Ferguson, Lucy Rau. 
efirey 
'ogel, 


ephen. 
S. 


i tA the interaction hypothesis. . . ...+- s 
ire i by: fectiveness of widows’ groups in facilitating change. - 
Bassett, John È. See Gayton, William F. 
pa Lee Roy. See Fiedler, Deky. a AA 
e a and Beck, Rara T. Premature conclusions pes ; als Fa 
suicide attempters: A reply to Steele: oo ian oe ‘sarang ects 
i Irving, Eileen, and Johnson, Stephen A. g training and posat rom 
Beman ai and en extended progressive relaxation, aeli sane EG PA 
biofeedback. ..-2-+----55 40:54 Goldring. 


Bentler, P. M. See Zukow, 


vi 


AUTHOR INDEX TO VOLUME 46 


Bentler, P. M., and Newcomb, Michael D. Longitudinal study of marital success and failure., 1053 
Berry, G. James. See Braucht, G. Nicholas. 
Bee fant e ke aaa Welk Robert E. A nė f psychol : 
Berzins, Juris I., Welling, ., and Wetter, Rol e new measure of psychological 

androgyny on the Personality Research Form iby. 
Best, Helen. See Craig, Kenneth D. 
Best, J. Allan. See Craig, Kenneth D. 7 
Beutler, Larry E., Pollack, Stephen, and Jobe, Avis. “Acceptance,” values, and therapeutic 

+ isn ete eed aaa A E E E ou E TT se vce vie « on cv 198 

Bhatia, Kiran, and Golin, Sanford. Role of locus of control in frustration-produced aggression. 364 
Bigelow, Llewellyn B. See Donnelly, Edward F., 
Biglan, Anthony. See Lewis, Clifford E. 


Borkovec, T. D., Grayson, J. B., and Cooper, K. M. Treatment of general tension: Subjective 
and physiological effects of progressive relaxation. ʻi 
ichael W., and Berry, 

empirical types of multiple drug abusers........ 1463 


567 


518 


contact as variables in the behavioral treatment of obesi 593 
pice Jeffrey A., and Graham, John R. The 4-3 MMPI proi 344 


Inventory in a university population using psychiatric estimate as the criterion 150 
i William L. 


mensions of anxiety and sensation Paec a ae T 194 


Butcher, James N., and Tellegen, Auke. Common methodological problems in MMPI research. 620 
Butt, James H. See Shipley, Robert H. See oe eee a 


Caffey, Eugene M., Jr. See Lorei, Theodore W. 

Campbell, Susan bi, and Steinert, Yvonne. Comparisons of rating scales of child psycho- 
thology ia clinic and nonclinic samples. <.. , oane 0a E aa 358 

Cardwell, Gilbert F. See Greenberg, Roger P. 

Carpena Paul . See Del Gaudio, Andrew C. 


Carroll, Charles 

for children. . 372 
Carroll, Jerome F, 

differences in 575 


Chamberlain, C. J. See Joslyn, Dan. 
Chaney, Edmund F., 0” Michae dll training wi lics. 1092 
Chapter "See Hele chael R and Marlatt, G. Alan. Skill training with alcoholics 
Chapman, Loren J. See Hertler, Chris A. 

Chapman, Stanley L., and Jeltey, D. Balfour. Situational management, standard setting, and 


ification weight loss program 1588 


Christian, William L., Burkhart, B: R. 
of MMPI items: New interest Ba ee ey Sabile 


Clopton, James R. A note on the M: 
cess James R., and Klein, Gary L. An 


a ~ 


AUTHOR INDEX TO VOLUME 46 


Cooper, K. M. See Borkovec, T. D. 

Cortner, Robert H. See Stedman, James M. Mi. a 

Costelio, Raymond M. Premorbid social competence construct generalizability across ethnic 
groups: Path analyses with two premorbid social competence components. -ziris +++. - 1164 

Counts, D. Kenneth, Hollandsworth, James G., Jr., and Alcorn, John D. Use of electromyo- 
graphic biofeedback and cue-controlled relaxation in the treatment of test anxiety. ...... - 

Cowen, Emory L. Some problems in community program evaluation oe 

Cowan, Philip A. See Dyckman, John M. 


792 


the 19708. caa ese aie soo cosets E sk en ee 


573 
1105 

Dahlberg, Charles. See Natale, Michael. 

Dahlstrom, W. Grant. See Gynther, Malcolm D. 

Dancis, Joseph. See Anderson, Lowell. i 

Davidson, Kay M., and Bailey, Kent G. Effects of “status sets” on Rotter's Locus of Control ae 
scale : z 

Davison, Gerald C. : The treatment of homosexuality,- -... - 170 

Dean, Raymond S. Distinguishing learning-disabled and emotionally disturbed children on aA 
the WISCER ARS E tame o oe ere ee 

DeHorn, Allan, and Klinge, alte. Coase and facto g lysis ae WISC-R and the 1160 
Peabody Picture Vocabulary Test for an adolescent psychiatric sampe. n.i e ont 

Del Gaudio, Andrew C., Carpenter, Paul J., and Morrow, Gary R. Male and female treatment 1577 
differences: Can they be generalized?..............s0sssss000e 

Denney, Douglas R. See Elliott, Charles H. 

Dies, Robert R. See Kirshner, Barry J. 

Dikmen, Sureyya S. See Dodrill, Carl B. r 1432 

Dodrill, Carl B. The hand dynamometer as a neuropsychological measure, =: 

Dodrill, Carl B., and Dikmen, Sureyya S. The Seashore Tonal Memory Test as w 

D psychological manie aan Baton a wlpa.ps vc « ¢ E e e i, 

onahoe, Clyde P., Jr. See Freedman, Barbara J. y. aw l 

Donnelly, Edward F., Nasrallah, Henry A., Wyatt, Richard Jed, Gilin, J. Christian, and Bigelow, 5 
Llewellyn B. Effects of dopamine synthesis inhibition on WAIS comprehension. ; ... . E 

Dor-Shay, Netta Kohn. On the long-range effects of concentration camp internment on 5 i 


victims: 25 years aer paeo ic A 
Douglass TEA aban den ‘al d ity, and birth order on the Defense Mechanisms 


Dudley, Gary E. Effects of sex, social desirability, 1419 

D Taven to Ead E EE nasty E S a eee {eats 
univant, Noel. See usan. ee ae 

Dyckman, John M., and Cowan, Philip A. Imaging vividness and the outcome of in vivo ane 1155 


imagined scene desensitization 
and Kleinknecht, Ronald A. The Palmar Sweat Index as a function of 194 
repression-sensitization and fear of dentistry seen 


Endelman, Janet R. See Sade C: Ba, j 
Endler, Norman S. See Breen, Lawrence J. As tame sly of ATE a e 


Epstein, Norman, and Jackson, Elizabeth. 
1016 
202 
Esse, John T., and 
Eysenck, H. J. See Zuckerman, Marvin. 
Eysenck, Sybil. See Zuckerman, Marvin. 
i bert H. 
pub Joka B A e Ts and Groh, Thomas. Sex tance of 987 


Farina, Amerigo, Murray, Pauli 


validity issues. -Aeee scott" age 
Farudi, Parvis A. See Akamatsu, +. 


Cox, W. Miles. Frequent citations in the Journal of Consulting ond Clinical Psychology during Pa 


viii 


AUTHOR INDEX TO VOLUME 46 


Ferguson, Lucy Rau, and Allen, Deborah R. Congruence of parental perception, marital .” 
satisfaction, and child adjustment............... 22.00.00 02sec eet eee eee tee ce eee eens 345 

Feyerabend, C. See Russell, M. A. H. a 

Fibel, Bobbi, and Hale, W. Daniel. The generalized Expectancy for Success Scaie—a new 


MCASUFE. |. 55:5 cs scuedn vote eee R Paden ee TESS A a AR 924 
Fiedler, Decky, and Beach, Lee Roy. On the decision to be assertive.......... 2.0.0... ....004. 537° 
Finch, A. J., Jr. See Kendall, Philip C. i 


Finch, Alfred J., Jr. See Newmark, Charles S. 
Fioravanti, Mario. See Gough, G. 

Fleece, L. ` See Melamed, B. G. 
Fogel, Arthur F. See Allgeier, Elizabeth Rice. 


Folkins, Carlyle H. See Miller, Stephen H. 

Follingstad, Diane R. See Moreault, Denise. ; 

Ford, Julian D. Fe ae relationship in behavior therapy: An apao analysis......... 1302 

Forehand, Re Wells, Karen C., and Sturgis, Ellie T. Predictors of child noncompliant behavior A f 
WS D AA T A O E A E T a a E 


Forsythe, Alan B. See Tuma, A. Hussain. 

Foy, David W. See Pachman, Joseph S. 

Freedman, Barbara J., Rosenthal, Lisa, Donahoe, Clyde P., Jr., Schlundt, David G., and McFall, 
Richard M. A social-behavioral analysis of skill deficits in delinquent and nondelinquent 
a A a a aa A ES OA 1448 

Furman Wyndol. See Kelly, Jeffrey A. 


Garfield, Sol L. Research problems in clinical diagnosis.. .............. 0000000000 000c 596 
Garrity, Linda I., Servos, Andria B. Comparison of measures of adaptive behaviors in preschool 


children... 9.04.5 (a eM NEMEC Types 288 
Garske, John P, See Kowitt, Michael R. 
Gayton, William F., Bassett, John E., Tavormina, Joseph, and Ozmon, Kenneth L. Repression- 
ot eee and eae Mee See Cre ott tec. a. casa neen a 
eller, rew M., an i vin. Cogniti d nality factors in suicida? behavior. . 
Geller, Jan D. See ramet ir ee ive and personality factors in suicida! be! 


parison of linear and MMPI diagnostic methods with an uncontaminated criterion 1046 
aapert, Gary S. See Parker, Jerry C. P 
Gillin, J. Christian. See Donnelly, Edward F. 

TRAET a E cases nespaieal ano Seneca 
al ag a Te e colag set talk: Aacity mao i 
Glasgow, Ri E. Effects Pa self-control manual, rapid smoking, and amount of therapist r 

contact on EO s conker AY o E RE 1439 

Glennon, Blais, and Waas Joka K. An ghscevationad wreck i sks ace ; 
{{oung, children MPM esis A sipsidsioluWMieln ei sleact yiscc'ssccscccucceavceseseeese? 
Golden, Charles J. See Purisch, AmoldD. UUU UUU 


eee William B. Psychotherapeutic styles: Self-perceptions of therapists varying on A-B 
ANG Experience. . s> e e S as E e aa aak p aAa aa E eae scene 
Goodwin, William B., Quinlan, Donald M., and Geller, Jesse D. Practicing A an 
id and neurotic Patient prototypes. 

reply to Harris and Harvey 


Green, Samuel B. See Burkhart, Barry R. 
Green, Samuel B, See Schwarz, Raymond M. 


AUTHOR INDEX TO VOLUME 46 


Greenberg, Roger P., and Cardwell, Gilbert F. Rorschach elopment intelligence 
factors E i E he ge) ree developmental laval and KA 
Greene, Roger L. Can clients provide valuable feedback to clinicians about their personali 
ies 


interpretation? Greene replies... .........cssducssessecsasesucace edsr 
Grimm, arene G. See fer, Frederick H. 4 ied 
1561 
Gynther, Malcolm D., Lachar, David, and Dahlstrom, W. Grant. Are special norms for minor- 
ities needed? Development of an MMPI F scale for blacks................ voodedcens S408 


Hart, Russell R. Therapeutic effectiveness of setting and monitoring goals, 
Harvey, John H. See Harris, Ben. 


i Ags fakin, Believable jenda oa perce Sa testing shia a 
eckerman, Carol L, See Brownell, Kelly D. 
Heilbrun, Alfred B., Jr. Projective and repressive styles of processing aversive information... 186 
Hertler, Chris A., Chapman, Loren J., and Chapman, Jean È. A scoring manual for literalness - 
eh in rover inte eaten Bri EAT N Pre A T dess aai 
ile, Matthew G. See Nietze! i . 
Hiscock, Merrill. Imagery assessment through self-report: What do imagery questionnaires 2 
200 
n 
clinic population....... jrrverseraenessssensusetaseeesensns tna p KA 1187 
Holland, Terrill R. Dimensions, patterns, and personality correlates of drug abuse lender s 
moea aaa a abi a Se Be oowlgheonytwnsbaexsn’ © r Á 
olla) james G., Jr. See Coun! . Kenneth. 
Bore ee eae aid Andsasi Freak. Coping and the self-control of chronic tension pe 
headach@trsi. sererai va ear Se : ores 
Holzmuller, Ana. See Wiggins, Jerry S. al 
Horowitz, Leonard M., Sampson, Harold, Siegelman, Ellen Y., Weiss, Josepb, and Goodiriend, 
Shirley. Cohesive and dispersal behaviors; Two classes of concomitant change in paycho: a 
therapy. sssccscccsrecsesrarcueseerssesecncsaseseewepesseasen® 
Horvitz, Bruce. es Sole, poe 
losch, . See Trevi! 5 P, 
Huesmann, [a Revell Lekovita Monroe M., and Eron, Leonard D. Sum of MMPI Seales P, -0 
4, and 9 as a measure of aggression........-..ssssseseeeeeseees of 
Hutcherson, S. See Melamed, B. G. 
w 
m 


Jackson, Donet erates Norman iz 

fackson, Elizabeth. an. 

Jacobson, Nats. Specificand nonspecific factors in the effectiveness of a behavienl appresch on 
to the treatment of marital discord... 

Jaffe, Joseph. See Natale, Michael. 

Janda, Lois. See O'Grady, Kevin. s a 


Jeffrey, D. Balfour. See man, 
Tefery, Robert W., Vender, Michael, Wing, Rena 


after behavioral treatment for obesity 
Jew, Charles. See Shawver, Lois. 
Jobe, Avis. See Beutler, Larry E. 


R. Weight loss and behavior change 1 year m 


AUTHOR INDEX TO VOLUME 46 


Johnson, James H. See Giannetti, Ronald A. 
Johnson, James H. See Sarason, Irwin G. 
jones James H. See Smith, Ronald E. 
ohnson, Stephen A. See Beiman, Irving. 
Johnson, William G. See Stalonas, Peter M., Jr. 
Jones, Warren H., Chernovetz, Mary Ellen O’C., and Hansson, Robert O. The enigma of 
androgyny: Differential implications for males and females?.. Pa 
Joslyn, Dan, Grundvig, John L., and Chamberlain, C. J. Predicting confabulation from the 
Graham-Kendall Memory-For-Designs TUS padres tnt os Otte RoR TE 


Kanfer, Frederick H., and Grimm, Laurence G. Freedom of choice and behavioral change... 
Kantorowitz, David A., Walters, Joyce, and Pezdek, Kathy. Positive versus negative self- 
monitoring in the self-control of smoking... .....--. 0... esses eee eee eee eee tta 
Kay, Edwin J., Lyons, Arthur, Newman, William, Mankin, Donald, and Loeb, Roger C. A 
longitudinal study of the personality correlates of marijuana use 
Kazdin, Alan E. Evaluating the generality of findings in analogue therapy research. . 
Bagi Alan E. Methodological and interpretative problems of single-case experimental 
Cir RRO cha ont E T pote oie OPS ac eee RR 
Keeley, Stuart M. See Waller, Ronald W. 
Kellerman, Jonathan. A note on psychosomatic factors in the etiology of neoplasms.......... 
Kelly, J effrey A., Furman, Wyndol, and Young, Veronica. Problems associated with the 
a measurement of sex roles and androgyny.............-...-00se0 reser eeeuee 
Kendall, Philip C. See Newmark, Charles S. 
Kendall, Philip C. Anxiety: States, traits—Situations?..... 20.00... 00s eee renee 
Kendall, Philip C., Edinger, Jack, and Eberly, Carole. Taylor’s MMPI correction factor for 
spinal cord injury: Empirical endorsement. ..... 2.0.2... 06.00.00 0e eee aeeoo rarat 
Kendall, Philip C., Finch, A. J., Jr., Little, Verda L., Chirico, Bernard M., and Ollendick, 
Thomas H. Variations in a construct: Quantitative and qualitative differences in children’s 
locus of control. . 
Kendall, Philip C., an , A. > 
A group ea LCS VERE oo oui. Se ued aR E 
Kendall, Philip C., Finch, A. J., Jr., and Montgomery, L. E. Vicarious anxiety: A system: 
evaluation of a vicarious threat to'seli-esteem......... 2... ....cc sche cece eee ee unt 
Kera atthe ane Robert G. 
avari, | ., and Douglass, Frazier M., IV. The Polydruy ent Scale: A psycho- 
„~ metric technique for the inaivect measurement of drug Rice pa Auer Beth ote eset p y 1 
King, Glen D., Hannay, H. Julia, Masek, Bruce J., and Burns, Joan W. Effects of anxiety and 
E A a A E ETN 
Ki i Michael w ATE n NOAE 
rshner, Barry J., Dies, Robert R., and Brown, Robert A. i nipula- 
tion of self-disclosure on group cohesiveness, eee ents! mang 


Kole) ae sy Analy Baten re Personality and social factors in adolescent marijuana 
Kondo, Charles Y, See Nietzel, Michael T. 
Kopel, Steven A. See Rosen, Raymond C. 
Korn ITNA See Natale Michael, 
owitt, Mi R. i i i 
$ ‘tp nycotherapeuti E: mi Na E. Mouenty, self-disclosure, and gender as determi 
remsdorf, Ross B., Palladino, Lucy J., Polenz, Douglas D., and Antista, Barbara J. Effects of 
the sex of both interviewer and subj ate Bite nta S n 
Kurtines, William. Ses Srapcti, Joo on reported manifest dream content. ...----- 
Kurtines, William M., Ball, R., and Wood, Gloria H. Personality characteristics of long- 
term recovered alcoholics: A comparative analysis...................++s0-0eenetttttyt 
ee as M., and Garfield, Sol L. Illusory correlation : A further exploration of Chapman's 
MO ieee so 88 Ve Ome rire 
Kushner, Kenneth. On the external validity of two psychotherapy analogues.....----+::*""" 


havioral approach to chronic smokers 
Lang, Elizabeth. See Rierdan, Jill. 
Lang, Peter J. See Weerts, Theodore C. 
Lang, Rudie J. Multivariate classification of day-care patients: Personality as a 

continuüm... es. so aie Aee 0 E IRE Sioa asin a a aala e 
Lansky, David. See Nathan, Peter E. 


dimensional 


629 
1522 
1574 

280 

370 


590 
110 
997 


1566 
375 


1171 


366 


173 
1166 


on 


1009 
1394 


1583 


1212 


AUTHOR INDEX TO VOLUME 46 


Lansky, David, Nathan, Peter E., and Lawson, David M. Blood alcol discri 

by alcoholics: The role of internal and external cues.............. g mie Peaks FERGA 953 
Lawlis, G. Frank. See Stedman, James M. f 
Lawson, David M. See Lansky, David. 
Lazzari, Renato. See Gough, Harrison G. 
Lefkowitz, Monroe M. See Huesmann, L. Rowell. 
Lehman, Ralph A.W. See Heaton, Robert K. 


Lehman, Robert E. Symptom contamination of the Schedule of Recent Events.............. 1564 
Lehrer, Paul M. Psychophysiological effects of progressive relaxation in anxiety neurotic 
patients and of progressive relaxation and alpha feedback in nonpatients................. 389 


Lenauer, Michael. See Sadd, Susan. 

Leonard, Calista V. Response to “A note on the MMPI as a suicide predictor”.............. 
Levenson, Robert W., and Gottman, John M. Toward the assessment of social competence... 453 
Levin, Harvey. See Overall, John E. 

Levin, Saul M., Gambaro, Salvatore, and Wolfinsohn, Lawrence. . Penile tumescence as & mea- 


sure of sexual arousal: A reply to Farkas............+00+ssscse cee nses Vesa snesasasn get 1517 
Lewis, Clifford E., Biglan, Anthony, and Steinbock, Elizabeth. Self-administered relaxation 
training and money deposits in the treatment of recurrent a DEO o E 1274 


Licht, Mark H. See Paul, Gordon L. 

Lick, John, Condiotte, Mark, and Unger, Thomas. Effects of uncertainty about the behavior 
of a phobic stimulus on subjects’ fear reactions.......+.:++:+++srsesuseeseytsrrseseees 

Linehan, Marsha M. See Goldfried, Marvin R. 

Little, Verda L, See Kendall, Philip C. 

Loeb, Roger C. See Kay, Edwin J. 

Longabaugh, Richard. See Vannicelli, Marsha. -s 

Lorei, Theodore W., and Caffey, Eugene M., Jr. Goal definition by staff consensus: A contribu- fied 
tion to the planning, delivery, and evaluation of mental health services ...-.. r. 

Lubin, Bernard, Marone, Joseph G., and Nathan, Ronald G. Comparison of self. 


and examiner-administered Depression Adjective Check Lists. .. srta hi 584 
Lyle, J. G. See Rawling, P. | 
Lyons, Arthur. See Kay, Edwin J. 
Maher, Brendan A. Preface,.......+0+0+s+ceseceeessess 595 
Maher, Brendan A. A read 835 
clinical psychology , 6 
Maher, Brendan A. Stimulus samplin; 0 


Mahoney, Michael J. Experimental methods and outcome evaluation 
Maloney, Lorrie J. See Spinetta, John qe 
Manlig, Don Sar Key, piyi, A 

ann, Davi . See onnor, Kevin. | k f 
Margolin, Gayla. Relationships amo marital assessment rocedures: A correlational study.. 1556 
Margolin, Gayla, and Weiss, Robert L. Comparative ev: 

associated with behavioral treatments 

Marlatt, G. Alan, See Chaney, Edmund F. 
Marone, Joseph G. See Lubin, Bernard. 
Martin, ‘April; See Silverman, Lloyd H. ER 
Martindale, Colin, Hemispheric asymmetry and Jonit 
Martindale, Colin. The thei ist-as-fixed-effect fallacy 


Massey, Frank. See Pachman, 
Matarazzo, Joseph D., Wiens, 


A. 
ies See Lando, Harry ply to Reade and Wertheimer. -- seseeepeneee® 


McFall, Richard M. See Freedman, Barbara J. 703 


M ennis M. See Simons, Lynn S. ba and the slow 
Mechi, Edun. E. ‘Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, sow 506 


ft psychology... +. -++-2+ss1r tastes Shiite ud ta 
Melamed, B ee, R., Fleece, L., Hutcheon, om Si aad Hawes, Efecte of fin 
modeling om the reduction of anxiety-related viduals varying ms 
i ience in the stress situation 


Mendelsohn, Eric. See Silverman, 
i z g Al ale Mal ir i peeewcesecess 
ae Ao ee to social stimuli.. ~ -- -+++ -+117 


xii 


AUTHOR INDEX TO VOLUME 46 


Miller, John E. See Shiek, David A. i j E 

Miller, Martha P., Murphy, Philip J., and Miller, Terry P. Comparison of. electromyographic 
feedback and progressive relaxation training in treating circumscribed anxiety stress reactions 

Miller, Stephen H., O'Reilly, Charles A., Roberts, Karlene H., and Folkins, Carlyle H. Factor 
structure and scale reliabilities of the Adjective Check List across time.........-........ 

Miller, Terry P. See Miller, Martha P. 4 1 

Miller, William R. Behavioral treatment of problem drinkers: A comparative outcome study of 
three controlled drinking therapies..........----- E E A a neti gass 

Mishara, Brian L. Geriatric patients who improve in token economy and general milieu 
treatment programs: A multivariate analysis. . . ..- -++ -41411 reri retr 

Montgomery, L. E. See Kendall, Philip C. 

Moos, Rudolf H. See Cronkite, Ruth C. : f 

Moos, Rudolf H., and Bromet, Evelyn. Relation of patient attributes to perceptions of the 
treatment environment. ..........-.0s00rn eee e secre reese estes neers teres EEE 

Moreault, Denise, and Follingstad, Diane R. Sexual fani of females as a function of sex 
guilt and experimental response cues. ......-..+0-+1ceecreeeeeee terrin sess e es 

Morrow, Gary R. See Del Gaudio, Andrew C. 

Morton, Teru L. See Simons, Lynn S. 

Murphy, Philip J. See Miller, Martha P. 

Murray, Pauline J. See Farina, Amerigo. 


Nasrallah, Henry A. See Donnelly, Edward F. 
Natale, Michael, Dahlberg, Charles C., and Jaffe, Joseph. Effect of psychotomimetics (LSD 
and dextroamphetamine) on the use of primary- and secondary-process language..........-- 
Natale, Michael, Kowitt, Michael, Dahlberg, Charles, and Jaffe, Joseph. Effect of psycho- 
AEON A (LSD and dextroamphetamine) on the use of figurative language during psycho- 
gual VSS ig.) os os acters E eRe eich EA nee ss settee 
Nathan, Peter E. See Lansky, David. 
Nathan, Peter E., and Lansky, David. Common methodological problems in research on the 
addictions =, Aaa A T E E taney eset ee een ees 
Nathan, Ronald G. See Lubin, Bernard. 
Neale, John M. See Stone, Arthur A. i 
Nelson, Wilbur J., Jr., and Birkimer, John C. Role of self-instruction and self-reinforcement in 
the modification of impulsivity. . 
Nemec, Richard E. Effects of con 
and left Mec leans: a 
Nemetz, Georgia H., Craig, K: 
dysfunction through symbolic modeling 
Newcomb, Michael D. See Bentler, P. M. 
Norman William. See Kay, Edwin J. 5 
Newmark, Charles S., Ziff, David R., Finch, Alfred J., Jr., and Kendall, Philip C. Comparing 
the empirical validity of the standard form with two abbreviated MMPIs........-020007" 
Nietzel, Michael 1 T., Hile, Matthew G., and Kondo, Charles Y. Diversity among lower-class 
therapy clients: A comparison of Class IV and Class V psychotherapy recipients...» -» -> 
Nolan, J. Dennis, , and Sandman, Curt. ‘“Biosyntonic” therapy: Modikcation of an operant 
conditioning fe Reporte sel ees A 


Novicki, Stephen, Je; Reported atest events during developmental periods and thi Si 
ba i in college students..............--:-seeer cert 
Nuessle, William. See Miller Gleam A an college students. ...........-.--+-05000 0+ 


the upper middle class: A replication of Spence............2...-.-00cee uty ys ice 
O'Grady, Kevin, and Janda, Louis. Paychametrie correlates of the Mosher Forced Choice 
Guilt Inventory........... 
Okada, Marilyn. See Breen, Lawrence J. 
O'Leary, K. Daniel, and Turkewitz, Hillary. Methodological errors in marital and child treat- 


P Ce 4 

Leay, Micheal R, See C E AE 

Oliver, J: M. See Bumberry, illiam. pa: 

Ollendick, Thomas H. See Kendall, Philip C. 

O'Reilly, Charles A. See Miller, Stephen H. 

Ottomanelli, Gennaro, Wilson, Peter, and Whyte, Richard. MMPI evaluation of 5-y' 
done treatment status;.: ..... SE a E ana a aa ete 

Orea pa È, Poa Noma G., and Levin, Harvey. Effects of aging, 

coholism, and functional psychopathol WA) Gleste see te 
Overall, Peggy B. See Hata, Nan Co aa aaa mas 
Ozmon, Kenneth L. See Gayton, William F. 


Pachman, Joseph S., Foy, David W., Massey, Frank, and Eisler, Richard M. A factor analysis 
of assertive behaviors. . WE.. o. ME.. Oe Fee. n= A 
Palladino, Lucy J. See Kremsdorf, R 


1579 


713 


183 
294 
62 


53 
377 
1133 
1552 


379 
1168 
1581 


141 


579 
1315 


347 


AUTHOR INDEX TO VOLUME 46 


Post, Amy A Wittmaier, Bruce C., and Radin, Mitchell E. Self-disclosure as a function of 
state and trait anxiety. 

Poythress, Norman G., E Selec short form of the MMPI: Addendum to Faschingbauer. 

Prigatano, George P. See Parsons, A. 

Pritchard, David A. See Rosenblatt, Arthur I. 

Prociuk, Terry J. See Breen, Lawrence J. 

Purisch, Arnold D. See Golden, Charles J. 

Purisch, Arnold D., Golden, Charles J., and Hammeke, Thomas A. Discrimination of schizo- 
phrenic and brain-injured patients by a standardized version of Luria’s neuropsychological 
tests oi. (ANAL E ole! ale'e slave T a MEMO OEE SM Bovesvenes 


Quinlan, Donald M. See Goodwin, William B, 


Radin, Mitchell E. See acre L. 
Raw, Martin. See Russell, M. A. H. eet: 
Rawlin; , P., and Lyle, J. G. Cued recall and discimination of memory deficit.............+ 
Reim, Pran P. Mood, pleasant cronin ae unpleasant events: Two pilot studies. ....... 
Red Devil W. w cine, pends z 

eith, Gunther. See Nemetz, , 
Renson, Gisele J., Adams, John E., and Tinklenberg, Jared R. Buss-Durkee assessment and 


validation with violent versus nonviolent chronic alcohol abusers. + hea 
Reppucci, N. Dickon, and Clingempeel, W. Glenn. Methodological issues in research with 


c tional populations. s.s. Ssnt es soitan cn aana SER SNE PAEAN sa 
Reschly, Daniel ii WISC-R factor structures among Anglos, Blacks, Chicanos, and Native 
R Americaull assert cr aa r. a A yee ve 

el i, N. Dickon. See Carrol A 
Reynolds, William M. A question as to validity of Verbal Scale IQ as a WAIS short form. 
Rierdan, Jill, Lang, Elizabeth, and Eddy, Sara, Suicide and transparency Sang 


Rorschach: A replication... . 20.0060 cs0c cece ee euee ssnsti ttt ` 
Roberts, Karlene H. See Miller, Stephen H. 


: fie. Dayid. 
Roa feet aad Edward Kath J. The relationship of two process measurement quem eed 


Roe, John È., and Edwards, 
R fon theray eae eR ool ave E T T Mato 
oe! ius. See Girodo, $ i 
Roff, ei: ‘and Knight, Raymond. Young adult schizophrenics: Prediction of outcome 
and antecedent childhood factors. o.oo esysecabareeveueveceners ae qt 


Rosen, R: d C., and Kopel, Steven A. 
R behavioral treatment a sexual ven I r 

osenbi Gerald. See ol . MMPI 
Rosenblatt, Arthur I., and Pritchard, David A. Moderators of racial differences on the 
Rosenthal, Lisa. See Freedman, Barbara J. 


Rourke, Daniel. See Wilson, Robert S. ; R. 
Royce 'W. Stephen, and Arkowitz, Hal. Multimodal evaluation of practice interactions 
treatment for social isolation mS E iy 


Russell, M. A. H., Raw, Martin, Taylor, C., Feyerabend, C, and ~ Boot 

a carboxyhaemoglobin levels after rapid-smoking aversion therapy... 

Lena ichael Duni» oel. Objective measurement 

Sadd, Mi Shaver, and Dunivant, Ni 

: Sa of iA and fear of failure: A 4 analytic approach. . 
Saloojee, Y. See Russell, M. A. H. 5 

Ep RA Harold. See Horowits, Leon Leonard M. 

Sandman, Curt. See Nolan, J. 
Santo, Yoav. See Carroll, Jerome F. X. 
Sarason, Irwin G. See Smith, a 
Sarason, Irwin G., Johnson, arti ke and Siegel, Judith M. 5 Ri: 
Sarason, NISDE Rick. Test anxiety and the passage of time... 


2 
W 


xiv 


AUTHOR INDEX TO VOLUME 46 


Scheff, Betty-Jane. See Vannicelli, Marsha. i 

Schlundt, David G. See Freedman, Barbara J. 

Schuld, Donald. See Watson, Charles G. 

Schuldt, W. John. See Souheaver, Gary T. 

Schwartz, Joseph M. See Gomes-Schwartz, Beverly. 

Schwarz, J. Conrad. See Zuroff, David C. 

Schwarz, Raymond M. See Burkhart, Barry R. i 4 

Schwarz, Raymond M., Burkhart, Barry R., and Green, Samuel B. Turning on or turning off: 
Sensation seeking or tension reduction as motivational determinants of alcohol use........ 

Scopetta, Mercedes Arca. See Szapocanik, José. t 

Servos, Andria B.. See Garrity, Linda I. 

Shapiro, David. See Surwit, Richard S. 

Shaver, Phillip. See Sadd, Susan. 

Shawver, Lois, and Jew, Charles. Predic 
replication failure. ee 

Shealy, Allen E. See Matarazzo, Joseph D. 

Shiek, David A., and Miller, John E. Validity generalization of the WISC-R factor structure 
with 10}-year-old children... .......-.---0+-++ 

Shipley, Robert H., Butt, James H., Horwitz, Bruce, ry, Preparation for a 
stressful medical procedure: Effect of amount of stimulus preexposure and coping style. ... 

Shostak, David A., and McIntyre, Curtis W. Stimulus-seeking behavior in three delinquent 

| personality types... 21 yee ye eee seen E e eee ec sees eee cnet 

Siegel, Judith M. See Sarason, Irwin G. 

Siegelman, Ellen Y. See Horowitz, Leonard M. 

Silberfarb, Peter M., Phelps, Patricia J., Hauri, Peter, and Solow, Charles. Effects of intestinal 
bypass surgery on body coni 


ting violent behavior from WAIS characteristics: A 


Simons, Lynn S 
_ outcome and follow-up evaluation based on client case records in a mental health center.. « 
Skinner, Harvey A., and Jackson, Douglas N. A model of psychopathology based on an inte- 
gration of MMPI actuarial systems,...... ep 
Slaney, Robert B. Therapist and client perce; 
conditions 


tive roles for the facilitative 


on the efficacy of de- 


Snyder, C. 
feed! 


back to clinicians about their personality inte i 
i rpretations? A reply to Greene...-..-* 
Sobel, Linda C., and Sobell, eee Validity of self-reports in three Populations of alcoholics. 


Sere Jane EMS Walker, 
Stalonas, Peter M., Jr., Johnson, William G., and Christ, Maryann. Behavior modification tor 
cere The eas: of exercise, contingency EA eiF and program adherence... 
oe dere iy awlis, G. Frank, Cortner, Robert H., and Achterberg, Gloria. Relation- 
ships between WISC-R factors, Wide-Range Achievement Test scores, and visual-motor 
maturation in children referred for sychological evaluation : a 
Steele, Robert E. Steele’s reply to Bedrosian and Beck 
Steinbock, Elizabeth. See Lewis, Clifford E. 
Steinert, Yvonne. See Campbell, Susan B. 
Stewart, Abigail JA Jongltudiral study of coping styles in slidening and socially def! 
Stone, Arthur A., and Neale, John M. Life Event Scales: Psychophysical training and rating 
dimension effects on event-weighting eo gamed ee ae a oe 
Stoops, Rick. See Sarason, Irwin G. tar P 
Strickland, Bonnie R. Internal-external expectancies and health-related behaviors. - -- :: -+ 
Sturgis, Ellie T. See Forehand, Rex. SA 7 eea 
Sturgis, Ellie T., and Adams, Henry E. The right to treatment: Issues in the treatment of 
homosexuality... .... .: SOI TE F E E A a 


F 


1144 


1079 
849 


1192 


AUTHOR INDEX TO VOLUME 46 


Surwit, Richard S., Shapiro, David, and Good, Michael I. Comparison liovascular 
biofeedback, neuromuscular bi feedback iC itation i s arias 
cera hypertension........ E S ae Wein ae oe 
Sutker, Patricia B., Archer, Robert P., and ‘Allain, Albert N. Drug abuse patterns, | personali lity 
characteristics, and relationships with sex, pee nites bord apd 
Sutker, Patricia B., Allain, Albert N., Smith, Charles J., and Cohen, Gary H. Addictdescriptions 
= ere community, multimodality, and methadone maintenance treatment clients 
Sutker, Patricia B., Allain, Albert N., and Geyer, : Scott. Female criminal violence and di feren: 
tial MMPI characteristics. . ... Kit pine vets i Li 
Szapocznik, José, Scopetfa, Mercedes Arca, Aranalde, Maria 
William. Cuban value structure: Treatment implications. ........++..-+ss-ssssssee0s 


Torbet, 
Trevithi 
Choice, «5.5 E E T TE O AGEA N a 
Tuma, A. Hussain, May, Philip R. A., Alan B. Therapist experi- 
ence, general clinical ability, and treatment outcome in schizophrenia. ......- fier vias 


Turkewitz, Hillary. See O'Leary, K. Daniel. x À 
harles. Short forms of the MMPI with back pain patients... 


Turner, Judith, and McCreary, [ 
Turner, Robert G., and Keyson, Mae. Relationships between client self-perceptions of self- 
consciousness levels and therapist awareness of these perceptions...» +s+ 1st 


Ungaro, Roseann. Seg Silverman, Lloyd H. 
Unger, Thomas. See Lick, John. 


Vail, Anthony. Factors influencing lower-class black patients remaining in treatment... 6555+ 


Van Hoose, T. A. See Penk, W. E. 

Vannicelli, Marsha, Washburn, Stephen, Scheff, Betty-Jane, and Longabaugh, 
arison of usual and experimental page in a psychiatric day center... 

Vender, Michael. See Jeffery, Robert W. 

Vogt, Arthur T. See Heaton, Robert K. 


Wade, Ti C. See Simons, Lynn S. hae 
Wagner} ‘Nathaniel N. Is masturbation still sinful? Comments on Bailey's comments. sate 
Waksman, Steven. Psychometric phrenology revisited: Comments on neuropsycho! ogical 
testing...-.+- eee ecvecuat ana aria aoaie Mee UMM OnE A | Saale 
Walker, Elaine F., and Stake, Jayne E. ey ae preferences for male an counsel 
Waller, Ronald W., and Keeley, Stuart M. Effects of explanation and information J Seep 


W the illusory te plenom -í 
lti . torowitz, David A. 

Wart bey Early childhood ites pea structural therapy : Outcome after 3 years... 
Washi . See Vannicelli, Marsha. mets ts EA 
iia eel Schuld, Donald. Psychosomatic etiological factors 3 pen ee A 


response to Kellerman... -r-e ee AMPI scale to 

Watson, Charles G., and Plemel, Duane. An ae to separate brain-damaged 
functional psychiatric tients in neuropsychiatric RN eS 

Weerts, Thedcre a and Tat: Peter J. ee hophysilogy of fear imagery: Differences between 
focal phobia and social performance anxiety... eee 

Tan Joseph. See Horowits, Leones D ; 

s 7 $, o : e} ol 

Wess, Robert re Ra Aved, Barba 2 M. Marital satisfaction and preon a peon 
vhysical heals EAE A E aa 

Weisz, John R. See Glennon, Blair. 

Welling, Martha A. See Berzins, Juris I. 

Wells, Karen C. See Forehand, Rex, 

Westlake, Robert J. See Brownell, Kelly D. 

Wetter, Robert E. See Berzins, Juris I. 

Whitman, Douglas. See Wilson, Robert S. 

Whyte, Richard. See Ottomanelli, Gennai 

Wiens, Arthur N. See Matarazzo, Joseph D. 


from 


1538 


362 


180 


1120 


34 
1586 


m 


87 


1379 


xvi 


AUTHOR INDEX TO VOLUME 46 


Wiggins, Jerry S., and Holzmuller, Ana. Psychological androgyny and interpersonal behavior 40 
Wikoff, Richard L. Correlational and factor analysis of the Peabody Individual Achieves oat 


peas 5 322 
Wilcox, Stuart A. See Acosta, Frank X. 7N 
Wildman, Robert W., and Wildman, Robert W., II. Validity of the Verbal Scale IQ as a WAIS 

short form: A reply to Reynolds...... ; Ee A e cess 2. s. i -, He SOT 


Wilson, Robert S., Rosenbaum, Gerald, Brown, Gregory, Rourke, Daniel, Whitman, Dougias, 
and Grisell, James. An index of premorbid intelligence 


Worell, Judith. Sex roles and psychological well-being: Perspectives on methodology.......... 777 
Wyatt, Richard Jed. See Donnelly, Edward F, 


Yale, Coralee. See Tuma, A. Hussain. 
Yamamoto, Joe. See Acosta, Frank X. 
Young, Veronica. See Kelly, Jeffrey A. 
Yurcheson, R. See Melamed, B. G. 


Zeiss, Robert A. | Self-directed treatment for premature ejaculation......................... 1234 

Ziff, David R. See Newmark, Charles S. 

Zuckerman, Marvin. See Melistrom, Martin, Jr. 

Zuckerman, Marvin, Eysenck, Sybil, and Eysenck, H. J. Sensation seeking in England and 
America: Cross-cultural, age, and sex POMBRUEONR Tes! Pitre, ks nv 139 

Zukow, Arnold H. See Zukow, Patricia Goldring. 

Zukow, Patricia Goldring, Zukow, Arnold H., and Bentler, P. M. Rating scales for the identi- 
fication and treatment of hyperkinesis............. O ng s Ripe OS) 

Zuroff, David C., and Schwarz, J. Conrad. Effects of transcendental meditation and muscle 
relaxation on trait anxiety, maladjustment, locus of control, and drug use................. 264 


OTHER 


52, 212, 416 
ii, 230, 431, 872 
1191 


CHARLES C THOMAS « PUBLISHER 


SEX-RELATED COGNITIVE DIFFERENCES: An 
Essay on Theory and Evidence by Julia A. 
Sherman. Biological theories, including purported 
sex differences in brain lateralization, and related 
sociocultural determinants are discussed. cloth- 
$16.25, paper-$12.25 


COUNSELING IN COMMUNICATIVE DIS- 
ORDERS edited by R. E. Hartbauer. Each chapter 
focuses on a particular communicative problem, 
giving a general overview of the disorder, coun- 
seling techniques and procedures, and suggestions 
for evaluation of effectiveness. $18.25 


MUSIC THERAPY: An Introduction to Therapy 
and Special Education Through Music (2nd Ptg.) 
| by Donald E. Michel. A history of music therapy, its 
| basic theoretical rationales, and its applications in 
therapy, rehabilitation, and special education are 
presented. $11.25 


THE USE OF ALTERNATIVE MODES FOR 
COMMUNICATION IN PSYCHOTHERAPY: 
The Computer, The Book, The Telephone, The 
Television, The Tape Recorder by David Lester. 
The author discusses each communication mode 
and its vnique psychotherapeutic qualities. $12.25 


GROUP COUNSELING AND GROUP PSYCHO- 
. THERAPY WITH REHABILITATION CLIENTS 
[edited by Milton Seligman. Practical applications 
of group strategies are given particular emphasis 
along with group methods with vocational or occu- 
pational goals. cloth-§19.00, paper-$14.75 


CRISIS INTERVENTION AND HOW IT 
WORKS by Romaine V. Edwards. The primary 
focus is on the author’s proven method of crisis 
Counseling, $9.25 


THEORIES AND METHODS OF GROUP 
[COUNSELING IN THE SCHOOLS (2nd Ed.) 
edited by George M. Gazda. Topics include group 
methods for use with the early school child, family 
|8toup counseling, and Adlerian group consulta- 
tion, $18.25 


A PRIMER ON SCHOOL MENTAL HEALTH 
CONSULTATION by Morton I. Berkowitz. The 

sics of the consultation process — how to start 
School consultation, what goes on once it is in pro- 
gress, and to what ends it may conclude — are pre- 
Sented. $11.50 


MENTAL EXAMINER'S SOURCE BOOK edited 
?y Julian C. Davis and John P. Foreyt. Chapters 
Mclude the classification of mental retardation, ob- 
Iective and projective personality testing, intelli- 
nes, testing and the mental status examination: 


es Orders with remittance sen 


301-327 East Lawrence Avenue ® 


THE HAND TEST: A New Projective T 

Special Reference to the Biia of ain 
sive Behavior (5th Ptg.) by Barry Bricklin, Zygmunt 
A. Piotrowski and Edwin E, Wagner. The text 
covers daily use of the test in clinical practice, in- 
ped diagnostic procedures for individual cases, 


THE CARE AND MANAGEMENT OF THE SICK 
AND INCOMPETENT PHYSICIAN by Robert C. 
Green, Jr., George J. Carroll and William D, 
Buxton. The authors detail the management, pre- 
yention and identification of problems which im- 
pair the ability of a physician to perform 
adequately. $10.00 


BASIC APPROACHES TO GROUP PSYCHO- 
THERAPY AND GROUP COUNSELING (2nd 
Ed., 2nd Pig.) edited by George M. Gazda. Experts 
in the field summarize such techniques as psycho- 
drama, reality * therapy, rational-emotive therapy 
and gestalt therapy. $19.50 


PSYCHIATRIC PROBLEMS IN OPHTHAL- 
MOLOGY edited by Jerome T. Pearlman, George 
L. Adams and Sherwin H. Sloan. Discussions of the 
eye and the “I,” reactions to the loss of sight, and 
eye symptoms with no organic disease are included 
along with helpful diagnostic and management 
procedures. $14.50 


HUMAN BEHAVIOR GENETICS compiled and 
edited by Arnold R. Kaplan. In addition to such 
topics as basic human genetics, measuring human 
behavior, and genetic factors in intelligence and 
personality development, the book shows the need 
for careful controls and conservative judgement. 


$78.00 


CASE STUDIES OF THE CLINICAL INTER- 
PRETATION OF THE BENDER GESTALT 
TEST: Illustrations of the Interpretive Process for 
Graduate Training and Continuing Professional 
Education by Clifford M. DeCato and Robert J. 
Wicks. This book establishes the Bender Gestalt 
Test (BGT) as an effective means for assessing per- 
sonality dynamics, developmental level and psycho- 
pathology. $11.25 


TECHNIQUES FOR BEHAVIOR CHANGE: Ap- 
eed ‘Adlerian Theory (3rd Pig.) compiled 
and edited by Arthur G. Nikelly. Sections include 
assessment, group and individual therapy, and tech- 
niques for dealing with special problems. $12.25 


THE EMOTIONALLY DISTURBED CHILD: A 
Book of Readings (2nd Ptg.) compiled and edited by 
Larry A. Faas. The text covers man’s approach to 
defining emotional disturbances and offers discus- 
sion on the identification and dynamics of emotion- 
ally disturbed children. $16.75 


t, on approval, postpaid. 


Springfield © Illinois @ 62717 


Clinical Evaluation 
of Young Children 


with the McCarthy 
Scales 


By ALAN S. KAUFMAN Ph.D. 
and NADEEN L. KAUFMAN, M.A., Ed.D. 
Clinical Evaluation of Young Children with 
the McCarthy Scales is probably the only book 
available on the clinical interpretation of the 
McCarthy test. The authors, who have drawn 
on years of research and clinical experience 
with the test, share a psychometric-clinical- 
learning disabilities orientation towards assess- 
ment. In addition, both authors were involved 
in the development of the scale prior to its 
publication and have published extensively on 
the battery during the past five years. Crucial 
topics such as screening for learning problems, 
administration tips, and writing, case reports 
are also featured. Readers and users of this 
unique work will gain invaluable insight into 
the McCarthy profile of any children tested. 
1977, 320 pp., $16.50 ISBN: 8-8089-1013-2 


The Personal 
Sphere Model 


By RAOUL A. SCHMIEDECK, M.D. 

The Personal Sphere Model offers reliable 
and valid data as an instrument for the assess- 
ment of object relationships and for personal- 
ity research in general. It is a projective test in 
which interpersonal relationships are presented 
by drawings. Both a manual for the Personal 
Sphere Model and an account of its develop- 
ment, this book contains instructions, norms, 
studies in reliability and validity, examples, and 
reviews of supporting work. 

1978, 240 pp., $18.50 ISBN: 0-8089-1093-0 

Send payment with order and save postage 

and handling charge. 

Prices are subject to change without notice, 


Grune & Stratton 


A Subsidiary of 
Harcourt Brace Jovanovich, Publishers 
111 FIFTH AVENUE, NEW YORK, N.Y. 10003 
24-28 OVAL ROAD, LONDON NW1 7DX 


Contributions to the 
Psychopathology 
of Schizophrenia 


Edited by BRENDAN A. MAHER 


This collection contains reprints of important 
ticles—both empirical reviews and theoretical form- 
ulations—on schizophrenia that have appeared in 
Progress in Experimental Personality Research, It 
provides a comprehensive review of the advances” 
in research on schizophrenia over the past 12 years, 
All articles except the last are accompanied by 
postscripts written by the article authors, ind 
changes that have occurred in their thinking and 
new data that bear on their original formulations, 
Clinical psychologists, psychiatrists, psychoana- 
lysts, and psychopathologists; and abnormal, pef= 
sonality, developmental, cognitive, and experimen- 
tal psychologists will all find interesting perspectives 
on their work and the history of its development if 
these seminal articles. 

CONTENTS: P. H. Venables, Input Disfunction in 
Schizophrenia, Postscript. L. J. Chapman et al., A 
Theory of Verbal Behavior in Schizophrenia. Post 
script. |. |. Gottesman, Contributions of Twin Stud 
ies to Perspectives in Schizophrenia, Twin Studies 
and Schizophrenia a Decade Later. W. E. Broen 
Jr., and L. H. Storms, A Theory of Response Inter- 
ference in Schizophrenia. Response Disorganization” 
and Narrowed Observation. J. M. Neale and R. Li 
Cromwell, Attention and Schizophrenia. Postscript. 
P. Magaro, Theories of the Schizophrenic Perform: 
ance Deficit. 


1977, 400 pp., $11.25 


Progress in Experimental 


Personality Research 
VOLUME 8 
Edited by BRENDAN A. MAHER 7 
FROM THE PREVIEWS: Be, 
“, . . unusually excellent scholarly and closely 
reasoned review. . . highly recommended.” j 
—AMERICAN JOURNAL OF PSYCHIATRY 
“, «. bound to be arresting to many readers, and 
further demonstrates that experimental psycholog A) 
is ok so much a subject as a way of life’ in ti 
reduction of scientific uncertainty.” í 
—Vernon Hamilton in NATURE 
This serial publication is designed to synthesize 
current significant developments in the study a 
personality for psychologists, psychopathologistàr 
psychiatrists, sociologists, educators, and other si 
dents of behavioral science. Contributions ral 
from original works in important technical areas kd 
summaries of data and integrated reviews of pri i 
ent knowledge. 
1978, 368 pp., $23.00 ISBN: 0-12-541408-0 
Send payment with order and, save postage 
and handling charge. 
Prices are subject to change without notice. 


ACADEMIC PRESS, INC. — 


A Subsidiary of j 
Harcourt Brace Jovanovich, pee cae A 
111 FIFTH AVENUE, NEW YORK, N.Y, 
24-28 OVAL ROAD, LONDON NW1 7DX 


ISBN: 0-12-465250-6 


> 


+ 
i 


 AĦHSEN'S EPT 


Description: Ahsen's Eidetic Parents Test | "A methodological advance.” 
(EPT) elicits eidetic images of parents, in ~ American Journal of Psychiatry 


the mind of the subject, in 30 situations. 
Each response is a triadic experiential unit ff 
— an image(I), a somatic response(S), 
and a meaning(M). A precise test for 
localizing personality structures, soma- 
tic and emotional patterns and 
psychological themes in the form of 
images which do not change unless 
transformed through therapeutic 
dynamics. 


“An important addition to the fear 
inventories and reinforcement schedules 
currently in use...a rapid and effective 
method.” ~ Behavior Therapy 


“An exciting and ingenious way of 
getting at conflict areas and provid- 
ing a rich source of material for 
therapeutic sessions...A new and 
intriguing device for personality 
assessment.” 

~ Contemporary Psychology 


Uses: Easily administered within 
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